Install Structsure Edge
The following are step-by-step instructions on how to configure Structsure for a Snowball Edge deployment.
Snowball Edge Unlock
This doc will guide you through the basics of unlocking a Snowball Edge device for use. The full docs can be found here or on page 81 of the AWS Snowball Edge Developer Guide PDF.
Requirements
In order to unlock the Snowball Edge, you need the following:
- Snowball Edge CLI or OpsHub
- Snowball Edge Unlock Code (unique per snowball)
- Snowball Edge Manifest file (unique per snowball)
- The IP of the Snowball Edge device
Unlocking using the CLI
Issue the following command to unlock the Snowball Edge Device:
snowballEdge unlock-device --endpoint https://192.0.2.0 --manifest-file /Downloads/JID2EXAMPLE-0c40-49a7-9f53-916aEXAMPLE81-manifest.bin --unlock-code 12345-abcde-12345-ABCDE-12345
Obtain the Access/Secret Keys
Issue the following commands to obtain the root user's Access/Secret keys and export them for future use. This assumes you have jq and grep available. If you don't have jq or grep, manually parse the Access Key Id and Secret Access Key.
AWS_ACCESS_KEY_ID=$(snowballEdge list-access-keys | jq -r '.AccessKeyIds[0]')
AWS_SECRET_ACCESS_KEY=$(snowballEdge get-secret-access-key --access-key-id "${AWS_ACCESS_KEY_ID}" | grep -Eo "^aws_secret_access_key = (.*)$" | cut -d' ' -f3)
export AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY
At this point you should have an unlocked Snowball Edge device and the root user's Access/Secret keys.
Snowball Edge AMI Import
Import the AMI
You will need to import the AMI to each and every Snowball Device.
You can follow the AWS documentation for this as found here. However the broad steps are as follows:
Unlock the Snowball Edge device and obtain the Access/Secret keys. See above, Snowball Edge Unlock for instructions.
Obtain your Snowball Edge IP address. You can export this info for easy substitution:
export AWS_REGION=snow
export SNOWBALL_IP=192.168.2.75Create a vmimport role and policy as described here
cat << EOF > /trust-policy.json
{
"Version":"2012-10-17",
"Statement":[
{
"Effect":"Allow",
"Principal":{
"Service":"vmie.amazonaws.com"
},
"Action":"sts:AssumeRole"
}
]
}
EOF
aws iam create-role --role-name structsure-vmimport --assume-role-policy-document file:///trust-policy.json --endpoint http://${SNOWBALL_IP}:6078
cat << EOF > /import-policy.json
{
"Version":"2012-10-17",
"Statement":[
{
"Effect":"Allow",
"Action":[
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListBucket",
"s3:GetMetadata"
],
"Resource":[
"arn:aws:s3:::*",
"arn:aws:s3:::*/*"
]
}
]
}
EOF
aws iam create-policy --policy-name structsure-vmimport --policy-document file:///import-policy.json --endpoint http://${SNOWBALL_IP}:6078
# Note down the policy ARN and substitute it into the next command
aws iam attach-role-policy --role-name structsure-vmimport --policy-arn <Policy ARN> --endpoint http://${SNOWBALL_IP}:6078Determine your S3 Bucket name:
aws s3 ls --endpoint "http://${SNOWBALL_IP}:8080"
export AMI_BUCKET=my-bucket-nameCopy the AMI to the S3 Bucket:
aws s3 cp ~/Downloads/export-ami-0123456789abcdef0.raw s3://${AMI_BUCKET} --endpoint "http://${SNOWBALL_IP}:8080"
Import the AMI Snapshot. Save the resulting Snapshot-ID for use in the next step.
aws ec2 import-snapshot --disk-container "Format=RAW,UserBucket={S3Bucket=${AMI_BUCKET},S3Key=export-ami-0123456789abcdef0.raw}" --endpoint "http://${SNOWBALL_IP}:8008"
Register the AMI:
Note: you may need to update the device name, root device name, volume size, and volume type depending on the OS of your AMI.
aws ec2 register-image \
--name my-ami-name \
--description "Strutsure Edge AMI" \
--block-device-mappings "[{\"DeviceName\":\"/dev/xvda\",\"Ebs\":{\"VolumeType\":\"sbp1\",\"DeleteOnTermination\":true,\"SnapshotId\":\"s.snap-0123456789abcdef0\",\"VolumeSize\":120}}]" \
--root-device-name /dev/xvda \
--endpoint http://${SNOWBALL_IP}:8008
Snowball Edge SSH Key Pair
Follow this guide to generate a SSH Key Pair. Note that these keys will be unique per snowball, unless you choose to import a single key-pair for each device. Be sure to safely retain the private key, as it is not recoverable if lost.
The full documentation can be found here.
Generate a Key Pair
Unlock the Snowball Edge device and obtain the Access/Secret keys. See above, Snowball Edge Unlock for instructions.
Obtain your Snowball Edge IP address. You can export this info for easy substitution:
export AWS_REGION=snow
export SNOWBALL_IP=192.168.2.75Create and Save Key pair. Note that this command requires the jq bin and chmod.
If you do not have those tools available, you can manually capture the key material from the output and save it to a pem file.
KEY_PAIR_RESPONSE=$(aws ec2 create-key-pair --key-name strutsure-edge --endpoint http://${SNOWBALL_IP}:8008)
echo "${KEY_PAIR_RESPONSE}" | jq -r '.KeyMaterial' > "private-key-${SNOWBALL_IP}.pem"
chmod 0400 private-key-${SNOWBALL_IP}.pem
At this point you should have a key-pair on the Snowball Edge device and the private key on your workstation. You could also import a key pair on instance creation with the command aws ec2 import-key-pair --key-name <value> --public-key-material <value>
.
Snowball Edge Instance Deployment
Plan your instances
You'll want to deploy the following:
- 1 Deploy box
- 3 Control Plane nodes
- Agent nodes that should at least amount to 36 cores of total vCPU (Recommended 6 agents)
Instance sizing is going to depend on the resources available to you. You'll want to sit down and plan out how to best utilize the space by mapping out the instances being deployed to each device and doing the math to divide the space with the smallest remainder. Here's some suggested minimum sizes however:
- Deploy box: sbe-c.large
- Control Plane Nodes: sbe-c.2xlarge
- Agent Nodes: sbe-c.4xlarge
The instance types available can be found here or starting on page 188 of the AWS Snowball PDF.
Here's an example scenario where the System Integrator is using 8 vCPU worth of instances on Snowball-0 and Snowball-1, leaving the rest of the resources for use by Structsure Edge. Note that this will most likely not be a case, but is a demonstration of shuffling instances around and changing instance sizes to optimize utilization.
SB-0: Total vCPU: 104 SI Instances: 8 Agent Nodes (x6): 96 Free vCPU: 0
SB-1: Total vCPU: 104 SI Instances: 8 Agent Nodes (x6): 96 Free vCPU: 0
SB-2: Total vCPU: 104 Control Plane: 8 Agent Nodes (x6): 96 Free vCPU: 0
SB-3: Total vCPU: 104 Control Plane: 8 Agent Nodes (x6): 96 Free vCPU: 0
SB-4: Total vCPU: 104 Control Plane: 8 Agent Nodes (x6): 96 Free vCPU: 0
SB-5: Total vCPU: 104 Deploy Box: 8 Agent Nodes (x6): 96 Free vCPU: 0
Plan your Storage
You'll want to divide up the storage in a smart way as well. It should be determined which snowball has the smallest amount of free space, round it down to the nearest TB, divide it in half, divide it by the number of agent nodes you're going to run, round to a good even number, and use that for your rook-ceph storage.
For example: Free Space 7.68 TB Round down to 7 TB Divide by 2 to 3.5 TB Divide by the number of agent nodes you're going to run per snowball, in this case 6, so 597.3 GB Round to 600 GB
We will want to mount in an additional volume to /var/lib/rook. This volume can be relatively small as it only requires to house the internal database rook-ceph needs. 100 GB should more than cover it.
We may want to mount in an additional volume for /var/log if your AMI doesn't have sufficient space. This will be dependent on the AMI that we're incorporating.
Plan Your IP Addressing and DNS Names
Each node will end up needing two IP addresses: one for the Virtual Network Interface (VNI) and one for the Direct Network Interface (DNI). Work with the SI to come up with an IP addressing scheme that will be "easy" and not cause problems. Keep in mind you'll need some addresses for the Virtual IPs (VIP) we'll assign to the "load balancers" created by kube-vip later.
Example:
control-plane-0.test.example.com:
- VNI: 10.0.0.100
- DNI: 10.0.0.200
control-plane-1.test.example.com:
- VNI: 10.0.0.101
- DNI: 10.0.0.201
control-plane-2.test.example.com:
- VNI: 10.0.0.102
- DNI: 10.0.0.202
agent-0.test.example.com:
- VNI: 10.0.0.110
- DNI: 10.0.0.210
...
agent-35.test.example.com:
VNI: 10.0.0.145
DNI: 10.0.0.245
Control Plane VIP: 10.0.0.250 Istio Ingress Gateway VIP: 10.0.0.251 Additional VIP?: 10.0.0.252
Configure the Snowball Timeserver
Clustered systems do not operate properly if they have a large time skew. We need to ensure the clocks are synched properly. It matters less if the time is accurate, but it must be consistent between the systems. You can configure the timeserver for the Snowballs, which will in turn configure the clocks on the systems we're deploying. The full documentation for configuring NTP can be found on page here and here or starting on page 121 and 143 of the AWS Snowball PDF.
If the System Integrator (SI) has not provided a time server, request that they configure one. They should be able to configure the nearest managed switch as an NTP server that you can use.
snowballEdge update-time-servers 10.0.0.2
Deploy Instances
The following steps will walk you through deploying an EC2 instance on the Snowball. Note that you will need to repeat these steps for each instance you're deploying on each Snowball. Note that the AMI ID will be unique per Snowball Edge Device.
Get the AMI ID either from the OpsHub UI or from the CLI.
aws ec2 describe-images --endpoint http://${SNOWBALL_IP}:8008
You can export the AMI ID for future use. NOTE: this AMI ID will be unique per Snowball Edge Device.
export AMI_ID="s.ami-example0123456789"
Get the physical network device ID. We need to determine which interface is connected and obtain its ID. Executing the following command will display all of the device info.
snowballEdge describe-device
Only one of the PhysicalNetworksInterfaceId's will have an IP address that isn't 0.0.0.0 (ex. RJ45 or QSFP etc...). Identify the correct interface using the
IpAddress
value that isn't 0.0.0.0. You can export the interface ID for future use. NOTE: this interface ID will be unique per Snowball Edge Device.export PHYSICAL_DEVICE_ID="s.ni-EXAMPLE0a3a6499fd"
Create the VNI.
snowballEdge create-virtual-network-interface --physical-network-interface-id ${PHYSICAL_DEVICE_ID} --ip-address-assignment STATIC --static-ip-address-configuration IpAddress=10.0.0.100,Netmask=255.255.255.0
You will need to keep the VNI ID for future use. You can export it if you would like.
export VNI_ID="arn:aws:snowball-device:::interface/s.ni-example0123456789"
Run the instance
Set your instance type:
export INSTANCE_TYPE="sbe-c.4xlarge"
You may need to tweak the block-device-mappings setting. More information on this can be found in the AWS CLI documentation here.
Remember, to configure the block device mapping as such:
- Control plane nodes get zero additional disks (unless you decide they need a dedicated logging disk).
- Agent nodes get additional disks:
- One for the Rook-Ceph database mounted in
/var/lib/rook
- One for the Rook-Ceph block storage (unformatted and un-partitioned)
- Optionally one for a dedicated logging disk
- One for the Rook-Ceph database mounted in
aws ec2 run-instances --image-id ${AMI_ID} --instance-type ${INSTANCE_TYPE} --key-name=${KEY_NAME} --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=control-plane-0.test.example.com}]" --block-device-mappings "[{\"DeviceName\":\"/dev/sdb\",\"Ebs\":{\"DeleteOnTermination\":false,\"VolumeSize\":600}},{\"DeviceName\":\"/dev/sdc\",\"Ebs\":{\"DeleteOnTermination\":false,\"VolumeSize\":100}}]" --endpoint http://${SNOWBALL_IP}:8008
You will need to keep the Instance ID for the next commands. You can export it.
export INSTANCE_ID="s.i-example0123456789"
Wait for the instance to start. It must be running to associate the interfaces.
aws ec2 wait instance-running --instance-ids ${INSTANCE_ID} --endpoint http://${SNOWBALL_IP}:8008
Associate the VNI. Your
--public-ip
is your VNI address you created.aws ec2 associate-address --public-ip 10.0.0.100 --instance-id ${INSTANCE_ID} --endpoint http://${SNOWBALL_IP}:8008
At this point you should be able to SSH to your instance via the VNI IP address.
ssh -i key.pem rocky@10.0.0.100
Configure Instance
Now that your instance is deployed, we will need to do some basic configuration that hasn't been automated. You will need to do these configurations as root user.
- Configure the hostname
- Configure and format the storage.
- Configure the DNI and routing.
Configure the hostname
hostnamectl set-hostname controlplane-0.test.example.com
You can verify with the following command:
hostnamectl status
Configure and format the storage
If you only added the rook-ceph block storage you're all done! If you've attached additional storage for other purposes you have some work to do.
List the disks. Note which disk is which for future steps
lsblk
You will need to put the large volume in the rook-ceph config, so note the device name down for later.
Partition, format, and mount the disk for the rook-ceph database.
Note: You should only have to do this on agent nodes. Note: Replace /dev/sdb with whatever device name you identified in step 1.
fdisk /dev/vdb
# n then enter to make a new partition
# p then enter for primary
# press enter to accept the defaults for the rest of the prompts
# p then enter to show the new partition. This should output all of your partitions.
# w then enter to write the new partition
# The new partition should show up now
fdisk -l
# Make the file system
mkfs.xfs /dev/vdb1
# make the /var/lib/rook directory
mkdir -p /var/lib/rook
# Modify the fstab to mount the disk to the dir
vi /etc/fstab
# Add a new line that looks something like this:
# /dev/vdb1 /var/lib/rook xfs defaults 0 0
# Remount the file systems
mount -a
# Verify the mount is good
df -hAdd the Logs disk to the LVM. (Optional for a dedicated logging disk)
Note: that if your AMI does not support LVM these steps will not work for you. You can follow the process in step 2 above to partition, format, and mount the disk to the logs dir
# Create the new PV
pvcreate /dev/sdc
# Identify the volume group we want to extend
vgs
# Extend the logging volume group
vgextend rook /dev/sdc
# Identify the Logical Volume
lvs
# Extend the logging volume
lvextend -l +100%FREE /dev/rocky/logs
# Resize the filesystem
xfs_growfs /var/log
# Check that things have been resized
df -h
Configure the DNI and routing
In order to make the VIPs for the API and ingress gateway(s) float between nodes, we need to configure the host with a DNI but first we need to prepare the instance for the DNI.
noteIf you want to delete an instance, delete the DNI before deleting the instance it's attached to.
# show the devices
nmcli -p device
# show the connections
nmcli connection show
# Stage a connection to attach to the DNI using a STATIC ip
# Note you may need to change the device name from eth1.
# The ipv4.address should be the DNI address that you chose. The ipv4.gateway can be found from the nmcli con show command
nmcli con add con-name eth1 type ethernet ifname eth1 ipv4.method manual ipv4.address 10.0.0.200/24 ipv4.gateway 10.0.0.1 ipv4.route-metric 10
nmcli con show eth1
# This is to deprioritize eth0 to make eth1 (dni) priority. You may also need to use the UUID instead of the con name.
nmcli connection modify eth0 ipv4.route-metric 20
Now create the DNI and you should be able to ssh in with the set DNI IP address
Create the DNI back on the local machine.
snowballEdge create-direct-network-interface --physical-network-interface-id ${PHYSICAL_DEVICE_ID} --instance-id ${INSTANCE_ID}
You should be able to ssh in with the set DNI IP address. Once on the instance, display the connection and make sure eth1 is higher priority than eth0
# settings
nmcli dev show eth1NOTE: when you apply the route config, you will most likely not be able to access the instance via the VNI IP address any longer. You should be able to connect via the DNI address at this point instead.
RKE2 and Structsure Deploy Guide
Configure Deploy box
This guide assumes you have deployed a system using the AMI provided in the Structsure Edge Data Transfer and have accessed this system via SSH.
Copy the files from your S3 Bucket. Be sure to substitute in your Snowball Edge Device IP, Bucket name, and Access/Secret Keys.
sudo su
mkdir -p /var/lib/structsure-edge
cd /var/lib/structsure-edge
export SNOWBALL_IP=192.168.1.2
export BUCKET_NAME=structsure-edge
export AWS_REGION=snow
export AWS_ACCESS_KEY_ID=my-access-key-id
export AWS_SECRET_ACCESS_KEY=my-secret-access-key
aws s3 cp --recursive s3://${BUCKET_NAME}/edge/ . --endpoint "http://${SNOWBALL_IP}:8080"Install Docker and load the Utility Image.
The edge-util docker image should have all of tooling you need (include k9s!) to perform an edge deployment. These steps will walk you through deploying Docker on a deployment instance based on the Structsure RKE2 AMI.
mkdir -p rpms/docker-repo
tar -C rpms/docker-repo -xzf rpms/docker-repo.tar.gz
cat << EOF > /etc/yum.repos.d/local.repo
[local]
name=Local Repository
baseurl=file:///var/lib/structsure-edge/rpms/docker-repo
enabled=1
gpgcheck=0
protect=1
EOFdnf -y install docker-ce docker-ce-cli containerd.io --repo local
systemctl enable docker --now
docker load -i edge-util.tarPrepare RKE2 Ansible
tar -zxf rke2-ansible.tar.gz
cp image-archives/* rke2-ansible/tarball_install/
cp -r rpms/* rke2-ansible/offline_repo/Populate ssh keys you use for the nodes in the rke2-ansible/ssh_keys directory, be sure to change the permissions:
chmod 0400 rke2-ansible/ssh_keys/*.pem
Populate ansible inventory
vi rke2-ansible/inventory/structsure-edge/hosts.ini
Your hosts.ini should look something like this:
[rke2_servers]
; Add Server Node IPs here, this would be the DNI's set for each node.
; Add the hostname variable to rename the systems if desired
; add the ssh private key for each host. Note that these keys will be unique per snowball
; the server variable needs to be the ip of the kube-vip interface (same as the vip_address variable).
; the first time ansible is executed, do NOT supply a the server variable to the first node
; after the cluster is up, and the api server is accessible via https://<kube-vip-ip>:6443 from all nodes, then add server=<kube-vip-ip> to the first node and re-run ansible
10.0.0.100 ansible_ssh_private_key_file=ssh_keys/private-key.pem
10.0.0.101 ansible_ssh_private_key_file=ssh_keys/private-key.pem server=10.0.0.200
10.0.0.102 ansible_ssh_private_key_file=ssh_keys/private-key.pem server=10.0.0.200
[rke2_agents]
; Add Agent Node IPs here, this would be the DNI's set for each node.
; add the ssh private key for each host. Note these keys will be unique per snowball
10.0.0.110 ansible_ssh_private_key_file=ssh_keys/private-key.pem
10.0.0.111 ansible_ssh_private_key_file=ssh_keys/private-key.pem
10.0.0.112 ansible_ssh_private_key_file=ssh_keys/private-key.pem
10.0.0.113 ansible_ssh_private_key_file=ssh_keys/private-key.pem
10.0.0.114 ansible_ssh_private_key_file=ssh_keys/private-key.pem
10.0.0.115 ansible_ssh_private_key_file=ssh_keys/private-key.pem
[rke2_cluster:children]
rke2_servers
rke2_agents
[rke2_servers:vars]
; vip_interface should be the same physical interface used to access the node instances
vip_interface=eth1
; this should be an IP address in the snowball cidr that is not in a DHCP pool or has a reservation
; this will be the IP address of the api server "load balancer"
vip_address=10.0.0.200
; you can set this and run ansible if you break quorum. Note that it does dumb recovery and you could lose data
recover_quorum=false
; this cidr is used for the application load balancer.
; Again, this should not be in a DHCP pool or it should have a reservation so its not used elsewhere
vip_cloud_provider_cidr=10.0.0.201/32
[rke2_agents:vars]
; This should be the same ip as the vip_address
server=10.0.0.200
[all:vars]
ansible_user=rocky
install_rke2_version=v1.26.12+rke2r1
selinux=trueExecute the ansible
docker run -it -v $(pwd):/work -w /work edge-util /bin/bash
or to keep your container running after exiting:
docker run -itd -v $(pwd):/work -w /work edge-util sh -c "while true; do sleep 30; done"
docker exec -it <container-id> /bin/bash
cd rke2-ansible
ansible-playbook -i inventory/structsure-edge/hosts.ini site.ymlThese ansible scripts can take a long time.
Verify you can access the api via the kube vip address using curl
curl -k https://<kube_vip_ip>:6443
The following output is expected:
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "Unauthorized",
"reason": "Unauthorized",
"code": 401
}After the first successful run for ansible, set the server variable for the first control plane node and re-run the ansible. You can limit the execution to just that one system by specifying the DNI IP address in the limit flag. If you need to add more nodes to the cluster or rerun ansible make sure to include the master node IP address if you use the limit flag for critical variables to work.
ansible-playbook -i inventory/structsure-edge/hosts.ini site.yml --limit '10.2.3.1'
When ansible is done, you should be able to access your cluster.
kubectl --kubeconfig connection/rke2.yaml get node
Export the kubeconfig for future use.
export KUBECONFIG=$(pwd)/connection/rke2.yaml
The kube-vip-cloud-provider deployment may need it's imagepullpolicy
set to IfNotPresent
, by default it is Always
and pods may not come up. This can be edited with kubectl edit deployment kube-vip-cloud-provider -n kube-system
. If the kube-proxy pods do not deploy correctly on some nodes the static manifest file under /var/lib/rancher/rke2/agent/pod-manifests
may need to be removed on the host machine on each node that the pod isn't healthy, then run systemctl restart rke2-agent
.
Deploy the Rook Operator. Within your rook-ceph/values.yaml it is important to make sure
hostpathRequiresPrivileged: true
is enabled if SELinux was enabled within your rke2 ansible.cd ..
helm install --create-namespace rook-operator charts/rook-ceph-v1.13.1.tgz --values rook-values/rook-operator.yaml -n rook-cephVerify operator is running:
kubectl get all -n rook-ceph
Get a shell on an agent node and check to see what device name was assigned to the secondary disk attached at launch
lsblk
Determine which disk is the secondary (no partitions and should be the correct size); remember, only your agent nodes should be used as storage resources. Modify the rook-ceph-cluster/values.yaml and replace the storage value with this disk; it should look something like this:
```bash
storage: # cluster level storage configuration and selection
useAllNodes: false
useAllDevices: false
#deviceFilter:
# config:
# crushRoot: "custom-root" # specify a non-default root label for the CRUSH map
# metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
# databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
# osdsPerDevice: "1" # this value can be overridden at the node or device level
# encryptedDevice: "true" # the default value for this option is "false"
# # Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then,only the named
# # nodes below will be used as storage resources. Each node's 'name' field should match their 'kubernetes.io/hostname'label.
nodes:
- name: "agent-0-xxxxx"
devices: # specific devices to use for storage can be specified for each node
- name: "vdc"
config:
osdsPerDevice: "1"
- name: "agent-1-xxxxx"
devices: # specific devices to use for storage can be specified for each node
- name: "vdc"
config:
osdsPerDevice: "1"
```
Also, rook-ceph toolbox should be enabled because it is disabled by default by setting toolbox.enabled: true
within the rook-ceph-cluster/values.yaml.
Deploy the Rook Cluster
helm install --create-namespace -n rook-cluster rook-cluster charts/rook-ceph-cluster-v1.13.1.tgz --values rook-ceph-cluster/values.yaml
Wait for the rook cluster to provision.
watch kubectl get cephcluster -n rook-cluster; kubectl get po -n rook-cluster
Phase should read Ready
Message should read Cluster Created Successfully
Health should read HEALTH_OK
Check the cluster status
kubectl exec -it -n rook-cluster $(kubectl get po -n rook-cluster -l app=rook-ceph-tools -o jsonpath="{.items[0].metadata.name}") -- ceph status
Health should say HEALTH_OK
Other things of note:
- one osd per agent node
- All pgs should be active+clean
- data.usage should should show you the total storage available to the cluster
Display storage classes
kubectl get sc
The ceph block storage class should be default
Create a registry bucket:
kubectl apply -f manifests/zarf-registry-bucket.yaml
Wait for the bucket to be ready
watch kubectl get objectbucketclaims.objectbucket.io,objectbuckets.objectbucket.io -A
Get the connection info
kubectl get cm -n rook-cluster zarf-registry-bucket -oyaml
kubectl get secret -n rook-cluster zarf-registry-bucket -oyamlBe sure to base64 decode the access and secret keys Update the manifests/zarf-init-config.yaml file using the above output being sure to replace:
- REGISTRY_STORAGE_S3_BUCKET
- REGISTRY_STORAGE_S3_ACCESSKEY
- REGISTRY_STORAGE_S3_SECRETKEY
Also add these two settings. The template pre-dates them: REGISTRY_HPA_MIN: "2" REGISTRY_PVC_ACCESS_MODE: ReadWriteMany
Initialize Zarf
ZARF_CONFIG=manifests/zarf-init-config.yaml zarf init --components git-server --confirm
This could be pretty slow
Taint the registry so it only runs on the control plane node:
kubectl patch deployment zarf-docker-registry -n zarf --patch "
spec:
template:
spec:
nodeSelector:
node-role.kubernetes.io/master: \"true\"
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule"Recombine the zarf parts ahead of time to avoid accidentally corrupting them
cat zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst.part00[1-9] > zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst
Modify the structsure-values.yaml file if desired
Deploy Structsure
zarf package deploy zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst --set BIGBANG_VALUES=manifests/structsure-values.yaml --confirm
You could encounter the zarf hang on push issue. If you do, you can ctrl+c to cancel the push, clean up the temp files, and start over
# ctrl+c
rm -rf /tmp/zarf-*
zarf package deploy zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst --set BIGBANG_VALUES=manifests/structsure-values.yaml --confirmIf the image push doesn't work with a
package deploy
it is recommended to usepackage mirror-resources
with azarf connect git
tunnel open to the gitea server and push it directly to the zarf-docker-registry using the node address and nodeport it is using for the--registry-url
. You will need to authenticate the--registry-push-username/password
and--git-push-username/password
. To get these credentials run azarf tools get-creds
. For example:To open the gitea tunnel:
zarf connect git
To mirror the resources/push images from the package to the package registry from a different terminal:
zarf package mirror-resources zarf-5.9/zarf-package-structsure-enterprise-amd64-v5.9.0.tar.zst --registry-url <DNI IP your node zarf-docker-registry is running on>:31999 --registry-push-username zarf-push --registry-push-password <zarf-push-password> --git-url http://127.0.0.1:<random tunnel port from zarf connect git> --git-push-username zarf-git-user --git-push-password <git-user-password>
This should bypass any image pushing issues you may encounter and a
package deploy
will need to be ran after thepackage mirror-resources
:zarf package deploy zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst --set BIGBANG_VALUES=manifests/structsure-values.yaml --confirm