Skip to main content
Version: 5.11.0

Install Structsure Edge

The following are step-by-step instructions on how to configure Structsure for a Snowball Edge deployment.

Snowball Edge Unlock

This doc will guide you through the basics of unlocking a Snowball Edge device for use. The full docs can be found here or on page 81 of the AWS Snowball Edge Developer Guide PDF.

Requirements

In order to unlock the Snowball Edge, you need the following:

  1. Snowball Edge CLI or OpsHub
  2. Snowball Edge Unlock Code (unique per snowball)
  3. Snowball Edge Manifest file (unique per snowball)
  4. The IP of the Snowball Edge device

Unlocking using the CLI

Issue the following command to unlock the Snowball Edge Device:

    snowballEdge unlock-device --endpoint https://192.0.2.0 --manifest-file /Downloads/JID2EXAMPLE-0c40-49a7-9f53-916aEXAMPLE81-manifest.bin --unlock-code 12345-abcde-12345-ABCDE-12345

Obtain the Access/Secret keys

Issue the following commands to obtain the root user's Access/Secret keys and export them for future use. This assumes you have jq and grep available.
If you don't have jq or grep, manually parse the Access Key Id and Secret Access Key.

AWS_ACCESS_KEY_ID=$(snowballEdge list-access-keys | jq -r '.AccessKeyIds[0]')
AWS_SECRET_ACCESS_KEY=$(snowballEdge get-secret-access-key --access-key-id "${AWS_ACCESS_KEY_ID}" | grep -Eo "^aws_secret_access_key = (.*)$" | cut -d' ' -f3)
export AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY

At this point you should have an unlocked Snowball Edge device and the root user's Access/Secret keys.

Snowball Edge AMI Import

Import the AMI

You will need to import the AMI to each and every Snowball Device.

You can follow the AWS documentation for this as found here. However the broad steps are as follows:

  1. Unlock the Snowball Edge device and obtain the Access/Secret keys. See above, Snowball Edge Unlock for instructions.

  2. Obtain your Snowball Edge IP address. You can export this info for easy substitution:

    export AWS_REGION=snow
    export SNOWBALL_IP=192.168.2.75
  3. Create a vmimport role and policy as described here

    cat << EOF > /trust-policy.json
    {
    "Version":"2012-10-17",
    "Statement":[
    {
    "Effect":"Allow",
    "Principal":{
    "Service":"vmie.amazonaws.com"
    },
    "Action":"sts:AssumeRole"
    }
    ]
    }
    EOF
    aws iam create-role --role-name structsure-vmimport --assume-role-policy-document file:///trust-policy.json --endpoint http://${SNOWBALL_IP}:6078
    cat << EOF > /import-policy.json
    {
    "Version":"2012-10-17",
    "Statement":[
    {
    "Effect":"Allow",
    "Action":[
    "s3:GetBucketLocation",
    "s3:GetObject",
    "s3:ListBucket",
    "s3:GetMetadata"
    ],
    "Resource":[
    "arn:aws:s3:::*",
    "arn:aws:s3:::*/*"
    ]
    }
    ]
    }
    EOF
    aws iam create-policy --policy-name structsure-vmimport --policy-document file:///import-policy.json --endpoint http://${SNOWBALL_IP}:6078
    # Note down the policy ARN and substitute it into the next command

    aws iam attach-role-policy --role-name structsure-vmimport --policy-arn <Policy ARN> --endpoint http://${SNOWBALL_IP}:6078
  4. Determine your S3 Bucket name:

    aws s3 ls --endpoint "http://${SNOWBALL_IP}:8080"
    export AMI_BUCKET=my-bucket-name
  5. Copy the AMI to the S3 Bucket:

    aws s3 cp ~/Downloads/export-ami-0123456789abcdef0.raw s3://${AMI_BUCKET} --endpoint "http://${SNOWBALL_IP}:8080"
  6. Import the AMI Snapshot. Save the resulting Snapshot-ID for use in the next step.

    aws ec2 import-snapshot --disk-container "Format=RAW,UserBucket={S3Bucket=${AMI_BUCKET},S3Key=export-ami-0123456789abcdef0.raw}" --endpoint "http://${SNOWBALL_IP}:8008"
  7. Register the AMI:

    Note: you may need to update the device name, root device name, volume size, and volume type depending on the OS of your AMI.

    aws ec2 register-image \
    --name my-ami-name \
    --description "Strutsure Edge AMI" \
    --block-device-mappings "[{\"DeviceName\":\"/dev/xvda\",\"Ebs\":{\"VolumeType\":\"sbp1\",\"DeleteOnTermination\":true,\"SnapshotId\":\"s.snap-0123456789abcdef0\",\"VolumeSize\":120}}]" \
    --root-device-name /dev/xvda \
    --endpoint http://${SNOWBALL_IP}:8008

Snowball Edge SSH Key Pair

Follow this guide to generate a SSH Key Pair. Note that these keys will be unique per snowball, unless you choose to import a single key-pair for each device. Be sure to safely retain the private key, as it is not recoverable if lost.

The full documentation can be found here.

Generate a Key Pair

  1. Unlock the Snowball Edge device and obtain the Access/Secret keys. See above, Snowball Edge Unlock for instructions.

  2. Obtain your Snowball Edge IP address. You can export this info for easy substitution:

    export AWS_REGION=snow
    export SNOWBALL_IP=192.168.2.75
  3. Create and Save Key pair. Note that this command requires the jq bin and chmod.

    If you do not have those tools available, you can manually capture the key material from the output and save it to a pem file.

    KEY_PAIR_RESPONSE=$(aws ec2 create-key-pair --key-name strutsure-edge --endpoint http://${SNOWBALL_IP}:8008)
    echo "${KEY_PAIR_RESPONSE}" | jq -r '.KeyMaterial' > "private-key-${SNOWBALL_IP}.pem"
    chmod 0400 private-key-${SNOWBALL_IP}.pem

At this point you should have a key-pair on the Snowball Edge device and the private key on your workstation. You could also import a key pair on instance creation with the command aws ec2 import-key-pair --key-name <value> --public-key-material <value>.

Snowball Edge Instance Deployment

Plan your instances

You'll want to deploy the following:

  • 1 Deploy box
  • 3 Control Plane nodes
  • Agent nodes that should at least amount to 36 cores of total vCPU (Recommended 6 agents)

Instance sizing is going to depend on the resources available to you. You'll want to sit down and plan out how to best utilize the space by mapping out the instances being deployed to each device and doing the math to divide the space with the smallest remainder. Here's some suggested minimum sizes however:

  • Deploy box: sbe-c.large
  • Control Plane Nodes: sbe-c.2xlarge
  • Agent Nodes: sbe-c.4xlarge

The instance types available can be found here or starting on page 188 of the AWS Snowball PDF.

Here's an example scenario where the System Integrator is using 8 vCPU worth of instances on Snowball-0 and Snowball-1, leaving the rest of the resources for use by Structsure Edge.
Note that this will most likely not be a case, but is a demonstration of shuffling instances around and changing instance sizes to optimize utilization.

SB-0: Total vCPU: 104 SI Instances: 8 Agent Nodes (x6): 96 Free vCPU: 0

SB-1: Total vCPU: 104 SI Instances: 8 Agent Nodes (x6): 96 Free vCPU: 0

SB-2: Total vCPU: 104 Control Plane: 8 Agent Nodes (x6): 96 Free vCPU: 0

SB-3: Total vCPU: 104 Control Plane: 8 Agent Nodes (x6): 96 Free vCPU: 0

SB-4: Total vCPU: 104 Control Plane: 8 Agent Nodes (x6): 96 Free vCPU: 0

SB-5: Total vCPU: 104 Deploy Box: 8 Agent Nodes (x6): 96 Free vCPU: 0

Plan your Storage

You'll want to divide up the storage in a smart way as well. It should be determined which snowball has the smallest amount of free space, round it down to the nearest TB, divide it in half, divide it by the number of agent nodes you're going to run, round to a good even number, and use that for your rook-ceph storage.

For example: Free Space 7.68 TB Round down to 7 TB Divide by 2 to 3.5 TB Divide by the number of agent nodes you're going to run per snowball, in this case 6, so 597.3 GB Round to 600 GB

We will want to mount in an additional volume to /var/lib/rook. This volume can be relatively small as it only needs to house the internal database rook-ceph needs. 100 GB should more than cover it.

We may want to mount in an additional volume for /var/log if your AMI doesn't have sufficient space. This will be dependant on the AMI that we're bringing.

Plan your IP addressing and DNS names

Each node will end up needing two IP addresses: one for the Virtual Network Interface (VNI) and one for the Direct Network Interface (DNI). Work with the SI to come up with an IP addressing scheme that will be "easy" and not cause problems. Keep in mind you'll need some addresses for the Virtual IPs (VIP) we'll assign to the "load balancers" created by kube-vip later.

Example:

control-plane-0.test.example.com:

  • VNI: 10.0.0.100
  • DNI: 10.0.0.200

control-plane-1.test.example.com:

  • VNI: 10.0.0.101
  • DNI: 10.0.0.201

control-plane-2.test.example.com:

  • VNI: 10.0.0.102
  • DNI: 10.0.0.202

agent-0.test.example.com:

  • VNI: 10.0.0.110
  • DNI: 10.0.0.210

...

agent-35.test.example.com:

  • VNI: 10.0.0.145

  • DNI: 10.0.0.245

Control Plane VIP: 10.0.0.250 Istio Ingress Gateway VIP: 10.0.0.251 Additional VIP?: 10.0.0.252

Configure the Snowball Timeserver

Clustered systems do not operate properly if they have a large time skew. We need to ensure the clocks are synched properly. It matters less if the time is accurate, but it must be consistent between the systems. You can configure the timeserver for the Snowballs, which will in turn configure the clocks on the systems we're deploying. The full documentation for configuring NTP can be found on page here and here or starting on page 121 and 143 of the AWS Snowball PDF.

If the System Integrator (SI) has not provided a time server, request that they configure one. They should be able to configure the nearest managed switch as an NTP server that you can use.

snowballEdge update-time-servers 10.0.0.2

Deploy Instances

The following steps should walk you through deploying an EC2 instance on the snowball.
Note that you will need to repeat these steps for each instance you're deploying on each Snowball.
Note that the AMI ID will be unique per Snowball Edge Device.

  1. Get the AMI ID either from the OpsHub UI or from the CLI.

    aws ec2 describe-images --endpoint http://${SNOWBALL_IP}:8008

    You can export the AMI ID for future use. NOTE: this AMI ID will be unique per Snowball Edge Device.

    export AMI_ID="s.ami-example0123456789"
  2. Get the physical network device ID. We need to determine which interface is connected and obtain its ID. Executing the following command will display all of the device info.

    snowballEdge describe-device

    Only one of the PhysicalNetworksInterfaceId's will have an IP address that isn't 0.0.0.0 (ex. RJ45 or QSFP etc...). Identify the correct interface using the IpAddress value that isn't 0.0.0.0. You can export the interface ID for future use. NOTE: this interface ID will be unique per Snowball Edge Device.

    export PHYSICAL_DEVICE_ID="s.ni-EXAMPLE0a3a6499fd"
  3. Create the VNI.

    snowballEdge create-virtual-network-interface --physical-network-interface-id ${PHYSICAL_DEVICE_ID} --ip-address-assignment STATIC --static-ip-address-configuration IpAddress=10.0.0.100,Netmask=255.255.255.0

    You will need to keep the VNI ID for future use. You can export it if you would like.

    export VNI_ID="arn:aws:snowball-device:::interface/s.ni-example0123456789"
  4. Run the instance

    Set your instance type:

    export INSTANCE_TYPE="sbe-c.4xlarge"

    You may need to tweak the block-device-mappings setting. More information on this can be found in the AWS CLI documentation here.

    Remember, to configure the block device mapping as such:

    • Control plane nodes get zero additional disks (unless you decide they need a dedicated logging disk).
    • Agent nodes get additional disks:
      • One for the Rook-Ceph database mounted in /var/lib/rook
      • One for the Rook-Ceph block storage (unformatted and un-partitioned)
      • Optionally one for a dedicated logging disk
    aws ec2 run-instances --image-id ${AMI_ID} --instance-type ${INSTANCE_TYPE} --key-name=${KEY_NAME} --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=control-plane-0.test.example.com}]" --block-device-mappings "[{\"DeviceName\":\"/dev/sdb\",\"Ebs\":{\"DeleteOnTermination\":false,\"VolumeSize\":600}},{\"DeviceName\":\"/dev/sdc\",\"Ebs\":{\"DeleteOnTermination\":false,\"VolumeSize\":100}}]" --endpoint http://${SNOWBALL_IP}:8008

    You will need to keep the Instance ID for the next commands. You can export it.

    export INSTANCE_ID="s.i-example0123456789"
  5. Wait for the instance to start. It must be running to associate the interfaces.

    aws ec2 wait instance-running --instance-ids ${INSTANCE_ID} --endpoint http://${SNOWBALL_IP}:8008
  6. Associate the VNI. Your --public-ip is your VNI address you created.

    aws ec2 associate-address --public-ip 10.0.0.100 --instance-id ${INSTANCE_ID} --endpoint http://${SNOWBALL_IP}:8008

    At this point you should be able to SSH to your instance via the VNI IP address.

    ssh -i key.pem rocky@10.0.0.100

Configure Instance

Now that your instance is deployed, we will need to do some basic configuration that hasn't been automated. You will need to do these configurations as root user.

  • Configure the hostname
  • Configure and format the storage.
  • Configure the DNI and routing.
  1. Configure the hostname

    hostnamectl set-hostname controlplane-0.test.example.com

    You can verify with the following command:

    hostnamectl status
  2. Configure and format the storage

    If you only added the rook-ceph block storage you're all done! If you've attached additional storage for other purposes you have some work to do.

    1. List the disks. Note which disk is which for future steps

      lsblk

      You will need to put the large volume in the rook-ceph config, so note the device name down for later.

    2. Partition, format, and mount the disk for the rook-ceph database

      Note: you should only have to do this on agent nodes
      Note: Replace /dev/sdb with whatever device name you identified in step 1

          fdisk /dev/vdb
      # n then enter to make a new partition
      # p then enter for primary
      # press enter to accept the defaults for the rest of the prompts
      # p then enter to show the new partition. This should output all of your partitions.
      # w then enter to write the new partition

      # The new partition should show up now
      fdisk -l

      # Make the file system
      mkfs.xfs /dev/vdb1

      # make the /var/lib/rook directory
      mkdir -p /var/lib/rook

      # Modify the fstab to mount the disk to the dir
      vi /etc/fstab
      # Add a new line that looks something like this:
      # /dev/vdb1 /var/lib/rook xfs defaults 0 0

      # Remount the file systems
      mount -a

      # Verify the mount is good
      df -h
    3. Add the Logs disk to the LVM. (Optional for a dedicated logging disk)

      Note: that if your AMI does not support LVM these steps will not work for you. You can follow the process in step 2 above to partition, format, and mount the disk to the logs dir

          # Create the new PV
      pvcreate /dev/sdc

      # Identify the volume group we want to extend
      vgs

      # Extend the logging volume group
      vgextend rook /dev/sdc

      # Identify the Logical Volume
      lvs

      # Extend the logging volume
      lvextend -l +100%FREE /dev/rocky/logs

      # Resize the filesystem
      xfs_growfs /var/log

      # Check that things have been resized
      df -h
  3. Configure the DNI and routing

    In order to make the VIPs for the API and ingress gateway(s) float between nodes, we need to configure the host with a DNI but first we need to prepare the instance for the DNI.

    note

    If you want to delete an instance, delete the DNI before deleting the instance it's attached to.

        # show the devices
    nmcli -p device

    # show the connections
    nmcli connection show

    # Stage a connection to attach to the DNI using a STATIC ip
    # Note you may need to change the device name from eth1.
    # The ipv4.address should be the DNI address that you chose. The ipv4.gateway can be found from the nmcli con show command
    nmcli con add con-name eth1 type ethernet ifname eth1 ipv4.method manual ipv4.address 10.0.0.200/24 ipv4.gateway 10.0.0.1 ipv4.route-metric 10

    nmcli con show eth1
    # This is to deprioritize eth0 to make eth1 (dni) priority. You may also need to use the UUID instead of the con name.
    nmcli connection modify eth0 ipv4.route-metric 20

Now create the DNI and you should be able to ssh in with the set DNI IP address

  1. Create the DNI back on the local machine.

    snowballEdge create-direct-network-interface --physical-network-interface-id ${PHYSICAL_DEVICE_ID} --instance-id ${INSTANCE_ID}

    You should be able to ssh in with the set DNI IP address. Once on the instance, display the connection and make sure eth1 is higher priority than eth0

    #  settings
    nmcli dev show eth1

    NOTE: when you apply the route config, you will most likely not be able to access the instance via the VNI IP address any longer. You should be able to connect via the DNI address at this point instead.

RKE2 and Structsure Deploy Guide

Configure Deploy box

This guide assumes you have deployed a system using the AMI provided in the Structsure Edge Data Transfer and have accessed this system via SSH.

  1. Copy the files from your S3 Bucket. Be sure to substitute in your Snowball Edge Device IP, Bucket name, and Access/Secret Keys.

    sudo su
    mkdir -p /var/lib/structsure-edge
    cd /var/lib/structsure-edge

    export SNOWBALL_IP=192.168.1.2
    export BUCKET_NAME=structsure-edge
    export AWS_REGION=snow
    export AWS_ACCESS_KEY_ID=my-access-key-id
    export AWS_SECRET_ACCESS_KEY=my-secret-access-key
    aws s3 cp --recursive s3://${BUCKET_NAME}/edge/ . --endpoint "http://${SNOWBALL_IP}:8080"
  2. Install Docker and load the Utility Image

    The edge-util docker image should have all of tooling you need (include k9s!) to perform an edge deployment.
    These steps will walk you through deploying Docker on a deployment instance based on the Structsure RKE2 AMI.

    mkdir -p rpms/docker-repo
    tar -C rpms/docker-repo -xzf rpms/docker-repo.tar.gz

    cat << EOF > /etc/yum.repos.d/local.repo
    [local]
    name=Local Repository
    baseurl=file:///var/lib/structsure-edge/rpms/docker-repo
    enabled=1
    gpgcheck=0
    protect=1
    EOF
    dnf -y install docker-ce docker-ce-cli containerd.io --repo local
    systemctl enable docker --now

    docker load -i edge-util.tar
  3. Prepare RKE2 Ansible

    tar -zxf rke2-ansible.tar.gz
    cp image-archives/* rke2-ansible/tarball_install/
    cp -r rpms/* rke2-ansible/offline_repo/
  4. Populate ssh keys you use for the nodes in the rke2-ansible/ssh_keys directory, be sure to change the permissions:

    chmod 0400 rke2-ansible/ssh_keys/*.pem
  5. Populate ansible inventory

    vi rke2-ansible/inventory/structsure-edge/hosts.ini

    your hosts.ini should look something like this:

    [rke2_servers]
    ; Add Server Node IPs here, this would be the DNI's set for each node.
    ; Add the hostname variable to rename the systems if desired
    ; add the ssh private key for each host. Note that these keys will be unique per snowball
    ; the server variable needs to be the ip of the kube-vip interface (same as the vip_address variable).
    ; the first time ansible is executed, do NOT supply a the server variable to the first node
    ; after the cluster is up, and the api server is accessible via https://<kube-vip-ip>:6443 from all nodes, then add server=<kube-vip-ip> to the first node and re-run ansible
    10.0.0.100 ansible_ssh_private_key_file=ssh_keys/private-key.pem
    10.0.0.101 ansible_ssh_private_key_file=ssh_keys/private-key.pem server=10.0.0.200
    10.0.0.102 ansible_ssh_private_key_file=ssh_keys/private-key.pem server=10.0.0.200

    [rke2_agents]
    ; Add Agent Node IPs here, this would be the DNI's set for each node.
    ; add the ssh private key for each host. Note these keys will be unique per snowball
    10.0.0.110 ansible_ssh_private_key_file=ssh_keys/private-key.pem
    10.0.0.111 ansible_ssh_private_key_file=ssh_keys/private-key.pem
    10.0.0.112 ansible_ssh_private_key_file=ssh_keys/private-key.pem
    10.0.0.113 ansible_ssh_private_key_file=ssh_keys/private-key.pem
    10.0.0.114 ansible_ssh_private_key_file=ssh_keys/private-key.pem
    10.0.0.115 ansible_ssh_private_key_file=ssh_keys/private-key.pem

    [rke2_cluster:children]
    rke2_servers
    rke2_agents

    [rke2_servers:vars]
    ; vip_interface should be the same physical interface used to access the node instances
    vip_interface=eth1
    ; this should be an IP address in the snowball cidr that is not in a DHCP pool or has a reservation
    ; this will be the IP address of the api server "load balancer"
    vip_address=10.0.0.200
    ; you can set this and run ansible if you break quorum. Note that it does dumb recovery and you could lose data
    recover_quorum=false
    ; this cidr is used for the application load balancer.
    ; Again, this should not be in a DHCP pool or it should have a reservation so its not used elsewhere
    vip_cloud_provider_cidr=10.0.0.201/32

    [rke2_agents:vars]
    ; This should be the same ip as the vip_address
    server=10.0.0.200

    [all:vars]
    ansible_user=rocky
    install_rke2_version=v1.26.12+rke2r1
    selinux=true
  6. Execute the ansible

    docker run -it -v $(pwd):/work -w /work edge-util /bin/bash

    or to keep your container running after exiting:

    docker run -itd -v $(pwd):/work -w /work edge-util sh -c "while true; do sleep 30; done"
    docker exec -it <container-id> /bin/bash
    cd rke2-ansible
    ansible-playbook -i inventory/structsure-edge/hosts.ini site.yml

    These ansible scripts can take a long time.

  7. Verify you can access the api via the kube vip address using curl

    curl -k https://<kube_vip_ip>:6443

    The following output is expected:

    {
    "kind": "Status",
    "apiVersion": "v1",
    "metadata": {},
    "status": "Failure",
    "message": "Unauthorized",
    "reason": "Unauthorized",
    "code": 401
    }
  8. After the first successful run for ansible, set the server variable for the first control plane node and re-run the ansible. You can limit the execution to just that one system by specifying the DNI IP address in the limit flag. If you need to add more nodes to the cluster or rerun ansible make sure to include the master node IP address if you use the limit flag for critical variables to work.

    ansible-playbook -i inventory/structsure-edge/hosts.ini site.yml --limit '10.2.3.1'
  9. When ansible is done, you should be able to access your cluster.

    kubectl --kubeconfig connection/rke2.yaml get node

    Export the kubeconfig for future use.

    export KUBECONFIG=$(pwd)/connection/rke2.yaml
note

The kube-vip-cloud-provider deployment may need it's imagepullpolicy set to IfNotPresent, by default it is Always and pods may not come up. This can be edited with kubectl edit deployment kube-vip-cloud-provider -n kube-system. If the kube-proxy pods do not deploy correctly on some nodes the static manifest file under /var/lib/rancher/rke2/agent/pod-manifests may need to be removed on the host machine on each node that the pod isn't healthy, then run systemctl restart rke2-agent.

  1. Deploy the Rook Operator. Within your rook-ceph/values.yaml it is important to make sure hostpathRequiresPrivileged: true is enabled if SELinux was enabled within your rke2 ansible.

    cd ..
    helm install --create-namespace rook-operator charts/rook-ceph-v1.13.1.tgz --values rook-values/rook-operator.yaml -n rook-ceph
  2. Verify operator is running:

    kubectl get all -n rook-ceph
  3. Get a shell on an agent node and check to see what device name was assigned to the secondary disk attached at launch

    lsblk

Determine which disk is the secondary (no paritions, and should be the correct size); remember only your agent nodes should be used as storage resources. Modify the rook-ceph-cluster/values.yaml and replace the storage value with this disk; it should look something like this:

```bash
storage: # cluster level storage configuration and selection
useAllNodes: false
useAllDevices: false
#deviceFilter:
# config:
# crushRoot: "custom-root" # specify a non-default root label for the CRUSH map
# metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
# databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
# osdsPerDevice: "1" # this value can be overridden at the node or device level
# encryptedDevice: "true" # the default value for this option is "false"
# # Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then,only the named
# # nodes below will be used as storage resources. Each node's 'name' field should match their 'kubernetes.io/hostname'label.
nodes:
- name: "agent-0-xxxxx"
devices: # specific devices to use for storage can be specified for each node
- name: "vdc"
config:
osdsPerDevice: "1"
- name: "agent-1-xxxxx"
devices: # specific devices to use for storage can be specified for each node
- name: "vdc"
config:
osdsPerDevice: "1"

```

Also, rook-ceph toolbox should be enabled because it is disabled by default by setting toolbox.enabled: true within the rook-ceph-cluster/values.yaml.

  1. Deploy the Rook Cluster

    helm install --create-namespace -n rook-cluster rook-cluster charts/rook-ceph-cluster-v1.13.1.tgz --values rook-ceph-cluster/values.yaml
  2. Wait for the rook cluster to provision

    watch kubectl get cephcluster -n rook-cluster; kubectl get po -n rook-cluster 

    Phase should read Ready

    Message should read Cluster Created Successfully

    Health should read HEALTH_OK

  3. Check the cluster status

    kubectl exec -it -n rook-cluster $(kubectl get po -n rook-cluster -l app=rook-ceph-tools -o jsonpath="{.items[0].metadata.name}") -- ceph status

    Health should say HEALTH_OK

    Other things of note:

    • one osd per agent node
    • All pgs should be active+clean
    • data.usage should should show you the total storage available to the cluster
  4. Display storage classes

    kubectl get sc

    The ceph block storage class should be default

  5. Create a registry bucket:

    kubectl apply -f manifests/zarf-registry-bucket.yaml
  6. Wait for the bucket to be ready

    watch kubectl get objectbucketclaims.objectbucket.io,objectbuckets.objectbucket.io -A
  7. Get the connection info

    kubectl get cm -n rook-cluster zarf-registry-bucket -oyaml
    kubectl get secret -n rook-cluster zarf-registry-bucket -oyaml

    Be sure to base64 decode the access and secret keys Update the manifests/zarf-init-config.yaml file using the above output being sure to replace:

    • REGISTRY_STORAGE_S3_BUCKET
    • REGISTRY_STORAGE_S3_ACCESSKEY
    • REGISTRY_STORAGE_S3_SECRETKEY

    Also add these two settings. The template pre-dates them: REGISTRY_HPA_MIN: "2" REGISTRY_PVC_ACCESS_MODE: ReadWriteMany

  8. Initialize Zarf

    ZARF_CONFIG=manifests/zarf-init-config.yaml zarf init --components git-server --confirm

    This could be pretty slow

  9. Taint the registry so it only runs on the control plane node:

    kubectl patch deployment zarf-docker-registry -n zarf --patch "
    spec:
    template:
    spec:
    nodeSelector:
    node-role.kubernetes.io/master: \"true\"
    tolerations:
    - key: node-role.kubernetes.io/master
    operator: Exists
    effect: NoSchedule"
  10. Recombine the zarf parts ahead of time to avoid accidentally corrupting them

    cat zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst.part00[1-9] > zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst
  11. Modify the structsure-values.yaml file if desired

  12. Deploy Structsure

    zarf package deploy zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst --set BIGBANG_VALUES=manifests/structsure-values.yaml --confirm

    You could encounter the zarf hang on push issue. If you do, you can ctrl+c to cancel the push, clean up the temp files, and start over

    # ctrl+c
    rm -rf /tmp/zarf-*
    zarf package deploy zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst --set BIGBANG_VALUES=manifests/structsure-values.yaml --confirm

    If the image push doesn't work with a package deploy it is recommended to use package mirror-resources with a zarf connect git tunnel open to the gitea server and push it directly to the zarf-docker-registry using the node address and nodeport it is using for the --registry-url. You will need to authenticate the --registry-push-username/password and --git-push-username/password. To get these credentials run a zarf tools get-creds. For example:

    To open the gitea tunnel:

    zarf connect git

    To mirror the resources/push images from the package to the package registry from a different terminal:

    zarf package mirror-resources zarf-5.9/zarf-package-structsure-enterprise-amd64-v5.9.0.tar.zst --registry-url <DNI IP your node zarf-docker-registry is running on>:31999 --registry-push-username zarf-push --registry-push-password <zarf-push-password> --git-url http://127.0.0.1:<random tunnel port from zarf connect git> --git-push-username zarf-git-user --git-push-password <git-user-password>

    This should bypass any image pushing issues you may encounter and a package deploy will need to be ran after the package mirror-resources:

    zarf package deploy zarf-package-structsure-enterprise-amd64-v5.0.0.tar.zst --set BIGBANG_VALUES=manifests/structsure-values.yaml --confirm