Skip to main content
Version: Next

Deploy Structsure Clusters in Disconnected Environments

Prerequisites

Before you begin deploying a Structsure cluster, ensure you have the following:

A DTA Package including:

  • A copy of the Structsure AWS IaC (preferably latest version)
  • Terraform offline bundle that corresponds to the commit of the AWS IaC
  • The latest Structsure Zarf Package
  • Utility image

Create Deploy Box

If available, create the deploybox out of a plain RHEL8 AMI. Ensure that the deploybox being created is in the same Virtual Private Cloud (VPC) to which the cluster is being deployed. The deploybox should also have at least 75-100 GiB of root volume.

note

If you are using an RKE2/Rocky AMI as the deploybox, you must use the /var/ directory as your working directory, if you do not want to repartition your disks. This is the only directory that has enough space to hold the DTA without having to resize by default. In addition, the home directory has noexecs permissions mounted on it, so you will run into permission denied errors when running terraform commands, if you do not use the /var/ directory.

Create a .json file similar to containers.json in this example. Import the .vmdk AMI using the following command, pointing it at the containers.json you created:

aws ec2 import-image --description "My server disks" --disk-containers "file://\containers.json"

Extract all of the .tar files from the DTA:

tar -xvf xxx.tar

Install RPMs (if you brought any) on the deploybox using the following:

dnf install rpms/*.rpm –nogpgcheck --disablerepo=* --allowerasing -y

It is recommended to keep all repositories in a repos/ directory to easily mount into your utility container. Git clone the bare git repositories in the DTA:

git clone --recursive structsure-aws-iac.git

To update or re-initialize the submodules on an already-cloned repository, run the following command:

git submodule update --init
note

Git submodules are referenced using relative paths in the AWS IaC repository, so if using an offline Git mirror, the rke2-aws-tf Git repository should be at the same folder level as the structsure-aws-iac repository. Refer to the .gitmodules file for the specific configuration of each submodule.

Additionally, when using a local git mirror, git may need to be configured to allow cloning submodules from file. This can be done by running:

git config --global protocol.file.allow always

Deploybox Password Expiration

If you are using one of the Structsure STIG'd AMIs as your deploybox, there is a STIG policy to set the password expiration to 60 days for user accounts. If you don't change the password for the rocky user or disable the password expiration policy, you will be locked out of the deploybox.

To disable password aging / expiration for user rocky, type the command with sudo as follows to use the interactive mode:

chage rocky

set the following prompts with the given values:

Minimum Password Age to 0
Maximum Password Age to 99999
Password Inactive to -1
Account Expiration Date to -1

OR

 chage -I -1 -m 0 -M 99999 -E -1 rocky

To re-enable password aging for the rocky user to expire at 60 days, or to reset to the default settings, you can use the following command:

chage -I -1 -m 1 -M 60 -W 7 rocky

Configure the Container Runtime (Docker)

Start Docker after installing docker-ce.

sudo systemctl enable docker --now

Start Utility Image and Create Volume Mount

Load in utility image (using utility.image.tar.gz in this example).

docker load -i utility.image.tar.gz

sudo docker run -it -d utilityimage:tag

Or, if the container was already initialized with docker run, then find the <container_id> using docker ps, and run:

sudo docker start <container_id>

It is recommended to use a volume mount, so that the state is not ephemeral within the container. It can be added in a script called debug_utility.sh:

#!/bin/bash
export REPO_DIR=${REPO_DIR:="/absolute/path/to/your/repositories"}
docker run -it --name debug \
--security-opt no-new-privileges \
--restart on-failure:5 \
--cpu-shares 1024 \
--memory 1024m \
--pids-limit 100 \
-v ${REPO_DIR}:/opt/repos:z \
-w /opt/repos \
registry.gitlab.com/structsure/breakerbar/utility-image/volta-util:main
/bin/bash

and run with:

sudo ./debug_utility.sh

If successful, this should mount your repos directory into your utility container at /opt/repos.

Terragrunt for DT Cluster

Move the terraform bundle into your deployment repo (in this case, in the base of structsure-aws-iac) and extract the files:

tar -xvf <terraform_bundle>.tar

Terragrunt includes files to configure commonly-changed environment-specific variables. To configure these, navigate to /<path>/structsure-aws-iac/infra-iac/envs and create a copy of one of the existing environment file examples contained in that directory. Name the new environment file something relevant to your deployment.

Review the environment file for changes that need to be made. At a minimum, the following values will likely need to be updated:

  • AWS region
  • Allowed security groups
  • Cluster name
  • Domain name
  • Artifact bucket
  • Control plane node AMI
  • Control plane node instance type
  • Agent node name
  • Agent node AMI
  • Desired amount of agent and control plane nodes
  • Agent node instance type
  • VPC ID
  • VPC Subnet IDs
  • Remote state bucket
  • Remote state DynamoDB

After these are updated, exec into the running utility container using the following command:

sudo docker exec -it <container_ID> /bin/bash

Within the utility container, declare the following environment variables from the root of your structsure-aws-iac working repository:

export TF_PLUGIN_CACHE_DIR="$(pwd)/.terragrunt-cache/providers/"
export TERRAGRUNT_ENV_FILE="$(pwd)/infra-iac/envs/your_terragrunt.hcl"
export WORKSPACE_NAME=<your_workspace>

You can also replace $(pwd)with the absolute path to the root of your working directory and export these variables from anywhere.

Navigate to the /<path>/structsure-aws-iac/infra-iac/rke2-cluster folder and run git config --global --add safe.directory /<path>/structsure-aws-iac, then edit the git config file to remove the pointer to gitlab.com, and instead point to the local repo, if applicable. This can be done outside of the container using vi .git/config. After it is updated, run git submodule update --init. This is only necessary if you did not clone from a bare repo, but instead copied a checked out repo that originally pointed to gitlab.com.

Initialize Terragrunt within structsure-aws-iac/infra-iac/rke2-cluster/ by using the following command:

terragrunt init -plugin-dir=/<path>/structsure-aws-iac/.terragrunt-cache/providers -get=false -input=false --terragrunt-log-level debug 2>&1 | tee terragrunt-init.log
note

Terragrunt will attempt to initialize the S3 bucket and DynamoDB table for its remote state if proper AWS credentials are present. In some cases, the automatic remote state initialization may partially fail if, for example, public access blocks can not be created in the current account. If this happens, Terragrunt will still attempt to create the remaining resources; subsequent attempts to run terragrunt init may succeed, since the public access block is not required for the remote state to function.

If you are using a Terragrunt workspace, you can run terragrunt run-all workspace select -or-create <workspace_name>. Check that the $TERRAGRUNT_ENV_FILE variable is still set, and if so, run terragrunt apply.

note

An error message stating the apply failed while creating the S3 bucket may appear, but the apply was still successful. This is normal because public access for S3 is blocked by default in IL4. If the S3 bucket exists, you should be able to continue.

Terragrunt for Collab Cluster

The Collab Cluster infrastructure is the same as the DT Cluster, but creates extra modules. After you create a DT Cluster, refer to the following guidance:

Each of the add-on IaC modules is dependent upon the main rke2-cluster IaC, so that module must be deployed first. If deploying a fully-featured collaboration environment, including all add-ons, the following short-hand can be used to deploy the RKE2 cluster and infrastructure for all available add-ons:

cd REPO_PATH
export TF_PLUGIN_CACHE_DIR="$(pwd)/.terragrunt-cache/providers/"
export TERRAGRUNT_ENV_FILE="$(pwd)/infra-iac/envs/your-custom-env.hcl"
export WORKSPACE_NAME="your-workspace-name"
cd infra
terragrunt run-all init --terragrunt-parallelism=1
terragrunt run-all workspace select -or-create $WORKSPACE_NAME
terragrunt run-all plan
terragrunt run-all apply

This Terragrunt run-all command will recursively look through the child directories for Terragrunt modules (signified by the existence of a file: terragrunt.hcl) and apply them. It can automatically resolve interdependent modules and apply them sequentially, as necessary.

note

The command terragrunt run-all plan will have unusual and incorrect output if no infrastructure has yet been created. This is because of the interdependent modules; if the RKE2 cluster has not yet been built, for example, the Jira add-on will not be able to inherit its cluster name, since that does not yet exist. As such, Terragrunt will instead substitute placeholder values for these as-yet unknown values using its mock_outputs functionality.

Outputs

Terragrunt will store several useful values as outputs; these can be viewed using the terragrunt output command. For example, configure your local terminal to use the cluster's kubeconfig file, and run the following command:

cd REPO_PATH/infra-iac/rke2-cluster
KUBECONFIG_URL=$(terragrunt output -raw kubeconfig_url)
aws s3 cp $KUBECONFIG_URL ~/.kube/config

This will allow you to run kubectl commands in the context of the cluster.

If multiple Terragrunt modules have been deployed, as for a collaboration environment, the outputs for all of them can be combined using a command, such as:

cd REPO_PATH/infra
terragrunt run-all output -json | jq -n 'reduce inputs as $i ({}; . * $i)' >
outputs.json

Additionally, Terragrunt will also create files (by default in the infra-iac/outputs directory) matching several of the Terraform outputs. These will be consumed during the deployment of the Structsure package.

Known Issues with Collab Cluster IAC

Terragrunt run-all commands did not work in IL4; therefore, each module and workspace was needed to be created individually. This was manually done by running terragrunt init, terragrunt workspace create, terragrunt plan, and terragrunt apply for each applicable module within structsure-aws-iac/infra-iac/modules.

Within IL4, we needed to set compatibility_mode=true in infra-iac/envs/prod-env-collab.hcl to prevent Terragrunt S3 bucket errors.

After Terragrunt apply, run the following to add the Big Bang outputs into one bigbang-secrets and bigbang-values.yaml:

export CI_PROJECT_DIR=/opt/repos/structsure-aws-iac
      yq eval-all '. as $i ireduce ({}; . * $i)' ${CI_PROJECT_DIR}/infra-iac/outputs/bigbang-secrets.yaml
$(find ${CI_PROJECT_DIR}/infra-iac/outputs/ -name 'bigbang-secrets-*.yaml') >
${CI_PROJECT_DIR}/infra-iac/outputs/bigbang-secrets.yaml

And

     yq eval-all '. as $i ireduce ({}; . * $i)' ${CI_PROJECT_DIR}/infra-iac/files/bigbang-values.yaml
$(find ${CI_PROJECT_DIR}/infra-iac/outputs/ -name 'bigbang-values-*.yaml') >
${CI_PROJECT_DIR}/infra-iac/outputs/bigbang-values.yaml

Deploy the Structsure Zarf Package

Gitea, used by Zarf, needs a storage class to create a persistent volume. To create an EBS-backed storage class marked as default (for Gitea to use it), apply the example storage class that is located at: /structsure-aws-iac/infra-iac/files/aws-ebs-storageclass.yaml.

After the cluster is successful, update your current directory to the same location as the Zarf packages to begin deploying Zarf. Zarf creates a significant amount of temporary files, so the --tmpdir flag is used to ensure there is enough room. Specify a tmpdir with at least 10-15 GB of free space.

Run the following to initialize Zarf:

ZARF_CONFIG=/<absolute_path>/structsure-aws-iac/infra-iac/outputs/zarf-init-config.yaml zarf init --components git-server --log-level debug --tmpdir=/home/rocky/data --confirm

See the Configuration File section of the Structsure Installation Instructions document for notes on configuring Zarf. It should include a valid path to bigbang-values.yaml in the bigbang_values_file field. bigbang-values.yaml was the output file created by the Structsure-AWS-IAC that was put together with either a yq or jq command. If necessary, you should also specify the domain name and valid customer certs within the zarf-config.yaml.

To deploy Zarf, run:

ZARF_CONFIG=/<path>/structsure-aws-iac/infra-iac/files/zarf-config.yaml zarf package deploy zarf-package-structsure-enterprise-amd64-v5.0.0-rc.3.tar.zst --log-level debug --confirm --tmpdir=/home/rocky/data

You may need to give the absolute path to your bigbang-values.yaml within your zarf-config.yaml.

After the Zarf deploy, it may ask the following questions:

Are you sure you want to deploy the following zarf package? Y
Do you want to deploy xrd-snapshotter-crds? N