Help:User journeys/Canasta on AWS EKS with RDS

From Canasta Wiki

Journey info  ·  Platform: AWS EKS / Amazon RDS MariaDB  ·  Time: ~1 hour wall-clock (most of it waiting for AWS to provision)

This journey walks through deploying a Canasta wiki on Amazon EKS (managed Kubernetes) using Amazon RDS MariaDB as the external database. It targets operators who want a fully managed AWS setup — AWS handles the K8s control plane, the database, and the underlying VPC/networking; you operate the application. It's a worked example, not the only way; for the conceptual material on external databases see Help:External database, and for multi-node K8s requirements see Help:Multi-node Kubernetes.

The walkthrough has three independent phases:

  1. EKS — provision a managed Kubernetes cluster
  2. RDS — provision a managed MariaDB database in the same VPC
  3. Canasta — point Canasta-CLI at the cluster and database, deploy a wiki

Each phase is self-contained and runs in roughly 10–20 minutes. End-to-end the whole thing takes about an hour, of which most is waiting for AWS resources to provision.

Prerequisites

  • An AWS account with permissions to create EKS, EC2, RDS, IAM, and VPC resources. For testing, an IAM user with the AdministratorAccess managed policy is the simplest. Production deployments should use a least-privilege role.
  • AWS credits or a billing setup. The journey provisions an EKS control plane, two EC2 worker nodes, and an RDS MariaDB instance — all of which bill while running. Check current AWS pricing for your region before starting; production sizing scales these significantly.
  • An operator workstation (see "Operator workstation: testing vs. production" below) with:
    • aws CLI v2
    • eksctl
    • kubectl
    • helm
    • Canasta-CLI installed in canasta-native mode. EKS authentication relies on the aws eks get-token exec block in the kubeconfig, and the canasta-docker image doesn't include the AWS CLI. For other K8s flavors with static-auth kubeconfigs (k3s, Docker Desktop, kind), either CLI mode works.
  • An AWS profile configured with admin credentials. This guide assumes the profile is named canasta-personal; substitute your own.

Operator workstation: testing vs. production

This walkthrough assumes you're running the commands from a personal laptop or workstation. That's fine for testing, evaluation, and learning — and the rest of this guide reads as if "laptop" is the operator machine.

For a real production deployment, you'd typically run the operator commands from a more durable machine: a small EC2 instance in the same VPC as the cluster, a CI/CD runner, a long-lived shared admin workstation, or similar. Two reasons:

  • State durability: Canasta-CLI keeps per-instance state (.env, values.yaml, gitops repo state if you use it) at ~/canasta/<instance-id>/ on the operator machine. If your laptop dies, that state goes with it. The cluster keeps running, but you'd need to recover state from a backup or gitops repo to issue further commands. A stable operator host avoids that recovery.
  • Network proximity: An operator host inside the VPC has direct private-subnet access to RDS, so commands that briefly need DB connectivity (less common, but possible during certain maintenance flows) just work without bastions or VPC tunnels.

For everything in this walkthrough, both modes work — the steps are identical. Just substitute "the operator machine" for "your laptop" mentally if you're on a stable controller, and remember to back up ~/canasta/<id>/ either way.

Pre-flight checks

Before you start, confirm your AWS auth, Helm version, and Ansible collections are in good shape.

# Verify auth before starting
AWS_PROFILE=canasta-personal aws sts get-caller-identity
# Should print your IAM user ARN and account ID.

If you already have Helm installed and it's older than v4, update it first (canasta-CLI is currently tested against Helm 4):

brew upgrade helm    # or your distro's equivalent
helm version --short
# Expect v4.x.y

A fresh canasta-docker install or get-canasta.sh bootstrap takes care of the matching Ansible collection automatically. If you set up the canasta-CLI checkout manually (just pip install -r requirements.txt without the corresponding ansible-galaxy collection install -r requirements.yml), run that explicitly:

.venv/bin/ansible-galaxy collection install -r requirements.yml --upgrade

If that errors with SSL certificate-verify failures (Python's bundled certs vs. macOS's keychain), point it at the certifi bundle:

CERTIFI=$(.venv/bin/python -c 'import certifi; print(certifi.where())')
SSL_CERT_FILE="$CERTIFI" .venv/bin/ansible-galaxy collection install -r requirements.yml --upgrade

ℹ️ Note: Architecture note: Your operator workstation (laptop or stable controller, see above) is not the Kubernetes control plane and does not run any worker nodes. EKS hosts the control plane in AWS-managed infrastructure (multi-AZ, auto-healed). Worker nodes are EC2 instances in your own VPC. kubectl and helm are HTTPS clients pointed at the EKS API endpoint. If your operator workstation shuts down, the cluster keeps running, pods keep serving traffic, and scheduled CronJobs (like Canasta's backup) keep firing — you just can't issue new commands until it comes back up.


Phase 1: Provision the EKS cluster

EKS is a managed Kubernetes control plane. AWS runs the control plane (API server, etcd, scheduler) inside their own VPC. You only have to provide the worker nodes, the VPC, and the IAM roles. eksctl automates the boilerplate.

Decisions to make first

Choice Recommendation Why
Region us-east-1 or wherever your other resources live Same-region traffic between EKS, RDS, and S3 is free and lower latency.
Cluster name canasta-test (this guide); your choice Used in the resource names and kubectl context.
Kubernetes version Latest stable supported by AWS (1.31 at time of writing) EKS supports a rolling window of versions.
Node type c7i-flex.large for testing 2 vCPU / 4 GB; available on accounts where t3.medium is gated by a Free-Tier-only billing restriction (see note below).
Node count 2 with autoscaling 1–3 Matches Canasta's default 2-node K8s topology.
Node group type Managed AWS handles AMI updates, node lifecycle.

Free-Tier instance-type gotcha: Some AWS accounts (particularly newer ones still in their first 12-month Free Tier window, or accounts with billing locked to free-tier-only resources) reject paid-tier instance types like t3.medium with the error The specified instance type is not eligible for Free Tier. The CloudFormation node-group stack rolls back, the cluster stack stays up, and you have to clean up manually before retrying. c7i-flex.large typically works on those accounts; if it doesn't either, query the available types with aws ec2 describe-instance-types --filters Name=free-tier-eligible,Values=true --query 'InstanceTypes[].InstanceType' --output table.

If you hit this and need to retry: aws cloudformation update-termination-protection --stack-name eksctl-canasta-test-nodegroup-<id> --no-enable-termination-protection, then aws cloudformation delete-stack ..., then re-run eksctl create nodegroup --cluster canasta-test --name ng-flex --node-type c7i-flex.large --nodes 2 --managed.

Provision

One command provisions the VPC, EKS control plane, IAM roles, and managed node group:

AWS_PROFILE=canasta-personal eksctl create cluster \
  --name canasta-test \
  --region us-east-1 \
  --version 1.31 \
  --node-type c7i-flex.large \
  --nodes 2 \
  --nodes-min 1 \
  --nodes-max 3 \
  --managed \
  --tags 'project=canasta-test,owner=<you>,env=ephemeral'

This runs for about 15–20 minutes. While it runs, you can prep RDS in parallel — see Phase 2.

Behind the scenes

eksctl creates two CloudFormation stacks:

  • eksctl-<name>-cluster — VPC, subnets, IAM roles, EKS control plane (~12 min)
  • eksctl-<name>-nodegroup-<id> — managed node group, EC2 instances (~5 min after the cluster stack)

Watch progress:

AWS_PROFILE=canasta-personal aws cloudformation list-stacks \
  --region us-east-1 \
  --stack-status-filter CREATE_IN_PROGRESS CREATE_COMPLETE \
  --query 'StackSummaries[?contains(StackName, `canasta-test`)].[StackName,StackStatus]' \
  --output table

Verify

eksctl create cluster writes a kubeconfig context for the new cluster, but if the cluster name was used previously on this machine (e.g., recreating after a teardown) the stored API endpoint can be stale. Refresh it explicitly:

AWS_PROFILE=canasta-personal aws eks update-kubeconfig \
  --region us-east-1 \
  --name canasta-test

Then verify:

AWS_PROFILE=canasta-personal kubectl get nodes
# Expect 2 nodes in Ready state.

AWS_PROFILE=canasta-personal kubectl get pods -A
# Expect kube-system pods (coredns, kube-proxy, aws-node) all Running.

Enable OIDC + EBS CSI driver

EKS clusters provisioned via eksctl create cluster don't ship with the AWS EBS CSI driver active by default for K8s 1.31+. Without it, PVCs against the default gp2 storage class never bind, pods stall in Pending, and canasta create times out at the web rollout. This is a one-time setup per cluster:

# 1. Enable IAM OIDC provider on the cluster (required for IRSA)
AWS_PROFILE=canasta-personal eksctl utils associate-iam-oidc-provider \
  --region us-east-1 \
  --cluster canasta-test \
  --approve

# 2. Create an IAM role for the EBS CSI controller's service account
AWS_PROFILE=canasta-personal eksctl create iamserviceaccount \
  --name ebs-csi-controller-sa \
  --namespace kube-system \
  --cluster canasta-test \
  --region us-east-1 \
  --role-name AmazonEKS_EBS_CSI_DriverRole \
  --role-only \
  --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
  --approve

# 3. Install the EBS CSI driver as an EKS addon, bound to the role
ROLE_ARN=$(AWS_PROFILE=canasta-personal aws iam get-role \
  --role-name AmazonEKS_EBS_CSI_DriverRole \
  --query 'Role.Arn' --output text)
AWS_PROFILE=canasta-personal aws eks create-addon \
  --cluster-name canasta-test \
  --region us-east-1 \
  --addon-name aws-ebs-csi-driver \
  --service-account-role-arn "$ROLE_ARN"

# 4. Wait for the addon to become ACTIVE (~2 min)
AWS_PROFILE=canasta-personal aws eks wait addon-active \
  --cluster-name canasta-test \
  --region us-east-1 \
  --addon-name aws-ebs-csi-driver

Verify provisioning works:

AWS_PROFILE=canasta-personal kubectl get csidriver
# Expect: ebs.csi.aws.com (with TRUE under ATTACHREQUIRED)

Install the NGINX Ingress controller

canasta create --ingress-class nginx requires an NGINX Ingress controller in the cluster. EKS doesn't ship one — install it via Helm:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx --create-namespace \
  --wait

The chart's default service type is LoadBalancer, which on EKS provisions an AWS Network Load Balancer (NLB) automatically. Verify:

kubectl get svc -n ingress-nginx ingress-nginx-controller
# EXTERNAL-IP will populate with an NLB DNS name like
# a1b2c3-...elb.us-east-1.amazonaws.com once provisioning completes (~2 min).

Point your wiki's DNS (CNAME) at that NLB hostname before — or shortly after — running canasta create. The create command doesn't block on DNS, but Let's Encrypt certificate issuance only succeeds once the DNS is in place.

ℹ️ Note: Re-running on a domain you've used before: Each helm install ingress-nginx provisions a fresh load balancer with a brand-new hostname. If you tear down and re-run this walkthrough, the previous NLB is deleted but any CNAME pointing at it goes stale — and dig +short <wiki-fqdn> may still return the dead hostname. cert-manager's HTTP-01 challenge then sits in pending indefinitely because Let's Encrypt can't reach the new ingress. Always update your CNAME to today's NLB hostname before expecting a cert to issue. Symptom: kubectl get challenge -n canasta-<id> shows a long-pending challenge; dig +short <wiki-fqdn> returns a hostname that no longer resolves to any IP.

Capture VPC info for the next phase

Both RDS and Canasta need the VPC ID and subnet IDs:

AWS_PROFILE=canasta-personal aws cloudformation describe-stacks \
  --stack-name eksctl-canasta-test-cluster \
  --region us-east-1 \
  --query 'Stacks[0].Outputs[?OutputKey==`VPC` || OutputKey==`SubnetsPrivate` || OutputKey==`SharedNodeSecurityGroup`].[OutputKey,OutputValue]' \
  --output table

You'll get values for:

  • VPC — the VPC ID, e.g. vpc-0bfff10b7f4ac9ee4
  • SubnetsPrivate — two subnet IDs (in different AZs), needed for RDS
  • SharedNodeSecurityGroup — the SG attached to all worker nodes; RDS will allow ingress from this SG

Save these somewhere. Phase 2 needs them.


Phase 2: Provision RDS MariaDB

RDS is AWS-managed MariaDB. We deploy it inside the EKS VPC's private subnets so cluster pods can reach it without going over the public internet. RDS handles backups, patching, replicas, etc.

Decisions

Choice Recommendation Why
Engine mariadb What Canasta uses elsewhere.
Engine version 10.11.16 Matches the in-cluster Canasta DB version. Keeps SQL behavior consistent.
Instance class db.t3.micro for testing Smallest available; cheap to run for an evaluation. Production wikis should use db.t3.small or larger.
Allocated storage 20 GB GP2 Smallest. Easy to grow later.
Multi-AZ No, for testing Doubles cost. Production should turn this on.
Public access No Private subnets only; reachable from EKS pods.
Backups 0-day retention for testing Production should be 7+ days.

Step 1: Create a DB subnet group

RDS requires at least two subnets in different availability zones, even for single-AZ deployments. We re-use the EKS private subnets:

AWS_PROFILE=canasta-personal aws rds create-db-subnet-group \
  --region us-east-1 \
  --db-subnet-group-name canasta-test-rds-sng \
  --db-subnet-group-description "Canasta test RDS subnet group on EKS private subnets" \
  --subnet-ids <SubnetsPrivate-1> <SubnetsPrivate-2> \
  --tags Key=project,Value=canasta-test

Substitute the two SubnetsPrivate values from Phase 1.

ℹ️ Note: Avoid em-dashes (—) in any AWS resource description. The RDS API rejects them as "non-printable control characters." Use plain hyphens.

Step 2: Create the RDS security group

A new SG, allowing MariaDB (port 3306) from the EKS workers' SG.

RDS_SG=$(AWS_PROFILE=canasta-personal aws ec2 create-security-group \
  --region us-east-1 \
  --group-name canasta-test-rds-sg \
  --description "Canasta test RDS - allow MariaDB from EKS workers" \
  --vpc-id <VPC> \
  --tag-specifications 'ResourceType=security-group,Tags=[{Key=project,Value=canasta-test}]' \
  --query 'GroupId' --output text)

SG selection gotcha: EKS exposes two cluster-related security groups in its CloudFormation outputs:

  • SharedNodeSecurityGroup — the SG eksctl creates for self-managed node groups
  • ClusterSecurityGroupId — the SG EKS itself manages, attached to managed node group instances by default

For a managed node group (which is what eksctl create cluster --managed produces), the workers actually wear ClusterSecurityGroupId, not SharedNodeSecurityGroup. RDS ingress must allow that SG. Allowing SharedNodeSecurityGroup alone results in nc: timeout from pods trying to reach RDS — DNS resolves, but TCP connection times out.

Confirm what's attached to your nodes before configuring RDS ingress:

aws ec2 describe-instances \
  --region us-east-1 \
  --filters "Name=tag:eks:cluster-name,Values=canasta-test" \
  --query 'Reservations[].Instances[].SecurityGroups[*].GroupId' \
  --output table

Allow ingress from both SGs to be safe:

for SG in <SharedNodeSecurityGroup> <ClusterSecurityGroupId>; do
  AWS_PROFILE=canasta-personal aws ec2 authorize-security-group-ingress \
    --region us-east-1 \
    --group-id "$RDS_SG" \
    --protocol tcp \
    --port 3306 \
    --source-group "$SG"
done

This is a security-group-to-security-group reference — any EC2 instance wearing one of those SGs (which means any EKS worker, regardless of which SG actually got attached) can reach the RDS instance on port 3306. You don't have to enumerate IPs.

Step 3: Create the DB instance

Generate a strong password and create the instance:

RDS_PWD=$(LC_ALL=C tr -dc 'A-Za-z0-9' < /dev/urandom | head -c 24)
echo "$RDS_PWD"  # save this somewhere safe; you'll need it for Canasta

AWS_PROFILE=canasta-personal aws rds create-db-instance \
  --region us-east-1 \
  --db-instance-identifier canasta-test-mariadb \
  --db-instance-class db.t3.micro \
  --engine mariadb \
  --engine-version 10.11.16 \
  --master-username canasta \
  --master-user-password "$RDS_PWD" \
  --allocated-storage 20 \
  --storage-type gp2 \
  --db-name main \
  --vpc-security-group-ids "$RDS_SG" \
  --db-subnet-group-name canasta-test-rds-sng \
  --no-publicly-accessible \
  --no-multi-az \
  --backup-retention-period 0 \
  --no-deletion-protection \
  --tags Key=project,Value=canasta-test

Provisioning takes 5–10 minutes. Watch:

AWS_PROFILE=canasta-personal aws rds describe-db-instances \
  --region us-east-1 \
  --db-instance-identifier canasta-test-mariadb \
  --query 'DBInstances[0].DBInstanceStatus' \
  --output text
# "creating" → "backing-up" → "available"

Capture the endpoint

When the DB is available:

AWS_PROFILE=canasta-personal aws rds describe-db-instances \
  --region us-east-1 \
  --db-instance-identifier canasta-test-mariadb \
  --query 'DBInstances[0].Endpoint.Address' \
  --output text
# Returns something like: canasta-test-mariadb.<id>.us-east-1.rds.amazonaws.com

Save this hostname. Phase 3 needs it.


Phase 3: Deploy Canasta

Canasta-CLI runs on your laptop (or any operator workstation with kubectl access). It uses the local kubectl context to drive the cluster.

Architecture recap

For self-managed K8s (k3s), Canasta installs the cluster on a target host via SSH and stores instance state on that host. For EKS, the cluster already exists and you talk to it via kubectl over HTTPS. So:

  • No --host flag. Canasta runs locally; instance_path defaults to ~/canasta/<instance-id>/ on the laptop.
  • Local kubectl/helm drive the cluster.
  • The current kubectl context determines which cluster Canasta deploys into. Canasta has no --cluster or --context flag; whatever kubectl config current-context returns at the moment you run canasta create is where the wiki ends up. If you have multiple contexts, take a moment to verify the right one is current — it's an easy mis-deploy otherwise.
  • <instance-id>-db-credentials Secret in the cluster carries the RDS password.

Instance state files (.env, values.yaml, settings) live at ~/canasta/<instance-id>/ on the operator workstation. Back this directory up regardless of which mode you're in (simple tar snapshot or canasta gitops init to push state to a git repo). For the testing-vs-production split, see the "Operator workstation" section in the Prerequisites.

Step 1: Set the kubectl context to the EKS cluster

eksctl create cluster writes a kubeconfig context for the new cluster, but if you already had other contexts (docker-desktop, your work cluster, etc.), it does not automatically switch to the new one. Run this explicitly so the next canasta command targets EKS:

# Adds (or refreshes) the EKS context in ~/.kube/config and sets it as current
AWS_PROFILE=canasta-personal aws eks update-kubeconfig \
  --region us-east-1 \
  --name canasta-test

Verify:

kubectl config current-context
# Expect: arn:aws:eks:us-east-1:<account-id>:cluster/canasta-test

kubectl get nodes
# Expect 2 nodes in Ready state.

If current-context shows something else (e.g. docker-desktop), switch explicitly:

kubectl config use-context arn:aws:eks:us-east-1:<account-id>:cluster/canasta-test

ℹ️ Note: Foot-gun warning: Anything that uses the local kubectl context — canasta create, canasta backup, kubectl apply, helm upgrade — runs against whatever context is current. If you switch contexts mid-session (e.g. to debug a different cluster), make sure you switch back before issuing more canasta commands. A short prompt-line helper that displays the current context (e.g. via kube-ps1) is worth its weight here.

Step 2: Build the Canasta envfile

Create a file with the RDS credentials and any other instance settings. Restic backup credentials (S3, etc.) can also go here:

cat > /tmp/eks-test.env <<EOF
USE_EXTERNAL_DB=true
MYSQL_HOST=<RDS-endpoint-from-phase-2>
MYSQL_USER=canasta
MYSQL_PASSWORD=<RDS-password-you-saved>

# Optional: Restic on S3 for backups
RESTIC_REPOSITORY=s3:s3.amazonaws.com/<your-backup-bucket>/ekstest
RESTIC_PASSWORD=<choose-a-restic-password>
AWS_ACCESS_KEY_ID=<an-access-key-with-bucket-perms>
AWS_SECRET_ACCESS_KEY=<the-secret>
EOF
chmod 600 /tmp/eks-test.env

The AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY here should be a separate IAM user scoped to ONLY the backup bucket — least privilege. Don't reuse your admin credentials.

Step 3: Create the instance

Confirm the kubectl context one more time before running this — it determines where the wiki ends up:

kubectl config current-context
# Should show: arn:aws:eks:us-east-1:<account-id>:cluster/canasta-test

Storage class gotcha: Canasta-CLI persists a defaultStorageClass setting in its registry (~/Library/Application Support/canasta/conf.json on macOS, ~/.config/canasta/conf.json elsewhere). If the same laptop has previously been used against a k3s cluster, the registry likely has "defaultStorageClass": "local-path" (k3s default). EKS doesn't have a local-path storage class — its default is gp2. PVC binding silently fails, pods stall in Pending, and canasta create times out at Waiting for deployment "canasta-<id>-web" rollout to finish.

Fix either by passing --storage-class gp2 explicitly to canasta create, or by editing the registry once for this controller:

python3 -c "
import json, os
p = os.path.expanduser('~/Library/Application Support/canasta/conf.json')
with open(p) as f: cfg = json.load(f)
cfg.setdefault('Settings', {})['defaultStorageClass'] = 'gp2'
with open(p, 'w') as f: json.dump(cfg, f, indent=2)
"

Then create:

canasta create \
  -i ekstest \
  --wiki main \
  --orchestrator k8s \
  -e /tmp/eks-test.env \
  -n ekstest.example.com \
  --ingress-class nginx \
  --tls-email <your-email> \
  --staging-certs \
  --storage-class gp2

Substitute the domain name with whatever DNS you'll point at the NGINX Ingress NLB. Use --staging-certs for the first run — Let's Encrypt has rate limits on production certs, and staging issues are unlimited (the cert will be browser-untrusted but proves the issuance flow works). Switch to production by re-running without --staging-certs once you've confirmed the chain.

What this does:

  1. Generates passwords (admin, MW secret key) and writes ~/canasta/ekstest/.env on your laptop.
  2. Creates K8s Secrets in the cluster: ekstest-db-credentials (with the RDS password from your envfile) and ekstest-mw-secrets.
  3. Renders Helm values (values.yaml) and applies the Canasta Helm chart to namespace canasta-ekstest.
  4. Deploys web, jobrunner, varnish, caddy pods. Skips the bundled db StatefulSet since USE_EXTERNAL_DB=true.
  5. Waits for the web pod to become ready and runs MediaWiki's install.php against the RDS instance.

Step 4: Verify

canasta list
# ekstest should appear with KUBERNETES / RUNNING

kubectl get pods -n canasta-ekstest
# Expect: caddy, varnish, web, jobrunner all Running

DNS-wise, the EKS Ingress controller (or whatever you point at your --domain) gives the wiki a public address. For testing without DNS:

kubectl port-forward -n canasta-ekstest svc/canasta-ekstest-caddy 8080:80
# Then http://localhost:8080

Step 5 (optional): Set up scheduled backups

If you put Restic credentials in the envfile in step 1, the backup-env Secret is already created in the cluster. Now schedule the backups:

canasta backup init -i ekstest    # creates the Restic repo at the S3 path
canasta backup schedule set -i ekstest '*/30 * * * *'   # every 30 min

A K8s CronJob is created in the namespace. Each firing dumps the wiki database from RDS into the snapshot, captures the canasta-managed Secrets for disaster recovery, and pushes everything to S3.

Verify after the next cron firing:

canasta backup list -i ekstest
# Should show one or more snapshots.

canasta backup files -i ekstest --snapshot <id>
# Confirms /currentsnapshot/config/backup/db_*.sql and secrets-ekstest.yaml are present.


Cleanup

These resources cost money. When you're done testing, tear them down:

# Canasta instance + namespace + Helm release + K8s Secrets
canasta delete -i ekstest

# RDS (no final snapshot, since this is a test)
AWS_PROFILE=canasta-personal aws rds delete-db-instance \
  --region us-east-1 \
  --db-instance-identifier canasta-test-mariadb \
  --skip-final-snapshot

# Wait for RDS deletion to complete (~5 min)
AWS_PROFILE=canasta-personal aws rds wait db-instance-deleted \
  --region us-east-1 \
  --db-instance-identifier canasta-test-mariadb

# RDS subnet group (must come after DB deletion)
AWS_PROFILE=canasta-personal aws rds delete-db-subnet-group \
  --region us-east-1 \
  --db-subnet-group-name canasta-test-rds-sng

# RDS security group (must come after DB deletion)
AWS_PROFILE=canasta-personal aws ec2 delete-security-group \
  --region us-east-1 \
  --group-id <RDS_SG>

# EKS cluster + node group + VPC + everything eksctl created
AWS_PROFILE=canasta-personal eksctl delete cluster \
  --name canasta-test \
  --region us-east-1

EKS cluster deletion takes 10–15 minutes. It tears down the node group, control plane, NAT gateway, and VPC.

ℹ️ Note: Don't forget S3. If you're done with the backup bucket and don't need the snapshots, delete it via console or CLI. S3 storage is cheap but not free; an idle bucket with snapshots will still bill a few cents per month.

ℹ️ Note: Rotate IAM access keys that you no longer need. Test-scoped access keys should be deleted from IAM rather than left active.

Production considerations

This guide produces a *test* deployment. Before running production traffic, consider:

  • Multi-AZ for RDS. Single-AZ has a multi-hour RTO if the AZ fails.
  • Larger node types. t3.medium is fine for low-traffic wikis. Heavier workloads need t3.large or beyond.
  • Cluster autoscaler. EKS managed node groups can autoscale, but the autoscaler addon needs to be installed.
  • IAM Roles for Service Accounts (IRSA) instead of static AWS access keys for the backup containers. Cleaner credential lifecycle.
  • canasta gitops init for declarative cluster state in a git repo. Survives laptop loss.
  • Backup destination outside this AWS account. Same-account backups are convenient but don't survive an account compromise.
  • Cost alerts. AWS billing alarms below your credit budget save you from surprises.

Troubleshooting

eksctl fails partway

eksctl uses CloudFormation under the hood. Check the failed stack in the AWS Console for the underlying error. Common causes: VPC quota exceeded, Elastic IP quota, IAM permission gaps. Delete the failed stack manually before retrying.

Pods on EKS can't reach RDS

Verify the security group ingress rule on the RDS security group lists the EKS worker security group as the source. From within a cluster pod:

kubectl run -n canasta-ekstest mariadb-test \
  --image=mariadb:10.11 \
  --rm -it -- \
  mariadb -h <RDS-endpoint> -u canasta -p<password> main -e 'SELECT 1'

If this hangs, the SG ingress isn't right or the RDS subnet group isn't using the same VPC.

Canasta create succeeds but the wiki returns HTTP 500

Most likely the web pod can't reach RDS or the credentials are wrong. Check pod logs:

kubectl logs -n canasta-ekstest deploy/canasta-ekstest-web --tail=100

Look for MariaDB connection errors. Verify by exec'ing into the pod and testing the DB connection manually.

EKS cluster deletion fails on the VPC

eksctl delete sometimes can't tear down the VPC because of stranded ENIs (typically from LoadBalancer services). Delete Services of type LoadBalancer first, wait a couple of minutes for AWS to release the ENIs, then re-run eksctl delete.

cert-manager challenge stays pending for the wiki domain

kubectl get challenge -n canasta-<id> shows a pending challenge for several minutes. Check whether the wiki's DNS still points at a previous test's NLB:

dig +short <wiki-fqdn>
kubectl get svc -n ingress-nginx ingress-nginx-controller

If the resolved hostname doesn't match the current EXTERNAL-IP, update your CNAME to today's NLB. cert-manager retries the challenge automatically; the cert typically issues within a minute or two of DNS propagation. (See the install-NGINX-Ingress section in Phase 1 for the NLB hostname rotation pattern.)