Help:External database

From Canasta Wiki

Canasta can be pointed at an existing external MySQL/MariaDB-compatible database server instead of running its bundled db container. This is the recommended path for production HA: it removes the database tier as a single point of failure, since a managed service (RDS Multi-AZ, Aurora, Cloud SQL, Azure Database for MySQL) or a self-hosted cluster (Galera, Percona XtraDB, MaxScale) handles replication and failover.

When to use this

Use an external database when:

  • You already operate a managed database service for other applications and want Canasta to use it.
  • You need the wiki to survive failure of the node hosting Canasta — the bundled db container uses node-local storage, so a node failure on a single-node deployment, or on the DB-pinned node in a multi-node Kubernetes deployment, takes the wiki down regardless of how many web replicas are running.
  • You want maintenance/update.php runs to be database-side — i.e. they don't block on a single Canasta-managed DB pod restarting.

The default deployment runs one MariaDB container on local storage. That's fine for development, evaluation, and small single-node production deployments where the host's uptime is the wiki's uptime. For anything more demanding, an external database is the simplest HA win.

Prerequisites

On the database server:

  • MySQL 5.7+ or MariaDB 10.3+ (matches what the bundled db uses; older versions may work but are not tested).
  • Network reachability from the Canasta host (or, for Kubernetes, from the web and jobrunner pods).
  • A user account on the database server named root, with CREATE DATABASE and full GRANT privileges. Canasta connects to the database server as root at create time to run the install.php step that provisions the wiki's database and schema. For self-hosted MariaDB / MySQL this is usually already in place. For managed services (RDS, Aurora, Cloud SQL, Azure Database for MySQL) where the master user has a different name, create a dedicated root@'%' user with appropriate grants, e.g.:
CREATE USER 'root'@'%' IDENTIFIED BY 'STRONG_PASSWORD_HERE';
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;

If your security policy doesn't allow WITH GRANT OPTION, narrow the grants to the database name(s) you'll create — the user must still hold CREATE on at least the wiki databases-to-be.

For TLS-required databases, make sure the client trusts the server's CA — Canasta uses the system CA bundle inside the container.

Setup

External-database configuration is supplied via an env file passed to canasta create with -e/--envfile. Pass the password through this file, not on the command line, so it doesn't leak through ps or shell history.

Create the env file

cat > /path/to/external-db.env <<'EOF'
USE_EXTERNAL_DB=true
MYSQL_HOST=db.example.internal
MYSQL_PORT=3306
MYSQL_USER=canasta
MYSQL_PASSWORD=STRONG_PASSWORD_HERE
MYSQL_SSL=false
EOF
chmod 600 /path/to/external-db.env

Keys, in detail:

  • USE_EXTERNAL_DB=true required — the flag that switches Canasta out of bundled-DB mode. If set to false or omitted, the other MYSQL_* keys are ignored and the bundled db container is used.
  • MYSQL_HOST required when USE_EXTERNAL_DB=true — hostname or IP of the database server. canasta create fails with a clear error if this isn't set.
  • MYSQL_PORT (default 3306) — port the database server listens on.
  • MYSQL_USER (default root) — username MediaWiki uses at runtime to connect to the database. The install-time connection always uses root (see Prerequisites), so for the common case where the same account handles both, leave this at the default.
  • MYSQL_PASSWORD required — password for the root account on the database server. Stored in the instance's .env file (Compose) or <id>-db-credentials Secret (Kubernetes).
  • MYSQL_SSL (default false) — set to true to require TLS for connections to the database server.

Create the instance

Pass the env file with -e:

canasta create \
  --id mywiki \
  --wiki main \
  --domain-name wiki.example.com \
  -e /path/to/external-db.env

For a Kubernetes deployment, also pass --orchestrator kubernetes and any storage / TLS flags needed for that environment (see Help:Multi-node Kubernetes).

What happens during create:

  • Canasta reads USE_EXTERNAL_DB from the env file. If true, it validates that MYSQL_HOST is also set and refuses to continue otherwise.
  • The bundled db service is not started. On Compose, this is achieved by setting COMPOSE_PROFILES to an empty value (the db service is gated behind an internal-db profile that's only enabled in bundled-DB mode). On Kubernetes, the generated values.yaml sets db.enabled: false and populates an externalDatabase block.
  • install.php runs against MYSQL_HOST:MYSQL_PORT with --installdbuser = MYSQL_USER and --installdbpass = MYSQL_PASSWORD, creating the wiki database on the external server.
  • The web and jobrunner containers see USE_EXTERNAL_DB=true at runtime and skip the wait-for-local-db loop they use in bundled-DB mode.

Verify

After canasta create finishes:

# The bundled db container/pod should NOT exist:
docker ps --filter "name=mywiki_db_1"            # Compose: empty output
kubectl get pod -n canasta-mywiki | grep -E '\bdb\b'   # K8s: empty output

# The wiki responds:
curl -I https://wiki.example.com/wiki/Main_Page  # 200 (after TLS provisioning)

You can also check that the wiki database was created on the external server:

SHOW DATABASES LIKE 'mywiki';
SELECT COUNT(*) FROM mywiki.user;   -- should return at least 1 (the wiki admin)

Wiki farms

The external-DB flow is per-instance, not per-wiki. All wikis in the same Canasta instance share the configured external database server (the same way they all share the bundled db container by default). Each wiki gets its own database on that shared server, named after the wiki ID — e.g. canasta add --id mywiki -w docs ... creates a docs database alongside the existing main database.

The MYSQL_USER needs CREATE DATABASE at the time canasta add runs, since the wiki-add flow also calls install.php.

Read replicas

Canasta's external-DB envfile keys (MYSQL_HOST, MYSQL_USER, etc.) configure a single primary endpoint. That single-endpoint shape is sufficient for two common HA topologies:

  • Single-endpoint HA — failover and read routing happen below MediaWiki, so MW only ever sees one address. Examples: AWS RDS Multi-AZ (the writer endpoint, with replicas behind it), Aurora cluster endpoints, MaxScale or ProxySQL fronting a Galera or replication cluster, a load-balanced VIP in front of a MariaDB pair.
  • Application-blind clusters where every node accepts both reads and writes. Examples: a Galera/Percona XtraDB cluster behind a VIP, where MW sends every query to the VIP and the cluster handles consistency.

For both, the default Canasta plumbing is enough — point MYSQL_HOST at the cluster endpoint, and that's it.

Explicit primary + replica routing

If you want MediaWiki itself to route reads to specific replicas and writes to the primary (the pattern Wikimedia uses, and the only way to use lagged replicas safely), MediaWiki has its own configuration variable $wgDBservers that takes a list of servers with per-server load weights:

  • load = 0 on the primary — receives writes (DB_PRIMARY) only.
  • load > 0 on replicas — receives reads (DB_REPLICA), weighted by load value.

MediaWiki's load balancer routes queries by load weight; chronology protection (read-after-write consistency for the same user across replication lag) is handled in the session store.

MediaWiki precedence makes this work today with no Canasta code change. If $wgDBservers is defined, MediaWiki uses it and ignores $wgDBserver — so adding the array to a per-wiki Settings.php overrides CanastaBase's single-server default. The same configuration works on both Compose and Kubernetes (same image, same LocalSettings.php include flow).

Add a file like this at config/settings/wikis/<wiki-id>/Settings.php on the instance:

// Overrides CanastaBase's default single-server $wgDBserver.
// MediaWiki prefers $wgDBservers when it's set.
$wgDBservers = [
    // Primary (writes): load=0
    [
        'host'     => 'primary.db.example.com',
        'dbname'   => $wgDBname,
        'user'     => getenv( 'MYSQL_USER' ) ?: 'root',
        'password' => getenv( 'MYSQL_PASSWORD' ),
        'type'     => 'mysql',
        'load'     => 0,
        'flags'    => DBO_DEFAULT | DBO_TRX,
    ],
    // Replicas (reads): load > 0
    [
        'host'     => 'replica1.db.example.com',
        'dbname'   => $wgDBname,
        'user'     => getenv( 'MYSQL_USER' ) ?: 'root',
        'password' => getenv( 'MYSQL_PASSWORD' ),
        'type'     => 'mysql',
        'load'     => 100,
        'flags'    => DBO_DEFAULT | DBO_TRX,
    ],
    [
        'host'     => 'replica2.db.example.com',
        'dbname'   => $wgDBname,
        'user'     => getenv( 'MYSQL_USER' ) ?: 'root',
        'password' => getenv( 'MYSQL_PASSWORD' ),
        'type'     => 'mysql',
        'load'     => 100,
        'flags'    => DBO_DEFAULT | DBO_TRX,
    ],
];

A few things to know:

  • MYSQL_HOST must still point at the primary. Canasta uses the single-endpoint env var for install.php, the wait_for_db init container on Kubernetes, and any maintenance scripts that exec into the web pod and run mariadb-dump. Those flows are write-touching or write-adjacent and need to reach the primary directly. $wgDBservers only governs the application's runtime queries.
  • Chronology protection requires a session store. MediaWiki uses its session cache ($wgMainCacheType) to remember "this user just wrote at timestamp T" and avoids serving them a replica that hasn't caught up. Verify your session store is set to a network-shared backend (Memcached, Redis) — not the default SqlBag — before pointing reads at lagged replicas, otherwise users may see stale data right after editing.
  • Canasta doesn't validate or template $wgDBservers. It's user-managed configuration in Settings.php. Mistakes — wrong host, wrong load weight, missing flags — surface as MediaWiki errors, not Canasta errors.
  • The same Settings.php path works for any per-wiki MW config. If you have multiple wikis sharing one Canasta instance with one external DB cluster, each wiki's config/settings/wikis/<wiki-id>/Settings.php can configure its own $wgDBservers independently — useful if some wikis want replica routing and others don't.

For the full $wgDBservers reference (all keys, group routing, more advanced patterns), see Manual:$wgDBservers on mediawiki.org.

Switching an existing instance

The five external-DB keys are immutable after instance creation:

  • USE_EXTERNAL_DB
  • MYSQL_HOST
  • MYSQL_PORT
  • MYSQL_USER
  • MYSQL_SSL

canasta config set blocks attempts to change these with the message:

'<key>' cannot be changed after instance creation. Delete the instance
and recreate with the new database configuration.

The reason: all wikis in the instance share the same database connection. Pointing a running instance at a different database would orphan the existing data; doing it via a config-set command would silently break the wiki at the next restart. To migrate from bundled-DB to external-DB (or vice versa), the supported path is:

  1. canasta export the wiki(s) on the existing instance.
  2. canasta delete the instance.
  3. Set up the external database (or the bundled DB) according to the new shape.
  4. canasta create a fresh instance with the desired database configuration.
  5. canasta import the exported data.

A future feature (not yet implemented) may automate this; for now the export/delete/create/import flow is the canonical way.

On Kubernetes instances, four additional keys (MYSQL_PASSWORD, WIKI_DB_PASSWORD, MYSQL_ROOT_PASSWORD, MW_SECRET_KEY) are also blocked from canasta config set because the running pods authenticate from K8s Secret resources rather than from .env. Updating .env alone wouldn't change the wiki's behavior. To rotate one of these on a K8s instance, see Help:Secrets for the kubectl-based flow.

See also