Skip to content

🌐 เอกสารภาษาไทยกำลังจัดทำ — เนื้อหาด้านล่างเป็นภาษาอังกฤษชั่วคราว จนกว่าจะมีการแปล. This page is not yet translated; English content is shown temporarily.

Backup & disaster recovery

The control-plane PostgreSQL database is the single source of truth — organizations, projects, providers, budgets, limits, guardrail config, API keys, usage ledger, and the audit log all live there. Protecting it is the core of your DR plan.

Who this is for

Platform engineers responsible for data protection and recovery.

What to back up

DataWhere it livesWhy it matters
Tenant & policy configControl-plane PostgreSQLRecreates all orgs, projects, budgets, providers, guardrails
API keysControl-plane PostgreSQLKeys keep working after restore
Usage ledgerControl-plane PostgreSQLMonth-to-date spend and history
Audit logControl-plane PostgreSQLCompliance trail
IdentityKeycloak PostgreSQLUsers, brokered-IdP config, group mappings
SecretsKubernetes Secrets / your secret storeProvider keys, OIDC secrets, DB passwords

Metrics, logs, and traces in the observability stack are operational telemetry — back them up only if your retention policy requires it. They can be reconstructed from live traffic; the config database cannot.

Backing up PostgreSQL

The control-plane database runs on CloudNativePG, which supports scheduled backups to object storage:

yaml
postgres:
  backup:
    enabled: true
    method: objectStore            # or volumeSnapshot
    objectStore:
      destinationPath: s3://backups/opsta-ai-gateway/
      endpointURL: https://s3.internal:9000

This produces base backups plus continuous WAL archiving, enabling point-in-time recovery. Apply the same approach to the Keycloak database cluster.

Back up before every upgrade

Always take a fresh backup immediately before an upgrade so you can roll back a schema migration if needed.

Restoring

  1. Restore the PostgreSQL cluster from the most recent base backup (and replay WAL to a target time for point-in-time recovery), following CloudNativePG's restore procedure.
  2. Ensure the Kubernetes Secrets are present (from your secret store or backup).
  3. Bring up the platform with helm install/upgrade. On start, the control plane connects to the restored database and reconciles the gateway from it — providers, budgets, guardrails, and keys are projected back onto the data plane automatically.

Because the gateway holds no configuration of its own, restoring the database restores the whole platform's behavior. There's no separate gateway state to recover.

bash
$ kubectl -n opsta-ai-gateway get backup
NAME                       AGE   CLUSTER    METHOD              PHASE       ERROR
opsta-pg-backup-20260614   2m    opsta-pg   barmanObjectStore   completed

Disaster-recovery posture

  • Database: in HA, a 3-instance cluster tolerates node loss; object-store backups protect against cluster loss.
  • Reproducibility: the platform rebuilds from the chart plus the restored database — no hand-built state.
  • Secrets: keep them in an external secret store so they survive cluster loss independently.

Next steps

Enterprise AI governance, on infrastructure you own.