Object storage
SeaweedFS in production — architecture, the MinIO cutover runbook, rollback, soak checklist, and secrets handling
Architecture
Production object storage is SeaweedFS running as a single weed mini process — master, volume server, filer, and S3 gateway in one pod. The S3 API listens on seaweedfs:8333 inside the cluster only; there is no Ingress route to it.
containers:
- name: seaweedfs
# Statically pinned; NOT tracked by ArgoCD Image Updater (app images only).
image: chrislusf/seaweedfs:4.31
command: [weed]
args: [mini, -dir=/data]The Django backend is the only consumer. It reads S3_ENDPOINT_URL, S3_ACCESS_KEY, and S3_SECRET_KEY from the backend-secrets Secret (mapped to the AWS_* django-storages settings in lcmd_db/config/settings/base.py). Two buckets:
| Bucket | Used for |
|---|---|
compounds | Molecule xyz files (<pk>/main.xyz), written by subset imports — BUCKETS.molecules |
local-media | Django default storage (STORAGES["default"]) |
Both are created idempotently by manage.py initialize_buckets (existing buckets are skipped), which also runs at the end of the migrate Job on every sync — a fresh store bootstraps itself.
File bytes are always streamed through the Django API — e.g. GET /api/v1/molecules/<id>/xyz_file/ returns a FileResponse read from the bucket. The storage endpoint itself is never exposed to clients.
History — SeaweedFS replaced MinIO (2026-06) after the upstream community edition was archived. The passive MinIO rollback deployment and its PVC were removed once the cutover had soaked; pre-cutover object data is gone by design (everything in compounds is reproducible from git LFS via a full reimport).
For the local-development SeaweedFS (docker-compose, same weed mini setup), see Architecture → SeaweedFS storage.
Cutover runbook
The PR that carries this page is the cutover: merging it flips S3_ENDPOINT_URL in backend-secrets from MinIO to http://seaweedfs:8333. SeaweedFS starts empty — the data is re-imported from git LFS rather than copied over.
Molecule file downloads 404 between steps 3 and 5, until the reimport has refilled the store. Announce the window.
Freeze writes and dump the database
Announce a write freeze: no imports, no bulk creation through the API. Then take a one-off dump — this is the only backup that exists, and the rollback path after step 5 depends on it:
kubectl -n prod exec postgres-0 -- \
sh -c 'pg_dump -U "$POSTGRES_USER" -Fc "$POSTGRES_DB"' \
> pre-seaweedfs-$(date +%Y%m%d).dumpMerge the cutover PR
Wait for the ArgoCD sync, then confirm the live Secret actually flipped:
kubectl -n prod get secret backend-secrets \
-o jsonpath='{.data.S3_ENDPOINT_URL}' | base64 -d
# → http://seaweedfs:8333Restart the backend
Secret changes do not restart pods (there is no Reloader controller in this cluster) — the running backend keeps its old env until you roll it:
kubectl -n prod rollout restart deployment/backend
kubectl -n prod rollout status deployment/backendBootstrap the buckets
kubectl -n prod exec deployment/backend -- python manage.py initialize_bucketsThe migrate Job runs this too, but it is a PreSync hook — during the cutover sync its pod started before the new Secret was applied, so it bootstrapped MinIO, not SeaweedFS. Run it once by hand; it is idempotent.
Reimport everything from git LFS
From a laptop with a working kubeconfig (see Dataset imports for the Job mechanics). Nervous? Canary a single subset first:
cd apps/backend
uv run manage.py launch_import_job --subset OSCARDHBD --reload # canary
uv run manage.py launch_import_job --all --reload # full reimport--reload deletes each subset's existing entities before re-importing, so DB rows and SeaweedFS objects are recreated together. Large subsets take ~30 min each on the bulk path.
Verify
Counts plus a sampled storage round-trip, from a shell in the backend pod:
kubectl -n prod exec -it deployment/backend -- python manage.py shellfrom lcmd_db.apps.molecules.models import Molecule
from lcmd_db.constants.storage import BUCKETS
from lcmd_db.core.models.storage import Storage
storage = Storage(bucket=BUCKETS.molecules)
print(Molecule.objects.count()) # matches the pre-cutover count
for m in Molecule.objects.order_by("?")[:20]:
assert storage.exists(m.xyz_file.name), m.pkAnd from the outside:
# streams the file bytes through Django
curl -fsS https://lcmd-app.epfl.ch/api/v1/molecules/<id>/xyz_file/ | head -2
# detail API still renders xyz_file as a URL string
curl -fsS https://lcmd-app.epfl.ch/api/v1/molecules/<id>/ | jq .xyz_fileThe xyz_file URL is presigned against the cluster-internal endpoint, so it is not fetchable from a browser — its presence just confirms the serializer/storage wiring resolves. Downloads go through the /xyz_file/ endpoint.
Rollback
Before step 5 (no reimport yet) — revert the cutover PR; ArgoCD restores the MinIO values in backend-secrets; kubectl -n prod rollout restart deployment/backend. MinIO still holds all the data, nothing else to do.
After step 5 — revert + restart is not enough: the --reload reimport deleted and re-created the DB rows, and their files exist only in SeaweedFS. The reverted backend would point at MinIO objects keyed by rows that no longer exist. Either:
-
restore the step-1 dump, which returns the database to the state matching MinIO's objects (this is what the write freeze protects — anything written after the dump is lost):
kubectl -n prod exec -i postgres-0 -- \ sh -c 'pg_restore -U "$POSTGRES_USER" -d "$POSTGRES_DB" --clean --if-exists' \ < pre-seaweedfs-YYYYMMDD.dump -
or re-run the full reimport (
launch_import_job --all --reload) against MinIO after the revert.
Soak checklist
Run daily for ~10 days after the cutover:
| Check | How |
|---|---|
| Backend logs clean of storage errors | kubectl -n prod logs deployment/backend --since=24h | grep -iE 'boto|s3|storage|signature' → empty |
| Sampled xyz downloads work | curl -fsS https://lcmd-app.epfl.ch/api/v1/molecules/<id>/xyz_file/ → 200 |
| SeaweedFS pod healthy | kubectl -n prod get pods -l app=seaweedfs → 0 restarts (probes already hit /healthz) |
| Disk headroom on the data volume | kubectl -n prod exec deployment/seaweedfs -- df -h /data |
| Imports still work end to end | uv run manage.py launch_import_job --subset OSCARDHBD --limit 5 --dry-run |
| ArgoCD app converged | kubectl -n argocd get application lcmd-app → Synced / Healthy |
Only after a quiet soak, merge the MinIO-removal PR.
The removal PR deletes the MinIO Deployment and its PVC — the pre-cutover data is gone and the rollback paths above stop working. Point of no return.
Editing the storage secrets
The prod overlay keeps one sops file per Secret in infrastructure/kubernetes/app/overlays/prod/ (backend-secrets.enc.yaml, seaweedfs-secrets.enc.yaml, ...), each listed in secrets-generator.yaml for KSOPS. Hard-won rules:
- Always edit through
sops(sops infrastructure/kubernetes/app/overlays/prod/backend-secrets.enc.yaml), never as text. sops ≥ 3.13 — the version pinned in the devcontainer — hard-fails decryption when the MAC is stale. A 2026-06 incident (deleting a document from the then-multidoc secrets file as text) broke MAC verification for every secret in the repo; the per-Secret file split exists so one bad edit can never invalidate the others. - The pre-commit hook
sops-decrypt-check(scripts/check-sops-mac.sh) decrypt-verifies every changed*.enc.yamlwherever an age key is available, and skips cleanly where it isn't (e.g. CI). - The age key lives at
~/.config/sops/age/keys.txt; in the devcontainer it persists in thelcmd-db-sops-agevolume.
See Kubernetes access → Secrets management for obtaining the key and general sops usage.
Known gaps and hardening
Handover notes — read before assuming any safety net exists:
There are no automated backups — not for PostgreSQL, not for object storage.
- Object storage is reproducible: every file can be regenerated from git LFS with
launch_import_job --all --reload. Losing the SeaweedFS PVC costs a reimport, not data. - PostgreSQL is not: user accounts, subset metadata, and anything created through the app exist only in the database, protected solely by manual dumps (like the cutover's step 1).
- Both stores sit on single
local-pathPVCs on the same node — one disk failure takes both. - Recommended first hardening: a
pg_dumpCronJob with copies shipped off the node. - Key escrow before a maintainer departs: add a second age recipient to
.sops.yaml, runsops updatekeyson each*.enc.yaml, and store the private key in the lab's password manager. Until then, exactly one keys.txt (plus the in-cluster copy for ArgoCD) can decrypt prod secrets.