Skip to content

cert-manager

cert-manager issues the single Let’s Encrypt wildcard certificate that fronts every web-exposed addon in the lab. It is one of the four platform services terraform/bootstrap/ installs before ArgoCD takes over, and the most operationally load-bearing one — break cert renewal and every https://*.lab.jackhall.dev hostname stops working at the browser.

The chicken-and-egg: cert-manager has to exist before the wildcard Certificate resource can be created, and the wildcard cert has to exist before the central lab Gateway can terminate TLS for any addon. Putting cert-manager in terraform/bootstrap/ keeps that ordering enforced by depends_on rather than by hand. See Platform / Why Terraform for the broader rationale.

Everything lives in terraform/bootstrap/cert-manager.tf:

  • The cert-manager namespace and Helm release (chart version v1.16.2, CRDs enabled).
  • The DNS-01 service-account key — generated by Terraform, uploaded to GSM as cert-manager-dns01-key, then synced into the cluster by ESO rather than written directly. This keeps the rule from ADR-0001 intact: the only Kubernetes Secret Terraform creates directly is the ESO bootstrap key.
  • The letsencrypt-dns01 ClusterIssuer.
  • The wildcard-lab-jackhall-dev Certificate, producing the wildcard-lab-jackhall-dev-tls Secret.
ResourcePurpose
ClusterIssuer/letsencrypt-dns01The one issuer every Certificate in the cluster references. ACME DNS-01 against Cloud DNS for the lab.jackhall.dev zone.
Certificate/wildcard-lab-jackhall-dev (in cert-manager)The cluster-wide wildcard. Renewed automatically; nothing else issues a Certificate today.
Secret/wildcard-lab-jackhall-dev-tls (in cert-manager)The actual cert + key the Gateway reads via cross-namespace ReferenceGrant.
tofu output wildcard_cert_secretPrints cert-manager/wildcard-lab-jackhall-dev-tls — the value Gateway/HTTPRoute manifests reference.

Health checks the operator runs after a bootstrap or upgrade:

Terminal window
kubectl -n cert-manager get clusterissuer letsencrypt-dns01
kubectl -n cert-manager get certificate wildcard-lab-jackhall-dev

Both should report READY=True. If the Certificate is False after ~5 minutes, the most common cause is that NS delegation for lab.jackhall.dev hasn’t propagated — see the run book in terraform/bootstrap/README.md.

cert-manager sits at the centre of three other systems:

  • External Secrets Operator delivers the DNS-01 SA key. Terraform writes the key to GSM; ESO syncs it to the K8s Secret/cert-manager-dns01-key that the ClusterIssuer reads via serviceAccountSecretRef. cert-manager itself never sees a Terraform-managed Secret.
  • Cloud DNS receives the _acme-challenge.lab.jackhall.dev TXT records cert-manager writes during DNS-01. The hosted zone is named explicitly in the ClusterIssuer (hostedZoneName) rather than auto-discovered, because the SA holds roles/dns.admin scoped to this zone and lacks the project-level dns.managedZones.list permission auto-discovery would call. Naming the zone drops the list call entirely.
  • The lab Gateway terminates TLS using the wildcard cert. The Secret lives in cert-manager; the Gateway lives in gateway-system; a single ReferenceGrant in the cert-manager namespace authorises the cross-namespace read. This replaces the per-addon *-wildcard-cert grants the per-addon-Gateway era used to ship — one grant covers every hostname under *.lab.jackhall.dev.
GSM secret cert-manager-dns01-key
│ ESO ExternalSecret (refresh 1h)
Secret cert-manager/cert-manager-dns01-key
│ ClusterIssuer/letsencrypt-dns01 reads the SA JSON
ACME DNS-01 challenge → TXT record in Cloud DNS (lab.jackhall.dev zone)
│ LE issues / renews
Secret cert-manager/wildcard-lab-jackhall-dev-tls
│ ReferenceGrant in cert-manager → Gateway in gateway-system
Listener on *.lab.jackhall.dev:443 (the lab Gateway)

DNS-01 is the only ACME challenge that lets the cluster issue a wildcard cert and the only one that doesn’t need inbound HTTP from the public internet. The lab is LAN-only by design — see Split-horizon DNS and ADR-0003 — so HTTP-01 is structurally unavailable. DNS-01 also produces one cert that covers every present and future addon hostname, which is what makes the central-Gateway pattern work without per-addon issuance.

terraform/bootstrap/cert-manager.tf is the canonical source — the ClusterIssuer and Certificate specs are short enough to read end-to-end.