cert-manager
cert-manager issues the single Let’s Encrypt
wildcard certificate that fronts every web-exposed addon in the lab. It
is one of the four platform services
terraform/bootstrap/
installs before ArgoCD takes over, and the most operationally
load-bearing one — break cert renewal and every https://*.lab.jackhall.dev
hostname stops working at the browser.
Why Terraform installs it
Section titled “Why Terraform installs it”The chicken-and-egg: cert-manager has to exist before the wildcard
Certificate resource can be created, and the wildcard cert has to
exist before the central lab Gateway
can terminate TLS for any addon. Putting cert-manager in
terraform/bootstrap/ keeps that ordering enforced by depends_on
rather than by hand. See
Platform / Why Terraform
for the broader rationale.
Where the config lives
Section titled “Where the config lives”Everything lives in
terraform/bootstrap/cert-manager.tf:
- The
cert-managernamespace and Helm release (chart versionv1.16.2, CRDs enabled). - The DNS-01 service-account key — generated by Terraform, uploaded to
GSM as
cert-manager-dns01-key, then synced into the cluster by ESO rather than written directly. This keeps the rule from ADR-0001 intact: the only KubernetesSecretTerraform creates directly is the ESO bootstrap key. - The
letsencrypt-dns01ClusterIssuer. - The
wildcard-lab-jackhall-devCertificate, producing thewildcard-lab-jackhall-dev-tlsSecret.
Operational entry points
Section titled “Operational entry points”| Resource | Purpose |
|---|---|
ClusterIssuer/letsencrypt-dns01 | The one issuer every Certificate in the cluster references. ACME DNS-01 against Cloud DNS for the lab.jackhall.dev zone. |
Certificate/wildcard-lab-jackhall-dev (in cert-manager) | The cluster-wide wildcard. Renewed automatically; nothing else issues a Certificate today. |
Secret/wildcard-lab-jackhall-dev-tls (in cert-manager) | The actual cert + key the Gateway reads via cross-namespace ReferenceGrant. |
tofu output wildcard_cert_secret | Prints cert-manager/wildcard-lab-jackhall-dev-tls — the value Gateway/HTTPRoute manifests reference. |
Health checks the operator runs after a bootstrap or upgrade:
kubectl -n cert-manager get clusterissuer letsencrypt-dns01kubectl -n cert-manager get certificate wildcard-lab-jackhall-devBoth should report READY=True. If the Certificate is False after
~5 minutes, the most common cause is that NS delegation for
lab.jackhall.dev hasn’t propagated — see the run book in
terraform/bootstrap/README.md.
How it interacts with the rest
Section titled “How it interacts with the rest”cert-manager sits at the centre of three other systems:
- External Secrets Operator delivers the DNS-01 SA key.
Terraform writes the key to GSM; ESO syncs it to the K8s
Secret/cert-manager-dns01-keythat theClusterIssuerreads viaserviceAccountSecretRef. cert-manager itself never sees a Terraform-managed Secret. - Cloud DNS receives the
_acme-challenge.lab.jackhall.devTXT records cert-manager writes during DNS-01. The hosted zone is named explicitly in theClusterIssuer(hostedZoneName) rather than auto-discovered, because the SA holdsroles/dns.adminscoped to this zone and lacks the project-leveldns.managedZones.listpermission auto-discovery would call. Naming the zone drops the list call entirely. - The
labGateway terminates TLS using the wildcard cert. The Secret lives incert-manager; the Gateway lives ingateway-system; a singleReferenceGrantin thecert-managernamespace authorises the cross-namespace read. This replaces the per-addon*-wildcard-certgrants the per-addon-Gateway era used to ship — one grant covers every hostname under*.lab.jackhall.dev.
GSM secret cert-manager-dns01-key │ │ ESO ExternalSecret (refresh 1h) ▼Secret cert-manager/cert-manager-dns01-key │ │ ClusterIssuer/letsencrypt-dns01 reads the SA JSON ▼ACME DNS-01 challenge → TXT record in Cloud DNS (lab.jackhall.dev zone) │ │ LE issues / renews ▼Secret cert-manager/wildcard-lab-jackhall-dev-tls │ │ ReferenceGrant in cert-manager → Gateway in gateway-system ▼Listener on *.lab.jackhall.dev:443 (the lab Gateway)Why DNS-01 (not HTTP-01)
Section titled “Why DNS-01 (not HTTP-01)”DNS-01 is the only ACME challenge that lets the cluster issue a wildcard cert and the only one that doesn’t need inbound HTTP from the public internet. The lab is LAN-only by design — see Split-horizon DNS and ADR-0003 — so HTTP-01 is structurally unavailable. DNS-01 also produces one cert that covers every present and future addon hostname, which is what makes the central-Gateway pattern work without per-addon issuance.
terraform/bootstrap/cert-manager.tf
is the canonical source — the ClusterIssuer and Certificate
specs are short enough to read end-to-end.