lab Gateway
The lab Gateway is the cluster’s one north-south HTTP entry point.
Every web-exposed addon — AdGuard’s admin UI, the dashboard, ArgoCD,
Hubble UI, and any future addon — attaches its HTTPRoute here via
cross-namespace parentRefs instead of bringing its own Gateway. One
LB IP, one Envoy proxy, one wildcard cert, one knob to tune.
This wasn’t always the shape: each addon used to bring its own Gateway. Collapsing them down was issue #48 — see ADR-0002 — Cilium as the unified networking layer for the broader “one Cilium for everything” rationale.
Where it runs
Section titled “Where it runs”A single Gateway resource lives in the gateway-system namespace.
There is no chart and no values file —
kubernetes/apps/lab-gateway/
is just three raw manifests: a Namespace, the Gateway, and one
ReferenceGrant.
| Knob | Value |
|---|---|
gatewayClassName | cilium |
LB IP (lbipam.cilium.io/ips) | 192.168.1.201 — pinned, second slot in the pool |
| Listener hostname | *.lab.jackhall.dev |
| Listener port / protocol | 443 / HTTPS |
| TLS mode | Terminate |
| Cert ref | cert-manager/wildcard-lab-jackhall-dev-tls |
allowedRoutes.namespaces.from | All |
Why pin the LB IP
Section titled “Why pin the LB IP”192.168.1.201 is load-bearing for AdGuard Home’s wildcard rewrite:
*.lab.jackhall.dev → 192.168.1.201That single rewrite covers every addon’s hostname, present and future,
without ever editing the AdGuard UI. If the Gateway’s LB IP drifted
(re-allocation after a re-create, or a different pod claiming the
slot), every addon’s hostname would resolve to a stale address and the
LAN would silently break. Pinning the IP via lbipam.cilium.io/ips
makes the whole pattern correct by construction.
The other pinned IP, 192.168.1.200, belongs to AdGuard Home’s DNS
Service — that’s the address LAN devices configure as their resolver.
Together they’re the only two IPs in the .200–.230 Cilium LB pool
that have to stay stable; the rest are first-come for any future
LoadBalancer Service.
Cross-namespace parentRefs
Section titled “Cross-namespace parentRefs”allowedRoutes.namespaces.from: All is what makes the central-Gateway
pattern usable. Each addon’s HTTPRoute lives in the addon’s own
namespace and attaches with:
parentRefs: - group: gateway.networking.k8s.io kind: Gateway namespace: gateway-system name: lab sectionName: httpsNo per-namespace ReferenceGrant is required for that attachment —
from: All authorises it. This is the only Gateway-API mechanism that
scales: one knob on the Gateway instead of N grants in N addon
namespaces.
The flip side: a cross-namespace backend ref (an HTTPRoute
whose backendRefs point at a Service in another namespace) does still
need a ReferenceGrant in the backend’s namespace. The only addon
doing that today is Hubble UI, whose backend is owned
by Cilium and lives in kube-system.
Wildcard cert
Section titled “Wildcard cert”The Gateway terminates TLS with a real Let’s Encrypt cert for
*.lab.jackhall.dev. The cert lives in the cert-manager namespace as
the Secret wildcard-lab-jackhall-dev-tls, produced by the
letsencrypt-dns01 ClusterIssuer (DNS-01 challenge against Cloud
DNS, see
Split-horizon DNS).
A single ReferenceGrant in cert-manager authorises the lab
Gateway in gateway-system to read that Secret. This replaces the
per-addon *-wildcard-cert grants that the per-addon-Gateway era had
to ship — one Gateway, one grant.
cert-manager handles renewal automatically; the renewed Secret propagates to the Gateway without manual intervention.
Adding a new web-exposed addon
Section titled “Adding a new web-exposed addon”The addon’s kubernetes/apps/<name>/ directory ships:
- A
Namespace(or relies onCreateNamespace=trueon itsApplication). - An
HTTPRoutein that namespace with theparentRefsblock above and the addon’s Service inbackendRefs. If the backend is in a different namespace, also aReferenceGrantin the backend’s namespace. - Optional
gethomepage.dev/*annotations on theHTTPRouteso the dashboard auto-discovers the card.
That’s the entire integration. No edits to lab-gateway/, no new LB
IP, no new AdGuard rewrite. The wildcard listener matches every
hostname under *.lab.jackhall.dev automatically; the wildcard cert
covers every hostname under it; AdGuard’s wildcard rewrite resolves
every hostname under it to .201.
What lives where
Section titled “What lives where”Visually:
LAN client │ │ DNS query for foo.lab.jackhall.dev ▼192.168.1.200 (AdGuard Home, kubernetes/apps/adguard-home/) │ │ wildcard rewrite → 192.168.1.201 ▼192.168.1.201 (the lab Gateway, this addon) │ │ TLS terminate with wildcard cert (cert-manager) │ Host: foo.lab.jackhall.dev │ → match HTTPRoute parentRef'd here ▼backend Service in foo's namespaceTwo LB IPs, one cert, one Gateway. Adding addon #9 doesn’t change the left or middle column — it adds one more leaf at the right.
Recovering a wedged data path
Section titled “Recovering a wedged data path”The Gateway has one operational failure mode where the obvious next step doesn’t fix it.
Symptom. Every *.lab.jackhall.dev host times out at once —
dashboard, argocd, hubble, longhorn, AdGuard’s admin UI — while
AdGuard’s DNS at 192.168.1.200:53 still resolves. TCP to
192.168.1.201:443 and to the LB’s NodePort on the announcing worker
both time out (silent drop); other host ports on the same worker give a
clean RST, so the node itself is alive. The Gateway’s status still
says Programmed: True, attached HTTPRoutes still resolve their
refs, and the backend pods are Ready. Nothing in kubectl looks wrong.
The control plane is fine; the data plane has wedged on the
L2-announcing worker.
Identify which worker. The lab-l2-workers
CiliumL2AnnouncementPolicy parks the VIP on one worker at a time via
a lease. Find the current holder by ARP from any LAN host:
arp -n 192.168.1.201The MAC matches that worker’s NIC IP — 192.168.1.241 is worker-01,
.242 is worker-02, .243 is worker-03.
Fix. Delete the cilium-envoy pod on that one worker:
kubectl -n kube-system delete pod \ -l k8s-app=cilium-envoy \ --field-selector spec.nodeName=worker-01The DaemonSet recreates it in ~30s; the listener rebinds; every
*.lab.jackhall.dev host comes back.
Don’t restart cilium-agent first. It’s the obvious instinct and a
waste of a cycle. The agent reconnects xDS cleanly — the logs will
show Listener, RouteConfiguration, and Secret versions re-ack’d
— but doesn’t bounce the envoy process. Whatever stale state was
holding envoy’s listener wedged survives the agent restart untouched.
Only restarting cilium-envoy itself rebuilds the bound sockets.
The repo-side README at
kubernetes/apps/lab-gateway/README.md
spells out the full manifest set, the LB-IP allocation table in
context, and the migration that landed in issue #48.