TL;DR

A three-part walk through Cilium for practitioners. Part 1 builds the muscle memory on a local k3d cluster. Part 2 takes the same patterns to EKS and confronts the VPC-CNI question. Part 3 makes the compliance case under FIPS 140-3 and FedRAMP. The companion repo lives at github.com/hagzag/cilium-in-practice.

Why this series, why now

Three things converged on my desk over the last few months.

First, I keep getting pulled into reviews of incidents where the root cause is the same shape: workloads that could reach anything they wanted, network controls that operated only at L3/L4, and zero visibility into what they were actually saying to each other. Capital One’s SSRF to the AWS metadata service. Tesla’s cryptojacked pods phoning home through Cloudflare. SUNBURST beaconing over DNS for weeks before anyone noticed. None of those needed novel exploits. They needed an egress allowlist and a Hubble flow log.

Second, my current project explicitly asked for pod-level network isolation and is on a path toward FedRAMP Moderate with FIPS 140-3 validated crypto. “Use NetworkPolicies” is not an answer at that scope. Auditors want control mapping, evidence, encryption in transit between pods, and audit-grade flow data. Vanilla Kubernetes does not produce any of that on its own.

Third, a VPC-CNI version regression (Orel Fichman’s writeup) turned a baseline NetworkPolicy into a silent total blackout on a customer cluster. That bug is fixed now, but it’s a tell — the AWS network-policy agent is younger code, with fewer eyes on it, than Cilium’s eBPF datapath. If you’re betting your isolation story on it, that’s a load-bearing assumption worth examining.

So I sat down with Cilium and worked through it end-to-end.

What’s in the series

Part 1 — Cilium + k3d, hands-on with eBPF network policies (Nov 2025) A local k3d cluster, no cloud account, walking the policy progression from no-policy to default-deny to L4 allow to L7 HTTP filtering to egress lockdown. Every step backed by curl probes and Hubble flow logs you can see on your laptop. Companion lab: practice/part1/.

Part 2 — Cilium on EKS: from VPC-CNI to identity-aware policy (Apr 2026) The same patterns, now on EKS. Three deployment modes side by side — VPC-CNI only, VPC-CNI + Cilium chaining, full Cilium replacement — with pros, cons, and a Terraform module for each. Includes the operational reality: AWS Load Balancer Controller interaction, IRSA, ENI exhaustion, Hubble Relay sizing.

Part 3 — Cilium under FIPS 140-3 and FedRAMP (May 2026) The compliance case. NIST 800-53 control families mapped to specific Cilium capabilities, FIPS 140-3 builds (Chainguard cilium-agent-fips, cilium-envoy-fips), transparent pod-to-pod encryption choices (WireGuard vs IPsec), and what still needs OPA / Tetragon alongside Cilium.

Who this is for

Practitioners on EKS or self-managed Kubernetes who have to make a network-policy decision that survives both an outage and an audit. The series assumes you know what a NetworkPolicy is and have used kubectl. It does not assume you know eBPF — Part 1 covers the mental model in a section, not a chapter.

If you only have time for one post, read Part 2. If you only have time for one lab, run Part 1.

How to follow along

git clone https://github.com/hagzag/cilium-in-practice
cd cilium-in-practice
task list
task check-tools
task part1:run

See you in Part 1.