Kubernetes networking is one of those topics that seems simple on the surface but hides a lot of nuance. Every pod gets its own IP address, every service gets a stable DNS name, and traffic just... flows. Until it doesn't.
The Flat Network Model
Kubernetes mandates that all pods can communicate with each other without NAT. This is the "flat network" model and it's one of the most important design decisions in the entire system. It means that your application code doesn't need to care about the underlying network topology.
In practice, this is implemented by CNI (Container Network Interface) plugins like Calico, Cilium, or Flannel. Each takes a different approach:
- Calico uses BGP to distribute routes across nodes
- Cilium leverages eBPF for high-performance, kernel-level packet processing
- Flannel creates a simple overlay network using VXLAN
Service Discovery and DNS
Every Kubernetes service gets a DNS entry in the form <service>.<namespace>.svc.cluster.local. CoreDNS handles the resolution, and it's surprisingly fast — most queries resolve in under a millisecond.
apiVersion: v1
kind: Service
metadata:
name: my-api
namespace: production
spec:
selector:
app: my-api
ports:
- port: 80
targetPort: 8080
This service is reachable at my-api.production.svc.cluster.local from anywhere in the cluster.
Ingress Controllers
Getting traffic into the cluster is where ingress controllers come in. NGINX Ingress Controller is the most common, but Traefik and Envoy-based solutions like Contour are gaining ground.
The key insight is that an ingress controller is just a reverse proxy that watches the Kubernetes API for Ingress resources and reconfigures itself accordingly. It's a beautiful pattern — declarative infrastructure at its finest.
Service Mesh: Do You Need One?
Service meshes like Istio and Linkerd add a sidecar proxy to every pod, giving you mTLS, traffic splitting, retries, and observability for free. The trade-off is complexity and resource overhead.
My rule of thumb: if you have fewer than 20 services, you probably don't need a service mesh. If you have more than 50, you probably do. In between, it depends on your team's maturity and your observability requirements.
What I've Learned
After running Kubernetes in production for three years, the networking layer is where most subtle bugs hide. Misconfigured network policies, DNS caching issues, and conntrack table exhaustion have all bitten me at least once. The best defense is understanding the fundamentals deeply — once you know how packets actually flow through the system, debugging becomes much more tractable.