A default-deny NetworkPolicy is five lines of spec. Those five lines will also kill DNS resolution for every pod they select, because an egress deny blocks UDP packets to kube-dns just as happily as it blocks the traffic you were actually worried about. The distance between "I understand network policies" and "I rolled out default deny without an outage" is mostly three blind spots: DNS, your ingress controller, and admission webhooks.
Out of the box, Kubernetes runs a flat pod network. Every pod can open a connection to every other pod in the cluster, across namespaces, no questions asked. If you've already done the work of building least-privilege service accounts, a flat network is the same problem one layer down: identity is locked tight while the network is wide open. This post is about closing that gap with Calico on a bare-metal cluster (K8s 1.31, Calico 3.x), in an order that doesn't take the cluster down while you do it.
One prerequisite worth stating plainly: the NetworkPolicy API objects exist in every cluster, but they do nothing unless your CNI enforces them. Calico does. If you're on a CNI without policy support, you can apply these manifests all day and traffic flows anyway, which is its own special category of false confidence.
The rollout that looks right and isn't
The tempting approach goes like this: write one default-deny policy, template it across every namespace, apply, done. Security checkbox ticked before lunch.
Here's the policy everyone starts with:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: team-a
spec:
podSelector: {} # selects every pod in the namespace
policyTypes:
- Ingress
- Egress
The empty `podSelector` selects all pods, and listing both policy types makes them isolated in both directions. Correct, minimal, and the moment it lands cluster-wide, three things break in a predictable order.
### Failure one: DNS dies first, and it dies slowly
Every pod in a selected namespace loses the ability to resolve names, because queries to kube-dns in `kube-system` are egress traffic like any other. The nasty part is the failure mode. Connections to a denied endpoint fail fast with a timeout you'll notice. DNS failures look different: each lookup waits out a 5-second timeout per attempt, multiplied by the search domain list your `ndots` config generates. Apps get slow before they get broken, which sends you debugging application performance instead of network policy. I wrote about how the search domain expansion amplifies this in [the ndots:5 post](https://guatulabs.dev/posts/wildcard-dns-ndots-5-the-tls-nightmare-and-how-to-fix-it/); default deny turns every one of those expanded lookups into a 5-second black hole.
### Failure two: your ingress controller can't reach anything
Traffic from Traefik or ingress-nginx to your backend pods is just pod-to-pod traffic crossing a namespace boundary. Default deny on the application namespace blocks it, and every service behind the ingress starts returning 502s and 504s. The application pods are healthy, the Service endpoints are populated, readiness probes pass (kubelet probes come from the node, and Calico permits them). Everything looks green except the part where users reach it. This also bites cert-manager: an HTTP-01 challenge needs the ingress controller to reach the temporary solver pod, so default deny can silently stall certificate issuance long after the initial rollout.
### Failure three: the webhook deadlock
This is the one that turns a degraded cluster into a stuck one. Admission webhooks (Kyverno, cert-manager's webhook, anything with a `ValidatingWebhookConfiguration`) receive calls from the API server. Deny ingress to the webhook pod and those calls time out. With `failurePolicy: Fail`, the API server now rejects the operations that webhook gates, and here's the kicker: the NetworkPolicy you're trying to apply to fix the problem is itself an API operation that flows through admission. You're locked out of the fix by the thing you broke.
It gets worse if the policies are managed by automation. With a Kyverno generate rule or [a GitOps controller](https://guatulabs.dev/posts/gitops-for-homelabs-argocd-app-of-apps/) syncing the policy, deleting the offending NetworkPolicy by hand buys you a few seconds before it's regenerated. You end up playing whack-a-mole against your own reconciliation loop while the cluster burns. The escape hatch is to pause the automation first (scale down Kyverno, disable ArgoCD auto-sync for that app), then remove the policy.
A detail that matters here: API server traffic to webhooks often originates from the control plane host network, not from a pod you can match with a `podSelector`. Allowing it means an `ipBlock` rule for your control plane CIDR, or excluding webhook namespaces from default deny entirely. I do the latter.
## A rollout order that works
The fix for all three failures is the same discipline: never apply a deny you haven't already written the allows for, and never apply it wider than you can watch.
### Step 1: one namespace, not the cluster
Pick a single application namespace with low blast radius. Resist the urge to start cluster-wide; the whole point of the first namespace is to discover the flows you forgot existed. `kubectl get networkpolicy -A` should stay boring while you learn.
### Step 2: the baseline trio
Default deny ships as a set of three policies applied together, in one `kubectl apply -f` of one directory. The deny:
yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: team-a
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
The DNS allow, which goes everywhere the deny goes, no exceptions:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
namespace: team-a
spec:
podSelector: {}
policyTypes: [Egress]
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
Both protocols matter. DNS falls back to TCP for large responses, and an egress rule that only allows UDP produces intermittent failures that are miserable to track down.
The intra-namespace and ingress-controller allow:
yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-baseline-ingress
namespace: team-a
spec:
podSelector: {}
policyTypes: [Ingress]
ingress:
# any pod in this same namespace
- from:
- podSelector: {}
# everything in the ingress controller's namespace
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress
That kubernetes.io/metadata.name label is the load-bearing trick here. Since K8s 1.22, every namespace carries it automatically with its own name as the value, which gives you a stable way to select namespaces without inventing and maintaining your own labeling scheme.
With the trio applied, check behavior from inside the namespace before moving on:
# throwaway pod inside the locked-down namespace
kubectl -n team-a run probe --rm -it --image=busybox:1.36 --restart=Never -- sh
# inside the pod:
nslookup kubernetes.default # should answer instantly
wget -qO- -T 2 http://api.team-b.svc.cluster.local # should time out
Fast DNS plus a slow, eventually-failing cross-namespace connection is the signature of a healthy baseline. Instant DNS failure means the allow-dns policy didn't land; an instant cross-namespace success means the deny didn't.
### Step 3: log before you deny
Calico's `Log` rule action is the visibility tool the vanilla NetworkPolicy API doesn't have. Before tightening further, I put a logging policy behind the allows so I can see what the deny is about to catch:
yaml
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
name: log-unmatched
spec:
order: 4000 # evaluated after everything else
namespaceSelector: projectcalico.org/name == 'team-a'
types: [Ingress, Egress]
ingress:
- action: Log
egress:
- action: Log
With the iptables dataplane, Log uses the kernel LOG target, so dropped-candidate packets show up in the kernel log with a calico-packet: prefix (configurable via logPrefix in FelixConfiguration):
journalctl -k --grep calico-packet
Two caveats. Kernel logging is noisy, so treat this as a diagnostic you enable for hours, not a permanent fixture. And the eBPF dataplane doesn't support the `Log` action, so if you've switched dataplanes this tool isn't available.
This step is where "set and forget" turns into something closer to auditing. Run a logging policy for a day against a namespace before enforcing, and you find the flows nobody documented: the metrics scraper, the backup job, the sidecar that phones a service in another namespace.
One class of flow deserves special mention: anything running with `hostNetwork: true`. Node-level monitoring agents and some bare-metal ingress deployments source their traffic from the node's IP, not a pod IP, so `podSelector` and `namespaceSelector` rules never match them. If scraping or health checks break only after enforcement, this is usually why, and the fix is an `ipBlock` rule covering your node CIDR rather than another selector you'll fight with.
### Step 4: the cluster-wide backstop
Once the per-namespace pattern is proven, Calico's `GlobalNetworkPolicy` enforces namespace isolation as a guardrail across every tenant namespace at once, with infrastructure explicitly carved out:
yaml
apiVersion: projectcalico.org/v3
kind: GlobalNetworkPolicy
metadata:
name: tenant-isolation-backstop
spec:
order: 3000
namespaceSelector: >-
projectcalico.org/name not in
{"kube-system", "calico-system", "calico-apiserver",
"ingress", "argocd", "cert-manager", "kyverno"}
types: [Ingress, Egress]
egress:
# DNS keeps working even where namespace policies are missing
- action: Allow
protocol: UDP
destination:
selector: k8s-app == 'kube-dns'
ports: [53]
- action: Allow
protocol: TCP
destination:
selector: k8s-app == 'kube-dns'
ports: [53]
No explicit Deny rule, and that's deliberate. In Calico, when at least one policy selects an endpoint and no rule allows the packet, the packet is dropped at the end of evaluation. The backstop selects everything outside the exclusion list, allows DNS, and lets the implicit deny do the rest.
The order: 3000 is doing real work. Calico assigns Kubernetes NetworkPolicies an order of 1000, and lower order means earlier evaluation. An allow in a namespace's own policy terminates evaluation before the backstop is ever consulted. The backstop only catches traffic nothing else has claimed, which means namespaces with proper policies behave per their policies, and namespaces without any get isolation by default instead of the flat network.
That exclusion list is the "infrastructure exclusion" pattern, and I'd argue it's the single most important decision in the whole rollout. The namespaces that run your CNI, your ingress, your GitOps controller, and your admission webhooks are the namespaces where a policy mistake costs you the ability to fix policy mistakes. Leave them out of automated enforcement. Write their policies by hand, later, one at a time, with the logging step in between.
Step 5: automate generation, with the same exclusions
For new namespaces, a Kyverno generate rule stamps the baseline trio in automatically:
yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: generate-default-deny
spec:
rules:
- name: default-deny
match:
any:
- resources:
kinds: [Namespace]
exclude:
any:
- resources:
kinds: [Namespace]
names: ["kube-system", "kube-public", "kube-node-lease",
"calico-system", "ingress", "argocd", "kyverno"]
generate:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
name: default-deny
namespace: "{{request.object.metadata.name}}"
synchronize: true
data:
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
Two operational notes. `synchronize: true` is what creates the regeneration loop from failure three: hand-deleting the generated policy gets it recreated within seconds, so during an incident you pause the ClusterPolicy before touching its output. And Kyverno treats generate rules as effectively immutable: if the generated resource definition is wrong, plan on deleting and recreating the ClusterPolicy rather than patching it in place.
## Why this works
The mental model that makes all of this predictable: Kubernetes NetworkPolicies are additive allow-lists with an implicit deny that activates the moment any policy selects a pod. There is no deny rule in the vanilla API. A pod selected by zero policies accepts everything; a pod selected by any policy accepts only what the union of matching policies allows. That's why the baseline trio works as a set: the deny policy flips the pod into isolated mode, and the other two define the allowed surface.
Calico layers an ordered evaluation model on top. Policies are sorted by `order`, rules within a policy run top to bottom, and the first `Allow` or `Deny` terminates evaluation. Kubernetes-native policies slot in at order 1000 (you can see the converted versions with `calicoctl get networkpolicy --all-namespaces`, prefixed `knp.default.`). Pods matched by no policy at all fall through to Calico's per-namespace profiles, which default to allow. That layering is exactly what makes the backstop-at-3000 pattern safe: specific intent at 1000 wins, the guardrail catches the remainder, and the logging policy at 4000 sees only what's about to die.
Felix, Calico's per-node agent, also quietly saves you from the worst self-own. Its failsafe port list (SSH on 22, the API server on 6443, BGP on 179, etcd, Typha) is exempt from policy on host endpoints by default, so a bad policy can break your workloads without also locking you out of the nodes you need to fix it from. Don't shrink that list without a very specific reason.
## Lessons learned
The failure modes are knowable in advance. DNS, ingress, and webhooks fail in that order every time, and writing the allows before the deny is cheaper in every way than discovering them from a monitoring graph. If a rollout plan doesn't mention `kube-dns`, port 53, or `failurePolicy`, it isn't done.
Namespace-by-namespace beats cluster-wide, even though it feels slower. The first namespace takes a day because you're discovering undocumented flows. The tenth takes ten minutes because there's nothing left to discover. Going cluster-wide first inverts that: you discover everything at once, in production, with automation re-applying the breakage faster than you can remove it.
Exclude infrastructure from automation permanently, not temporarily. Every system that can generate or sync policies (Kyverno, ArgoCD, your own scripts) should carry the same exclusion list for `kube-system`, the CNI namespace, ingress, GitOps, and webhook namespaces. The asymmetry is stark: a missing policy in those namespaces costs you some security posture, while a wrong policy there costs you the control plane's ability to accept the fix.
Logging is the difference between policy as guesswork and policy as engineering. The `Log` action is crude (kernel log lines, iptables dataplane only), but it converts "why is this connection failing" from a hypothesis into a grep. I'd take crude visibility over elegant blindness in any network debugging session. This pattern, restrict by default and watch the boundary, is the same shape as the guardrails I build around [autonomous agent infrastructure](https://guatulabs.com/services): the deny is easy, and the engineering is in the observability that tells you what the deny will cost before you pay it.
The thing the docs undersell is that default deny is a migration, not a manifest. The YAML is trivial. The work is the inventory of flows your cluster actually depends on, and you only get that inventory by watching one namespace at a time with the logs on.
Top comments (0)