What this does

This guide configures Kubernetes Network Policies to isolate AI services into a service mesh with explicit ingress/egress rules. Inference pods communicate only with the API gateway and model storage; agent pods can reach tool APIs and vector databases; and no pod accepts traffic from unauthorized sources. This zero-trust networking model prevents lateral movement if an AI service is compromised and protects proprietary model weights from exfiltration.

Steps

Verify network policy support is enabled in the cluster:
```
kubectl api-resources | grep networkpolicies
```
Expected output: networkpolicies in the list, confirming the API resource exists.

Apply namespace labels for policy targeting:

kubectl label namespace ai-inference purpose=inference
kubectl label namespace ai-agents purpose=agents
kubectl label namespace ai-data purpose=data

Create a deny-all default policy for each namespace to establish a zero-trust baseline:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all
  namespace: ai-inference
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

Expected: after applying, no pods in the namespace can receive or send traffic.

Allow ingress from the API gateway to inference pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-gateway
  namespace: ai-inference
spec:
  podSelector:
    matchLabels:
      app: vllm
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              purpose: gateway
      ports:
        - port: 8000
          protocol: TCP
  policyTypes: [Ingress]

Allow egress from inference pods to model storage (S3 or PVC backend):

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-to-storage
  namespace: ai-inference
spec:
  podSelector:
    matchLabels:
      app: vllm
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              purpose: data
      ports:
        - port: 443
          protocol: TCP
  policyTypes: [Egress]

Allow agent pods to reach external tool APIs. Use an IP block for external services:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-to-tools
  namespace: ai-agents
spec:
  podSelector:
    matchLabels:
      app: ai-agent
  egress:
    - to:
        - ipBlock:
            cidr: 10.0.0.0/8
            except: [10.0.0.0/28]  # Exclude sensitive subnets
      ports:
        - port: 443
          protocol: TCP
    - to:
        - podSelector:
            matchLabels:
              app: weaviate
      ports:
        - port: 8080
          protocol: TCP
  policyTypes: [Egress]

Allow DNS resolution egress from all namespaces (necessary for service discovery):

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
  namespace: ai-inference
spec:
  podSelector: {}
  egress:
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - port: 53
          protocol: UDP
  policyTypes: [Egress]

Test isolation by attempting to reach a restricted service from an unauthorized pod:
```
kubectl run -it --rm test-pod --image=alpine --namespace=default -- sh
wget -qO- http://vllm.ai-inference.svc.cluster.local:8000/health
```
Expected: connection timeout, confirming the deny-all policy blocks cross-namespace traffic.

Verification

kubectl get networkpolicy -A --no-headers | wc -l

Expected output: the total count of active Network Policies across all namespaces (should be >= 5 with the defined policies).

Common failures

Service-to-service communication broken — a deny-all policy was applied without corresponding allow rules. Audit with kubectl describe networkpolicy for each namespace and ensure every legitimate communication path has an explicit allow rule.
DNS resolution fails — the deny-all policy blocks UDP port 53 to CoreDNS. Add the allow-dns policy described in step 7 to every namespace with a deny-all baseline.
Network policy has no effect — the CNI may not support Kubernetes Network Policies. Check with kubectl get pods -n kube-system | grep -E "calico|cilium|antrea" to confirm a compatible CNI is running.
Ingress allowed but response traffic blocked — egress rules are required for the return path. Ensure both the request and response directions have matching allow rules.

How to set up network policies for AI service mesh isolation

What this does

Steps

Verification

Common failures

Related guides