08. GPU Node Selection

Chapter 8 of 24 · 20 min

GPU node selection requires configuring which nodes receive pods requesting GPU resources. Kubernetes scheduler considers resource availability when placing pods, but GPU nodes require explicit labeling, taints, and node affinity rules to ensure proper placement.

Node labeling provides metadata for selection criteria. The standard labels nvidia.com/gpu.count, nvidia.com/gpu.memory, and nvidia.com/gpu.family annotate GPU characteristics without vendor-specific coupling.

# Label nodes with GPU characteristics
kubectl label node gpu-worker-1 \
  nvidia.com/gpu.count=1 \
  nvidia.com/gpu.memory=16Gi \
  nvidia.com/gpu.family=A100

# List nodes with GPU resources
kubectl get nodes -l nvidia.com/gpu.count

Taints prevent pods from scheduling on nodes unless pods tolerate the taints. Taints pair with tolerations to create dedicated GPU pools that accept only GPU workloads. The taint effect determines behavior when untolerated pods attempt scheduling.

# Taint a GPU node to dedicated AI workloads
apiVersion: v1
kind: Node
metadata:
  name: gpu-worker-1
spec:
  taints:
    - key: nvidia.com/gpu
      value: "compute"
      effect: NoSchedule

Node affinity rules select nodes based on labels while tolerations override taints. Pods specifying node affinity for GPU nodes automatically tolerate the GPU taint. The combination ensures only GPU-requiring workloads reach GPU nodes.

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
        - matchExpressions:
            - key: nvidia.com/gpu.count
              operator:Gt
              values:
                - "0"
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchLabels:
              app: inference-server
          topologyKey: kubernetes.io/hostname

Topology spread constraints distribute pods across failure domains. Specifying maxSkew ensures pods spread across zones or nodes, improving availability during zone failures. The topologyKey specifies the domain boundary for distribution.

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        app: inference-server

EXERCISE

Configure a GPU node pool in a Kubernetes cluster. Label the node with GPU characteristics, apply taints for dedicated scheduling, and deploy an inference workload using node affinity and topology spread constraints. Verify that the scheduler respects the constraints and pods spread appropriately.

# Label and taint the GPU node
kubectl label node worker-gpu-2 \
  node.kubernetes.io/gpu-pool=ai-inference \
  nvidia.com/gpu.product=NVIDIA-A100-40GB

kubectl taint node worker-gpu-2 \
  nvidia.com/gpu=ai-inference:NoSchedule

# Deploy with matching toleration
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inference-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: inference-server
  template:
    metadata:
      labels:
        app: inference-server
    spec:
      tolerations:
        - key: nvidia.com/gpu
          operator: Exists
          effect: NoSchedule
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: nvidia.com/gpu.product
                    operator: Exists
      containers:
        - name: inference
          image: inference/model-server:v2.1
          resources:
            limits:
              nvidia.com/gpu: 1
EOF