Kubernetes scheduling determines where your pods run. Understanding node affinity, taints and tolerations, resource requests/limits, and autoscaling is critical for running production workloads efficiently.
Resource Requests and Limits
Every production pod should specify resource requests (guaranteed minimum) and limits (maximum allowed).
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
template:
spec:
containers:
- name: api
image: myapp:latest
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
Sizing Guidelines
- Requests: Set to the average resource usage (what the app normally needs)
- Limits: Set to 2x the request (room for spikes without OOMKill)
- CPU: 1000m = 1 CPU core. Start with 100m-250m for most microservices
- Memory: Monitor actual usage with
kubectl top podsand adjust
Node Affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- eu-central-1a
- eu-central-1b
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
preference:
matchExpressions:
- key: node-type
operator: In
values:
- compute-optimized
Taints and Tolerations
# Taint a node (only GPU workloads)
kubectl taint nodes gpu-node-1 gpu=true:NoSchedule
# Pod with toleration
spec:
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
containers:
- name: ml-training
image: tensorflow:latest
Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
Pod Disruption Budgets
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: 2 # or maxUnavailable: 1
selector:
matchLabels:
app: api-server
π Want to master Kubernetes?
Check out our Kubernetes eBooks with hands-on examples, from beginner to CKA exam preparation.
Browse DevOps Books βProduction Best Practices
- Always set resource requests and limits
- Use pod anti-affinity to spread replicas across nodes
- Configure PodDisruptionBudgets for critical services
- Set up HPA with both CPU and memory metrics
- Use topology spread constraints for zone-aware scheduling
- Monitor with
kubectl top nodesandkubectl top pods - Right-size containers using VPA recommendations