Phần 3: Pod, container và lifecycle | CodeTrekNomad — Blog Kỹ thuật cho Lập trình viên

Pod ≠ container (nhắc lại)

Pod là nhóm container cùng chia sẻ:

Network namespace: cùng IP, cùng localhost, cùng port space.
Volume: mount cùng volume từ pod spec.
Lifecycle: tất cả container trong pod được schedule lên cùng node và start/stop cùng nhau.

Dùng multi-container pod khi container phải chạy cùng node và giao tiếp qua localhost hoặc shared volume (ví dụ: sidecar log collector, envoy proxy). Nếu không cần, tách thành pod riêng.

Các loại container trong Pod

App container (main)

Container chính chạy workload. Phần lớn pod chỉ có một app container.

Init container

Chạy trước app container, tuần tự (init-1 xong → init-2 → … → app start). Dùng cho:

Migration database.
Chờ dependency sẵn sàng (wait-for-db).
Download config/cert ban đầu.

spec:
  initContainers:
    - name: wait-db
      image: busybox:1.36
      command: ["sh", "-c", "until nc -z postgres 5432; do sleep 2; done"]
    - name: migrate
      image: my-app:1.2.0
      command: ["./migrate", "up"]
  containers:
    - name: app
      image: my-app:1.2.0

Init container thất bại → pod không start. kubelet retry init container theo restartPolicy.

Sidecar container (K8s 1.29+)

Trước 1.29, sidecar là convention (thêm container trong containers[]). Từ 1.29, Kubernetes hỗ trợ native sidecar qua restartPolicy: Always trong initContainers:

spec:
  initContainers:
    - name: log-agent
      image: fluent-bit:3.0
      restartPolicy: Always # ← Sidecar: start trước app, chạy suốt
  containers:
    - name: app
      image: my-app:1.2.0

Sidecar native:

Start trước app container (như init), nhưng không block app start.
Chạy suốt vòng đời pod, shutdown sau app container.

Pod phases

Pending → Running → Succeeded / Failed
              │
              └── (container restart) → Running

Phase	Ý nghĩa
Pending	Pod đã được tạo trong etcd nhưng chưa scheduled lên node, hoặc đang pull image.
Running	Ít nhất một container đang chạy hoặc đang start/restart.
Succeeded	Tất cả container exit code 0 (thường cho Job/batch).
Failed	Ít nhất một container exit non-zero và không restart.
Unknown	Node mất liên lạc, kubelet không báo status.

Container states

Trong mỗi pod, từng container có state riêng:

State	Chi tiết
Waiting	Đang pull image, chờ init, chờ resource. `reason` cho biết cụ thể: `ContainerCreating`, `ImagePullBackOff`, `CrashLoopBackOff`.
Running	Process đang chạy. `startedAt` cho biết lúc nào.
Terminated	Process đã exit. `exitCode`, `reason` (`Completed`, `OOMKilled`, `Error`).

# Xem state chi tiết
kubectl get pod my-app -o jsonpath='{.status.containerStatuses[*].state}'

# Hoặc dễ đọc hơn
kubectl describe pod my-app | grep -A5 "State:"

Restart policy

Policy	Hành vi	Dùng cho
`Always` (mặc định)	Luôn restart khi container exit (bất kể exit code)	Deployment, DaemonSet
`OnFailure`	Restart chỉ khi exit code ≠ 0	Job
`Never`	Không bao giờ restart	Debug, Job đặc biệt

Backoff: kubelet restart container với delay tăng dần: 10s → 20s → 40s → … → 5 phút (cap). Đây là CrashLoopBackOff, không phải lỗi riêng, chỉ là kubelet đang chờ trước khi restart lần tiếp.

Debug CrashLoopBackOff, quy trình

1. kubectl describe pod <name>
   → Xem Events: lý do crash (OOMKilled? Exit code?)
   → Xem Last State: exitCode, reason

2. kubectl logs <name> --previous
   → Log của container lần chạy trước (trước restart)
   → Nếu container crash quá nhanh, log có thể trống

3. Kiểm tra exit code:
   - 1: app error (exception, config sai)
   - 137 (128+9): SIGKILL → OOMKilled hoặc bị kubelet kill
   - 139 (128+11): SIGSEGV → segfault
   - 143 (128+15): SIGTERM → graceful shutdown nhưng app không handle

4. Nếu OOMKilled → tăng memory limit hoặc fix leak (bài 13)
5. Nếu app error → fix app, sửa config/secret
6. Nếu log trống → thử chạy image local: docker run <image>

Resource fields

Mỗi container khai báo requests và limits cho CPU/memory:

containers:
  - name: app
    image: my-app:1.2.0
    resources:
      requests:
        cpu: "100m" # Scheduler dùng để đặt pod lên node
        memory: "128Mi"
      limits:
        memory: "256Mi" # Kernel cgroup enforce
        # cpu limit: thường không đặt (xem bài 13)

Chi tiết đầy đủ về requests/limits, QoS class, throttling, OOMKilled → Bài 13 và bài chuyên sâu Kubernetes requests/limits.

Graceful shutdown

Khi pod bị xoá (scale down, rolling update, drain):

1. Pod đánh dấu Terminating
2. Endpoints controller bỏ pod khỏi Service (không nhận traffic mới)
3. kubelet gửi SIGTERM cho container
4. Container có terminationGracePeriodSeconds (mặc định 30s) để cleanup
5. Hết thời gian → SIGKILL

spec:
  terminationGracePeriodSeconds: 60 # Cho app 60s cleanup
  containers:
    - name: app
      lifecycle:
        preStop:
          exec:
            command: ["/bin/sh", "-c", "sleep 5"] # Chờ LB drain

preStop hook chạy trước SIGTERM, dùng để đợi load balancer drain connection.

Điều cần giữ khi vận hành Kubernetes

Pod = nhóm container cùng network + volume + lifecycle.
Init container chạy tuần tự trước app; sidecar native (1.29+) chạy song song suốt vòng đời.
Pod phase: Pending → Running → Succeeded/Failed.
CrashLoopBackOff: xem describe → logs --previous → exit code → fix nguyên nhân.
Graceful shutdown: SIGTERM → grace period → SIGKILL. Dùng preStop để drain.

Câu hỏi hay gặp

Pod Pending mãi, bắt đầu debug từ đâu?

Trả lời: kubectl describe pod → xem Events. Nguyên nhân thường gặp: Insufficient cpu/memory (node đầy → scale node hoặc giảm requests), node selector/affinity không match, PVC chưa bound, image pull failed (sai tên image, registry auth).

Container exit code 137 nhưng describe không ghi OOMKilled?

Trả lời: SIGKILL (137) có thể từ OOM cgroup (kubelet ghi OOMKilled) hoặc từ liveness probe failed (kubelet kill container sau N lần fail). Kiểm tra liveness probe config và Events.

Khi nào dùng multi-container pod vs nhiều pod riêng?

Trả lời: Multi-container khi các container phải cùng node và giao tiếp qua localhost/shared volume (sidecar proxy, log shipper, adapter). Nếu có thể deploy/scale độc lập → pod riêng. Nguyên tắc: nghi ngờ thì tách.

Bài tiếp theo (Giai đoạn II): Deployment, ReplicaSet và rollout, rolling update và rollback không downtime.