[k8s] Storage：PV / PVC / StorageClass

Pod 死了就死了，裡面的資料也跟著消失。如果你的資料庫跑在 K8s 裡，Pod 重建一次資料就全沒了——除非你用了 Persistent Volume。K8s 的 storage 模型就是在解決這個問題：讓資料的生命週期跟 Pod 脫鉤。

先講結論

K8s 的 Pod 是短暫的，資料要活過 Pod 重啟就需要 PV（Persistent Volume）/ PVC（Persistent Volume Claim）。StorageClass 讓 PVC 自動配置 PV（動態配置），不用管理員手動建。StatefulSet 搭配 volumeClaimTemplates，每個 Pod 自動綁定自己的 PVC。備份用 Velero，可以做整個叢集或單一 namespace 的 snapshot。

PV / PVC 基本概念

PV（Persistent Volume）

PV 是叢集層級的儲存資源。可以想成「一塊硬碟」，由管理員建立或 StorageClass 動態建立。

PVC（Persistent Volume Claim）

PVC 是 Pod 對儲存的「申請單」。Pod 不直接用 PV，而是透過 PVC 來申請。K8s 會自動找一個符合條件的 PV 來綁定。

類比

PV = 停車場裡的車位
PVC = 停車證（我需要一個 10GB 的車位）
StorageClass = 停車場管理員（自動幫你找車位或新建車位）

靜態配置 vs 動態配置

靜態配置：手動建 PV

管理員先建好 PV，使用者再用 PVC 去 claim。

# 管理員建立 PV
apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv-01
spec:
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual
  hostPath:
    path: /data/pv-01
---
# 使用者建立 PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: app-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: manual
  resources:
    requests:
      storage: 10Gi

K8s 會把 app-data PVC 綁定到 local-pv-01 PV（因為大小和 accessMode 都符合）。

問題：每次要新的 PVC，管理員就要手動建 PV。10 個服務 = 10 個 PV = 10 次手動操作。不 scale。

動態配置：StorageClass 自動建 PV

定義一個 StorageClass，PVC 引用它，K8s 自動建 PV。

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iopsPerGB: "3000"
  throughput: "125"
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
---
# PVC 引用 StorageClass
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: db-data
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 50Gi

PVC 一建立，StorageClass 的 provisioner 就會自動建一個 50Gi 的 EBS gp3 volume。刪掉 PVC 後，因為 reclaimPolicy: Retain，PV 和底層的 EBS 不會被刪，避免誤刪資料。

常見 StorageClass 範例

local-path（k3s 內建，開發用）

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-path
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: rancher.io/local-path
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

k3s 預設就有這個。資料存在 Node 的本地磁碟上。簡單好用，但 Node 掛了資料就沒了。

NFS（多 Node 共享）

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-storage
provisioner: nfs.csi.k8s.io
parameters:
  server: 192.168.1.100
  share: /exports/k8s
reclaimPolicy: Retain
allowVolumeExpansion: true
mountOptions:
  - nfsvers=4.1
  - hard
  - noresvport

NFS 支援 ReadWriteMany（多個 Pod 同時讀寫同一個 volume），適合共享檔案的場景。但效能比不上 block storage。

AWS EBS（生產環境）

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ebs-gp3
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  encrypted: "true"
  kmsKeyId: arn:aws:kms:ap-northeast-1:123456789:key/xxx
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

WaitForFirstConsumer：PV 不會立刻建立，而是等 Pod 被排到某個 Node 後，在那個 Node 的 AZ 建立 EBS。避免 PV 在 AZ-a 但 Pod 被排到 AZ-b 的問題。

AccessModes：誰能讀寫

Mode	簡寫	意思	適用場景
ReadWriteOnce	RWO	一個 Node 獨佔讀寫	DB、單一 Pod 服務
ReadOnlyMany	ROX	多個 Node 唯讀	靜態資源、設定檔
ReadWriteMany	RWX	多個 Node 同時讀寫	共享上傳檔案、CMS
ReadWriteOncePod	RWOP	一個 Pod 獨佔讀寫	嚴格的資料完整性

注意：不是所有 storage backend 都支援所有 mode。EBS 只支援 RWO，EFS 支援 RWX。選 storage 之前先看 CSI driver 的支援列表。

StatefulSet + volumeClaimTemplates

StatefulSet 每個 Pod 需要自己的 PVC（不能共享），用 volumeClaimTemplates 自動建立：

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis-cluster
spec:
  serviceName: redis-svc
  replicas: 3
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
        - name: redis
          image: redis:7-alpine
          ports:
            - containerPort: 6379
          command:
            - redis-server
            - /etc/redis/redis.conf
          volumeMounts:
            - name: redis-data
              mountPath: /data
            - name: redis-config
              mountPath: /etc/redis
          resources:
            requests:
              cpu: 100m
              memory: 256Mi
            limits:
              cpu: 500m
              memory: 512Mi
      volumes:
        - name: redis-config
          configMap:
            name: redis-config
  volumeClaimTemplates:
    - metadata:
        name: redis-data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 5Gi

K8s 會自動建立：

redis-data-redis-cluster-0（5Gi PVC）
redis-data-redis-cluster-1（5Gi PVC）
redis-data-redis-cluster-2（5Gi PVC）

Pod 被砍掉重建後，還是會綁回自己的 PVC，資料不會丟。

Reclaim Policy：PVC 刪了 PV 怎麼辦

Policy	行為	建議用途
`Retain`	PVC 刪了，PV 和底層 storage 保留	生產環境（保護資料）
`Delete`	PVC 刪了，PV 和底層 storage 一起刪	開發測試（自動清理）

強烈建議：生產環境一律用 Retain。Delete 就是一個 kubectl delete pvc 就能讓你的資料永遠消失。

備份策略：Velero

就算有 Retain，你還是需要備份。Node 的磁碟壞了、雲端 volume 掛了——這些不是 K8s 能保護你的。

安裝 Velero

velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.9.0 \
  --bucket my-k8s-backups \
  --secret-file ./credentials-velero \
  --backup-location-config region=ap-northeast-1 \
  --snapshot-location-config region=ap-northeast-1

備份整個 namespace

# 建立備份
velero backup create prod-backup \
  --include-namespaces production \
  --ttl 720h    # 保留 30 天
 
# 排程備份（每天凌晨 3 點）
velero schedule create daily-prod \
  --schedule="0 3 * * *" \
  --include-namespaces production \
  --ttl 720h

還原

# 列出備份
velero backup get
 
# 還原到同一個叢集
velero restore create --from-backup prod-backup
 
# 還原到新的 namespace（disaster recovery）
velero restore create --from-backup prod-backup \
  --namespace-mappings production:production-restored

常見踩坑

PVC 一直 Pending：StorageClass 不存在、或者 StorageClass 的 provisioner 沒裝。用 kubectl describe pvc 看 Events。

Pod 排到錯的 AZ：EBS 在 AZ-a，Pod 在 AZ-b，mount 失敗。用 volumeBindingMode: WaitForFirstConsumer 解決。

容量不夠但不能擴充：StorageClass 沒設 allowVolumeExpansion: true。設了之後可以直接改 PVC 的 spec.resources.requests.storage，K8s 自動擴容。

StatefulSet 刪了但 PVC 還在：這是設計如此（保護資料）。要清除要手動 kubectl delete pvc。

Storage 是 K8s 裡最容易出事的環節。Pod 掛了可以重建，資料丟了就是丟了。花時間搞清楚 StorageClass 和備份策略，比花時間學 Helm 重要十倍。

本系列文章

← 上一篇： DNS
本篇：Storage：PV / PVC / StorageClass
下一篇： RBAC →

Terry Yao's Blog

分類

目錄