Kubernetes 成本優化：企業級容器平台的 10 個省錢策略

📅 2026-03-15⏱ 7 min read

📑 目錄

Kubernetes 成本挑戰
叢集層級優化
1. 啟用 Cluster Autoscaler
2. 合理規劃節點池
Pod 層級優化
3. 精確設定資源 Requests 與 Limits
4. 實施 Vertical Pod Autoscaler (VPA)
5. 使用 Horizontal Pod Autoscaler (HPA)
命名空間與治理
6. 實施 ResourceQuota
7. LimitRange 預設值
監控與可觀測性
8. 建立成本儀表板
9. 部署 Kubecost
10. 定期成本審計
預期效益
下一步

Kubernetes 成本挑戰

隨著容器化部署的普及，Kubernetes 已成為企業的標準基礎設施。然而，Flexera 2026 年的報告指出，不當的 K8s 配置可能導致資源浪費高達 60%。

核心原因在於：開發者傾向於「過度配置」資源請求，以確保應用程式不會因資源不足而崩潰。但這種保守策略直接導致了大量的資源閒置。

💡 核心觀念：Kubernetes 成本優化的本質，不是減少資源，而是讓每一分錢都花在刀口上。

叢集層級優化

1. 啟用 Cluster Autoscaler

Cluster Autoscaler 會根據 Pending Pod 的資源需求自動調整節點數量：

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: cluster-autoscaler
        image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.29.0
        command:
          - ./cluster-autoscaler
          - --v=4
          - --scale-down-enabled=true
          - --scale-down-delay-after-add=5m
          - --scale-down-unneeded-time=3m
          - --skip-nodes-with-local-storage=false
          - --balance-similar-node-groups=true

關鍵參數調校：

參數	建議值	說明
`scale-down-delay-after-add`	5m	新增節點後等多久才允許縮減
`scale-down-unneeded-time`	3m	節點閒置多久後觸發縮減
`scale-down-utilization-threshold`	0.5	使用率低於 50% 才考慮移除
`max-graceful-termination-sec`	600	Pod 優雅終止最長等待時間

2. 合理規劃節點池

不同工作負載應使用不同的節點池：

節點池	實例類型	適用場景	成本策略
系統池	m7g.large	控制平面、監控	Reserved
通用池	c7g.xlarge	Web 服務、API	Savings Plans
運算池	c7g.4xlarge	資料處理、ML 推理	Spot + On-Demand
CI/CD 池	m7g.medium	建構、測試	100% Spot

⚠️ Spot 節點注意事項：Spot 節點可能隨時被回收。務必確保 Spot 池上的工作負載有適當的 Pod Disruption Budget 和重試機制。

Pod 層級優化

3. 精確設定資源 Requests 與 Limits

這是 K8s 成本優化中最重要的一環。過高的 requests 導致資源閒置，過低則造成 OOM 或被驅逐：

resources:
  requests:
    cpu: "250m"      # 根據實際 P95 使用量設定
    memory: "512Mi"  # 根據實際峰值 + 20% 緩衝
  limits:
    cpu: "1000m"     # 突發上限，通常為 requests 的 2-4 倍
    memory: "1Gi"    # 硬上限，超過即 OOM Kill

💡 實戰建議：使用以下 PromQL 查詢來找出過度配置的 Pod：
(kube_pod_container_resource_requests{resource="cpu"} 
 - rate(container_cpu_usage_seconds_total[5m])) 
/ kube_pod_container_resource_requests{resource="cpu"} > 0.7
比率大於 0.7 的 Pod 代表至少 70% 的 CPU request 被浪費。

4. 實施 Vertical Pod Autoscaler (VPA)

VPA 會根據歷史使用數據自動調整 Pod 的資源配置：

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"  # 或使用 "Off" 僅獲取建議
  resourcePolicy:
    containerPolicies:
    - containerName: my-app
      minAllowed:
        cpu: "100m"
        memory: "128Mi"
      maxAllowed:
        cpu: "2"
        memory: "4Gi"

5. 使用 Horizontal Pod Autoscaler (HPA)

搭配 VPA 使用 HPA，根據實際流量自動擴縮 Pod 副本數：

CPU 目標：建議設為 70%（留 30% 緩衝給突發流量）
自訂指標：使用 RPS（每秒請求數）比 CPU 更精確
縮減冷卻：設定 scaleDown.stabilizationWindowSeconds: 300

命名空間與治理

6. 實施 ResourceQuota

防止單一團隊或命名空間過度消耗叢集資源：

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-a-quota
  namespace: team-a
spec:
  hard:
    requests.cpu: "20"
    requests.memory: "40Gi"
    limits.cpu: "40"
    limits.memory: "80Gi"
    pods: "50"
    services: "10"

7. LimitRange 預設值

設定命名空間層級的預設資源限制，避免開發者忘記設定 resources 導致無限制消耗：

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: team-a
spec:
  limits:
  - default:
      cpu: "500m"
      memory: "512Mi"
    defaultRequest:
      cpu: "100m"
      memory: "128Mi"
    type: Container

監控與可觀測性

8. 建立成本儀表板

使用 Prometheus + Grafana 搭建成本監控，追蹤以下關鍵指標：

指標	計算方式	目標值
叢集使用率	實際使用 / 已配置	> 65%
Request 效率	實際使用 / Requested	> 60%
Spot 覆蓋率	Spot 節點 / 總節點	> 40%
閒置 Pod	CPU < 5% 的 Pod 數量	0

9. 部署 Kubecost

Kubecost 可以精確計算每個 namespace、deployment、甚至每個 Pod 的實際成本：

# 安裝 Kubecost
helm install kubecost kubecost/cost-analyzer \
  --namespace kubecost \
  --create-namespace \
  --set kubecostToken="YOUR_TOKEN"

10. 定期成本審計

建立每週成本審計 Checklist：

檢查 CPU/Memory 使用率低於 20% 的 Deployment
確認 CronJob 已設定 successfulJobsHistoryLimit 和 failedJobsHistoryLimit
驗證 HPA 和 VPA 是否正常運作
審查新部署的資源 requests 是否合理

預期效益

根據我們為 20+ 家企業實施 K8s 成本優化的經驗，以下是各策略的預期節省幅度：

優化策略	實施難度	預期節省	建議優先級
Right-sizing (策略 3)	⭐⭐	15-25%	🔴 最高
Spot 節點池 (策略 2)	⭐⭐⭐	20-40%	🔴 最高
Cluster Autoscaler (策略 1)	⭐⭐	10-20%	🟡 高
VPA + HPA (策略 4-5)	⭐⭐	10-15%	🟡 高
ResourceQuota (策略 6-7)	⭐	5-10%	🟢 中

💡 綜合效益：完整實施以上所有策略，典型企業可降低 30-50% 的 K8s 運營成本。

下一步

想要為你的 Kubernetes 叢集做一次免費的成本健檢？歡迎聯絡我們，CloudSwap 團隊將提供專業的容器平台費用優化諮詢。