OpenClaw3 K8s 部署最佳实践
摘要:本文记录了将 OpenClaw3 智能助手部署到 K8s 集群的完整过程,包含 5 小时实战经验、血泪教训和验证过的正确方案。
📊 部署概述
架构设计
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| ┌─────────────────────────────────────────────────────┐ │ K8s Cluster │ │ ┌─────────────────────────────────────────────┐ │ │ │ openclaw namespace │ │ │ │ ┌─────────────────────────────────────┐ │ │ │ │ │ openclaw-gateway Deployment │ │ │ │ │ │ ┌───────────────────────────────┐ │ │ │ │ │ │ │ openclaw Container (4Gi) │ │ │ │ │ │ │ │ - Gateway: 18789 │ │ │ │ │ │ │ │ - Feishu WebSocket │ │ │ │ │ │ │ │ - Aliyun Bailian API │ │ │ │ │ │ │ └───────────────────────────────┘ │ │ │ │ │ │ ▼ PVC (200Gi) │ │ │ │ │ │ - config/ │ │ │ │ │ │ - workspace/ │ │ │ │ │ └─────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────┘ ▼ ┌─────────────────────┐ │ External Services │ ├─────────────────────┤ │ - Aliyun Bailian │ │ - Feishu Bot │ │ - Harbor Registry │ └─────────────────────┘
|
环境要求
| 项目 |
要求 |
| K8s 版本 |
1.20+ |
| 存储 |
CephFS (ReadWriteMany) |
| 镜像仓库 |
Harbor (私有) |
| 内存 |
4Gi (Pod 限制) |
| CPU |
2 Core (Pod 限制) |
🔧 部署步骤
步骤 1:创建命名空间和 RBAC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| kubectl create namespace openclaw
kubectl create serviceaccount openclaw-sa -n openclaw
kubectl create clusterrole openclaw-clusterrole \ --verb=get,list,watch,create,update,patch,delete \ --resource=pods,pods/log,pods/exec,deployments,services,configmaps,secrets,persistentvolumeclaims
kubectl create clusterrolebinding openclaw-clusterrolebinding \ --clusterrole=openclaw-clusterrole \ --serviceaccount=openclaw:openclaw-sa
|
步骤 2:创建镜像拉取 Secret
1 2 3 4 5
| kubectl create secret docker-registry harbor-registry-secret \ --docker-server=<harbor-address> \ --docker-username=<username> \ --docker-password=<password> \ -n openclaw
|
步骤 3:创建应用 Secret
1 2 3 4 5
| kubectl create secret generic openclaw-secrets \ --from-literal=FEISHU_APP_ID='<your-app-id>' \ --from-literal=FEISHU_APP_SECRET='<your-app-secret>' \ --from-literal=DASHSCOPE_API_KEY='<your-api-key>' \ -n openclaw
|
步骤 4:创建 PVC
1 2 3 4 5 6 7 8 9 10 11 12
| apiVersion: v1 kind: PersistentVolumeClaim metadata: name: openclaw-data-pvc namespace: openclaw spec: accessModes: - ReadWriteMany storageClassName: csi-cephfs-sc resources: requests: storage: 200Gi
|
步骤 5:部署应用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
| apiVersion: apps/v1 kind: Deployment metadata: name: openclaw-gateway namespace: openclaw spec: replicas: 1 strategy: type: Recreate selector: matchLabels: app: openclaw template: metadata: labels: app: openclaw spec: serviceAccountName: openclaw-sa imagePullSecrets: - name: harbor-registry-secret containers: - name: openclaw image: <harbor-address>/crystalforge/openclaw-cn-base:1.0.3-feishu imagePullPolicy: Always command: ['/bin/bash', '-c', 'openclaw gateway --allow-unconfigured'] resources: requests: cpu: 500m memory: 1Gi limits: cpu: 2 memory: 4Gi livenessProbe: httpGet: path: /health port: 18789 initialDelaySeconds: 60 periodSeconds: 30 timeoutSeconds: 10 failureThreshold: 3 ports: - containerPort: 18789 name: gateway env: - name: TZ value: Asia/Shanghai - name: FEISHU_APP_ID valueFrom: secretKeyRef: name: openclaw-secrets key: FEISHU_APP_ID - name: FEISHU_APP_SECRET valueFrom: secretKeyRef: name: openclaw-secrets key: FEISHU_APP_SECRET - name: DASHSCOPE_API_KEY valueFrom: secretKeyRef: name: openclaw-secrets key: DASHSCOPE_API_KEY volumeMounts: - name: data-volume mountPath: /root/.openclaw subPath: config - name: data-volume mountPath: /openclaw/workspace subPath: workspace volumes: - name: data-volume persistentVolumeClaim: claimName: openclaw-data-pvc
|
🚨 血泪教训(5 小时实战经验)
教训 1:配置文件格式
问题:Pod CrashLoopBackOff
错误:JSON5: invalid character '}' at 1:503
原因:使用 heredoc 方式写入 JSON 导致格式损坏
错误做法:
1 2 3
| kubectl exec ... -- bash -c 'cat > config.json << EOF { ... } EOF'
|
正确做法:
1 2 3 4 5 6 7 8 9 10
| vim config.json jq . config.json > /dev/null && echo "✅ 格式正确"
kubectl cp config.json namespace/pod:/path/config.json
kubectl exec ... -- cat /path/config.json | jq . kubectl rollout restart deployment/xxx
|
教训 2:存储方案选择
问题:Pod Pending
原因:多 PVC 方案超出集群配额
错误做法:
1 2 3 4 5 6 7 8
| volumes: - name: config-volume persistentVolumeClaim: claimName: openclaw-config-pvc - name: workspace-volume persistentVolumeClaim: claimName: openclaw-workspace-pvc
|
正确做法:
1 2 3 4 5 6 7 8 9 10 11
| volumes: - name: data-volume persistentVolumeClaim: claimName: openclaw-data-pvc volumeMounts: - name: data-volume mountPath: /root/.openclaw subPath: config - name: data-volume mountPath: /openclaw/workspace subPath: workspace
|
教训 3:配置更新方式
问题:配置无法保存
原因:ConfigMap 挂载是只读的
错误做法:
1 2 3 4
| volumes: - name: config-volume configMap: name: openclaw-config
|
正确做法:
1 2 3 4 5 6 7 8
| volumes: - name: data-volume persistentVolumeClaim: claimName: openclaw-data-pvc volumeMounts: - name: data-volume mountPath: /root/.openclaw/.openclaw/openclaw.json subPath: openclaw.json
|
✅ 验证清单
部署验证
1 2 3 4 5 6 7 8 9 10 11 12
| kubectl get pods -n openclaw
kubectl logs -n openclaw deployment/openclaw-gateway --tail=50
kubectl exec -n openclaw deployment/openclaw-gateway -- \ cat /root/.openclaw/.openclaw/openclaw.json | jq .channels.feishu.enabled
|
功能验证
1 2 3 4 5 6 7 8 9 10 11
|
curl http://<node-ip>:18789/health
|
📞 故障排查
Pod CrashLoopBackOff
1 2 3 4 5 6 7 8 9 10
| kubectl logs -n openclaw deployment/openclaw-gateway --tail=100
kubectl exec -n openclaw deployment/openclaw-gateway -- \ cat /root/.openclaw/.openclaw/openclaw.json | jq .
kubectl cp config.json namespace/pod:/root/.openclaw/.openclaw/openclaw.json kubectl delete pod -n openclaw -l app=openclaw
|
Pod Pending
1 2 3 4 5 6 7
| kubectl describe pod -n openclaw -l app=openclaw
kubectl get pvc -n openclaw
|
镜像拉取失败
1 2 3 4 5 6 7 8 9
| kubectl get secret harbor-registry-secret -n openclaw -o yaml
docker login <harbor-address>
kubectl delete secret harbor-registry-secret -n openclaw kubectl create secret docker-registry ...
|
🎯 核心原则
简单 > 复杂,验证 > 假设,备份 > 修改
部署原则
- 单 PVC 方案 - 避免多 PVC 调度问题
- 无 initContainer - 先启动再配置
- 分步验证 - 每步确认成功再继续
- 本地验证 - JSON 配置先 jq 验证再部署
配置原则
- 备份优先 - 修改前先备份
- 格式验证 - jq 验证 JSON 格式
- kubectl cp - 不用 heredoc/echo 管道
- 重启验证 - 配置更新后重启 Pod
📚 相关资源
作者: John
日期: 2026-03-10
版本: v1.0
分类: DevOps/K8s