腾讯云cbs-csi接口测试
csi是k8s定义的容器存储接口,在k8s 1.13版GA。各个云商、分布式存储项目可以开发自己的符合csi规范的接口,使得k8s可以通过csi调用去管理持久化卷的生命周期
当使用腾讯云自建k8s集群,又想使用腾讯云的cbs云硬盘持久化数据,我们可以安装腾讯提供的csi接口,包括cbs,cos,cfs等产品都能被k8s调用。git项目地址:kubernetes-csi-tencentcloud
安装文档参考git的文档基本没难度。
StorageClass、pvc、pv简单概念
pv = 存储卷,是用来抽象实际硬盘的
pvc = 更上层的抽象,一般供pod、statefulset使用。描述存储的特性,比如大小,生命周期,读写特性,根据这些特性匹配pv
StorageClass = 对存储产品的抽象,比如腾讯云盘cbs,nfs等,pvc使用StorageClass可以自动创建pv
pvc可以绑定StorageClass自动创建pv,或者绑定手动创建的pv
手动创建pv
在腾讯云控制台手动创建一个cbs,获得disk id。之后就能手动创建pv和绑定pvc,pvc采用lable匹配pv或者直接写明pv name。
下面提供一个示例:
1 | kind: StorageClass |
- disk-7naj0icr是在腾讯控制台创建好的磁盘
- nodeAffinity段描述可用区,腾讯的cvm和磁盘要同一可用区才能挂载
- 可以直接在pvc中写明volumeName: pvname
- 可以用selector.matchLabels 绑定pvc和pv关系
创建好pv后是Available状态,和pvc绑定后是bound状态
StorageClass自动创建pv
自动按需创建10g的高性能云盘,采用预付费方式,时长1个月,到期自动续费。如果cvm在多个zone的记得加上WaitForFirstConsumer拓扑感知。
cbs.yml
1 | kind: StorageClass |
创建pod的时候,pv自动创建了,在腾讯云控制台也能看到多了一个盘。如果是用deployment,pv需要支持能共享读写,因为deployment可能创建多个pod,如果是ReadWriteOnce则会有挂载问题。
如果需要用deployment,则replicas要等于1,同时指定 .spec.strategy.type==Recreate,在创建新 Pods 之前,所有现有的 Pods 会被杀死,保证不会多次挂载硬盘。
pvc status显示bound表示已经绑定了pv,pv是整个集群共有的资源,pvc是namespace下的资源。
1 | ubuntu@k8s-dev-m1:~/k8s$ kubectl get pvc |
测试
写性能
大概就是100多M/s的样子,符合预期
1 | root@nettool-cbs:/# time dd if=/dev/zero of=/data/z0 count=1024 bs=1k |
pv和数据回收测试
pod restart 1次后数据还在
1 | ubuntu@k8s-dev-m1:~/k8s$ kubectl get po |
delete pod的时候可能会有点慢,因为要卸载磁盘,delete完成后在腾讯控制台能看到磁盘是待挂载状态。之后重建pod,数据还在
1 | root@nettool-cbs:/data# ls |
delete pvc后pv也一起delete了,因为reclaimPolicy默认是delete,腾讯云控制台看到磁盘是待回收状态
1 | ubuntu@k8s-dev-m1:~/k8s$ kubectl delete pvc cbs-premium.10g |
踩坑
如果开启了psp可能安装失败,因为cbs-node容器需要很多特权,我测试即使给sa绑定了特权psp也不行,反正psp也是准备废弃的东西,干脆关掉。
报错:Warning FailedCreate 1s daemonset-controller Error creating: pods “cbs-csi-node-“ is forbidden: unable to validate against any pod security policy: [spec.securityContext.hostPID: Invalid value: true: Host PID is not allowed to be used spec.containers[1].securityContext.capabilities.add: Invalid value: “SYS_ADMIN”: capability may not be added spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.securityContext.hostPID: Invalid value: true: Host PID is not allowed to be used spec.volumes[0]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[3]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[4]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[5]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[6]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.containers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.containers[1].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.volumes[0]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[3]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[4]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[5]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.volumes[6]: Invalid value: “hostPath”: hostPath volumes are not allowed to be used spec.containers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.containers[1].securityContext.allowPrivilegeEscalation: Invalid value: true: Allowing privilege escalation for containers is not allowed spec.containers[1].securityContext.capabilities.add: Invalid value: “SYS_ADMIN”: capability may not be added spec.containers[1].volumeMounts[3].readOnly: Invalid value: false: must be read-only spec.containers[1].volumeMounts[4].readOnly: Invalid value: false: must be read-only]
yaml里面的image改成腾讯自己的容器仓库会快很多
例如 ccr.ccs.tencentyun.com/k8scsi/csi-node-driver-registrar:v1.2.0
如果使用StorageClass自动创建pv,腾讯的子账号需要有财务权限 QcloudCVMFinanceAccess
报错:TencentCloudSDKError] Code=UnauthorizedOperation.NotHavePaymentRight, Message=(6e17e85dfc0b)您没有付款权限,无法完成支付,请确认您的付款权限后重试, RequestId=3474841d-5273-42fb-abe8-6e17e85dfc0b
手工绑定pvc和pv也需要storageClass
报错: Cannot bind to requested volume “disk-7naj0icr”: storageClassName does not match
metric端口与calico冲突导致节点网络不通
默认metric端口是9099,解决方法是启动cbs加参数 -metric_port=19099 或者禁用 -enable_metrics_server=false
calico node报错日志1
2
3
42022-01-21 07:33:49.433 [INFO][58] ipsets.go 304: Finished resync family="inet" numInconsistenciesFound=0 resyncDuration=1.018007ms
2022-01-21 07:33:49.433 [INFO][58] int_dataplane.go 765: Finished applying updates to dataplane. msecToApply=1.285159
2022-01-21 07:33:50.015 [ERROR][58] health.go 196: Health endpoint failed, trying to restart it... error=listen tcp 127.0.0.1:9099: bind: address already in use
2022-01-21 07:33:51.016 [ERROR][58] health.go 196: Health endpoint failed, trying to restart it... error=listen tcp 127.0.0.1:9099: bind: address already in use