最近要新装一套k8s,二进制方式可以说比较难的了,过程有点像搭积木,好处是能更加清晰了解k8s的内部组件,这次选1.29这个比较新的版本。另外众所周知的原因,下面涉及的容器尽量使用国内源或者自己tag push的版本。
服务器规划 因为是正式的环境,选用3个master,k8s需要运行etcd,kube-apiserver,kube-controller-manager,kube-scheduler,另外我选择使用一个standalone运行的kubelet去容器化运行这些组件,这样还要安装containerd。如果不是在公有云上部署的话可能需要haproxy和keepalive做apiserver的负载均衡。node节点使用ipvs处理service,服务器OS统一使用ubuntu 22.04升级到5.19内核。 服务器规划如下
hosts
ip
component
master1
10.1.1.1
containerd,crictl,kubelet,etcd,kube-apiserver,kube-controller-manager,kube-scheduler
master2
10.1.1.2
containerd,crictl,kubelet,etcd,kube-apiserver,kube-controller-manager,kube-scheduler
master3
10.1.1.3
containerd,crictl,kubelet,etcd,kube-apiserver,kube-controller-manager,kube-scheduler
node1
10.1.1.4
containerd,crictl,kubelet,kube-proxy,ipvs
node2
10.1.1.5
containerd,crictl,kubelet,kube-proxy,ipvs
容器环境 这个集群使用containerd做容器runtime,还有crictl是客户端,这2个每个节点都安装
containerd 因为k8s好像是从1.1x版本推出了cri概念,到了1.24开始不支持docker做runtime了,现在推荐安装containerd或者cri-o做容器runtime,使用二进制安装,配置文件注意一下就好。 先从containerd的Versioning and release 说明了解kubernetes和containerd的版本关系,然后从containerd官方GitHub 和 runc 下载二进制版本 containerd配置可以参考k8s官方: 容器运行时
把下载的二进制包解压之后containerd放到/usr/local/bin,runc放到/usr/local/sbin 之后配置containerd
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 version = 2 root = "/var/lib/containerd" state = "/run/containerd" plugin_dir = "" disabled_plugins = [] required_plugins = [] oom_score = 0 [grpc] address = "/run/containerd/containerd.sock" tcp_address = "" tcp_tls_cert = "" tcp_tls_key = "" uid = 0 gid = 0 max_recv_message_size = 16777216 max_send_message_size = 16777216 [ttrpc] address = "" uid = 0 gid = 0 [debug] address = "" uid = 0 gid = 0 level = "" [metrics] address = "" grpc_histogram = false [cgroup] path = "" [timeouts] "io.containerd.timeout.shim.cleanup" = "5s" "io.containerd.timeout.shim.load" = "5s" "io.containerd.timeout.shim.shutdown" = "3s" "io.containerd.timeout.task.state" = "2s" [plugins] [plugins."io.containerd.gc.v1.scheduler"] pause_threshold = 0.02 deletion_threshold = 0 mutation_threshold = 100 schedule_delay = "0s" startup_delay = "100ms" [plugins."io.containerd.grpc.v1.cri"] disable_tcp_service = true stream_server_address = "127.0.0.1" stream_server_port = "0" stream_idle_timeout = "4h0m0s" enable_selinux = false sandbox_image = "ccr.ccs.tencentyun.com/google_container/pause-amd64:3.1" stats_collect_period = 10 systemd_cgroup = false enable_tls_streaming = false max_container_log_line_size = 16384 disable_cgroup = false disable_apparmor = false restrict_oom_score_adj = false max_concurrent_downloads = 3 disable_proc_mount = false [plugins."io.containerd.grpc.v1.cri".containerd] snapshotter = "overlayfs" default_runtime_name = "runc" no_pivot = false [plugins."io.containerd.grpc.v1.cri".containerd.default_runtime] runtime_type = "" runtime_engine = "" runtime_root = "" privileged_without_host_devices = false [plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime] runtime_type = "" runtime_engine = "" runtime_root = "" privileged_without_host_devices = false [plugins."io.containerd.grpc.v1.cri".containerd.runtimes] [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc] runtime_type = "io.containerd.runc.v2" runtime_engine = "" runtime_root = "" privileged_without_host_devices = false [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true [plugins."io.containerd.grpc.v1.cri".cni] bin_dir = "/opt/cni/bin" conf_dir = "/etc/cni/net.d" max_conf_num = 1 conf_template = "" [plugins."io.containerd.grpc.v1.cri".registry] [plugins."io.containerd.grpc.v1.cri".registry.mirrors] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"] endpoint = ["https://mirror.ccs.tencentyun.com","https://m.daocloud.io/docker.io"] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."quay.io"] endpoint = ["https://m.daocloud.io/quay.io"] [plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming] tls_cert_file = "" tls_key_file = "" [plugins."io.containerd.internal.v1.opt"] path = "/opt/containerd" [plugins."io.containerd.internal.v1.restart"] interval = "10s" [plugins."io.containerd.metadata.v1.bolt"] content_sharing_policy = "shared" [plugins."io.containerd.monitor.v1.cgroups"] no_prometheus = false [plugins."io.containerd.runtime.v1.linux"] shim = "containerd-shim" runtime = "runc" runtime_root = "" no_shim = false shim_debug = false [plugins."io.containerd.runtime.v2.task"] platforms = ["linux/amd64"] [plugins."io.containerd.service.v1.diff-service"] default = ["walking"] [plugins."io.containerd.snapshotter.v1.devmapper"] root_path = "" pool_name = "" base_image_size = ""
sandbox_image 这个要配好,因为在kubelet 1.27已经删除了–pod-infra-container-image 参数,都从containerd这里配了
ubuntu22 默认使用了cgroup v2(参考上面的k8s官方链接),所以在[plugins.”io.containerd.grpc.v1.cri”.containerd.runtimes.runc.options]下面加上SystemdCgroup = true,同时kubelet也要设置systemd cgroup
创建一个systemd配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 [Unit] Description=containerd container runtime Documentation=https://containerd.io After=network.target local-fs.target [Service] ExecStartPre=-/sbin/modprobe overlay ExecStart=/usr/local/bin/containerd Type=notify Delegate=yes KillMode=process Restart=always RestartSec=5 # Having non-zero Limit*s causes performance problems due to accounting overhead # in the kernel. We recommend using cgroups to do container-local accounting. LimitNPROC=infinity LimitCORE=infinity LimitNOFILE=1048576 # Comment TasksMax if your systemd version does not supports it. # Only systemd 226 and above support this version. TasksMax=infinity OOMScoreAdjust=-999 [Install] WantedBy=multi-user.target
sudo systemctl daemon-reload && sudo systemctl start containerd 应该能看到containerd起来了
crictl 也一样是二进制安装,从官方GitHub 下载,解压放到/usr/local/bin,然后写配置
1 2 3 cat > /etc/crictl.yaml <<EOF runtime-endpoint: unix:///run/containerd/containerd.sock EOF
完成之后运行命令测试下 sudo crictl images
master部分 kubelet standalone master的kubelet主要作为基座运行容器,kubelet本身足够稳定,也能统一风格,当然使用docker compose估计也没问题吧。 kubelet standalone这种方式记得是老版本官方推荐的部署方式,不知道有没有记错,网上很多教程用systemd跑etcd、apiserver等组件,感觉配置繁琐 standalone运行简单点,从k8s GitHub 下载二进制版本,因为这个是独立跑的,版本可以不一样,不过我这里还是用了1.29,解压之后把kubelet放到/usr/local/bin 创建systemd配置kubelet.service,这个文件可以复用到node节点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/kubernetes/kubernetes After=containerd.service Requires=containerd.service [Service] ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service ExecStartPre=/bin/mkdir -p /sys/fs/cgroup/hugetlb/system.slice/kubelet.service EnvironmentFile=/etc/kubernetes/kubelet WorkingDirectory=/var/lib/kubelet ExecStart=/usr/local/bin/kubelet $KUBELET_ARGS Restart=on-failure [Install] WantedBy=multi-user.target
kubelet,是二进制启动参数文件,现在很多参数都废弃了改成写进KubeletConfiguration
1 2 3 KUBELET_ARGS="--config=/etc/kubernetes/kubeletConfig \ --cert-dir=/etc/kubernetes/pki \ --v=2"
旧版本如果使用containerd这里还要配 –container-runtime=remote –container-runtime-endpoint=unix:///run/containerd/containerd.sock
kubeletConfig,是主要运行配置文件,大部分参数在这里,standalone的配置简单些
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 port: 10250 readOnlyPort: 10255 cgroupDriver: systemd failSwapOn: true staticPodPath: /etc/kubernetes/staticPods imageGcHighThreshold: 70 imageGcLowThreshold: 50 RuntimeRequestTimeout: 10m EnforceNodeAllocatable: ["pods"] evictionHard: memory.available: 300M authentication: anonymous: enabled: true webhook: enabled: false authorization: mode: AlwaysAllow
之后systemd启动kubelet就可以在/etc/kubernetes/staticPods放容器yaml文件了。
证书 k8s内部组件之间通讯和认证等环节很多都需要x509证书,各种证书的用途可以参考kubernetes文档 ,我们使用cfssl工具生成各种证书。 下载cfssl、cfssljson 放到/usr/local/bin 我们大概需要准备下面这些证书
ca
cert
kind
usage
etcd-ca.pem,etcd-ca-key.pem
etcd-peer.pem,etcd-peer-key.pem
server,clinet
apiserver与etcd通讯加密
k8s-ca.pem,k8s-ca-key.pem
apiserver.pem,apiserver-key.pem
server
apiserver服务器证书
controller-manager.pem,controller-manager-key.pem
server,clinet
controller-manager证书,验证controller-manager权限
kube-scheduler.pem,kube-scheduler-key.pem
server,clinet
scheduler证书,验证scheduler权限
admin.pem, admin-key.pem
client
kubectl管理集群
kubelet-server.pem,kubelet-server-key.pem
server
kubelet服务端证书 ,手动指定kubelet服务端证书
kubelet-ca.pem,kubelet-ca-key.pem
kubelet-client.pem,kubelet-client-key.pem
client
apiserver与kubelet通讯认证
agg-ca.pem, agg-ca-key.pem
agg.pem,agg-key.pem
clinet
aggregation层认证
目前证书算法可以用rsa和ecdsa,生成证书的过程参考之前的文章 ca证书例子
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 { "CA":{ "expiry":"876000h" }, "CN": "kubernetes", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "CN", "ST": "GD", "L": "Shenzhen", "O": "dingding", "OU": "System" } ] }
ca的expiry如果不写默认是5年,如果不知道5年后会踩坑
etcd集群 etcd是k8s存储引擎,后期备份很重要,容器化搭建集群很简单,把配置复制到/etc/kubernetes/staticPods目录 etcd yaml:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 apiVersion: v1 kind: Pod metadata: name: etcd-server1 spec: hostNetwork: true containers: - image: quay.io/coreos/etcd:v3.5.15 name: etcd-container command: - etcd - --name - etcd1 - --data-dir - /data/etcd - --listen-peer-urls - https://10.1.1.1:2380 - --listen-client-urls - https://0.0.0.0:4001 - --initial-advertise-peer-urls - https://10.1.1.1:2380 - --advertise-client-urls - https://10.1.1.1:4001 - --initial-cluster - etcd1=https://10.1.1.1:2380,etcd2=https://10.1.1.2:2380,etcd3=https://10.1.1.3:2380 - --initial-cluster-token - dtalk-k8s-etcd - --client-cert-auth - --trusted-ca-file=/etc/etcdssl/etcd-ca.pem - --cert-file=/etc/etcdssl/etcd-peer.pem - --key-file=/etc/etcdssl/etcd-peer-key.pem - --peer-client-cert-auth - --peer-trusted-ca-file=/etc/etcdssl/etcd-ca.pem - --peer-cert-file=/etc/etcdssl/etcd-peer.pem - --peer-key-file=/etc/etcdssl/etcd-peer-key.pem ports: - containerPort: 2380 hostPort: 2380 name: serverport - containerPort: 4001 hostPort: 4001 name: clientport volumeMounts: - mountPath: /data/etcd name: dataetcd - mountPath: /etc/etcdssl name: etcdssl readOnly: true - mountPath: /usr/lib/ssl name: usrlibssl readOnly: true - mountPath: /etc/ssl name: etcssl readOnly: true volumes: - hostPath: path: /data/etcd name: dataetcd - hostPath: path: /data/etcdssl name: etcdssl - hostPath: path: /usr/lib/ssl name: usrlibssl - hostPath: path: /etc/ssl name: etcssl
注意修改ip和–name参数,在每个master都复制一份
ETCD_NAME:节点名称,集群中唯一
ETCD_DATA_DIR:数据目录
ETCD_LISTEN_PEER_URLS:集群通信监听地址
ETCD_LISTEN_CLIENT_URLS:客户端访问监听地址
ETCD_INITIAL_ADVERTISE_PEER_URLS:集群通告地址
ETCD_ADVERTISE_CLIENT_URLS:客户端通告地址
ETCD_INITIAL_CLUSTER:集群节点地址
ETCD_INITIAL_CLUSTER_TOKEN:集群 Token
ETCD_INITIAL_CLUSTER_STATE:加入集群的当前状态,new 是新集群,existing 表示加入已有集群
节点都启动之后使用crsctl logs看看日志,如果没报错就跑一下health命令
1 2 3 etcdctl --cacert=/etc/etcdssl/etcd-ca.pem --cert=/etc/etcdssl/etcd-peer.pem --key=/etc/etcdssl/etcd-peer-key.pem \ --endpoints=https://10.1.1.1:4001,https://10.1.1.2:4001,https://10.1.1.3:4001 \ --write-out=table endpoint health
kube-apiserver apiserver证书csr
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 { "CN": "kubernetes", "hosts": [ "127.0.0.1", "10.1.1.1", "10.1.1.2", "10.1.1.3", "10.1.1.10", "192.168.0.1", "localhost", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GD", "L": "Shenzhen", "O": "dingding", "OU": "System" } ] }
hosts要写上3个master和负载均衡ip,192.168.0.1是集群service第一个ip,也是kubernetes.default service
创建一个token.csv,用于kubelet node初始认证
1 2 3 cat > token.csv << EOF $(head -c 16 /dev/urandom | od -An -t x | tr -d ' '),kubelet-bootstrap,10001,"system:kubelet-bootstrap" EOF
apiserver的yml如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 apiVersion: v1 kind: Pod metadata: name: kube-apiserver spec: hostNetwork: true containers: - name: kube-apiserver image: registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver-amd64:v1.29.7 command: - kube-apiserver args: - --bind-address=0.0.0.0 - --advertise-address=10.1.1.1 - --secure-port=7443 - --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,ResourceQuota,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,NodeRestriction - --service-cluster-ip-range=192.168.0.0/16 - --enable-bootstrap-token-auth - --authorization-mode=RBAC - --token-auth-file=/etc/kubernetes/pki/token.csv - --client-ca-file=/etc/kubernetes/pki/k8s-ca.pem - --tls-cert-file=/etc/kubernetes/pki/apiserver.pem - --tls-private-key-file=/etc/kubernetes/pki/apiserver-key.pem - --service-account-key-file=/etc/kubernetes/pki/k8s-ca-key.pem - --service-account-signing-key-file=/etc/kubernetes/pki/k8s-ca-key.pem - --service-account-issuer=https://kubernetes.default.svc - --service-node-port-range=1-65535 - --allow-privileged=true - --endpoint-reconciler-type=lease - --storage-backend=etcd3 - --etcd-servers=https://10.1.1.1:4001,https://10.1.1.2:4001,https://10.1.1.3:4001 - --etcd-cafile=/etc/kubernetes/pki/etcd-ca.pem - --etcd-certfile=/etc/kubernetes/pki/etcd-peer.pem - --etcd-keyfile=/etc/kubernetes/pki/etcd-peer-key.pem - --requestheader-client-ca-file=/etc/kubernetes/pki/agg-ca.pem - --proxy-client-cert-file=/etc/kubernetes/pki/agg.pem - --proxy-client-key-file=/etc/kubernetes/pki/agg-key.pem - --kubelet-client-certificate=/etc/kubernetes/pki/kubelet-client.pem - --kubelet-client-key=/etc/kubernetes/pki/kubelet-client-key.pem - --requestheader-allowed-names=aggregator - --requestheader-extra-headers-prefix=X-Remote-Extra- - --requestheader-group-headers=X-Remote-Group - --requestheader-username-headers=X-Remote-User - --enable-aggregator-routing=true - --v=2 ports: - containerPort: 7443 hostPort: 7443 name: https volumeMounts: - mountPath: /etc/kubernetes/pki name: kubepki readOnly: true - mountPath: /etc/ssl name: etcssl readOnly: true - mountPath: /usr/lib/ssl name: usrlibssl readOnly: true volumes: - hostPath: path: /etc/kubernetes/pki name: kubepki - hostPath: path: /etc/ssl name: etcssl - hostPath: path: /usr/lib/ssl name: usrlibssl
注意修改–advertise-address 的ip
–enable-bootstrap-token-auth 和 –token-auth-file是用于kubelet node认证,详情可以看:kubelet tls bootstrap
–requestheader-client-ca-file 和 –proxy-client开头的证书是用于扩展api认证,–requestheader-allowed-names指定允许的证书CN,参考官方config aggregate layer
–kubelet-client开头的证书用于apiserver请求kubelet通信鉴权
老版本apiserver可以绑个localhost:8080的insecure给本地用,这样controller和scheduler配置都方便点,不用那么多cert,现在新版本不给用了
kubectl 先配置好kubectl,用于后续操作apiserver 把二进制kubectl放进/usr/local/bin 使用admin.pem证书生成kubeconfig文件
1 2 3 4 5 kubectl config set-cluster kubernetes --certificate-authority=k8s-ca.pem --embed-certs=true --server=https://10.1.1.10:7443 --kubeconfig=kube.config kubectl config set-credentials admin --client-certificate=admin.pem --client-key=admin-key.pem --embed-certs=true --kubeconfig=kube.config kubectl config set-context kubernetes --cluster=kubernetes --user=admin --kubeconfig=kube.config kubectl config use-context kubernetes --kubeconfig=kube.config mv kube.config ~/.kube/config
配好之后运行kubectl version看看能不能正确返回server端版本
kube-controller-manager controller-manager证书csr
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 { "CN": "system:kube-controller-manager", "key": { "algo": "rsa", "size": 2048 }, "hosts": [ "127.0.0.1", "10.1.1.1", "10.1.1.2", "10.1.1.3" ], "names": [ { "C": "CN", "ST": "GD", "L": "Shenzhen", "O": "system:kube-controller-manager", "OU": "system" } ] }
CN和O都是system:kube-controller-manager,k8s rbac认证体系中证书CN=user,O=group,system:kube-controller-manager是内置的适用于controller-manager的权限
生成kube-controller-manager.kubeconfig,用于与apiserver通讯
1 2 3 4 kubectl config set-cluster kubernetes --certificate-authority=k8s-ca.pem --embed-certs=true --server=https://10.1.1.10:7443 --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-credentials system:kube-controller-manager --client-certificate=controller-manager.pem --client-key=controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
controller-manager.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 apiVersion: v1 kind: Pod metadata: name: kube-controller-manager spec: containers: - command: - kube-controller-manager args: - --bind-address=127.0.0.1 - --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig - --service-account-private-key-file=/etc/kubernetes/pki/k8s-ca-key.pem - --root-ca-file=/etc/kubernetes/pki/k8s-ca.pem - --feature-gates=RotateKubeletServerCertificate=true - --controllers=*,bootstrapsigner,tokencleaner - --cluster-signing-key-file=/etc/kubernetes/pki/k8s-ca-key.pem - --cluster-signing-cert-file=/etc/kubernetes/pki/k8s-ca.pem - --tls-cert-file=/etc/kubernetes/pki/controller-manager.pem - --tls-private-key-file=/etc/kubernetes/pki/controller-manager-key.pem - --use-service-account-credentials=true - --v=2 - --leader-elect=true image: registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager-amd64:v1.29.7 livenessProbe: tcpSocket: port: 10257 initialDelaySeconds: 15 timeoutSeconds: 1 name: kube-controller-manager volumeMounts: - mountPath: /etc/kubernetes name: kubeconf readOnly: true - mountPath: /var/log/kubernetes name: logfile - mountPath: /etc/ssl name: etcssl readOnly: true - mountPath: /usr/lib/ssl name: usrlibssl readOnly: true hostNetwork: true volumes: - hostPath: path: /var/log/kubernetes name: logfile - hostPath: path: /etc/kubernetes name: kubeconf - hostPath: path: /etc/ssl name: etcssl - hostPath: path: /usr/lib/ssl name: usrlibssl
kube-scheduler scheduler证书csr
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 { "CN": "system:kube-scheduler", "key": { "algo": "rsa", "size": 2048 }, "hosts": [ "127.0.0.1", "10.1.1.1", "10.1.1.2", "10.1.1.3" ], "names": [ { "C": "CN", "ST": "GD", "L": "Shenzhen", "O": "system:kube-scheduler", "OU": "system" } ] }
system:kube-scheduler是内置scheduler权限
生成kube-scheduler.kubeconfig
1 2 3 4 kubectl config set-cluster kubernetes --certificate-authority=k8s-ca.pem --embed-certs=true --server=https://10.1.1.10:7443 --kubeconfig=kube-scheduler.kubeconfig kubectl config set-credentials system:kube-scheduler --client-certificate=kube-scheduler.pem --client-key=kube-scheduler-key.pem --embed-certs=true --kubeconfig=kube-scheduler.kubeconfig kubectl config set-context system:kube-scheduler --cluster=kubernetes --user=system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
kube-scheduler.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 apiVersion: v1 kind: Pod metadata: name: kube-scheduler spec: hostNetwork: true containers: - name: kube-scheduler image: registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler-amd64:v1.29.7 command: - kube-scheduler args: - --kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig - --v=2 - --leader-elect=true livenessProbe: tcpSocket: port: 10259 initialDelaySeconds: 15 timeoutSeconds: 1 volumeMounts: - mountPath: /var/log/kubernetes name: logfile - mountPath: /etc/kubernetes name: kubeconf readOnly: true - mountPath: /etc/ssl name: etcssl readOnly: true - mountPath: /usr/lib/ssl name: usrlibssl readOnly: true volumes: - hostPath: path: /var/log/kubernetes name: logfile - hostPath: path: /etc/kubernetes name: kubeconf - hostPath: path: /etc/ssl name: etcssl - hostPath: path: /usr/lib/ssl name: usrlibssl
通过kubectl get cs验证集群安装完成
worker节点部分 服务器优化和基础软件安装 worker节点需要优化一下systctl,时间同步,ulimit等,然后安装个ipvs用于service负载均衡,传统使用iptables实现的service规则多少有点性能低吧,新安装的就按推荐的来。 时间同步如果是云服务器应该已经配好,没有的话使用systemd-timesyncd配置比较简单
ulimit /etc/security/limits.conf 加上
1 2 3 4 5 6 * soft nofile 655360 * hard nofile 131072 * soft nproc 655350 * hard nproc 655350 * soft memlock unlimited * hard memlock unlimited
sysctl.conf 这个配置是ubuntu20的,22应该也差不多吧
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 net.core.netdev_max_backlog=10000 net.core.somaxconn=32768 net.ipv4.conf.all.rp_filter=1 net.ipv4.tcp_max_syn_backlog=8096 fs.inotify.max_user_instances=8192 fs.file-max=2097152 fs.inotify.max_user_watches=524288 net.core.bpf_jit_enable=1 net.core.bpf_jit_harden=1 net.core.bpf_jit_kallsyms=1 net.core.dev_weight_tx_bias=1 net.core.rmem_max=16777216 net.core.wmem_max=16777216 net.ipv4.tcp_rmem=4096 12582912 16777216 net.ipv4.tcp_wmem=4096 12582912 16777216 net.core.rps_sock_flow_entries=8192 net.ipv4.neigh.default.gc_thresh1=2048 net.ipv4.neigh.default.gc_thresh2=4096 net.ipv4.neigh.default.gc_thresh3=8192 net.ipv4.tcp_max_orphans=32768 net.ipv4.tcp_max_tw_buckets=32768 net.ipv4.tcp_fastopen = 3 vm.max_map_count=262144 kernel.threads-max=30058 net.ipv4.ip_forward=1
kubelet 二进制安装参考上面的步骤,但是在worker节点上配置要复杂一些 先创建token.csv和bootstrap文件,用于kubelet和apiserver初始交互,主要是通过一个低权限的token交互自动创建用于真正通讯的证书和kubeconfig
1 2 3 4 5 6 BOOTSTRAP_TOKEN=$(awk -F "," '{print $1}' /etc/kubernetes/token.csv) kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://10.1.1.10:7443 --kubeconfig=kubelet-bootstrap.kubeconfig kubectl config set-credentials kubelet-bootstrap --token=${BOOTSTRAP_TOKEN} --kubeconfig=kubelet-bootstrap.kubeconfig kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap.kubeconfig kubectl config use-context default --kubeconfig=kubelet-bootstrap.kubeconfig
systemd配置用上面章节的,kubelet参数文件:
1 2 3 4 5 KUBELET_ARGS="--config=/etc/kubernetes/kubeletConfig \ --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \ --cert-dir=/etc/kubernetes/pki \ --v=2"
kubelet.kubeconfig 是自动生成的,只要配置好–bootstrap-kubeconfig就ok
–rotate-certificates 废弃了,不知道哪个版本就会删掉,改成在kubeletConfig配置
kubeletConfig是重点配置文件,大部分配置写在这里
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration port: 10250 readOnlyPort: 10255 serializeImagePulls: false authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/kubelet-ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s cgroupDriver: systemd cgroupsPerQOS: true staticPodPath: /etc/kubernetes/staticPods imageGcHighThreshold: 70 imageGcLowThreshold: 50 featureGates: RotateKubeletServerCertificate: true rotateCertificates: true serverTLSBootstrap: true RuntimeRequestTimeout: 10m EnforceNodeAllocatable: ["pods","system-reserved","kube-reserved"] SystemReservedCgroup: /system.slice KubeReservedCgroup: /system.slice/kubelet.service systemReserved: cpu: 500m memory: 512Mi kubeReserved: cpu: 500m memory: 512Mi evictionHard: memory.available: 300Mi clusterDomain: "cluster.local" clusterDNS: - "169.254.20.10"
clientCAFile 是kubelet的认证ca,apiserver请求kubelet用独立一套ca
cgroupDriver: systemd ubuntu 22.04 默认使用了cgroup v2
tlsCertFile 和 tlsPrivateKeyFile 手动指定kubelet服务端证书,注意ca是k8s-ca.pem
serverTLSBootstrap 向apiserver发送csr请求kubelet serving证书,但是出于安全原因,Kubernetes 核心中所实现的 CSR 批复控制器并不会自动批复节点的服务证书。 要使用 RotateKubeletServerCertificate 功能特性, 集群运维人员需要运行一个定制的控制器或者手动批复服务证书的请求。参考kubelet tls的证书轮换中的说明 ,
以上两个都不设置则kubelet会自签一套kubelet.crt,kubelet.key
clusterDNS 这里的ip是后面配置local dns的ip,如果不用local dns就写cluster service 网段的第二个ip 192.168.0.2
kubelet开启了x509并且在apiserver有设置kubelet client证书,需要设置一个clusterrolebinding授权,user是证书CN
1 2 kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap kubectl create clusterrolebinding kubelet-admin --clusterrole=system:kubelet-api-admin --user=kubeletadmin
设置RBAC权限,自动批复和续约 bootstrap的csr
1 2 kubectl create clusterrolebinding auto-approve-csrs-for-group --clusterrole=system:certificates.k8s.io:certificatesigningrequests:nodeclient --group=system:kubeletbootstrap --user=kubelet-bootstrap kubectl create clusterrolebinding auto-approve-renewals-for-nodes --clusterrole=system:certificates.k8s.io:certificatesigningrequests:selfnodeclient --group=system:nodes
手动批复kubelet serving csr
1 2 3 4 5 # kubectl -n kube-system get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-brvcz 85s kubernetes.io/kubelet-serving system:node:<node name> <none> Pending # kubectl certificate approve csr-brvcz
condition变成 Approved,Issued 就可以了,在cert dir会出现kubelet-server-date
.pem
kube-proxy kube-proxy的部署使用daemonset方式,直接使用 ServiceAccount 的 token 认证,不需要签发证书,也就不用担心证书过期问题 参考了bookstack.cn
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 APISERVER="https://10.1.1.10:7443" CLUSTER_CIDR="172.18.0.0/16" cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ServiceAccount metadata: name: kube-proxy namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:kube-proxy roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:node-proxier subjects: - kind: ServiceAccount name: kube-proxy namespace: kube-system --- kind: ConfigMap apiVersion: v1 metadata: name: kube-proxy namespace: kube-system labels: app: kube-proxy data: kubeconfig.conf: |- apiVersion: v1 kind: Config clusters: - cluster: certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt server: ${APISERVER} name: default contexts: - context: cluster: default namespace: default user: default name: default current-context: default users: - name: default user: tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token config.conf: |- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration bindAddress: 0.0.0.0 clientConnection: acceptContentTypes: "" burst: 10 contentType: application/vnd.kubernetes.protobuf kubeconfig: /var/lib/kube-proxy/kubeconfig.conf qps: 5 # 集群中 Pod IP 的 CIDR 范围 clusterCIDR: ${CLUSTER_CIDR} configSyncPeriod: 15m0s conntrack: # 每个核心最大能跟踪的NAT连接数,默认32768 maxPerCore: 32768 min: 131072 tcpCloseWaitTimeout: 1h0m0s tcpEstablishedTimeout: 24h0m0s enableProfiling: false healthzBindAddress: 0.0.0.0:10256 iptables: # SNAT 所有 Service 的 CLUSTER IP masqueradeAll: false masqueradeBit: 14 minSyncPeriod: 0s syncPeriod: 30s ipvs: minSyncPeriod: 0s # ipvs 调度类型,默认是 rr,支持的所有类型: # rr: round-robin # lc: least connection # dh: destination hashing # sh: source hashing # sed: shortest expected delay # nq: never queue scheduler: rr syncPeriod: 30s metricsBindAddress: 0.0.0.0:10249 # 使用 ipvs 模式转发 service mode: ipvs # 设置 kube-proxy 进程的 oom-score-adj 值,范围 [-1000,1000] # 值越低越不容易被杀死,这里设置为 —999 防止发生系统OOM时将 kube-proxy 杀死 oomScoreAdj: -999 EOF
CLUSTER_CIDR 集群POD IP,使用私有地址,同下面calico配置中的ip pool一样
创建DaemonSet
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 ARCH="amd64" VERSION="v1.29.7" cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: DaemonSet metadata: labels: k8s-app: kube-proxy-ds-${ARCH} name: kube-proxy-ds-${ARCH} namespace: kube-system spec: selector: matchLabels: k8s-app: kube-proxy-ds-${ARCH} updateStrategy: type: RollingUpdate template: metadata: labels: k8s-app: kube-proxy-ds-${ARCH} spec: priorityClassName: system-node-critical containers: - name: kube-proxy image: registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy-${ARCH}:${VERSION} imagePullPolicy: IfNotPresent command: - /usr/local/bin/kube-proxy - --config=/var/lib/kube-proxy/config.conf - --hostname-override=\$(NODE_NAME) securityContext: privileged: true volumeMounts: - mountPath: /var/lib/kube-proxy name: kube-proxy - mountPath: /run/xtables.lock name: xtables-lock readOnly: false - mountPath: /lib/modules name: lib-modules readOnly: true env: - name: NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName hostNetwork: true serviceAccountName: kube-proxy volumes: - name: kube-proxy configMap: name: kube-proxy - name: xtables-lock hostPath: path: /run/xtables.lock type: FileOrCreate - name: lib-modules hostPath: path: /lib/modules tolerations: - key: CriticalAddonsOnly operator: Exists - operator: Exists nodeSelector: kubernetes.io/arch: ${ARCH} EOF
到这里基本上kube组件安装就完成了,这时候可以看看各个组件的log有没有明显报错,没有就继续进行网络配置
如果出现大量 system:node:nodeName 相关的权限问题,可以看看system:node这个clusterrolebinding有没有正确设置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:node roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:node subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:nodes
网络和基础容器 calico 这个calico也是挺稳的,用了很久,如果在腾讯云需要配置IPIP Always模式,配置可以在官网下载 使用kubernetes api datastore,小于50节点的配置manifest
1 curl https://raw.githubusercontent.com/projectcalico/calico/v3.27.4/manifests/calico.yaml -O
文件很长,主要修改的地方是
CALICO_IPV4POOL_CIDR 改为172.18.0.0/16
CALICO_IPV4POOL_IPIP 改为Always,CrossSubnet是跨网络才使用IPIP封装
改好后kubectl apply就行,之后通过ping pod ip确认网络打通 如果有问题可以通过ip route,ip addr等命令查看路由是否正确
coredns corends是k8s的重要组件,负责集群内dns解析,coredns有很多plugins,除了kubernetes外,hosts和rewrite等也很有用,可以看官网解析
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 apiVersion: v1 kind: ServiceAccount metadata: name: coredns namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: kubernetes.io/bootstrapping: rbac-defaults name: system:coredns rules: - apiGroups: - "" resources: - endpoints - services - pods - namespaces verbs: - list - watch - apiGroups: - discovery.k8s.io resources: - endpointslices verbs: - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:coredns roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:coredns subjects: - kind: ServiceAccount name: coredns namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata: name: coredns namespace: kube-system data: Corefile: | .:53 { errors health { lameduck 5s } ready kubernetes cluster.local in-addr.arpa ip6.arpa { fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . /etc/resolv.conf { max_concurrent 1000 } cache 30 loop reload loadbalance ready :8181 } --- apiVersion: apps/v1 kind: Deployment metadata: name: coredns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/name: "CoreDNS" app.kubernetes.io/name: coredns spec: # replicas: not specified here: # 1. Default is 1. # 2. Will be tuned in real time if DNS horizontal auto-scaling is turned on. strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 selector: matchLabels: k8s-app: kube-dns app.kubernetes.io/name: coredns template: metadata: labels: k8s-app: kube-dns app.kubernetes.io/name: coredns spec: priorityClassName: system-cluster-critical serviceAccountName: coredns tolerations: - key: "CriticalAddonsOnly" operator: "Exists" nodeSelector: kubernetes.io/os: linux affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: k8s-app operator: In values: ["kube-dns"] topologyKey: kubernetes.io/hostname containers: - name: coredns image: ccr.ccs.tencentyun.com/bxrapp/coredns:1.11.1 imagePullPolicy: IfNotPresent resources: limits: memory: 170Mi requests: cpu: 100m memory: 70Mi args: [ "-conf", "/etc/coredns/Corefile" ] volumeMounts: - name: config-volume mountPath: /etc/coredns readOnly: true ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - containerPort: 9153 name: metrics protocol: TCP securityContext: allowPrivilegeEscalation: false capabilities: add: - NET_BIND_SERVICE drop: - all readOnlyRootFilesystem: true livenessProbe: httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 readinessProbe: httpGet: path: /ready port: 8181 scheme: HTTP dnsPolicy: Default volumes: - name: config-volume configMap: name: coredns items: - key: Corefile path: Corefile --- apiVersion: v1 kind: Service metadata: name: kube-dns namespace: kube-system annotations: prometheus.io/port: "9153" prometheus.io/scrape: "true" labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" kubernetes.io/name: "CoreDNS" app.kubernetes.io/name: coredns spec: selector: k8s-app: kube-dns app.kubernetes.io/name: coredns clusterIP: 192.168.0.2 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP - name: metrics port: 9153 protocol: TCP
需要在Corefile加上 ready :8181,不然deployment的readiness检查会不通过
local dns cache 本地dns缓存主要是能分担主DNS压力,还有避免 netfilter 做 DNAT 导致 conntrack 冲突引发DNS 5 秒延时 原理是使用DaemonSet在每个worker节点上起一个CoreDns,pod通过Kubelet或者在Pod注入DNSConfig 配置修改成本地local dns,解析dns先从local dns查找缓存,缓存不中再回归到主DNS解析。
参考本地 DNS 缓存,imroc.cc 腾讯在 TKE 集群中使用 NodeLocal DNS Cache
镜像底层库 DNS 解析行为默认使用 UDP 在同一个 socket 并发 A 和 AAAA 记录请求,由于 UDP 无状态,两个请求可能会并发创建 conntrack 表项,如果最终 DNAT 成同一个集群 DNS 的 Pod IP 就会导致 conntrack 冲突,由于 conntrack 的创建和插入是不加锁的,最终后面插入的 conntrack 表项就会被丢弃,从而请求超时,默认 5s 后重试,造成现象就是 DNS 5 秒延时; 底层库是 glibc 的容器镜像可以通过配 resolv.conf 参数来控制 DNS 解析行为,不用 TCP 或者避免相同五元组并发(使用串行解析 A 和 AAAA 避免并发或者使用不同 socket 发请求避免相同源端口),但像基于 alpine 镜像的容器由于底层库是 musl libc,不支持这些 resolv.conf 参数,也就无法规避,所以最佳方案还是使用本地 DNS 缓存。
创建sa和svc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ServiceAccount metadata: name: node-local-dns namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile --- apiVersion: v1 kind: Service metadata: name: kube-dns-upstream namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "KubeDNSUpstream" spec: ports: - name: dns port: 53 protocol: UDP targetPort: 53 - name: dns-tcp port: 53 protocol: TCP targetPort: 53 selector: k8s-app: kube-dns EOF
创建DaemonSet
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 UPSTREAM_CLUSTER_IP=$(kubectl -n kube-system get services kube-dns-upstream -o jsonpath="{.spec.clusterIP}") cat <<EOF | kubectl apply -f - apiVersion: v1 kind: ConfigMap metadata: name: node-local-dns namespace: kube-system labels: addonmanager.kubernetes.io/mode: Reconcile data: Corefile: | cluster.local:53 { errors cache { success 9984 30 denial 9984 5 } reload loop bind 169.254.20.10 forward . ${UPSTREAM_CLUSTER_IP} { force_tcp } prometheus :9253 health 169.254.20.10:8080 } in-addr.arpa:53 { errors cache 30 reload loop bind 169.254.20.10 forward . ${UPSTREAM_CLUSTER_IP} { force_tcp } prometheus :9253 } ip6.arpa:53 { errors cache 30 reload loop bind 169.254.20.10 forward . ${UPSTREAM_CLUSTER_IP} { force_tcp } prometheus :9253 } .:53 { errors cache 30 reload loop bind 169.254.20.10 forward . /etc/resolv.conf { force_tcp } prometheus :9253 } --- apiVersion: apps/v1 kind: DaemonSet metadata: name: node-local-dns namespace: kube-system labels: k8s-app: node-local-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: updateStrategy: rollingUpdate: maxUnavailable: 10% selector: matchLabels: k8s-app: node-local-dns template: metadata: labels: k8s-app: node-local-dns spec: priorityClassName: system-node-critical serviceAccountName: node-local-dns hostNetwork: true dnsPolicy: Default # Don't use cluster DNS. tolerations: - key: "CriticalAddonsOnly" operator: "Exists" containers: - name: node-cache image: ccr.ccs.tencentyun.com/bxrapp/k8s-dns-node-cache:1.23.1 resources: requests: cpu: 25m memory: 5Mi args: [ "-localip", "169.254.20.10", "-conf", "/etc/Corefile", "-upstreamsvc", "kube-dns-upstream" ] securityContext: privileged: true ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - containerPort: 9253 name: metrics protocol: TCP livenessProbe: httpGet: host: 169.254.20.10 path: /health port: 8080 initialDelaySeconds: 60 timeoutSeconds: 5 volumeMounts: - mountPath: /run/xtables.lock name: xtables-lock readOnly: false - name: config-volume mountPath: /etc/coredns - name: kube-dns-config mountPath: /etc/kube-dns volumes: - name: xtables-lock hostPath: path: /run/xtables.lock type: FileOrCreate - name: kube-dns-config configMap: name: kube-dns optional: true - name: config-volume configMap: name: node-local-dns items: - key: Corefile path: Corefile.base EOF
在所有worker的kubeleConfig修改–cluster-dns,重启
1 2 sed -i 's/192.168.0.2/169.254.20.10/g' kubeleConfig sudo systemctl restart kubelet
我们在之前配置已经写好了就不需要改
完成之后我们在pod里面nslookup可以看到dns server已经修改成169.254.20.10
kubelet-csr-approver 这个是自动审批kubelet serving cert的组件,有了它才能实现自动审批和轮换kubelet server证书,部署和配置看GitHub 主要配置有
PROVIDER_REGEX hostname正则规则
PROVIDER_IP_PREFIXES worker节点的ip网段,需要dns正确解析hostname
审批成功可以看到有日志
1 {"level":"INFO","ts":"2024-08-19T09:44:04.648Z","caller":"controller/csr_controller.go:169","msg":"CSR approved","controller":"certificatesigningrequest","controllerGroup":"certificates.k8s.io","controllerKind":"CertificateSigningRequest","CertificateSigningRequest":{"name":"csr-w9x65"},"namespace":"","name":"csr-w9x65","reconcileID":"f0deda4b-9a4f-4bae-8491-f0ff019cdf3d"}
metrics server metrics-server是apiserver的重要扩展api,kubectl top和HPA需要依赖metrics-server,在apiserver中设置了aggregation层认证的x509证书就是用于扩展api鉴权的。 使用清单安装:
1 kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
运行之后可能有2个问题
apiserver访问不了metrics-server,因为我们apiserver不能访问pod ip,需要加上hostNetwork: true,并且修改–secure-port=10251避免冲突;或者在master节点安装kube-proxy,calico,但是太麻烦。
kubelet x509证书问题,设置kubeletConfig中的参数serverTLSBootstrap: true,正确审批证书。或者简单的在metrics-server启动参数设置–kubelet-insecure-tls。参考Zeng Xu’s BLOG