kainstall = kubeadm install kubernetes
使用 shell 脚本, 基于 kubeadm 一键部署 kubernetes HA 集群, 轻松帮您打造一个可在 生产环境 下使用的健壮集群。
https://github.com/lework/kainstall
为什么要搞这个?Ansible PlayBook 不好么?
因为懒,Ansible PlayBook 编排是非常给力的,不过需要安装 Python 和 Ansible, 且需要下载多个 yaml 文件 。因为懒,我想要个更简单的方式来快速部署一个分布式的 Kubernetes HA 集群, 使用 shell 脚本可以不借助外力直接在服务器上运行,省时省力。 并且 shell 脚本只有一个文件,文件大小100 KB 左右,非常小巧,可以实现一条命令安装集群的超快体验,而且配合离线安装包,可以在不联网的环境下安装集群,这体验真的非常爽啊。
selinux
swap
firewalld
epel
源limits
history
记录journal
日志chrony
时间同步ssh-login-info
信息audit
审计ipvs
模块docker
, kube
组件。kubernetes
集群,以及增加或删除节点。ingress
组件,可选nginx
,traefik
。network
组件,可选flannel
,calico
, 需在初始化时指定。monitor
组件,可选prometheus
。log
组件,可选elasticsearch
。storage
组件,可选rook
,longhorn
。web ui
组件,可选dashboard
, kubesphere
。addon
组件,可选metrics-server
, nodelocaldns
。kubernetes
指定版本。bash -c "$(curl -sSL https://cdn.jsdelivr.net/gh/lework/kainstall/kainstall.sh)" \
- init \
--master 192.168.77.130,192.168.77.131,192.168.77.132 \
--worker 192.168.77.133,192.168.77.134 \
--user root \
--password 123456 \
--port 22 \
--version 1.19.3
更多操作见: kainstall 仓库
wget http://kainstall.oss-cn-shanghai.aliyuncs.com/1.19.3/centos7.tgz
bash -c "$(curl -sSL https://cdn.jsdelivr.net/gh/lework/kainstall/kainstall.sh)" \
- init \
--master 192.168.77.130,192.168.77.131,192.168.77.132 \
--worker 192.168.77.133,192.168.77.134 \
--user root \
--password 123456 \
--port 22 \
--version 1.19.3 \
--offline-file centos7.tgz
更多离线包: kainstall-offline 仓库
创建了一个 QQ 群 467645743 大家有问题的可以加进来。
顶顶顶
大佬,shell 写的真好。能不能透露一下怎么学的 shell?大佬还有其他 shell 作品吗?
您好:我在用脚本部署 k8s 集群时没有指定版本 bash kainstall-centos.sh init --master 192.168.200.121,192.168.200.122 --worker 192.168.200.123,192.168.200.124 --user root --password 1qaz2wsx --port 22,选择默认版本,
完成时提示报错: ERROR Summary: [2021-08-17T00:39:00.081914830+0800]: ERROR: [waiting] ingress-nginx pod ready failed. [2021-08-17T00:39:35.795033905+0800]: ERROR: [apply] add kubernetes dashboard ingress failed. 查看日志发现错误:
utils::retry 6 kubectl wait --namespace ingress-nginx --for=condition=ready pod --selector=app.kubernetes.io/component=controller --timeout=60s ' Warning: Permanently added '192.168.200.121' (ECDSA) to the list of known hosts. error: timed out waiting for the condition on pods/ingress-nginx-controller-7d8f68bdbd-pk88q Retry 1/6 exited 1, retrying in 1 seconds... error: timed out waiting for the condition on pods/ingress-nginx-controller-7d8f68bdbd-pk88q Retry 2/6 exited 1, retrying in 2 seconds... error: timed out waiting for the condition on pods/ingress-nginx-controller-7d8f68bdbd-pk88q Retry 3/6 exited 1, retrying in 4 seconds... error: timed out waiting for the condition on pods/ingress-nginx-controller-7d8f68bdbd-pk88q Retry 4/6 exited 1, retrying in 8 seconds... error: timed out waiting for the condition on pods/ingress-nginx-controller-7d8f68bdbd-pk88q Retry 5/6 exited 1, retrying in 16 seconds... error: timed out waiting for the condition on pods/ingress-nginx-controller-7d8f68bdbd-pk88q Retry 6/6 exited 1, no more retries left. [2021-08-17T00:39:00.081914830+0800]: ERROR: [waiting] ingress-nginx pod ready failed.
Warning: Permanently added '192.168.200.121' (ECDSA) to the list of known hosts. error: unable to recognize "STDIN": no matches for kind "Ingress" in version "networking.k8s.io/v1beta1" Retry 1/6 exited 1, retrying in 1 seconds... error: unable to recognize "STDIN": no matches for kind "Ingress" in version "networking.k8s.io/v1beta1" Retry 2/6 exited 1, retrying in 2 seconds... error: unable to recognize "STDIN": no matches for kind "Ingress" in version "networking.k8s.io/v1beta1" Retry 3/6 exited 1, retrying in 4 seconds... error: unable to recognize "STDIN": no matches for kind "Ingress" in version "networking.k8s.io/v1beta1" Retry 4/6 exited 1, retrying in 8 seconds... error: unable to recognize "STDIN": no matches for kind "Ingress" in version "networking.k8s.io/v1beta1" Retry 5/6 exited 1, retrying in 16 seconds... error: unable to recognize "STDIN": no matches for kind "Ingress" in version "networking.k8s.io/v1beta1" Retry 6/6 exited 1, no more retries left. [2021-08-17T00:39:35.795033905+0800]: ERROR: [apply] add kubernetes dashboard ingress failed.
检测服务状态:
[root@master01 opt]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get "http://127.0.0.1:10251/healthz:" dial tcp 127.0.0.1:10251: connect: connection refused
controller-manager Healthy ok
etcd-0 Healthy {"health":"true","reason":""}
[root@master01 opt]#
[root@master01 opt]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master-node1 Ready control-plane,master 23m v1.22.0 k8s-master-node2 Ready control-plane,master 19m v1.22.0 k8s-worker-node1 Ready worker 18m v1.22.0 k8s-worker-node2 Ready worker 18m v1.22.0 [root@master01 opt]# kubectl get ns NAME STATUS AGE default Active 24m ingress-nginx Active 15m kube-node-lease Active 24m kube-public Active 24m kube-system Active 24m kubernetes-dashboard Active 8m44s [root@master01 opt]#
因为 1.22 把 v1beta 去掉了,ingress crontrol 都有问题了,现在这个问题解决了。可以重新安装下 ingress
kainstall 上的 readme 中 qq 不能跳转,能否给一个 qq 群号?
添加 elasticsearch 之后 es 报错
java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:292)
at org.elasticsearch.node.Node.<init>(Node.java:376)
at org.elasticsearch.node.Node.<init>(Node.java:281)
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:219)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:219)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:399)
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159)
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:75)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:116)
at org.elasticsearch.cli.Command.main(Command.java:79)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:81)
For complete error details, refer to the log at /usr/share/elasticsearch/logs/k8s-logs.log
需要在 es-cluster 的 yaml 中的 env 下添加
name: node.max_local_storage_nodes value: "3"
重置集群,再次安装报错,复现步骤 1、bash kainstall-ubuntu.sh init -m xxx --version 1.20.6 正常 2、bash kainstall-ubuntu.sh add -w xxx --version 1.20.6 正常 3、bash kainstall-ubuntu.sh reset 正常 4、bash kainstall-ubuntu.sh init -m 172.17.31.10 -w 172.17.31.11 --version 1.20.6 报错
[2021-12-07T22:48:43.843009307+0800]: INFO: [check] sshpass command exists.
[2021-12-07T22:48:43.844762232+0800]: INFO: [check] wget command exists.
[2021-12-07T22:48:43.967038796+0800]: INFO: [check] ssh 172.17.31.10 connection succeeded.
[2021-12-07T22:48:44.131261546+0800]: INFO: [check] ssh 172.17.31.11 connection succeeded.
[2021-12-07T22:48:44.132638066+0800]: INFO: [check] os support: ubuntu20.04 ubuntu20.10 ubuntu21.04 ubuntu18.04
[2021-12-07T22:48:44.246906858+0800]: INFO: [check] 172.17.31.10 os support succeeded.
[2021-12-07T22:48:44.391252598+0800]: INFO: [check] 172.17.31.11 os support succeeded.
[2021-12-07T22:48:44.394797116+0800]: INFO: [init] Get 172.17.31.10 InternalIP.
[2021-12-07T22:48:44.512532553+0800]: INFO: [command] get MGMT_NODE_IP value succeeded.
[2021-12-07T22:48:44.514228279+0800]: INFO: [init] master: 172.17.31.10
[2021-12-07T22:48:52.129289350+0800]: INFO: [init] init master 172.17.31.10 succeeded.
[2021-12-07T22:48:52.559981054+0800]: INFO: [init] 172.17.31.10 set hostname and hostname resolution succeeded.
[2021-12-07T22:48:52.561673278+0800]: INFO: [init] 172.17.31.10: set audit-policy file.
[2021-12-07T22:48:52.676739186+0800]: INFO: [init] 172.17.31.10: set audit-policy file succeeded.
[2021-12-07T22:48:52.678454078+0800]: INFO: [init] worker: 172.17.31.11
[2021-12-07T22:49:00.137476057+0800]: INFO: [init] init worker 172.17.31.11 succeeded.
[2021-12-07T22:49:00.656946306+0800]: INFO: [install] install docker on 172.17.31.10.
[2021-12-07T22:49:06.107280418+0800]: INFO: [install] install docker on 172.17.31.10 succeeded.
[2021-12-07T22:49:06.109137495+0800]: INFO: [install] install kube on 172.17.31.10
[2021-12-07T22:49:13.957090675+0800]: INFO: [install] install kube on 172.17.31.10 succeeded.
[2021-12-07T22:49:13.959173245+0800]: INFO: [install] install docker on 172.17.31.11.
[2021-12-07T22:49:17.974488740+0800]: INFO: [install] install docker on 172.17.31.11 succeeded.
[2021-12-07T22:49:17.976069907+0800]: INFO: [install] install kube on 172.17.31.11
[2021-12-07T22:49:25.850685739+0800]: INFO: [install] install kube on 172.17.31.11 succeeded.
[2021-12-07T22:49:25.852310759+0800]: INFO: [install] install haproxy on 172.17.31.11
[2021-12-07T22:49:28.783115900+0800]: INFO: [install] install haproxy on 172.17.31.11 succeeded.
[2021-12-07T22:49:28.784874939+0800]: INFO: [kubeadm init] kubeadm init on 172.17.31.10
[2021-12-07T22:49:28.786523020+0800]: INFO: [kubeadm init] 172.17.31.10: set kubeadmcfg.yaml
[2021-12-07T22:49:28.909218319+0800]: INFO: [kubeadm init] 172.17.31.10: set kubeadmcfg.yaml succeeded.
[2021-12-07T22:49:28.910937717+0800]: INFO: [kubeadm init] 172.17.31.10: kubeadm init start.
[2021-12-07T22:53:32.232152130+0800]: ERROR: [kubeadm init] 172.17.31.10: kubeadm init failed.
ERROR Summary:
[2021-12-07T22:53:32.232152130+0800]: ERROR: [kubeadm init] 172.17.31.10: kubeadm init failed.
See detailed log >>> /tmp/kainstall.lZtGxGCHJV/kainstall.log
报错日志
root@i-hx0g9sad:~# tail -40 /tmp/kainstall.lZtGxGCHJV/kainstall.log
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
[2021-12-07T22:53:32.232152130+0800]: ERROR: [kubeadm init] 172.17.31.10: kubeadm init failed.
kubelet 报错