随着信息技术的飞速发展,企业的业务系统越来越复杂,对系统的稳定性、可靠性和性能要求也越来越高。在这样的背景下,搭建一个高效、可靠的监控平台对于企业来说至关重要。而基于 k8s 集群搭建 zabbix 平台(麒麟操作系统),可以充分发挥 k8s 的容器编排能力和 Zabbix 的监控能力,同时利用麒麟操作系统的安全可靠性,为企业提供一个稳定、高效、安全的监控解决方案。接下来,本文将介绍如何基于 k8s 集群搭建 zabbix 平台(麒麟操作系统),能够为企业提供强大的监控能力,确保系统的稳定运行。
1.1 配置静态IP

示例(最小化配置必须的参数)
[root@k8s-master01 ~]# vi etc/sysconfig/network-scripts/ifcfg-ens192 BOOTPROTO="none" NAME="ens192" DEVICE="ens192" ONBOOT="yes" IPADDR=192.168.3.174 PREFIX=24 GATEWAY=192.168.1.1 DNS1=114.114.114.114 [root@k8s-master01 ~]# systemctl restart network #刷新网络服务 [root@k8s-master01 ~]# ip a #查看ip
1.2 配置添加hosts
[root@k8s-master01 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.3.174 k8s-master01 192.168.3.175 k8s-node01 192.168.3.176 k8s-node02 [root@k8s-master01 ~]#
(1)清理现有 Docker 并配置源
# 卸载可能存在的旧版本 Docker sudo yum remove docker docker-engine docker-ce docker-* containerd.io # 添加阿里云 CentOS 8 基础源 sudo curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-8.repo # 添加 Docker CE 源(阿里云镜像) sudo yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo # 修正 yum 变量(关键步骤!强制使用 centos 8 的包) echo "8" | sudo tee /etc/yum/vars/centos_version > /dev/null sudo sed -i 's/$releasever/$centos_version/g' /etc/yum.repos.d/docker-ce.repo sudo sed -i 's/$releasever/$centos_version/g' /etc/yum.repos.d/CentOS-Base.repo # 更新缓存 sudo yum makecache # 查看所有可用的 Docker CE 版本 yum list docker-ce --showduplicates | sort -r #让包管理器自动解决依赖关系。注意,这会安装软件源中标记为默认或最新的版本 yum install -y docker-ce docker-ce-cli containerd.io 或# 选择一个较新但非最新的版本进行安装(例如 26.1.3) yum install -y docker-ce-26.1.3-1.el8 docker-ce-cli-26.1.3-1.el8 containerd.io
(2)配置docker的国内源
# 创建/etc/docker目录(如果不存在)
sudo mkdir -p /etc/docker
# 配置Docker守护进程,尤其是cgroup驱动
cat <<EOF | sudo tee /etc/docker/daemon.json
{
"registry-mirrors": ["https://hub.docker-alhk.dkdun.com/"],
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
# 重启Docker以应用配置
sudo systemctl daemon-reload
sudo systemctl restart docker
# 设置Docker开机自启
sudo systemctl enable docker.service
sudo systemctl status dokcer
(1)配置国内yum源
cat <<EOF | tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF # 更新索引缓冲 yum makecache
(2)安装kubelet、kubeadm和kubectl
# 安装指定版本,建议指定版本以避免兼容性问题 yum install -y kubeadm-1.23.17-0 kubelet-1.23.17-0 kubectl-1.23.17-0 --disableexcludes=kubernetes # 或者安装当前最新稳定版 # sudo yum install -y kubelet kubeadm kubectl # 启用kubelet服务,但暂不启动(需初始化后才会正常) sudo systemctl enable kubelet
(1)初始化Kubernetes控制平面(Master节点)
以下步骤仅在Master节点上执行:
kubeadm init \ --apiserver-advertise-address=192.168.3.174 \ # 替换为你的Master节点IP --image-repository registry.aliyuncs.com/google_containers \ # 使用国内镜像源 --kubernetes-version=v1.23.17 \ # 指定与安装一致的版本 --service-cidr=10.96.0.0/12 \ #指定 Service 网络的 IP 地址范围。 --pod-network-cidr=10.244.0.0/16 # 设置Pod网络CIDR,需与后续CNI插件匹配
成功后会提示一下信息

继续执行
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
在node节点执行加入到集群
kubeadm join 192.168.3.174:6443 --token g5aqri.02haydobvjfu7dnl \ --discovery-token-ca-cert-hash sha256:95fd4a7fd82ccde7b329ea551130340a837a3aa612145caa2435d2c18b3c91ef
查看集群
[root@k8s-master01 ~]# kubectl label node k8s-node01 node-role.kubernetes.io/worker=worker node/k8s-node01 labeled [root@k8s-master01 ~]# kubectl label node k8s-node02 node-role.kubernetes.io/worker=worker node/k8s-node02 labeled [root@k8s-master01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master01 NotReady control-plane,master 4h29m v1.23.17 k8s-node01 NotReady worker 3m45s v1.23.17 k8s-node02 NotReady worker 41m v1.23.17 [root@k8s-master01 ~]# #过一段时间 [root@k8s-master01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master01 Ready control-plane,master 24m v1.23.17 k8s-node01 Ready worker 18m v1.23.17 k8s-node02 Ready worker 18m v1.23.17
如果镜像拉取有问题,可先手动导入(连接下载解压:通过网盘分享的文件:zabbix-all.zip
链接: https://pan.baidu.com/s/1VsVlPF8RAZUlP3F2b5IxYQ?pwd=5h2u 提取码: 5h2u)
# 导入 Zabbix Server docker load -i zabbix-server-mysql.tar.gz # 导入 Zabbix Web docker load -i zabbix-web-nginx-mysql.tar.gz # 导入 Zabbix Agent docker load -i zabbix-agent.tar.gz #导入 数据库 docker load -i mysql:8.0.gz
(1)本次测试选择MySQL,创建mysql-deploy.yaml文件
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mysql-pvc namespace: zabbix spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi # 根据需求调整存储大小 # 若有StorageClass,取消注释并指定(如:storageClassName: "standard") # storageClassName: "your-storage-class" --- apiVersion: apps/v1 kind: Deployment metadata: name: mysql namespace: zabbix spec: replicas: 1 selector: matchLabels: app: mysql template: metadata: labels: app: mysql spec: containers: - name: mysql image: mysql:8.0 # 官方MySQL镜像 imagePullPolicy: IfNotPresent ports: - containerPort: 3306 env: - name: MYSQL_ROOT_PASSWORD value: "Root@123" # root密码,建议修改 - name: MYSQL_DATABASE value: "zabbix" # Zabbix数据库名 - name: MYSQL_USER value: "zabbix" # Zabbix数据库用户 - name: MYSQL_PASSWORD value: "zabbix" # Zabbix用户密码,建议修改 volumeMounts: - name: mysql-data mountPath: /var/lib/mysql # MySQL数据目录 resources: requests: cpu: 300m memory: 300Mi limits: cpu: 800m memory: 800Mi volumes: - name: mysql-data persistentVolumeClaim: claimName: mysql-pvc --- apiVersion: v1 kind: Service metadata: name: mysql namespace: zabbix spec: selector: app: mysql ports: - port: 3306 targetPort: 3306 type: ClusterIP # 仅集群内部访问
(2) 应用部署
kubectl apply -f mysql-deploy.yaml
(3)验证 MySQL 是否运行
kubectl get pods -n zabbix -l app=mysql # 状态应为Running
示例:
[root@k8s-master01 ~]# kubectl get pods -n zabbix -l app=mysql NAME READY STATUS RESTARTS AGE mysql-758b6cf559-7d2kh 1/1 Running 0 40m [root@k8s-master01 ~]#
(1)Zabbix Server 是核心组件,负责处理监控数据。需连接 MySQL,创建zabbix-server-deploy.yaml
[root@k8s-master01 ~]# cat zabbix-server-deploy.yaml apiVersion: apps/v1 kind: Deployment metadata: name: zabbix-server namespace: zabbix spec: replicas: 1 selector: matchLabels: app: zabbix-server template: metadata: labels: app: zabbix-server spec: containers: - name: zabbix-server image: zabbix/zabbix-server-mysql:latest # 官方Zabbix Server镜像(6.4版本) imagePullPolicy: IfNotPresent ports: - containerPort: 10051 # Zabbix Server默认端口 env: - name: DB_SERVER_HOST value: "mysql" # 对应MySQL Service名称 - name: MYSQL_DATABASE value: "zabbix" - name: MYSQL_USER value: "zabbix" - name: MYSQL_PASSWORD value: "zabbix" # 与MySQL中Zabbix用户密码一致 - name: MYSQL_ROOT_PASSWORD value: "Root@123" # MySQL root密码(用于初始化数据库) resources: requests: cpu: 300m memory: 300Mi limits: cpu: 1000m memory: 800Mi --- apiVersion: v1 kind: Service metadata: name: zabbix-server namespace: zabbix spec: selector: app: zabbix-server ports: - port: 10051 targetPort: 10051 type: ClusterIP # 内部访问 [root@k8s-master01 ~]#
(2)应用部署
kubectl apply -f zabbix-server-deploy.yaml
(3)验证 Zabbix Server 状态
kubectl get pods -n zabbix -l app=zabbix-server
(1)Zabbix Web 提供可视化界面,需连接 Zabbix Server 和 MySQL。创建zabbix-web-deploy.yaml
[root@k8s-master01 ~]# cat zabbix-web-deploy.yaml apiVersion: apps/v1 kind: Deployment metadata: name: zabbix-web namespace: zabbix spec: replicas: 1 selector: matchLabels: app: zabbix-web template: metadata: labels: app: zabbix-web spec: containers: - name: zabbix-web image: zabbix/zabbix-web-nginx-mysql:latest # 官方Web镜像(Nginx+PHP) imagePullPolicy: IfNotPresent ports: - containerPort: 8080 # Web端口 env: - name: ZBX_SERVER_HOST value: "zabbix-server" # 对应Zabbix Server Service名称 - name: DB_SERVER_HOST value: "mysql" - name: MYSQL_DATABASE value: "zabbix" - name: MYSQL_USER value: "zabbix" - name: MYSQL_PASSWORD value: "zabbix" - name: PHP_TZ value: "Asia/Shanghai" # 设置时区为上海 resources: requests: cpu: 200m memory: 256Mi limits: cpu: 1000m memory: 512Mi --- apiVersion: v1 kind: Service metadata: name: zabbix-web namespace: zabbix spec: selector: app: zabbix-web ports: - port: 80 targetPort: 8080 type: NodePort # 暴露到节点端口,便于外部访问 [root@k8s-master01 ~]#
(2)应用部署
kubectl apply -f zabbix-web-deploy.yaml
(3)验证 Web 状态
kubectl get pods -n zabbix -l app=zabbix-web
[root@k8s-master01 ~]# kubectl get pods -n zabbix -l app=zabbix-web NAME READY STATUS RESTARTS AGE zabbix-web-7bc9f48d8f-7bvvz 1/1 Running 0 27m [root@k8s-master01 ~]#
[root@k8s-master01 ~]# cat zabbix-agent-daemonset.yaml apiVersion: apps/v1 kind: DaemonSet metadata: name: zabbix-agent namespace: zabbix spec: selector: matchLabels: app: zabbix-agent template: metadata: labels: app: zabbix-agent spec: hostNetwork: true # 新增:容忍 master 节点的污点(关键!) tolerations: - key: "node-role.kubernetes.io/master" operator: "Exists" effect: "NoSchedule" # 保留原有其他容忍(内存压力、磁盘压力等,无需修改) - key: "node.kubernetes.io/memory-pressure" operator: "Exists" effect: "NoSchedule" - key: "node.kubernetes.io/disk-pressure" operator: "Exists" effect: "NoSchedule" # 容器配置不变(保留原有内容) containers: - name: zabbix-agent image: zabbix/zabbix-agent:latest ports: - containerPort: 10050 env: - name: ZBX_SERVER_HOST value: "zabbix-server" - name: ZBX_HOSTNAME valueFrom: fieldRef: fieldPath: spec.nodeName resources: requests: cpu: 50m memory: 64Mi limits: cpu: 100m memory: 128Mi [root@k8s-master01 ~]# kubectl apply -f zabbix-agent-daemonset.yaml
[root@k8s-master01 ~]# kubectl get pods -n zabbix -l app=zabbix-agent -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES zabbix-agent-6lgmp 1/1 Running 0 41m 192.168.3.174 k8s-master01 <none> <none> zabbix-agent-hm6fz 1/1 Running 0 41m 192.168.3.176 k8s-node02 <none> <none> zabbix-agent-t7jd5 1/1 Running 0 33m 192.168.3.175 k8s-node01 <none> <none> [root@k8s-master01 ~]#
(1)获取节点 IP(任意 worker 节点或 master 节点均可
[root@k8s-master01 ~]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s-master01 Ready control-plane,master 30d v1.23.17 192.168.3.174 <none> Kylin Linux Advanced Server V10 (Lance) 4.19.90-52.22.v2207.ky10.x86_64 docker://20.10.24 k8s-node01 Ready worker 30d v1.23.17 192.168.3.175 <none> Kylin Linux Advanced Server V10 (Lance) 4.19.90-52.22.v2207.ky10.x86_64 docker://20.10.24 k8s-node02 Ready worker 30d v1.23.17 192.168.3.176 <none> Kylin Linux Advanced Server V10 (Lance) 4.19.90-52.22.v2207.ky10.x86_64 docker://20.10.24 [root@k8s-master01 ~]#
(2)访问地址:http://<节点IP>:<NodePort>
[root@k8s-master01 ~]# kubectl get svc zabbix-web -n zabbix NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE zabbix-web NodePort 10.99.186.185 <none> 80:30211/TCP 177m [root@k8s-master01 ~]#
(3)登录
登录:
用户名:Admin
密码:zabbix(首次登录建议修改)

错误信息