1、创建yaml文件安装metrics监控组件,修改节点名称(kubectl get nodes可查看节点名)
vim metrics.yaml
apiVersion: v1 automountServiceAccountToken: false kind: ServiceAccount metadata: labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: kube-state-metrics app.kubernetes.io/version: 2.4.2 name: kube-state-metrics namespace: monitoring --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: kube-state-metrics app.kubernetes.io/version: 2.4.2 name: kube-state-metrics rules: - apiGroups: - "" resources: - configmaps - secrets - nodes - pods - services - resourcequotas - replicationcontrollers - limitranges - persistentvolumeclaims - persistentvolumes - namespaces - endpoints verbs: - list - watch - apiGroups: - apps resources: - statefulsets - daemonsets - deployments - replicasets verbs: - list - watch - apiGroups: - batch resources: - cronjobs - jobs verbs: - list - watch - apiGroups: - autoscaling resources: - horizontalpodautoscalers verbs: - list - watch - apiGroups: - authentication.k8s.io resources: - tokenreviews verbs: - create - apiGroups: - authorization.k8s.io resources: - subjectaccessreviews verbs: - create - apiGroups: - policy resources: - poddisruptionbudgets verbs: - list - watch - apiGroups: - certificates.k8s.io resources: - certificatesigningrequests verbs: - list - watch - apiGroups: - storage.k8s.io resources: - storageclasses - volumeattachments verbs: - list - watch - apiGroups: - admissionregistration.k8s.io resources: - mutatingwebhookconfigurations - validatingwebhookconfigurations verbs: - list - watch - apiGroups: - networking.k8s.io resources: - networkpolicies - ingresses verbs: - list - watch - apiGroups: - coordination.k8s.io resources: - leases verbs: - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: kube-state-metrics app.kubernetes.io/version: 2.4.2 name: kube-state-metrics roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: kube-state-metrics subjects: - kind: ServiceAccount name: kube-state-metrics namespace: monitoring --- apiVersion: v1 kind: Service metadata: annotations: prometheus.io/scraped: "true" # 设置能被prometheus抓取到,因为不带这个annotation prometheus-service-endpoints 不会去抓这个metrics labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: kube-state-metrics app.kubernetes.io/version: 2.4.2 name: kube-state-metrics namespace: monitoring spec: # clusterIP: None # 允许通过svc来进行访问 ports: - name: http-metrics port: 8088 targetPort: http-metrics - name: telemetry port: 8089 targetPort: telemetry selector: app.kubernetes.io/name: kube-state-metrics --- apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: kube-state-metrics app.kubernetes.io/version: 2.4.2 name: kube-state-metrics namespace: monitoring spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: kube-state-metrics template: metadata: labels: app.kubernetes.io/component: exporter app.kubernetes.io/name: kube-state-metrics app.kubernetes.io/version: 2.4.2 spec: nodeName: k8s-master-1 # 设置在k8s-master-1上运行
tolerations: # 设置能容忍在master节点运行 - key: "node-role.kubernetes.io/master" operator: "Exists" effect: "NoSchedule" automountServiceAccountToken: true containers: # - image: k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.4.2 - image: anjia0532/google-containers.kube-state-metrics.kube-state-metrics:v2.4.2 # livenessProbe: # httpGet: # path: /healthz # port: 8088 # initialDelaySeconds: 5 # timeoutSeconds: 5 name: kube-state-metrics ports: - containerPort: 8088 name: http-metrics - containerPort: 8089 name: telemetry # readinessProbe: # httpGet: # path: / # port: 8089 # initialDelaySeconds: 5 # timeoutSeconds: 5 securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL readOnlyRootFilesystem: true runAsUser: 65534 runAsGroup: 65534 #fsGroup: 65534 serviceAccountName: kube-state-metrics
2、添加命名空间monitoring
kubectl create namespace monitoring
3、执行yaml文件
kubectl apply -f metrics.yaml
注:若在启动容器时提示拒绝连接,则在yaml文件中注释掉对应健康探针
如下为正常启动示例
4、下载解压helm-v3.9.4-linux-amd64.tar.gz包,对helm命令进行软链
tar xvf helm-v3.9.4-linux-amd64.tar.gz
5、官网下载配置文件、模板进行k8s agent安装
https://git.zabbix.com/projects/ZT/repos/kubernetes-helm/browse?at=refs%2Fheads%2Frelease%2F6.0
(以下为官网配置步骤)
添加存储库:
helm repo add zabbix-chart-6.0 https://cdn.zabbix.com/zabbix/integrations/kubernetes-helm/6.0
导出图表到文件的默认值:helm-zabbix$HOME/zabbix_values.yaml
helm show values zabbix-chart-6.0/zabbix-helm-chrt > $HOME/zabbix_values.yaml
根据文件中的环境更改值。$HOME/zabbix_values.yaml
列出群集的命名空间。
kubectl get namespaces
创建命名空间(如果群集中不存在)。monitoring
kubectl create namespace monitoring
在 Kubernetes 集群中部署 Zabbix。(如有必要,请更新 YAML 文件路径)。
helm install zabbix zabbix-chart-6.0/zabbix-helm-chrt --dependency-update -f $HOME/zabbix_values.yaml -n monitoring
查看容器。
kubectl get pods -n monitoring
查看Pod的信息。
kubectl describe pods/POD_NAME -n monitoring
查看 Pod 的所有容器。
kubectl get pods POD_NAME -n monitoring -o jsonpath='{.spec.containers[*].name}*'
查看 Pod 的日志容器。
kubectl logs -f pods/POD_NAME -c CONTAINER_NAME -n monitoring
容器的访问提示。
kubectl exec -it pods/POD_NAME -c CONTAINER_NAME -n monitoring -- sh
要卸载/删除部署,请执行以下操作:zabbix
helm delete zabbix -n monitoring
6、修改模板所有与IP、TOKEN相关的宏(IP为k8s服务器IP地址,下附获取TOKEN方式)
TOKEN获取方式:
k8s服务器执行命令
kubectl describe secrets $(kubectl get secrets -n monitoring |grep zabbix-service-account | grep -v zabbix-service-account-token-hctrs | cut -f1 -d ' ') -n monitoring |grep -E '^token' |cut -f2 -d':'|tr -d '\t'|tr -d ' '
注:宏中调用的API请求地址为默认端口,若接口无数据返回,可执行kubectl config view |grep server|cut -f 2- -d ":" | tr -d " "获取当前环境的API地址
错误信息