K8s 監控初體驗

[文章目录]
  1. 概述
  2. Helm 部署 Prometheus Operator
  3. 部署 kube-prometheus
  4. 觀察 namespace=monitor 資訊
  5. 開放 Grafana 存取

概述

CoreOS 有個開放源碼的 Prometheus Operator 專案,是為了”方便管理”監控系統 Prometheus 於 K8s 叢集環境上運作而生。
官方標題:The Prometheus Operator creates, configures, and manages Prometheus monitoring instances.
專案相關細節可以參考官方說明頁
以下透過 Helm 部署,可參考官方 github 說明頁

以下說明,安裝項目有:

  • coreos/prometheus-operator
  • coreos/kube-prometheus

Helm 部署 Prometheus Operator

  1. 透過 Helm 部署 Prometheus Operator 需要先新增 repo

    1
    $ helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/
  2. 新增 K8s Namespace

    1
    kc create namespace monitor
  3. 部署 Prometheus Operator

  • Helm release:coreos/prometheus-operator
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
helm install coreos/prometheus-operator --name prometheus-operator --namespace=monitor
# --set rbacEnable=true
NAME: prometheus-operator
LAST DEPLOYED: Wed Jan 23 16:19:01 2019
NAMESPACE: monitor
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/PodSecurityPolicy
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
prometheus-operator false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim

==> v1/ConfigMap
NAME DATA AGE
prometheus-operator 1 40s

==> v1/ServiceAccount
NAME SECRETS AGE
prometheus-operator 1 40s

==> v1beta1/ClusterRole
NAME AGE
prometheus-operator 40s
psp-prometheus-operator 40s

==> v1beta1/ClusterRoleBinding
NAME AGE
prometheus-operator 40s
psp-prometheus-operator 40s

==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
prometheus-operator 1 1 1 1 40s

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
prometheus-operator-858c485-26tt6 1/1 Running 0 40s


NOTES:
The Prometheus Operator has been installed. Check its status by running:
kubectl --namespace monitor get pods -l "app=prometheus-operator,release=prometheus-operator"

Visit https://github.com/coreos/prometheus-operator for instructions on how
to create & configure Alertmanager and Prometheus instances using the Operator.

部署 kube-prometheus

此項有什麼呢?
依據官方 github 資訊頁來看,有涵蓋 Prometheus 常見必備項目:

  • The Prometheus Operator
  • Highly available Prometheus
  • Highly available Alertmanager
  • Prometheus node-exporter
  • kube-state-metrics
  • Grafana
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# 安裝 kube-prometheus
$ helm install coreos/kube-prometheus --name kube-prometheus --namespace monitor
#
NAME: kube-prometheus
LAST DEPLOYED: Wed Jan 23 16:49:17 2019
NAMESPACE: monitor
STATUS: DEPLOYED

RESOURCES:
==> v1/PrometheusRule
NAME AGE
kube-prometheus-alertmanager 2s
kube-prometheus-exporter-kube-controller-manager 2s
kube-prometheus-exporter-kube-etcd 2s
kube-prometheus-exporter-kube-scheduler 2s
kube-prometheus-exporter-kube-state 2s
kube-prometheus-exporter-kubelets 1s
kube-prometheus-exporter-kubernetes 1s
kube-prometheus-exporter-node 1s
kube-prometheus-rules 1s
kube-prometheus 1s

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
kube-prometheus-exporter-node-58kmg 0/1 ContainerCreating 0 2s
kube-prometheus-exporter-node-774zm 0/1 ContainerCreating 0 2s
kube-prometheus-exporter-node-cx69j 0/1 ContainerCreating 0 2s
kube-prometheus-exporter-kube-state-658f46b8dd-s94v6 0/2 ContainerCreating 0 2s
kube-prometheus-grafana-f869c754-n2skj 0/2 ContainerCreating 0 2s

==> v1/Secret
NAME TYPE DATA AGE
alertmanager-kube-prometheus Opaque 1 2s
kube-prometheus-grafana Opaque 2 2s

==> v1beta1/RoleBinding
NAME AGE
kube-prometheus-exporter-kube-state 2s

==> v1beta1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-prometheus-exporter-node 3 3 0 3 0 <none> 2s

==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kube-prometheus-exporter-kube-state 1 1 1 0 2s
kube-prometheus-grafana 1 1 1 0 2s

==> v1beta1/PodSecurityPolicy
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
kube-prometheus-alertmanager false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim
kube-prometheus-exporter-kube-state false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim
kube-prometheus-exporter-node false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim,hostPath
kube-prometheus-grafana false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim,hostPath
kube-prometheus false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim

==> v1beta1/ClusterRole
NAME AGE
psp-kube-prometheus-alertmanager 2s
kube-prometheus-exporter-kube-state 2s
psp-kube-prometheus-exporter-kube-state 2s
psp-kube-prometheus-exporter-node 2s
psp-kube-prometheus-grafana 2s
kube-prometheus 2s
psp-kube-prometheus 2s

==> v1beta1/Role
NAME AGE
kube-prometheus-exporter-kube-state 2s

==> v1/Alertmanager
NAME AGE
kube-prometheus 2s

==> v1beta1/ClusterRoleBinding
NAME AGE
psp-kube-prometheus-alertmanager 2s
kube-prometheus-exporter-kube-state 2s
psp-kube-prometheus-exporter-kube-state 2s
psp-kube-prometheus-exporter-node 2s
psp-kube-prometheus-grafana 2s
kube-prometheus 2s
psp-kube-prometheus 2s

==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-prometheus-alertmanager ClusterIP 10.27.245.216 <none> 9093/TCP 2s
kube-prometheus-exporter-kube-controller-manager ClusterIP None <none> 10252/TCP 2s
kube-prometheus-exporter-kube-dns ClusterIP None <none> 10054/TCP,10055/TCP 2s
kube-prometheus-exporter-kube-etcd ClusterIP None <none> 4001/TCP 2s
kube-prometheus-exporter-kube-scheduler ClusterIP None <none> 10251/TCP 2s
kube-prometheus-exporter-kube-state ClusterIP 10.27.244.151 <none> 80/TCP 2s
kube-prometheus-exporter-node ClusterIP 10.27.248.143 <none> 9100/TCP 2s
kube-prometheus-grafana ClusterIP 10.27.247.179 <none> 80/TCP 2s
kube-prometheus ClusterIP 10.27.253.137 <none> 9090/TCP 2s

==> v1/ConfigMap
NAME DATA AGE
kube-prometheus-grafana 10 2s

==> v1/ServiceAccount
NAME SECRETS AGE
kube-prometheus-exporter-kube-state 1 2s
kube-prometheus-exporter-node 1 2s
kube-prometheus-grafana 1 2s
kube-prometheus 1 2s

==> v1/Prometheus
NAME AGE
kube-prometheus 2s

==> v1/ServiceMonitor
NAME AGE
kube-prometheus-alertmanager 1s
kube-prometheus-exporter-kube-controller-manager 1s
kube-prometheus-exporter-kube-dns 1s
kube-prometheus-exporter-kube-etcd 1s
kube-prometheus-exporter-kube-scheduler 1s
kube-prometheus-exporter-kube-state 1s
kube-prometheus-exporter-kubelets 1s
kube-prometheus-exporter-kubernetes 1s
kube-prometheus-exporter-node 1s
kube-prometheus-grafana 1s
kube-prometheus 1s


NOTES:
DEPRECATION NOTICE:

- alertmanager.ingress.fqdn is not used anymore, use alertmanager.ingress.hosts []
- prometheus.ingress.fqdn is not used anymore, use prometheus.ingress.hosts []
- grafana.ingress.fqdn is not used anymore, use prometheus.grafana.hosts []

- additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- prometheus.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- alertmanager.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kube-controller-manager.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kube-etcd.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kube-scheduler.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kubelets.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kubernetes.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels

觀察 namespace=monitor 資訊

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
kc -n monitor get all
NAME READY STATUS RESTARTS AGE
pod/alertmanager-kube-prometheus-0 2/2 Running 0 1m
pod/kube-prometheus-exporter-kube-state-66b8849c9b-5vxmr 2/2 Running 0 58s
pod/kube-prometheus-exporter-node-58kmg 1/1 Running 0 1m
pod/kube-prometheus-exporter-node-774zm 1/1 Running 0 1m
pod/kube-prometheus-exporter-node-cx69j 1/1 Running 0 1m
pod/kube-prometheus-grafana-f869c754-n2skj 2/2 Running 0 1m
pod/prometheus-kube-prometheus-0 3/3 Running 1 1m
pod/prometheus-operator-858c485-26tt6 1/1 Running 0 31m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-operated ClusterIP None <none> 9093/TCP,6783/TCP 1m
service/kube-prometheus ClusterIP 10.27.253.137 <none> 9090/TCP 1m
service/kube-prometheus-alertmanager ClusterIP 10.27.245.216 <none> 9093/TCP 1m
service/kube-prometheus-exporter-kube-state ClusterIP 10.27.244.151 <none> 80/TCP 1m
service/kube-prometheus-exporter-node ClusterIP 10.27.248.143 <none> 9100/TCP 1m
service/kube-prometheus-grafana ClusterIP 10.27.247.179 <none> 80/TCP 1m
service/prometheus-operated ClusterIP None <none> 9090/TCP 1m

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kube-prometheus-exporter-node 3 3 3 3 3 <none> 1m

NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/kube-prometheus-exporter-kube-state 1 1 1 1 1m
deployment.apps/kube-prometheus-grafana 1 1 1 1 1m
deployment.apps/prometheus-operator 1 1 1 1 31m

NAME DESIRED CURRENT READY AGE
replicaset.apps/kube-prometheus-exporter-kube-state-658f46b8dd 0 0 0 1m
replicaset.apps/kube-prometheus-exporter-kube-state-66b8849c9b 1 1 1 58s
replicaset.apps/kube-prometheus-grafana-f869c754 1 1 1 1m
replicaset.apps/prometheus-operator-858c485 1 1 1 31m

NAME DESIRED CURRENT AGE
statefulset.apps/alertmanager-kube-prometheus 1 1 1m
statefulset.apps/prometheus-kube-prometheus 1 1 1m

開放 Grafana 存取

透過線上修改 service type LoadBalancer,開放 Grafana 存取

1
kc edit -n monitor svc/kube-prometheus-grafana

開放後,就可以開始研究 K8s - Prometheus - Grafana 監控細節囉~