使用 HELM 來發布服務

[文章目录]
  1. 安裝 HELM
  2. 安裝 Tiller 也就是 HELM Server role
  3. 錯誤紀錄
    1. 如沒建立 HELM RBAC 情況下進行 helm init
    2. initialize Helm on both client and server
    3. GCP Insufficient CPU
  4. 觀察 Tiller
  5. HELM 安裝範例:Consul service
    1. 安裝 helm 線上 stable/consul 服務範例
    2. 在 GKE 上發布 Consul-UI 服務
    3. 於 HELM Chart 中新增參數 loadBalancerSourceRanges
  6. 其他可能用上的指令

安裝 HELM

  • macOS 系統:
    brew install kubernetes-helm

  • Linux 系統:from script
    $ sudo curl https://raw.githubusercontent.com/helm/helm/master/scripts/get | bash

安裝 Tiller 也就是 HELM Server role

  • 建立 RBAC for helm
    $ vi helm-rbac-config.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: ServiceAccount
metadata:
name: tiller
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: tiller
namespace: kube-system
  • 套用 RBAC

    1
    2
    3
    $ kc apply -f helm-rbac-config.yaml
    serviceaccount/tiller created
    clusterrolebinding.rbac.authorization.k8s.io/tiller created
  • 進行初始化 helm init

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    $ helm init --service-account tiller
    Creating /Users/afu/.helm
    Creating /Users/afu/.helm/repository
    Creating /Users/afu/.helm/repository/cache
    Creating /Users/afu/.helm/repository/local
    Creating /Users/afu/.helm/plugins
    Creating /Users/afu/.helm/starters
    Creating /Users/afu/.helm/cache/archive
    Creating /Users/afu/.helm/repository/repositories.yaml
    Adding stable repo with URL: https://kubernetes-charts.storage.googleapis.com
    Adding local repo with URL: http://127.0.0.1:8879/charts
    $HELM_HOME has been configured at /Users/afu/.helm.

    Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

    Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
    To prevent this, run `helm init` with the --tiller-tls-verify flag.
    For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
    Happy Helming!
  • 如需更新 Tille pod
    helm init --service-account tiller --upgrade

錯誤紀錄

如沒建立 HELM RBAC 情況下進行 helm init

下列回饋~請管理者給予適合的權限政策

1
2
3
4
5
6
7
8
9
$ helm init
$HELM_HOME has been configured at /home/afu/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Happy Helming!

initialize Helm on both client and server

1
2
$ helm list
Error: configmaps is forbidden: User "system:serviceaccount:kube-system:default" cannot list resource "configmaps" in API group "" in the namespace "kube-system"

出現此錯誤,起因是 Tiller 在 K8s 環境內權限不足
需要額外設定 ServiceAccount ,參考下列官方說明文件

GCP Insufficient CPU

1
2
3
4
5
6
7
8
9
10
11
# Consul 部署後,查看狀態
$ kc get pod
NAME READY STATUS RESTARTS AGE
consul-qpp4r 0/1 Pending 0 0s
consul-server-0 0/1 Pending 0 1m
consul-server-1 0/1 Pending 0 1m
consul-server-2 0/1 Pending 0 1m

# 從 GKE 主控台查詢 consul-server-0 事件
# Stateful Set: consul-server:
0/1 nodes are available: 1 Insufficient cpu.

原因:目前 GKE node 關於 CPU 資源不足,需要調整 node 資源。

觀察 Tiller

1
2
3
4
5
6
7
8
9
10
11
12
$ helm list

$ kc get -n kube-system pod -l app=helm
NAME READY STATUS RESTARTS AGE
tiller-deploy-6f8d4f6c9c-tj8dd 1/1 Running 0 3m23s

$ kc logs -n kube-system tiller-deploy-6f8d4f6c9c-tj8dd
[main] 2019/01/19 06:13:00 Starting Tiller v2.12.1 (tls=false)
[main] 2019/01/19 06:13:00 GRPC listening on :44134
[main] 2019/01/19 06:13:00 Probes listening on :44135
[main] 2019/01/19 06:13:00 Storage driver is ConfigMap
[main] 2019/01/19 06:13:00 Max history per release is 0

HELM 安裝範例:Consul service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# Clone the chart repo
$ git clone https://github.com/hashicorp/consul-helm.git
$ cd consul-helm

# 檢查 Consul version tag
$ git log
$ git tag

# Run Helm
$ helm install --dry-run ./
$ helm install --name consul ./
NAME: consul
LAST DEPLOYED: Sat Jan 19 17:36:07 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
consul-dns ClusterIP 10.106.118.143 <none> 53/TCP,53/UDP 1s
consul-server ClusterIP None <none> 8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP 1s
consul-ui ClusterIP 10.96.148.23 <none> 80/TCP 0s

==> v1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
consul 3 3 0 3 0 <none> 0s

==> v1/StatefulSet
NAME DESIRED CURRENT AGE
consul-server 3 3 0s

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
consul-dnlpp 0/1 ContainerCreating 0 0s
consul-fwvvs 0/1 ContainerCreating 0 0s
consul-hscvv 0/1 ContainerCreating 0 0s
consul-server-0 0/1 Pending 0 0s
consul-server-1 0/1 Pending 0 0s
consul-server-2 0/1 Pending 0 0s

==> v1beta1/PodDisruptionBudget
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
consul-server N/A 1 0 1s

==> v1/ConfigMap
NAME DATA AGE
consul-client-config 1 1s
consul-server-config 1 1s

刪除 release

1
2
3
4
5
6
7
$ helm delete --dry-run consul
release "consul" deleted
$ helm ls --all consul
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
consul 1 Sat Jan 19 17:36:07 2019 DELETED consul-0.5.0 default
$ helm del --purge consul
release "consul" deleted


安裝 helm 線上 stable/consul 服務範例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
helm install stable/consul
NAME: youthful-stoat
LAST DEPLOYED: Sat Jan 19 18:07:33 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
youthful-stoat-consul-tests 1 0s

==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
youthful-stoat-consul ClusterIP None <none> 8500/TCP,8400/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP,8600/UDP 0s
youthful-stoat-consul-ui NodePort 10.103.14.242 <none> 8500:30481/TCP 0s

==> v1beta1/StatefulSet
NAME DESIRED CURRENT AGE
youthful-stoat-consul 3 1 0s

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
youthful-stoat-consul-0 0/1 Pending 0 0s

==> v1beta1/PodDisruptionBudget
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
youthful-stoat-consul-pdb N/A 1 0 0s

==> v1/Secret
NAME TYPE DATA AGE
youthful-stoat-consul-gossip-key Opaque 1 0s


NOTES:
1. Watch all cluster members come up.
$ kubectl get pods --namespace=default -w
2. Test cluster health using Helm test.
$ helm test youthful-stoat-consul
3. (Optional) Manually confirm consul cluster is healthy.
$ CONSUL_POD=$(kubectl get pods -l='release=youthful-stoat-consul' --output=jsonpath={.items[0].metadata.name})
$ kubectl exec $CONSUL_POD consul members --namespace=default | grep server

在 GKE 上發布 Consul-UI 服務

方法有二:

  1. 變更 consul-ui service type
1
2
3
4
5
6
7
8
# vi values.yaml
ui:
service:
enabled: true
type: LoadBalancer

# 生效
$ helm upgrade consul ./
  1. 建立 consul ingress(尚未成功)
1
2
3
4
5
6
7
8
9
10
11
12
# ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: consul-ui
spec:
rules:
- http:
paths:
- backend:
serviceName: consul-ui
servicePort: 80

於 HELM Chart 中新增參數 loadBalancerSourceRanges

此案例,loadBalancerSourceRanges 原先並未定義在 Chart 中。

  • 先新增 values.yaml
1
2
3
4
5
service:
enabled: true
type: LoadBalancer
loadBalancerSourceRanges:
- 61.216.133.43/32
  • 於指定的 templates/ui-service.yaml 中新增參數
1
2
3
4
5
6
7
8
9
10
11
spec:
ports:
- name: http
port: 80
targetPort: 8500
{{- if .Values.ui.service.type }}
type: {{ .Values.ui.service.type }}
{{- end }}
{{- if .Values.ui.service.loadBalancerSourceRanges }}
loadBalancerSourceRanges: {{ .Values.ui.service.loadBalancerSourceRanges }}
{{- end }}
  • 更新 helm release-name

helm upgrade consul ./


其他可能用上的指令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# list releases
helm list

# install a chart archive
helm install

# inspect a chart
helm inspect

# download a named release
helm get

# given a release name, delete the release from Kubernetes
helm delete

# fetch release history
helm history

# displays the status of the named release
helm status [flags] RELEASE-NAME

# test a release
helm test [RELEASE] [flags]

# print the client/server version information
helm version

K8s 監控初體驗

[文章目录]
  1. 概述
  2. Helm 部署 Prometheus Operator
  3. 部署 kube-prometheus
  4. 觀察 namespace=monitor 資訊
  5. 開放 Grafana 存取

概述

CoreOS 有個開放源碼的 Prometheus Operator 專案,是為了”方便管理”監控系統 Prometheus 於 K8s 叢集環境上運作而生。
官方標題:The Prometheus Operator creates, configures, and manages Prometheus monitoring instances.
專案相關細節可以參考官方說明頁
以下透過 Helm 部署,可參考官方 github 說明頁

以下說明,安裝項目有:

  • coreos/prometheus-operator
  • coreos/kube-prometheus

Helm 部署 Prometheus Operator

  1. 透過 Helm 部署 Prometheus Operator 需要先新增 repo

    1
    $ helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/
  2. 新增 K8s Namespace

    1
    kc create namespace monitor
  3. 部署 Prometheus Operator

  • Helm release:coreos/prometheus-operator
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
helm install coreos/prometheus-operator --name prometheus-operator --namespace=monitor
# --set rbacEnable=true
NAME: prometheus-operator
LAST DEPLOYED: Wed Jan 23 16:19:01 2019
NAMESPACE: monitor
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/PodSecurityPolicy
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
prometheus-operator false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim

==> v1/ConfigMap
NAME DATA AGE
prometheus-operator 1 40s

==> v1/ServiceAccount
NAME SECRETS AGE
prometheus-operator 1 40s

==> v1beta1/ClusterRole
NAME AGE
prometheus-operator 40s
psp-prometheus-operator 40s

==> v1beta1/ClusterRoleBinding
NAME AGE
prometheus-operator 40s
psp-prometheus-operator 40s

==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
prometheus-operator 1 1 1 1 40s

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
prometheus-operator-858c485-26tt6 1/1 Running 0 40s


NOTES:
The Prometheus Operator has been installed. Check its status by running:
kubectl --namespace monitor get pods -l "app=prometheus-operator,release=prometheus-operator"

Visit https://github.com/coreos/prometheus-operator for instructions on how
to create & configure Alertmanager and Prometheus instances using the Operator.

部署 kube-prometheus

此項有什麼呢?
依據官方 github 資訊頁來看,有涵蓋 Prometheus 常見必備項目:

  • The Prometheus Operator
  • Highly available Prometheus
  • Highly available Alertmanager
  • Prometheus node-exporter
  • kube-state-metrics
  • Grafana
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# 安裝 kube-prometheus
$ helm install coreos/kube-prometheus --name kube-prometheus --namespace monitor
#
NAME: kube-prometheus
LAST DEPLOYED: Wed Jan 23 16:49:17 2019
NAMESPACE: monitor
STATUS: DEPLOYED

RESOURCES:
==> v1/PrometheusRule
NAME AGE
kube-prometheus-alertmanager 2s
kube-prometheus-exporter-kube-controller-manager 2s
kube-prometheus-exporter-kube-etcd 2s
kube-prometheus-exporter-kube-scheduler 2s
kube-prometheus-exporter-kube-state 2s
kube-prometheus-exporter-kubelets 1s
kube-prometheus-exporter-kubernetes 1s
kube-prometheus-exporter-node 1s
kube-prometheus-rules 1s
kube-prometheus 1s

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
kube-prometheus-exporter-node-58kmg 0/1 ContainerCreating 0 2s
kube-prometheus-exporter-node-774zm 0/1 ContainerCreating 0 2s
kube-prometheus-exporter-node-cx69j 0/1 ContainerCreating 0 2s
kube-prometheus-exporter-kube-state-658f46b8dd-s94v6 0/2 ContainerCreating 0 2s
kube-prometheus-grafana-f869c754-n2skj 0/2 ContainerCreating 0 2s

==> v1/Secret
NAME TYPE DATA AGE
alertmanager-kube-prometheus Opaque 1 2s
kube-prometheus-grafana Opaque 2 2s

==> v1beta1/RoleBinding
NAME AGE
kube-prometheus-exporter-kube-state 2s

==> v1beta1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-prometheus-exporter-node 3 3 0 3 0 <none> 2s

==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kube-prometheus-exporter-kube-state 1 1 1 0 2s
kube-prometheus-grafana 1 1 1 0 2s

==> v1beta1/PodSecurityPolicy
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
kube-prometheus-alertmanager false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim
kube-prometheus-exporter-kube-state false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim
kube-prometheus-exporter-node false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim,hostPath
kube-prometheus-grafana false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim,hostPath
kube-prometheus false RunAsAny RunAsAny MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim

==> v1beta1/ClusterRole
NAME AGE
psp-kube-prometheus-alertmanager 2s
kube-prometheus-exporter-kube-state 2s
psp-kube-prometheus-exporter-kube-state 2s
psp-kube-prometheus-exporter-node 2s
psp-kube-prometheus-grafana 2s
kube-prometheus 2s
psp-kube-prometheus 2s

==> v1beta1/Role
NAME AGE
kube-prometheus-exporter-kube-state 2s

==> v1/Alertmanager
NAME AGE
kube-prometheus 2s

==> v1beta1/ClusterRoleBinding
NAME AGE
psp-kube-prometheus-alertmanager 2s
kube-prometheus-exporter-kube-state 2s
psp-kube-prometheus-exporter-kube-state 2s
psp-kube-prometheus-exporter-node 2s
psp-kube-prometheus-grafana 2s
kube-prometheus 2s
psp-kube-prometheus 2s

==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-prometheus-alertmanager ClusterIP 10.27.245.216 <none> 9093/TCP 2s
kube-prometheus-exporter-kube-controller-manager ClusterIP None <none> 10252/TCP 2s
kube-prometheus-exporter-kube-dns ClusterIP None <none> 10054/TCP,10055/TCP 2s
kube-prometheus-exporter-kube-etcd ClusterIP None <none> 4001/TCP 2s
kube-prometheus-exporter-kube-scheduler ClusterIP None <none> 10251/TCP 2s
kube-prometheus-exporter-kube-state ClusterIP 10.27.244.151 <none> 80/TCP 2s
kube-prometheus-exporter-node ClusterIP 10.27.248.143 <none> 9100/TCP 2s
kube-prometheus-grafana ClusterIP 10.27.247.179 <none> 80/TCP 2s
kube-prometheus ClusterIP 10.27.253.137 <none> 9090/TCP 2s

==> v1/ConfigMap
NAME DATA AGE
kube-prometheus-grafana 10 2s

==> v1/ServiceAccount
NAME SECRETS AGE
kube-prometheus-exporter-kube-state 1 2s
kube-prometheus-exporter-node 1 2s
kube-prometheus-grafana 1 2s
kube-prometheus 1 2s

==> v1/Prometheus
NAME AGE
kube-prometheus 2s

==> v1/ServiceMonitor
NAME AGE
kube-prometheus-alertmanager 1s
kube-prometheus-exporter-kube-controller-manager 1s
kube-prometheus-exporter-kube-dns 1s
kube-prometheus-exporter-kube-etcd 1s
kube-prometheus-exporter-kube-scheduler 1s
kube-prometheus-exporter-kube-state 1s
kube-prometheus-exporter-kubelets 1s
kube-prometheus-exporter-kubernetes 1s
kube-prometheus-exporter-node 1s
kube-prometheus-grafana 1s
kube-prometheus 1s


NOTES:
DEPRECATION NOTICE:

- alertmanager.ingress.fqdn is not used anymore, use alertmanager.ingress.hosts []
- prometheus.ingress.fqdn is not used anymore, use prometheus.ingress.hosts []
- grafana.ingress.fqdn is not used anymore, use prometheus.grafana.hosts []

- additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- prometheus.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- alertmanager.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kube-controller-manager.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kube-etcd.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kube-scheduler.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kubelets.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels
- exporter-kubernetes.additionalRulesConfigMapLabels is not used anymore, use additionalRulesLabels

觀察 namespace=monitor 資訊

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
kc -n monitor get all
NAME READY STATUS RESTARTS AGE
pod/alertmanager-kube-prometheus-0 2/2 Running 0 1m
pod/kube-prometheus-exporter-kube-state-66b8849c9b-5vxmr 2/2 Running 0 58s
pod/kube-prometheus-exporter-node-58kmg 1/1 Running 0 1m
pod/kube-prometheus-exporter-node-774zm 1/1 Running 0 1m
pod/kube-prometheus-exporter-node-cx69j 1/1 Running 0 1m
pod/kube-prometheus-grafana-f869c754-n2skj 2/2 Running 0 1m
pod/prometheus-kube-prometheus-0 3/3 Running 1 1m
pod/prometheus-operator-858c485-26tt6 1/1 Running 0 31m

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-operated ClusterIP None <none> 9093/TCP,6783/TCP 1m
service/kube-prometheus ClusterIP 10.27.253.137 <none> 9090/TCP 1m
service/kube-prometheus-alertmanager ClusterIP 10.27.245.216 <none> 9093/TCP 1m
service/kube-prometheus-exporter-kube-state ClusterIP 10.27.244.151 <none> 80/TCP 1m
service/kube-prometheus-exporter-node ClusterIP 10.27.248.143 <none> 9100/TCP 1m
service/kube-prometheus-grafana ClusterIP 10.27.247.179 <none> 80/TCP 1m
service/prometheus-operated ClusterIP None <none> 9090/TCP 1m

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kube-prometheus-exporter-node 3 3 3 3 3 <none> 1m

NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/kube-prometheus-exporter-kube-state 1 1 1 1 1m
deployment.apps/kube-prometheus-grafana 1 1 1 1 1m
deployment.apps/prometheus-operator 1 1 1 1 31m

NAME DESIRED CURRENT READY AGE
replicaset.apps/kube-prometheus-exporter-kube-state-658f46b8dd 0 0 0 1m
replicaset.apps/kube-prometheus-exporter-kube-state-66b8849c9b 1 1 1 58s
replicaset.apps/kube-prometheus-grafana-f869c754 1 1 1 1m
replicaset.apps/prometheus-operator-858c485 1 1 1 31m

NAME DESIRED CURRENT AGE
statefulset.apps/alertmanager-kube-prometheus 1 1 1m
statefulset.apps/prometheus-kube-prometheus 1 1 1m

開放 Grafana 存取

透過線上修改 service type LoadBalancer,開放 Grafana 存取

1
kc edit -n monitor svc/kube-prometheus-grafana

開放後,就可以開始研究 K8s - Prometheus - Grafana 監控細節囉~

Gcloud SDK 初體驗

[文章目录]
  1. 初體驗
    1. 安裝筆記
    2. 版本資訊
    3. 初始化動作
    4. gcloud 更新紀錄
  2. gcloud 使用範例
  3. 用 gcloud 連結到 GKE 叢集

初體驗

這是我在 MacOS 環境下,進行下載與安裝 gcloud 過程筆記
參考官方文件說明頁

安裝筆記

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
[afu@MacBook-Pro ~/google-cloud-sdk $] ll
total 696
-rw-r--r--@ 1 afu staff 980B 9 25 02:42 LICENSE
-rw-r--r--@ 1 afu staff 673B 9 25 02:42 README
-rw-r--r--@ 1 afu staff 299K 9 25 02:42 RELEASE_NOTES
-rw-r--r--@ 1 afu staff 8B 9 25 02:42 VERSION
drwxr-xr-x@ 11 afu staff 352B 9 25 02:42 bin
-rw-r--r--@ 1 afu staff 2.6K 9 25 02:42 completion.bash.inc
-rw-r--r--@ 1 afu staff 2.0K 9 25 02:42 completion.zsh.inc
drwxr-xr-x@ 3 afu staff 96B 9 25 02:42 data
drwxr-xr-x@ 3 afu staff 96B 9 25 02:47 deb
-rwxr-xr-x@ 1 afu staff 2.0K 9 25 02:42 install.bat
-rwxr-xr-x@ 1 afu staff 4.4K 9 25 02:42 install.sh
drwxr-xr-x@ 7 afu staff 224B 9 25 02:42 lib
-rw-r--r--@ 1 afu staff 377B 9 25 02:42 path.bash.inc
-rw-r--r--@ 1 afu staff 1.2K 9 25 02:42 path.fish.inc
-rw-r--r--@ 1 afu staff 31B 9 25 02:42 path.zsh.inc
drwxr-xr-x@ 5 afu staff 160B 9 25 02:47 platform
-rw-r--r--@ 1 afu staff 39B 9 25 02:47 properties
drwxr-xr-x@ 3 afu staff 96B 9 25 02:47 rpm
[afu@MacBook-Pro ~/google-cloud-sdk $] less install.sh
[afu@MacBook-Pro ~/google-cloud-sdk $]

✘[afu@MacBook-Pro ~/google-cloud-sdk $] ./install.sh
Welcome to the Google Cloud SDK!

To help improve the quality of this product, we collect anonymized usage data
and anonymized stacktraces when crashes are encountered; additional information
is available at <https://cloud.google.com/sdk/usage-statistics>. You may choose
to opt out of this collection now (by choosing 'N' at the below prompt), or at
any time in the future by running the following command:

gcloud config set disable_usage_reporting true

Do you want to help improve the Google Cloud SDK (Y/n)? Y


Your current Cloud SDK version is: 218.0.0
The latest available version is: 230.0.0

┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Components │
├──────────────────┬──────────────────────────────────────────────────────┬──────────────────────────┬───────────┤
│ Status │ Name │ ID │ Size │
├──────────────────┼──────────────────────────────────────────────────────┼──────────────────────────┼───────────┤
│ Update Available │ BigQuery Command Line Tool │ bq │ < 1 MiB │
│ Update Available │ Cloud SDK Core Libraries │ core │ 9.3 MiB │
│ Update Available │ Cloud Storage Command Line Tool │ gsutil │ 3.6 MiB │
│ Not Installed │ App Engine Go Extensions │ app-engine-go │ 56.4 MiB │
│ Not Installed │ Cloud Bigtable Command Line Tool │ cbt │ 6.3 MiB │
│ Not Installed │ Cloud Bigtable Emulator │ bigtable │ 5.6 MiB │
│ Not Installed │ Cloud Datalab Command Line Tool │ datalab │ < 1 MiB │
│ Not Installed │ Cloud Datastore Emulator │ cloud-datastore-emulator │ 18.3 MiB │
│ Not Installed │ Cloud Datastore Emulator (Legacy) │ gcd-emulator │ 38.1 MiB │
│ Not Installed │ Cloud Firestore Emulator │ cloud-firestore-emulator │ 32.2 MiB │
│ Not Installed │ Cloud Pub/Sub Emulator │ pubsub-emulator │ 33.4 MiB │
│ Not Installed │ Cloud SQL Proxy │ cloud_sql_proxy │ 3.7 MiB │
│ Not Installed │ Emulator Reverse Proxy │ emulator-reverse-proxy │ 14.5 MiB │
│ Not Installed │ Google Cloud Build Local Builder │ cloud-build-local │ 5.9 MiB │
│ Not Installed │ Google Container Registry s Docker credential helper │ docker-credential-gcr │ 1.8 MiB │
│ Not Installed │ gcloud Alpha Commands │ alpha │ < 1 MiB │
│ Not Installed │ gcloud Beta Commands │ beta │ < 1 MiB │
│ Not Installed │ gcloud app Java Extensions │ app-engine-java │ 107.5 MiB │
│ Not Installed │ gcloud app PHP Extensions │ app-engine-php │ 21.9 MiB │
│ Not Installed │ gcloud app Python Extensions │ app-engine-python │ 6.2 MiB │
│ Not Installed │ gcloud app Python Extensions (Extra Libraries) │ app-engine-python-extras │ 28.5 MiB │
│ Not Installed │ kubectl │ kubectl │ < 1 MiB │
└──────────────────┴──────────────────────────────────────────────────────┴──────────────────────────┴───────────┘
To install or remove components at your current SDK version [218.0.0], run:
$ gcloud components install COMPONENT_ID
$ gcloud components remove COMPONENT_ID

To update your SDK installation to the latest version [230.0.0], run:
$ gcloud components update


Modify profile to update your $PATH and enable shell command
completion?

Do you want to continue (Y/n)? Y

The Google Cloud SDK installer will now prompt you to update an rc
file to bring the Google Cloud CLIs into your environment.

Enter a path to an rc file to update, or leave blank to use
[/Users/afu/.zshrc]:
Backing up [/Users/afu/.zshrc] to [/Users/afu/.zshrc.backup].
[/Users/afu/.zshrc] has been updated.

==> Start a new shell for the changes to take effect.


For more information on how to get started, please visit:
https://cloud.google.com/sdk/docs/quickstarts


[afu@MacBook-Pro ~/google-cloud-sdk $]

版本資訊

1
2
3
4
5
6
[afu@MacBook-Pro ~/google-cloud-sdk $] gcloud -v
Google Cloud SDK 218.0.0
bq 2.0.34
core 2018.09.24
gsutil 4.34
[afu@MacBook-Pro ~/google-cloud-sdk $]

初始化動作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
[afu@MacBook-Pro ~/google-cloud-sdk $] gcloud init
Welcome! This command will take you through the configuration of gcloud.

Your current configuration has been set to: [default]

You can skip diagnostics next time by using the following flag:
gcloud init --skip-diagnostics

Network diagnostic detects and fixes local network connection issues.
Checking network connection...done.
Reachability Check passed.
Network diagnostic (1/1 checks) passed.

You must log in to continue. Would you like to log in (Y/n)? Y

Your browser has been opened to visit:

https://accounts.google.com/o/oauth2/auth?redirect_uri=http%3A%2F%2Flocalhost%3A8085%2F&prompt=select_account&response_type=code&client_id=32555940559.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fappengine.admin+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcompute+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Faccounts.reauth&access_type=offline




Updates are available for some Cloud SDK components. To install them,
please run:
$ gcloud components update

You are logged in as: [afu at gmail.com].

Pick cloud project to use:
[1] integral-plexus-123456
[2] Create a new project
Please enter numeric choice or text value (must exactly match list
item): 1

Your current project has been set to: [integral-plexus-123456].

Do you want to configure a default Compute Region and Zone? (Y/n)? Y

Which Google Compute Engine zone would you like to use as project
default?
If you do not specify a zone via a command line flag while working
with Compute Engine resources, the default is assumed.
[1] us-east1-b
[2] us-east1-c
# ~~~~~~~~~~~~
# ~~~~~~~~~~~~
[48] europe-north1-b
[49] europe-north1-c
[50] northamerica-northeast1-a
Did not print [6] options.
Too many options [56]. Enter "list" at prompt to print choices fully.
Please enter numeric choice or text value (must exactly match list
item): 26

Your project default Compute Engine zone has been set to [asia-east1-b].
You can change it by running [gcloud config set compute/zone NAME].

Your project default Compute Engine region has been set to [asia-east1].
You can change it by running [gcloud config set compute/region NAME].

Created a default .boto configuration file at [/Users/afu/.boto]. See this file and
[https://cloud.google.com/storage/docs/gsutil/commands/config] for more
information about configuring Google Cloud Storage.
Your Google Cloud SDK is configured and ready to use!

* Commands that require authentication will use afu at gmail.com by default
* Commands will reference project `integral-plexus-123456` by default
* Compute Engine commands will use region `asia-east1` by default
* Compute Engine commands will use zone `asia-east1-b` by default

Run `gcloud help config` to learn how to change individual settings

This gcloud configuration is called [default]. You can create additional configurations if you work with multiple accounts and/or projects.
Run `gcloud topic configurations` to learn more.

Some things to try next:

* Run `gcloud --help` to see the Cloud Platform services you can interact with. And run `gcloud help COMMAND` to get help on any gcloud command.
* Run `gcloud topic -h` to learn about advanced features of the SDK like arg files and output formatting
[afu@MacBook-Pro ~/google-cloud-sdk $]

gcloud 更新紀錄

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
[afu@MacBook-Pro ~/google-cloud-sdk $] gcloud components update
Do you want to continue (Y/n)? Y

╔════════════════════════════════════════════════════════════╗
╠═ Creating update staging area ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Uninstalling: BigQuery Command Line Tool ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Uninstalling: Cloud SDK Core Libraries ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Uninstalling: Cloud Storage Command Line Tool ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Uninstalling: gcloud cli dependencies ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Installing: BigQuery Command Line Tool ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Installing: Cloud SDK Core Libraries ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Installing: Cloud Storage Command Line Tool ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Installing: gcloud cli dependencies ═╣
╠════════════════════════════════════════════════════════════╣
╠═ Creating backup and activating new installation ═╣
╚════════════════════════════════════════════════════════════╝

Performing post processing steps...done.

Update done!

To revert your SDK to the previously installed version, you may run:
$ gcloud components update --version 218.0.0

# 新版本版號資訊
[afu@MacBook-Pro ~/google-cloud-sdk $] gcloud -v
Google Cloud SDK 230.0.0
bq 2.0.39
core 2019.01.11
gsutil 4.35
[afu@MacBook-Pro ~/google-cloud-sdk $]

gcloud 使用範例

1
2
3
4
5
6
7
# 設定預設區域
$ gcloud config set compute/zone asia-east1-a
Updated property [compute/zone].

# 取得特定資訊
$ gcloud config get-value compute/zone
asia-east1-a

用 gcloud 連結到 GKE 叢集

事前條件:已經於 GKE 上完成創建叢集環境,於GKE 介面上複製下列指令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 執行
[afu@MacBook-Pro ~/google-cloud-sdk $] gcloud container clusters get-credentials afu-first-cluster-1 --zone asia-east1-a --project integral-plexus-123456

Fetching cluster endpoint and auth data.
kubeconfig entry generated for afu-first-cluster-1.

# 查看 kube/config
# 會發現新增 cluster、context 資訊,與異動後的 current-context
[afu@MacBook-Pro ~/google-cloud-sdk $] less ~/.kube/config

[afu@MacBook-Pro ~/google-cloud-sdk $] kc get nodes
NAME STATUS ROLES AGE VERSION
gke-afu-first-cluster-1-pool-1-dd90def6-lrcb Ready <none> 1h v1.11.6-gke.2
[afu@MacBook-Pro ~/google-cloud-sdk $]

GKE node migration

[文章目录]
  1. 先讓 node 隔離
  2. 再讓 pod 搬家
  3. 最後在 GKE 縮減 pool-1

事情是這樣的~
我的 GKE Lab 上初次使用 MACHINE_TYPE:g1-small,僅具有 1 cpu 資源
在 Master 角色上,此規格不敷使用,故有轉移至更高規格 node 上的想法。

我參照此篇說明,去實作,文章中提到的是以整個 nodepool 為遷移範例
我是僅針對單一 node 去實作。作法如下:

先讓 node 隔離

kubectl cordon 此隔離動作,並不影響現有 node 上面的服務、pod 的運作,僅影響著後續新的 pod 需求,並不會於被隔離狀態下的 node 部署。

  1. 先於 GKE 上,產生資源足夠的新 node pool,例如 MACHINE_TYPE:n1-standard-2

    1
    2
    3
    4
    5
    # 觀察 node-pools list
    $ gcloud container node-pools list --cluster afu-first-cluster-1
    NAME MACHINE_TYPE DISK_SIZE_GB NODE_VERSION
    pool-1 g1-small 30 1.11.6-gke.2
    pool-2 n1-standard-2 100 1.11.6-gke.2
  2. 舊 node 設定隔離:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    # 觀察目前 pool-1 既有 node info
    $ kubectl get nodes -l cloud.google.com/gke-nodepool=pool-1
    NAME STATUS ROLES AGE VERSION
    gke-afu-first-cluster-1-pool-1-dd90def6-7h0t Ready <none> 21h v1.11.6-gke.2
    gke-afu-first-cluster-1-pool-1-dd90def6-dhfn Ready <none> 21h v1.11.6-gke.2
    gke-afu-first-cluster-1-pool-1-dd90def6-lrcb Ready <none> 23h v1.11.6-gke.2
    gke-afu-first-cluster-1-pool-1-dd90def6-slvd Ready <none> 21h v1.11.6-gke.2

    # 針對 node:gke-afu-first-cluster-1-pool-1-dd90def6-lrcb 進行隔離
    $ kubectl cordon gke-afu-first-cluster-1-pool-1-dd90def6-lrcb
    node/gke-afu-first-cluster-1-pool-1-dd90def6-lrcb cordoned

    # 觀察目前 pool-1 既有 node info
    $ kubectl get nodes -l cloud.google.com/gke-nodepool=pool-1
    NAME STATUS ROLES AGE VERSION
    gke-afu-first-cluster-1-pool-1-dd90def6-7h0t Ready <none> 21h v1.11.6-gke.2
    gke-afu-first-cluster-1-pool-1-dd90def6-dhfn Ready <none> 21h v1.11.6-gke.2
    gke-afu-first-cluster-1-pool-1-dd90def6-lrcb Ready,SchedulingDisabled <none> 23h v1.11.6-gke.2
    gke-afu-first-cluster-1-pool-1-dd90def6-slvd Ready <none> 21h v1.11.6-gke.2

再讓 pod 搬家

  1. 舊 node 進行服務遷移
    kubectl drain 此動作,會促使節點上的 pod 遷移至其他節點。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# 指定 node 進行遷移,出現 DaemonSet pod 無法遷移的情況。
$ kubectl drain --force gke-afu-first-cluster-1-pool-1-dd90def6-lrcb
node/gke-afu-first-cluster-1-pool-1-dd90def6-lrcb already cordoned
error: unable to drain node "gke-afu-first-cluster-1-pool-1-dd90def6-lrcb", aborting command...

There are pending nodes to be drained:
gke-afu-first-cluster-1-pool-1-dd90def6-lrcb
error: DaemonSet-managed pods (use --ignore-daemonsets to ignore): consul-zk9wn, calico-node-6fdsp, ip-masq-agent-l85nd

# 指定 node 進行遷移,新增參數
# --ignore-daemonsets 忽略 DaemonSet pod
# --delete-local-data 刪除節點上資料
# --grace-period 設定寬限期
$ kubectl drain --force --ignore-daemonsets --delete-local-data --grace-period=10 gke-afu-first-cluster-1-pool-1-dd90def6-lrcb
node/gke-afu-first-cluster-1-pool-1-dd90def6-lrcb already cordoned
WARNING: Ignoring DaemonSet-managed pods: consul-zk9wn, calico-node-6fdsp, ip-masq-agent-l85nd
pod/l7-default-backend-7ff48cffd7-zbd8m evicted
pod/kube-dns-autoscaler-67c97c87fb-nc2rd evicted
pod/tiller-deploy-77c96688d7-jckdw evicted
pod/metrics-server-v0.2.1-fd596d746-cdp8p evicted
pod/calico-typha-horizontal-autoscaler-5ff7f558cc-zms8d evicted
pod/calico-typha-vertical-autoscaler-5d4bf57df5-pwjvn evicted
pod/calico-typha-5b857668fd-lzkf7 evicted
pod/calico-node-vertical-autoscaler-547d98499d-nt874 evicted
pod/kube-dns-7549f99fcc-kbvkt evicted
node/gke-afu-first-cluster-1-pool-1-dd90def6-lrcb evicted

最後在 GKE 縮減 pool-1

最後在 GKE 上縮減 pool-1 數量大小 4 -> 3,GKE 會將上述結點進行移除

完成,做過兩遍,都符合節點遷移需求,有此經驗後下次再進行遷移比較有所概念了~

參考官方文件說明頁
參考網友解說

人生第一個 K8s Dashboard ^.^

[文章目录]
  1. 起手式 deploy dashboard
    1. 下載 kubernetes-dashboard.yaml
    2. 修改 kubernetes-dashboard.yaml
    3. 建立 kubernetes-dashboard
    4. Create An Authentication Token (RBAC)
      1. Create Service Account
      2. Create ClusterRoleBinding
      3. deploy Account & ClusterRoleBinding
    5. 存取 Dashboard
      1. Bearer Token
      2. 實作 Kubeconfig file

建置完成 K8s 環境後,總是需要有個操作介面,給予管理人員使用。
官方有提供 Dashboard GUI 介面,下面紀錄安裝過程~

起手式 deploy dashboard

我是參考官網資訊頁來建置

下載 kubernetes-dashboard.yaml

1
$ wget https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml

修改 kubernetes-dashboard.yaml

service type= NodePort

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
ports:
- port: 443
targetPort: 8443
# nodePort: 30001
type: NodePort # <<<~~~~
selector:
k8s-app: kubernetes-dashboard

建立 kubernetes-dashboard

1
2
3
4
5
6
7
[afu@dev-k8sm1 ~]$ kc apply -f kubernetes-dashboard.yaml
secret/kubernetes-dashboard-certs created
serviceaccount/kubernetes-dashboard created
role.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
deployment.apps/kubernetes-dashboard created
service/kubernetes-dashboard created

Create An Authentication Token (RBAC)

參考官方說明頁

Create Service Account

k8s-dashboard-adminuser.yml

1
2
3
4
5
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kube-system

Create ClusterRoleBinding

k8s-dashboard-CRB.yml

1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kube-system

deploy Account & ClusterRoleBinding

1
2
3
[afu@dev-k8sm1 ~]$ kc apply -f k8s-dashboard-adminuser.yml -f k8s-dashboard-CRB.yml
serviceaccount/admin-user created
clusterrolebinding.rbac.authorization.k8s.io/admin-user created

存取 Dashboard

存取網址:https://192.168.100.174:NodePort/

登入 Dashboard 有兩個方式:

  1. Kubeconfig:選擇你建立的 kubeconfig 檔案,來設定存取叢集。
  2. Token:每個服務帳戶(Service Account)擁有一個持有 Bearer Token 的 Secret,可用來登入儀表板。

Bearer Token

透過獲取 token ,完成登入認證,可參考官方說明頁
指令:

1
2
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}') | awk '/^token:/{print $2}'

實作 Kubeconfig file

mkdir ansible/k8s/kubeconfig-exercise
cd ansible/k8s/kubeconfig-exercise
vi kubeconfig-demo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: v1
kind: Config
preferences: {}

clusters:
- cluster:
name: TestOCE

users:
- name: admin-user
#- name: experimenter

contexts:
- context:
name: exp-default
- context:
name: dev-frontend
- context:
name: dev-storage

Go to your config-exercise directory.

  • Add cluster details to your configuration file:

    1
    2
    $ kubectl config --kubeconfig=kubeconfig-demo set-cluster development --server=https://192.168.100.174 --insecure-skip-tls-verify
    Cluster "development" set.
  • Add user details to your configuration file:

    1
    2
    $ kubectl config --kubeconfig=kubeconfig-demo set-credentials admin-user --username=admin --password=ooxxqqpp
    User "admin-user" set.
  • Add context details to your configuration file:

    1
    2
    3
    4
    5
    6
    $ kubectl config --kubeconfig=kubeconfig-demo set-context dev-frontend --cluster=TestOCE --namespace=default --user=admin-user
    Context "dev-frontend" modified.
    $ kubectl config --kubeconfig=kubeconfig-demo set-context dev-storage --cluster=TestOCE --namespace=default --user=admin-user
    Context "dev-storage" modified.
    $ kubectl config --kubeconfig=kubeconfig-demo set-context exp-default --cluster=TestOCE --namespace=default --user=admin-user
    Context "exp-default" modified.
  • opening the config-demo file, you can use the config view command.

    1
    2
    3
    4
    $ kubectl config --kubeconfig=kubeconfig-demo view

    # To see only the configuration information associated with the current context, use the --minify flag.
    $ kubectl config --kubeconfig=kubeconfig-demo view --minify
  • Set the current-context to dev-frontend:
    $ kubectl config --kubeconfig=kubeconfig-demo use-context dev-frontend

  • Change the current context to dev-storage:
    $ kubectl config --kubeconfig=kubeconfig-demo use-context dev-storage

  • Delete the context
    kubectl config delete-context kubernetes-admin@kubernetes

Kubernetes 建置與執行,此書第四章 p.37~38 也有提到 kubectl config use-context 用法。

K8s Service 介紹

[文章目录]
  1. 建立服務
    1. 透過 kubectl expose 建立 service
      1. 提前概要:
      2. Lab案例:
      3. Lab案例觀察
    2. 開放 service 對外存取
      1. 方式:kubectl port-forward 轉發 tcp 連線
      2. 方式:變更現有 service type
      3. 方式:kubectl expose –type=NodePort
  2. 服務探索
    1. 關於 Kube-dns 與 Service CLUSTER-IP 的血緣關係
    2. 關於 Endpoint
    3. 手動探索服務
  3. kube-proxy 與 CLUSTER-IP 愛恨情仇
  4. 服務探測、檢查功能
    1. 探測器定義要點
    2. 三探測模式應用場景
    3. Liveness 定義方式
    4. Readiness 定義方式
    5. 配置說明
      1. HTTP 應用中進行 httpGet 檢查,有其他配置選項:

K8s Service 是為了所部署的應用程式,能夠開放對外連線,確保可提供應用服務使用,發揮其應用價值。
這篇,來了解複雜的 Service 細節,對於部署應用有基本的幫助。


建立服務

透過 kubectl expose 建立 service

提前概要:

  • 從現有的 Deployment、ReplicaSet 物件中,建立 Service
  • kubectl expose 執行條件:物件中必須要有 selector、port flag

相關 log:

1
2
3
4
5
6
$ kubectl expose pod alpaca-prod
error: could not retrieve selectors via --selector flag or introspection: the pod has no labels and cannot be exposed
See 'kubectl expose -h' for help and examples.

$ kubectl expose replicasets alpaca-prod
error: could not find port via --port flag or introspection

Lab案例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# 建立 replicaset 物件
[afu@k8s ~]$ kubectl apply -f 7-78-kuard-rs.yaml
replicaset.extensions/alpaca-prod created

# Get pod 物件資訊 、 --show-labels
[afu@k8s ~]$ kubectl get pod -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE LABELS
alpaca-prod-5bbdx 1/1 Running 0 15s 172.17.0.5 minikube <none> app=alpaca,env=prod,ver=1
alpaca-prod-88gdw 1/1 Running 0 15s 172.17.0.6 minikube <none> app=alpaca,env=prod,ver=1
alpaca-prod-p749d 1/1 Running 0 15s 172.17.0.7 minikube <none> app=alpaca,env=prod,ver=1


[afu@k8s ~]$ kubectl get replicaset -o wide
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
alpaca-prod 3 3 3 27s kuard gcr.io/kuar-demo/kuard-amd64:2 app=alpaca,env=prod,ver=1


[afu@k8s ~]$ kubectl get daemonset -o wide
No resources found.

[afu@k8s ~]$ kubectl get deployment -o wide
No resources found.

# 透過 kubectl expose 建立 service
[afu@k8s ~]$ kubectl expose replicasets alpaca-prod
service/alpaca-prod exposed

Lab案例觀察

1
2
3
4
[afu@k8s ~]$ kubectl get service -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
alpaca-prod ClusterIP 10.103.75.220 <none> 8080/TCP 13s app=alpaca,env=prod,ver=1
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 65m <none>

觀察重點:

  1. 從上面建立 alpaca-prod service 範例當中,觀察 SELECTOR 欄位 app=alpaca,env=prod,ver=1
    kubectl expose 指令是從上述的 replicaset 物件中擷取 label、port 兩參數作為 service flag
  2. 此 service type 是 ClusterIP,並提供一個 CLUSTER-IP 虛擬IP。
  3. 系統會透過 SELECTOR 條件識別,進行 service -> pod 負載平衡。
  4. 目前此 service 無法對外開放存取,僅限於 K8s Cluster。

開放 service 對外存取

從上述 alpaca-prod service 範例中直接進行後續說明
上述範例中,alpaca-prod service 預設採用 CLUSTER-IP 賦予叢集中可以….我還不是很懂此用途
在此要透過另一個 service type:NodePort 方式,開放服務對外連線

方式:kubectl port-forward 轉發 tcp 連線

有關於 port-forward:

  1. 可以轉發 TCP 連線,相對於 kube-proxy 僅能轉發 http 流量,有所不同。
  2. 可參考 K8s port-forward 官方教學頁
1
2
3
4
5
6
7
8
9
10
11
# 針對 service 物件進行 port-forward
[afu@k8s ~]$ kubectl port-forward --address 0.0.0.0 service/alpaca-prod 30111:8080
Forwarding from 0.0.0.0:30111 -> 8080
Handling connection for 30111
Handling connection for 30111

# 針對 pod 物件進行 port-forward
[afu@k8s ~]$ kubectl port-forward --address 0.0.0.0 kuard-config 30333:8080
Forwarding from 0.0.0.0:30333 -> 8080
Handling connection for 30333
Handling connection for 30333

方式:變更現有 service type

將 type: ClusterIP 修改成 type: NodePort

1
2
[afu@k8s ~]$ kubectl edit service alpaca-prod
service/alpaca-prod edited

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2019-01-15T09:05:36Z"
labels:
app: alpaca
env: prod
ver: "1"
name: alpaca-prod
namespace: default
resourceVersion: "5202"
selfLink: /api/v1/namespaces/default/services/alpaca-prod
uid: b87c02f5-18a4-11e9-8813-08002730aeb3
spec:
clusterIP: 10.103.75.220
ports:
- port: 8080
protocol: TCP
targetPort: 8080
selector:
app: alpaca
env: prod
ver: "1"
sessionAffinity: None
type: NodePort
# ^ 將 type: ClusterIP 修改成 type: NodePort
status:
loadBalancer: {}

變更後觀察:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[afu@k8s ~]$ kubectl get service -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
alpaca-prod NodePort 10.103.75.220 <none> 8080:30044/TCP 37m app=alpaca,env=prod,ver=1
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 102m <none>


[afu@k8s ~]$ kubectl describe service alpaca-prod
Name: alpaca-prod
Namespace: default
Labels: app=alpaca
env=prod
ver=1
Annotations: <none>
Selector: app=alpaca,env=prod,ver=1
Type: NodePort
IP: 10.103.75.220
Port: <unset> 8080/TCP
TargetPort: 8080/TCP
NodePort: <unset> 30044/TCP
Endpoints: 172.17.0.5:8080,172.17.0.6:8080,172.17.0.7:8080
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>

方式:kubectl expose –type=NodePort

關於 kubectl expose –type=NodePort,可參考官方說明頁

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# 建立第二範例 alpaca-prod22
[afu@k8s ~]$ kubectl apply -f 7-78-kuard22-rs.yaml
replicaset.extensions/alpaca-prod22 created

# kubectl expose + --type=NodePort
[afu@k8s ~]$ kubectl expose replicaset alpaca-prod22 --type=NodePort
service/alpaca-prod22 exposed

[afu@k8s ~]$ kubectl get service -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
alpaca-prod NodePort 10.103.75.220 <none> 8080:30044/TCP 53m app=alpaca,env=prod,ver=1
alpaca-prod22 NodePort 10.105.98.135 <none> 8080:30142/TCP 20s app=alpaca,env=prod,ver=2
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 118m <none>

[afu@k8s ~]$ kubectl describe service alpaca-prod22
Name: alpaca-prod22
Namespace: default
Labels: app=alpaca
env=prod
ver=2
Annotations: <none>
Selector: app=alpaca,env=prod,ver=2
Type: NodePort
IP: 10.105.98.135
Port: <unset> 8080/TCP
TargetPort: 8080/TCP
NodePort: <unset> 30142/TCP
Endpoints: 172.17.0.10:8080,172.17.0.8:8080,172.17.0.9:8080
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>

透過 NodePort 開放服務連線,仍需要環境中的 public ip 賦予連線
如果是 GCP 環境,可下指令:gcloud compute instances list
如果是 Minikube,可下指令:kubectl cluster-info


服務探索

服務探索有助於也有必要去發現哪些服務程序正在執行、位於何處執行、IP\Port~等等資訊
如何快速找到正確的物件對象,去探索及解決問題、能夠快速上線服務等等,都是需要依賴服務探索
以下介紹三個關於服務探索的介紹。

關於 Kube-dns 與 Service CLUSTER-IP 的血緣關係

在上述中,有提到 CLUSTER-IP,但不是很清楚有何作用,現在透過 kube-dns 服務,可以利用 FQDN 方式取得與連線 service。
kube-dns 服務是在建立 K8s 環境時,一併預設就會安裝的服務,可以透過此 DNS 服務查詢到 service 與 CLUSTER-IP 關係~

從上述 Lab kuard 服務中,可以透過 Kuard 網頁中 DNS query 功能查詢相關 A 紀錄
kubectl port-forward --address 0.0.0.0 service/alpaca-prod 30111:8080

  • 查詢預設 namespace,只需輸入 server name

    • 例如:alpaca-prod
      ;; QUESTION SECTION:
      ;alpaca-prod.default.svc.cluster.local. IN A

      ;; ANSWER SECTION:
      alpaca-prod.default.svc.cluster.local. 5 IN A 10.103.75.220

  • 查詢不同 namespace,需要輸入 server.namespace

    • 例如:kubernetes-dashboard.kube-system
      ;; QUESTION SECTION:
      ;kubernetes-dashboard.kube-system.svc.cluster.local. IN A

      ;; ANSWER SECTION:
      kubernetes-dashboard.kube-system.svc.cluster.local. 5 IN A 10.109.91.186

關於 Endpoint

K8s 在於建立服務物件時,會一同建立 endpoint 物件
例如稍早 Lab alpaca-prod 服務。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[afu@k8s ~]$ kubectl describe ep alpaca-prod
Name: alpaca-prod
Namespace: default
Labels: app=alpaca
env=prod
ver=1
Annotations: <none>
Subsets:
Addresses: 172.17.0.5,172.17.0.6,172.17.0.7
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
<unset> 8080 TCP

Events: <none>

[afu@k8s ~]$ kubectl get ep alpaca-prod
NAME ENDPOINTS AGE
alpaca-prod 172.17.0.5:8080,172.17.0.6:8080,172.17.0.7:8080 6h55m

另外也可以用 –watch 持續觀察服務 endpoint 的變化~

1
2
3
4
5
[afu@k8s ~]$ kubectl get ep alpaca-prod22 --watch
NAME ENDPOINTS AGE
alpaca-prod22 172.17.0.10:8080,172.17.0.8:8080,172.17.0.9:8080 4s
alpaca-prod22 172.17.0.10:8080,172.17.0.8:8080,172.17.0.9:8080 2m11s
alpaca-prod22 172.17.0.10:8080,172.17.0.8:8080,172.17.0.9:8080 0s

手動探索服務

稍早提到 label 應用於 service、replicaset、pod
透過 –show-labels 可以顯示出每個物件的 Label 資訊。

1
2
3
4
5
6
7
8
9
# Get pod 物件資訊 、 --show-labels
[afu@k8s ~]$ kubectl get pod -o wide --show-labels
NAME READY STATUS ··· AGE IP NODE ··· LABELS
alpaca-prod-5bbdx 1/1 Running ··· 7h11m 172.17.0.5 minikube ··· app=alpaca,env=prod,ver=1
alpaca-prod-88gdw 1/1 Running ··· 7h11m 172.17.0.6 minikube ··· app=alpaca,env=prod,ver=1
alpaca-prod-p749d 1/1 Running ··· 7h11m 172.17.0.7 minikube ··· app=alpaca,env=prod,ver=1
alpaca-prod22-9zrwf 1/1 Running ··· 21m 172.17.0.8 minikube ··· app=alpaca,env=prod,ver=2
alpaca-prod22-fh44h 1/1 Running ··· 21m 172.17.0.10 minikube ··· app=alpaca,env=prod,ver=2
alpaca-prod22-zdtrv 1/1 Running ··· 21m 172.17.0.9 minikube ··· app=alpaca,env=prod,ver=2

基於 pod 具有 Labels 資訊,在眾多的服務 pods 中,就可以善用 --selector 進行篩選探索出所需要的 pod 物件對象

  • -l, --selector='': Selector (label query) to filter on, supports '=', '==', and '!='.(e.g. -l key1=value1,key2=value2)
    1
    2
    3
    4
    5
    6
    # 篩選出 ver=2 的 alpaca pod 物件
    [afu@k8s ~]$ kubectl get pod -o wide --selector=ver=2,app=alpaca
    NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
    alpaca-prod22-9zrwf 1/1 Running 0 33m 172.17.0.8 minikube <none>
    alpaca-prod22-fh44h 1/1 Running 0 33m 172.17.0.10 minikube <none>
    alpaca-prod22-zdtrv 1/1 Running 0 33m 172.17.0.9 minikube <none>

回顧下,replicaset、service 也都同樣利用到 selector 去定義、篩選出本身與 pod 物件。


kube-proxy 與 CLUSTER-IP 愛恨情仇

  1. 我對 Service CLUSTER-IP 的見解是摸不著邊,不存在於 Host、CRI network 中,是被包裹在 iptable 中。
  2. CLUSTER-IP 是無法直接存取,需透過kube-proxy、kubectl port-forward等方式存取其中的服務。
  3. kube-proxy 作用於 service 中,每當 service 建立時一同建立 endpoint 紀錄,並依此紀錄進行 iptable 規則改寫,
    所以每當變動pod、endpoint紀錄異動皆會改寫 iptable 規則。
  • 引用官方敘述
    1
    2
    3
    --service-cluster-ip-range ipNet     Default: 10.0.0.0/24
    A CIDR notation IP range from which to assign service cluster IPs.
    This must not overlap with any IP ranges assigned to nodes for pods.

CLUSTER-IP 環境變數,我們可以透過 Kuard 服務來觀察

  • 啟動 kuard [afu@k8s ~]$ kubectl port-forward --address 0.0.0.0 service/alpaca-prod 30111:8080

服務探測、檢查功能

K8s 運行應用程式的容器時,K8s 會利用健康檢查進程方式促進容器維持運作中~
健康檢查進程只是確保 Container 內的應用程式進程能夠一直運行,如沒運行就進行重啟。
但是,僅檢查程序 “是否運行” 是不足夠的,例如進程 deadlock 了,雖進程運行中但無法回應請求,這在健康檢查程序是無法判斷出問題的,且仍認為正常運行中…囧

K8s 支援了兩種探測器:Liveness、Readiness
這兩者 Probe 功能是有差異的~

  • Liveness 是在 pod 已經運行狀態下並且開放請求流量時,進行 pod check 任務,如檢查失敗則進行 pod 重啟,嘗試恢復服務。
  • Readiness 是反過來,就在 Pod 未接受請求流量前就開始進行 pod check,並確認 check ok 之後,pod 才加入 service 行列中開始接受請求流量。
    Liveness 探測器功能,可以針對應用程式的邏輯進行請求,並期望獲得正常回應,藉此判斷應用程式是否合理的運作中。
    Readiness 探測器功能,在於服務部署初始化之後,先行檢查 pod 是否已經準備就緒,檢查後才可開放服務流量進來。

共通點就是,確保 pod 在健康狀態下於 K8s 叢集中運行,有關兩者基本介紹如下:

探測器定義要點

  • 探測器功能是定義於 Pod manifest 檔案中。
  • 每個 Pod 的 Liveness、Readiness 功能是分開個別定義的。
  • 支援三種模式:
    • exec:利用命令腳本方式
    • httpGet: 透過 http 協定回應
    • tcpSocket:嘗試進行 TCP Socket 連線

三探測模式應用場景

  1. 可在容器內執行腳本或程式,定義exec執行腳本,探測後回傳 zero 退出狀態碼,則視為探測正常運作。
  2. 針對 http 服務可以定義httpGet請求,藉此獲得 http response 狀態,判斷此 http 程序是否正常運作。
  3. 針對 TCP Socket 連線狀態進行探測,可以定義tcpSocket

Liveness 定義方式

1
2
3
4
5
6
7
8
9
10
11
12
# Manifest 檔案
spec:
containers:
- image: k8s.gcr.io/echoserver:1.10
livenessProbe:
httpGet:
path: /healthy
port: 8080
initialDelaySeconds: 1
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3

Readiness 定義方式

其配置方式與 Liveness 是雷同地:

1
2
3
4
5
6
7
8
9
10
11
12
# 以下僅是配置示意範本
spec:
containers:
- image: k8s.gcr.io/echoserver:1.10
readinessProbe:
httpGet:
path: /healthy
port: 8080
initialDelaySeconds: 1
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3

配置說明

Probe 配置中有很多依場景需求而定,盡可能滿足 liveness 和 readiness 的檢查功能:

  • initialDelaySeconds:容器啟動後幾秒之後進行第一次探測任務。
  • periodSeconds:執行探測的頻率間隔,默認是10秒,最小1秒。
  • timeoutSeconds:探測超時時間,默認1秒,最小1秒。
  • successThreshold:繼上次失敗後,最少連續探測成功次數才被認定為成功,默認是 1;
    • 對於 liveness 必須是 1,最小值是 1。
  • failureThreshold:連續探測失敗多少次才被認定為失敗,默認是 3,最小值是 1;
    • liveness 達標情況下將重啟 pod。
    • readiness 在此標準下,認定 pod 未準備就緒。

HTTP 應用中進行 httpGet 檢查,有其他配置選項:

  • host:連線的主機名,預設連線到 pod IP。
  • scheme:連線使用的 schema,默認HTTP。
  • path: 請求 HTTP server 的 path。
  • httpHeaders:定義 HTTP request header。
  • port:容器內所啟用的 HTTP TCP Port,介於 1 和 65525 之間。

有關 HTTP probe,由 kubelet 針對所指定的path、port發送 HTTP 請求進行檢查。
預設,Kubelet 將 probe 發送到容器的 IP 地址,除非設置了host欄位而覆蓋了預設值。
在大多數情況下,是不用設置host欄位。

有一種情況下您可以設置它~
假設容器在 127.0.0.1 Listen service,並且 PodhostNetwork欄位是true,此時,httpGet host是該設置為127.0.0.1
如果 pod 依賴於虛擬主機,這可能是更常見的情況,不建議使用host,而應該利用httpHeaders去定義 host header

CentOS 7 上安裝 K8s 自家練習環境

[文章目录]
  1. K8s 基本環境安裝
    1. 注意事項:
      1. 稍後進行 kubeadm init 之後,即會產生此檔案 /var/lib/kubelet/config.yaml,而且此服務會自行重啟,直到成功。
  • K8s 初始化叢集
    1. 初始化之前,先進行 gcr.io 連線測試
    2. 初始化經驗
    3. 初始化參數:
    4. 注意事項:
      1. 初始化第一次錯誤,原因有二
  • 允許非 root 權限下執行 kubectl
  • 設定 CNI:Flannel
    1. 檢查狀態
  • Joining your nodes
    1. 進行 Join nodes 之前,需要確認兩件事情
  • 以上僅是初步完成 K8s 環境安裝
  • 這篇算是複習、重點回顧
  • 挑戰在後頭
  • 先前,我搭建過乙次 K8s + CRI-O 環境,記錄於下列五篇文章中~
    文章一
    文章二
    文章三
    文章四
    文章五

    此次,基於上次的經驗,建構公司要使用的 Lab 環境,也為了簡單化,CRI 部分先採用大家熟悉的 Docker。
    安裝過程,主要也是參照官網說明頁,不再另外記載~ 貼上去於此,就沒有什麼意思
    如有注意事項或者錯誤資訊,會在下面記載下來。


    K8s 基本環境安裝

    安裝有下列三步驟:

    1. 安裝的準備

      • 進行之前,得先關閉 SWAP~
      • 確認 SELinux in permissive mode
      • 配置與生效系統參數 /etc/sysctl.d/k8s.conf
        • net.bridge.bridge-nf-call-ip6tables = 1
        • net.bridge.bridge-nf-call-iptables = 1
        • net.ipv4.ip_forward = 1
        • vm.swappiness = 0
    2. 先安裝 CRI 元件,這次選擇 Docker
      參考此官網說明頁

    3. 安裝 kubeadm, kubelet and kubectl
      參考此官網說明頁

    注意事項:

    • 啟用後,kubelet 仍是沒順利啟用,錯誤訊息如下:
      1
      2
      3
      4
      Jan 10 18:20:40 dev-k8sm1 kubelet: F0110 18:20:40.118057   40783 server.go:189] 
      failed to load Kubelet config file /var/lib/kubelet/config.yaml,
      error failed to read kubelet config file "/var/lib/kubelet/config.yaml",
      error: open /var/lib/kubelet/config.yaml: no such file or directory

    稍後進行 kubeadm init 之後,即會產生此檔案 /var/lib/kubelet/config.yaml,而且此服務會自行重啟,直到成功。


    K8s 初始化叢集

    以上,已經將基本的 K8s 初步元件安裝完畢,下面我們來進行 K8s 初始化叢集環境~

    初始化之前,先進行 gcr.io 連線測試

    1
    2
    3
    4
    5
    6
    7
    8
    9
    # kubeadm config images pull log
    [afu@dev-k8sm1 ~]$ sudo kubeadm config images pull
    [config/images] Pulled k8s.gcr.io/kube-apiserver:v1.13.1
    [config/images] Pulled k8s.gcr.io/kube-controller-manager:v1.13.1
    [config/images] Pulled k8s.gcr.io/kube-scheduler:v1.13.1
    [config/images] Pulled k8s.gcr.io/kube-proxy:v1.13.1
    [config/images] Pulled k8s.gcr.io/pause:3.1
    [config/images] Pulled k8s.gcr.io/etcd:3.2.24
    [config/images] Pulled k8s.gcr.io/coredns:1.2.6

    初始化經驗

    在初始化的參數中,必須要有 pod network cidr,參數--pod-network-cidr
    如需要指定 interface IP 作為 apiserver 服務IP,則需要有參數--apiserver-advertise-address

    初始化參數:

    1
    sudo kubeadm init --apiserver-advertise-address=$(hostname -i)  --pod-network-cidr=10.244.0.0/16

    注意事項:

    初始化第一次錯誤,原因有二

    1
    2
    3
    4
    5
    6
    7
    # 初始化資訊
    [init] Using Kubernetes version: v1.13.1
    [preflight] Running pre-flight checks
    [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
    error execution phase preflight: [preflight] Some fatal errors occurred:
    [ERROR Swap]: running with swap on is not supported. Please disable swap
    [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
    1. 以上提醒了要開放防火牆 6443/TCP、10250/TCP
      開放方式

      1
      2
      3
      sudo firewall-cmd --zone=public --add-port=6443/tcp --permanent
      sudo firewall-cmd --zone=public --add-port=10250/tcp --permanent
      sudo firewall-cmd --reload
    2. 忘了關閉 swap,糗~
      關閉方式

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      # 關閉 SWAP
      sudo swapoff -a
      sudo vi /etc/fstab
      # 註解下列 swap volume
      #/dev/mapper/VolGroup00-LogVol01 swap swap defaults

      sudo vi /etc/sysctl.d/k8s.conf
      net.bridge.bridge-nf-call-ip6tables = 1
      net.bridge.bridge-nf-call-iptables = 1
      vm.swappiness=0

      sudo sysctl --system

    初始化成功紀錄:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    [afu@dev-k8sm1 ~]$ sudo kubeadm init --apiserver-advertise-address=$(hostname -i)  --pod-network-cidr=10.244.0.0/16
    [init] Using Kubernetes version: v1.13.1
    Your Kubernetes master has initialized successfully!

    To start using your cluster, you need to run the following as a regular user:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

    You should now deploy a pod network to the cluster.
    Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
    https://kubernetes.io/docs/concepts/cluster-administration/addons/

    You can now join any number of machines by running the following on each node
    as root:

    kubeadm join API-IP:6443 --token abc.zzxxccvv --discovery-token-ca-cert-hash sha256:0123456789

    允許非 root 權限下執行 kubectl

    1
    2
    3
    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

    如果您有 root ,建議執行
    export KUBECONFIG=/etc/kubernetes/admin.conf


    設定 CNI:Flannel

    要使用 Flannel 作為 CNI,則在稍早 kubeadm init 中指定參數 --pod-network-cidr=10.244.0.0/16
    稍早我有指定,無誤~
    需要設定 sysctl net.bridge.bridge-nf-call-iptables=1,好讓流量能夠透過 iptables 通行。
    這項稍早也完成設定,無誤~

    檢查狀態

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    [afu@dev-k8sm1 ~]$ sudo systemctl status kubelet
    ● kubelet.service - kubelet: The Kubernetes Node Agent
    Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
    └─10-kubeadm.conf
    Active: active (running) since Thu 2019-01-10 21:12:30 CST; 19min ago
    Docs: https://kubernetes.io/docs/
    Main PID: 56904 (kubelet)

    [afu@dev-k8sm1 ~]$ kubectl get pods --all-namespaces
    NAMESPACE NAME READY STATUS RESTARTS AGE
    kube-system coredns-86c58d9df4-4kp2l 1/1 Running 0 18m
    kube-system coredns-86c58d9df4-zfh7s 1/1 Running 0 18m
    kube-system etcd-dev-k8sm1.kingbay-tech.com 1/1 Running 0 17m
    kube-system kube-apiserver-dev-k8sm1.kingbay-tech.com 1/1 Running 0 17m
    kube-system kube-controller-manager-dev-k8sm1.kingbay-tech.com 1/1 Running 0 17m
    kube-system kube-flannel-ds-amd64-rx877 1/1 Running 0 28s
    kube-system kube-proxy-dvm46 1/1 Running 0 18m
    kube-system kube-scheduler-dev-k8sm1.kingbay-tech.com 1/1 Running 0 17m

    Joining your nodes

    

    進行 Join nodes 之前,需要確認兩件事情

    1. 需要完成 CRI Docker 安裝
    2. 需要完成 kubeadm 安裝

    上述確認無誤後,即可於上述的 kubeadm join 資訊,完成 Join 動作。

    以上僅是初步完成 K8s 環境安裝

    這篇算是複習、重點回顧

    挑戰在後頭

    CD-第十章節心得-2

    [文章目录]
    1. 第十章節:提示與技巧
      1. 提示與技巧 P.271 ~
      2. 記錄部署活動
      3. 不要刪除舊擋,而是移動到別的位置
      4. 部署是整個團隊的責任
      5. 伺服器應用程式不應該有GUI
      6. 為新部署預留熱期
      7. 快速失敗
      8. 不要直接對生產環境進行修改
    2. 小結

    第十章節:提示與技巧

    提示與技巧 P.271 ~

    常常~部署團隊去部署不熟悉的系統或服務,就是產生不愉快經驗的開始。
    因此~讓真正執行部署的人,應該參與部署流程的建立。從頭建立團隊情感。
    部署的流程文件,也就可以從頭開始建立、修正、驗證,達到正確有效率的部署基礎。

    記錄部署活動

    如果軟體開發出的自動化部署,記錄那些環境因素、套件需求、檔案與目錄權限是非常重要的。
    部署後日誌、部署清單也是必要留下的。

    不要刪除舊擋,而是移動到別的位置

    如前一篇提到:應該要保留一份舊版本的副本區~相同的基本原則,還沒升級、部署完成之前,務必留下後路,不要置於自己不義的情況下。
    UNIX環境中最佳實踐:把應用程式每個版本部署到單獨的目錄中,再透過軟連結來指向目標版本。
    如此,版本的切換(還原)變得更加容易。

    部署是整個團隊的責任

    團隊中的每個成員都應該知道如何部署,及如何維護部署腳本。藉由每次部署工作都使用相同且真正的部署腳本(公版概念)。

    伺服器應用程式不應該有GUI

    反差:這叫 Windows 情何以堪~

    書中這麼下標題,其說法是:過去GUI類型應用程式,設置資訊沒有腳本化、應用程式位於哪個目錄非常敏感。
    其主要原因是不少應用程式依賴著使用者登入狀態情境下,才能啟動應用程式。

    為新部署預留熱期

    當網站在官方發布時,就應該已經執行了一段時間,足以讓應用程式的伺服器和資料庫建立好他們的緩存,準備好全部的連線,完成了預熱階段。

    謎之音:系統需要預熱我第一次聽到,CDN 系統中的快取需要預熱預存,倒是很常見。

    書中提到,那隻鳥~利用金絲雀發佈,算是達到預熱的目標。
    新伺服器服務經過一小部分的請求後,確認無誤證明了沒有異常,方可持續切換線上流量到新系統上。

    快速失敗

    部署腳本,應該也要納入測試計畫中~
    為了確保部署順利,這些測試應該被當作部署的驗證一環;理想下,當遇到問題,就應該立即停止。

    不要直接對生產環境進行修改

    前面章節,都有提到版控的重要性,非常不建議在線上生產環境中進行任何的修改。
    生產環境裡當被定義為「鎖定」概念,不會因未受控制的修改,讓系統呈現不穩定、未知的風險階段。
    如此的限制,才能促進走向「持續部署」的光明大道。

    小結

    • 部署流水線的幾個階段,比較常關注測試環境與生產環境的部署。
    • 這些階段的差別在於一定要有自動化測試階段,只有測試許可與正確,部署流水線就應該能夠透過「點擊按鈕」進行後續部署階段。
    • 部署的歷程,應該能夠讓團隊明確看出哪個版本被部署到哪個環境中。版本差異的透明度也需要。
    • 降低部署失敗風險,最佳方式就是落實持續發佈,每當有版本修訂,就該照流程完成部署到個階段測試環境中。
    • 頻繁的部署,越能縮小“有問題”的範圍,持續部署任務就越趨近100%成功。
    • 部署流程應該在最開始就實現與進行,並且持續迭代的改善流程。不應該在接近最後重要時刻才想到部署流程。
    • 只要參與開發、業務、測試、DBA、支援、維運等等團隊,這些人需要持續交流、合作,共同促使交付公司產品服務更有效率。

    CD-第十章節心得-1

    [文章目录]
    1. 第十章節:持續部署
      1. 持續部署 P.267 ~
        1. 多帶著這隻鳥跟著持續部署計畫前進,會有助於版本正確性更加有信心,問題風險控管更加有利。
        2. 持續發佈使用者需自行安裝的軟體 P.269 ~
      2. 提示與技巧 P.271 ~

    第十章節:持續部署

    持續部署 P.267 ~

    合乎「持續部署」邏輯的極限。就是每當有版本通過自動化測試,就將其部署到生產環境中,
    這是由 Timothy Fitz 先生創造的專業術語。
    一直進行部署,也不是說只要願意就進行部署,關鍵點還是在於完成自動化測試,並且持續部署到生產環境中。

    透過部署流水線,讓每次提交的程式通過了全部的自動化測試,就應該直接部署到生產環境中。如果想讓這做法不出問題,各種自動化測試就必須覆蓋整個應用程式。

    自動化測試項目:單元測試、元件測試、功能性測試、非功能性驗收測試。

    多帶著這隻鳥跟著持續部署計畫前進,會有助於版本正確性更加有信心,問題風險控管更加有利。

    總是有人反對持續部署,總是擔心風險太高~
    反過來,運用持續部署流程,會更加有助於團隊成員迫使自己做出正確的事。
    做出正確的事情越頻繁,效率越高,也是減少風險的絕佳方式。

    持續發佈使用者需自行安裝的軟體 P.269 ~

    軟體開發後,發佈在自己公司平台上,好不好管理?
    軟體開發後,發佈在他人資訊平台上,好不好管理?
    根本就是不同的管理思維~
    需要考慮:

    1. 管理眾多的版本、歷程、平台載具
    2. 升級是否自動化、是否通知、是否前景下升級
    3. 升級前的條件檢查
    4. 升級失敗後的退版方案
    Stable release(s) Versions
    Windows 70.0.3538.67 / October 16, 2018; 7 days ago[1]
    macOS 70.0.3538.64 / October 17, 2018; 6 days ago[2]
    Linux Android iOS 70.0.3538.75 / October 23, 2018; 0 days ago[3]
    Preview release(s) Versions
    Beta (Windows, macOS, Linux) 70.0.3538.67 / October 16, 2018; 7 days ago[4]
    Beta (Android) 70.0.3538.64 / October 15, 2018; 8 days ago[5]
    Beta (iOS) 71.0.3578.14 / October 19, 2018; 4 days ago[6]
    Dev (Windows, macOS, Linux) 71.0.3578.20 / October 23, 2018; 0 days ago[7]
    Dev (Android) 71.0.3578.12 / October 18, 2018; 5 days ago[6]
    Dev (iOS) 72.0.3585.0 / October 23, 2018; 0 days ago[6]
    Canary (Windows, macOS) 72.0.3589.0 / October 23, 2018; 0 days ago[6]
    Canary (Android) 72.0.3589.0 / October 23, 2018; 0 days ago[6]

    對於升級這件事情,使用者可能不了解甚至不想進行“升級”,如果沒有提供有益的升級資訊,通常很難帶動升級潮 XD

    此時想到,Apple 都會告知大家,iPhone \ MacBook 有多好有多讚~ 來波升級換機潮吧!!!

    心態學:如果開發團隊沒有完整的測試、發佈流程,對軟體推向市場膽戰心驚,那何況又要如何說服客戶進行升級呢~

    為了提供一個堅如磐石的升級過程,會需要處理binary、資料和配置資訊的遷移工作。只要是升級完成之前,應該要保留一份舊版本的副本區,以備更新失敗時,也要能夠悄悄的恢復原本良好狀態。

    最後,如果是客戶需自行安裝軟體的模式,關鍵在於能夠把錯誤報告回傳給開發團隊。
    內容或許描述了相處不愉快的事件,例如資源不足、系統與程式碼不相容等等

    提示與技巧 P.271 ~

    用 docker 架設 php 平台

    [文章目录]
    1. 需求
      1. 問題回顧

    需求

    這幾日,接到一個需求要架設基於 php 的專案軟體
    因為僅是 demo 用途,故直接想到用 docker 架設此環境。

    架設此環境,所需資源有三:PHP、Nginx、MySQL
    過去有架設過,應該問題不大,但每個專案各有不同的 php extension 需求,所以了解專案並於 docker php images 中實現是最明顯的課題。
    相對 Nginx & MySQL docker images 部分,則使用官方資訊下載使用即可。
    官方 Nginx image
    官方 MySQL image

    經過專案探尋與瞭解,也確認了需要的 php extension 清單,
    基於官方 php image,加上 php extension 需求,撰寫 PHP Dockerfile 如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    FROM php:5.6-fpm

    RUN apt-get update \
    && apt-get install -y \
    libfreetype6-dev \
    libjpeg62-turbo-dev \
    libpng-dev \
    libmcrypt-dev \
    && docker-php-ext-install -j$(nproc) iconv \
    && docker-php-ext-configure gd --with-freetype-dir=/usr/include/ --with-jpeg-dir=/usr/include/ \
    && docker-php-ext-install -j$(nproc) gd

    RUN docker-php-ext-install mysql mysqli pdo pdo_mysql mbstring mcrypt sockets

    php 專案中,基本上都需要用上 composer 指令,進行專案初始化
    可使用官方 composer,並且也需要搭配 php extension 需求,撰寫 PHP Dockerfile 如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    FROM composer

    RUN apk upgrade --update && \
    apk --no-cache add libmcrypt-dev mysql-client libpng-dev $PHPIZE_DEPS \
    && pecl install mcrypt-1.0.1 \
    && docker-php-ext-enable mcrypt

    RUN docker-php-ext-install -j$(nproc) gd mysqli
    RUN docker-php-ext-install -j$(nproc) pdo pdo_mysql iconv mbstring sockets

    WORKDIR /app

    ENTRYPOINT ["/bin/sh", "/docker-entrypoint.sh"]

    CMD ["composer"]

    撰寫後,進行 docker build 工作

    1
    2
    # docker build 指令
    docker build -t tag-name ./

    composer install & create-project 指令

    1
    2
    3
    4
    # composer create-project project/name
    sudo docker run --rm --interactive --tty -v /Documents/test:/app --user $(id -u):$(id -g) composer create-project project/name
    # composer install
    sudo docker run --rm --interactive --tty -v /Documents/test:/app --user $(id -u):$(id -g) composer install

    寫此篇其最主要紀錄的是過程中的問題概要,留下紀錄作為後續回顧。

    問題回顧

    1. 遇到問題:
      Cannot find autoconf. Please check your autoconf installation and the $PHP_AUTOCONF environment variable. Then, rerun this script. ERROR: 'phpize' failed

      • 解法:
        1
        RUN apk --no-cache add $PHPIZE_DEPS
    2. 遇到問題:執行 composer install or create-project 時,出現錯誤訊息:requested PHP extension mcrypt is missing from your system

      • 解法:其 docker image 缺少 PHP extension mcrypt ,所以需要
        1
        2
        RUN pecl install mcrypt-1.0.1
        RUN docker-php-ext-enable mcrypt
    3. 在執行專案時,遇到 php 問題:PHP Fatal error: Call to undefined function imagettftext()

      • 解法:因 imagettftext 功能相依 GD & FreeType,然而其 php extension GD 沒啟用 freetype 功能而出現此問題,在 dockerfile php 中啟用方式如下
        1
        2
        3
        RUN docker-php-ext-install -j$(nproc) iconv
        RUN docker-php-ext-configure gd --with-freetype-dir=/usr/include/ --with-jpeg-dir=/usr/include/
        RUN docker-php-ext-install -j$(nproc) gd