Merge pull request #608 from coreos/locksmithd-to-cluo

Switch Kubernetes clusters from locksmith to Container Linux Update Operator
2026-01-27 18:19:36 +00:00 · 2017-07-17 11:26:14 -07:00
parent 170f8c09ec 27d1139a07
commit be8fd3d488
17 changed files with 139 additions and 71 deletions
--- a/CHANGES.md
+++ b/CHANGES.md
@@ -11,7 +11,10 @@ Notable changes between releases.
 ### Examples / Modules

 * Upgrade Kubernetes v1.6.6 example clusters
-* Kubernetes examples clusters enable etcd TLS (unless experimental self-hosted etcd is enabled)
+* Kubernetes examples clusters enable etcd TLS
+* Deploy the Container Linux Update Operator (CLUO) to coordinate reboots of Container Linux nodes in Kubernetes clusters. See the cluster [addon docs](Documentation/cluster-addons.md).
+* Kubernetes examples (terraform and non-terraform) mask locksmithd
+* Terraform modules `bootkube` and `profiles` (Kubernetes) mask locksmithd

 ## v0.6.1 (2017-05-25)

--- a/Documentation/bootkube.md
+++ b/Documentation/bootkube.md
@@ -60,32 +60,32 @@ Client machines should boot and provision themselves. Local client VMs should ne

 We're ready to use bootkube to create a temporary control plane and bootstrap a self-hosted Kubernetes cluster.

-Secure copy the etcd TLS assets to `/etc/ssl/etcd/*` on **every** node.
+Secure copy the etcd TLS assets to `/etc/ssl/etcd/*` on **every controller** node.

-```bash
-for node in 'node1' 'node2' 'node3'; do
+```sh
+for node in 'node1'; do
    scp -r assets/tls/etcd-* assets/tls/etcd core@$node.example.com:/home/core/
    ssh core@$node.example.com 'sudo mkdir -p /etc/ssl/etcd && sudo mv etcd-* etcd /etc/ssl/etcd/ && sudo chown -R etcd:etcd /etc/ssl/etcd && sudo chmod -R 500 /etc/ssl/etcd/'
 done
 ```

-Secure copy the `kubeconfig` to `/etc/kubernetes/kubeconfig` on **every** node which will path activate the `kubelet.service`.
+Secure copy the `kubeconfig` to `/etc/kubernetes/kubeconfig` on **every node** to path activate the `kubelet.service`.

-```bash
+```sh
 for node in 'node1' 'node2' 'node3'; do
    scp assets/auth/kubeconfig core@$node.example.com:/home/core/kubeconfig
    ssh core@$node.example.com 'sudo mv kubeconfig /etc/kubernetes/kubeconfig'
 done
 ```

-Secure copy the `bootkube` generated assets to any controller node and run `bootkube-start` (takes ~10 minutes).
+Secure copy the `bootkube` generated assets to **any controller** node and run `bootkube-start` (takes ~10 minutes).

 ```sh
 scp -r assets core@node1.example.com:/home/core
 ssh core@node1.example.com 'sudo mv assets /opt/bootkube/assets && sudo systemctl start bootkube'
 ```

-Optionally watch the Kubernetes control plane bootstrapping with the bootkube temporary api-server. You will see quite a bit of output.
+Watch the Kubernetes control plane bootstrapping with the bootkube temporary api-server. You will see quite a bit of output.

 ```sh
 $ ssh core@node1.example.com 'journalctl -f -u bootkube'
@@ -96,11 +96,11 @@ $ ssh core@node1.example.com 'journalctl -f -u bootkube'
 [  299.311743] bootkube[5]: All self-hosted control plane components successfully started
 ```

-You may cleanup the `bootkube` assets on the node, but you should keep the copy on your laptop. It contains a `kubeconfig` used to access the cluster.
+[Verify](#verify) the Kubernetes cluster is accessible once complete. Then install **important** cluster [addons](cluster-addons.md). You may cleanup the `bootkube` assets on the node, but you should keep the copy on your laptop. It contains a `kubeconfig` used to access the cluster.

 ## Verify

-[Install kubectl](https://coreos.com/kubernetes/docs/latest/configure-kubectl.html) on your laptop. Use the generated kubeconfig to access the Kubernetes cluster. Verify that the cluster is accessible and that the kubelet, apiserver, scheduler, and controller-manager are running as pods.
+[Install kubectl](https://coreos.com/kubernetes/docs/latest/configure-kubectl.html) on your laptop. Use the generated kubeconfig to access the Kubernetes cluster. Verify that the cluster is accessible and that the apiserver, scheduler, and controller-manager are running as pods.

 ```sh
 $ KUBECONFIG=assets/auth/kubeconfig
@@ -128,7 +128,9 @@ kube-system   pod-checkpointer-hb960                     1/1       Running   0
 kube-system   pod-checkpointer-hb960-node1.example.com   1/1       Running   0          6m
 ```

-Try deleting pods to see that the cluster is resilient to failures and machine restarts (Container Linux auto-updates).
+## Addons
+
+Install **important** cluster [addons](cluster-addons.md).

 ## Going further

--- a/Documentation/cluster-addons.md
+++ b/Documentation/cluster-addons.md
@@ -0,0 +1,30 @@
+## Cluster Addons
+
+Kubernetes clusters run cluster addons atop Kubernetes itself. Addons may be considered essential for bootstrapping (non-optional), important (highly recommended), or optional.
+
+## Essential
+
+Several addons are considered essential. CoreOS cluster creation tools ensure these addons are included. Kubernetes clusters deployed via the Matchbox examples or using our Terraform Modules include these addons as well.
+
+### kube-proxy
+
+`kube-proxy` is deployed as a DaemonSet.
+
+### kube-dns
+
+`kube-dns` is deployed as a Deployment.
+
+## Important
+
+### Container Linux Update Operator
+
+The [Container Linux Update Operator](https://github.com/coreos/container-linux-update-operator) (i.e. CLUO) coordinates reboots of auto-updating Container Linux nodes so that one node reboots at a time and nodes are drained before reboot. CLUO enables the auto-update behavior Container Linux clusters are known for, but does it in a Kubernetes native way. Deploying CLUO is strongly recommended.
+
+Create the `update-operator` deployment and `update-agent` DaemonSet.
+
+```
+kubectl apply -f examples/addons/cluo/update-operator.yaml
+kubectl apply -f examples/addons/cluo/update-agent.yaml
+```
+
+*Note, CLUO replaces `locksmithd` reboot coordination. The `update_engine` systemd unit on hosts still performs the Container Linux update check, download, and install to the inactive partition.*
--- a/examples/addons/cluo/update-agent.yaml
+++ b/examples/addons/cluo/update-agent.yaml
@@ -0,0 +1,55 @@
+apiVersion: extensions/v1beta1
+kind: DaemonSet
+metadata:
+  name: container-linux-update-agent
+  namespace: kube-system
+spec:
+  template:
+    metadata:
+      labels:
+        app: container-linux-update-agent
+        container-linux-update.v1.coreos.com/agent-version: v0.2.1
+      annotations:
+        container-linux-update.v1.coreos.com/agent-version: v0.2.1
+    spec:
+      containers:
+      - name: update-agent
+        image: quay.io/coreos/container-linux-update-operator:v0.2.1
+        command:
+        - "/bin/update-agent"
+        volumeMounts:
+          - mountPath: /var/run/dbus
+            name: var-run-dbus
+          - mountPath: /etc/coreos
+            name: etc-coreos
+          - mountPath: /usr/share/coreos
+            name: usr-share-coreos
+          - mountPath: /etc/os-release
+            name: etc-os-release
+        env:
+        # read by update-agent as the node name to manage reboots for
+        - name: UPDATE_AGENT_NODE
+          valueFrom:
+            fieldRef:
+              fieldPath: spec.nodeName
+        - name: POD_NAMESPACE
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.namespace
+      tolerations:
+      - key: node-role.kubernetes.io/master
+        operator: Exists
+        effect: NoSchedule
+      volumes:
+      - name: var-run-dbus
+        hostPath:
+          path: /var/run/dbus
+      - name: etc-coreos
+        hostPath:
+          path: /etc/coreos
+      - name: usr-share-coreos
+        hostPath:
+          path: /usr/share/coreos
+      - name: etc-os-release
+        hostPath:
+          path: /etc/os-release
--- a/examples/addons/cluo/update-operator.yaml
+++ b/examples/addons/cluo/update-operator.yaml
@@ -0,0 +1,23 @@
+apiVersion: extensions/v1beta1
+kind: Deployment
+metadata:
+  name: container-linux-update-operator
+  namespace: kube-system
+spec:
+  replicas: 1
+  template:
+    metadata:
+      labels:
+        app: container-linux-update-operator
+    spec:
+      containers:
+      - name: update-operator
+        image: quay.io/coreos/container-linux-update-operator:v0.2.1
+        command:
+        - "/bin/update-operator"
+        - "--manage-agent=false"
+        env:
+        - name: POD_NAMESPACE
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.namespace
--- a/examples/groups/bootkube-install/node1.json
+++ b/examples/groups/bootkube-install/node1.json
@@ -9,11 +9,11 @@
  "metadata": {
    "domain_name": "node1.example.com",
    "etcd_initial_cluster": "node1=https://node1.example.com:2380",
-    "etcd_endpoints": "https://node1.example.com:2379",
    "etcd_name": "node1",
    "k8s_dns_service_ip": "10.3.0.10",
    "ssh_authorized_keys": [
-      "ADD ME"
+      "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPQFdwVLr+alsWIgYRz9OdqDhnx9jjuFbkdSdpqq4gd9uZApYlivMDD4UgjFazQpezx8DiNhu9ym7i6LgAcdwi+10hE4L9yoJv9uBgbBxOAd65znqLqF91NtV4mlKP5YfJtR7Ehs+pTB+IIC+o5veDbPn+BYgDMJ2x7Osbn1/gFSDken/yoOFbYbRMGMfVEQYjJzC4r/qCKH0bl/xuVNLxf9FkWSTCcQFKGOndwuGITDkshD4r2Kk8gUddXPxoahBv33/2QH0CY5zbKYjhgN6I6WtwO+O1uJwtNeV1AGhYjurdd60qggNwx+W7623uK3nIXvJd3hzDO8u5oa53/tIL fake-test-key-REMOVE-ME"
+
    ]
  }
 }
--- a/examples/groups/bootkube-install/node2.json
+++ b/examples/groups/bootkube-install/node2.json
@@ -8,10 +8,9 @@
  },
  "metadata": {
    "domain_name": "node2.example.com",
-    "etcd_endpoints": "https://node1.example.com:2379",
    "k8s_dns_service_ip": "10.3.0.10",
    "ssh_authorized_keys": [
-      "ADD ME"
+      "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPQFdwVLr+alsWIgYRz9OdqDhnx9jjuFbkdSdpqq4gd9uZApYlivMDD4UgjFazQpezx8DiNhu9ym7i6LgAcdwi+10hE4L9yoJv9uBgbBxOAd65znqLqF91NtV4mlKP5YfJtR7Ehs+pTB+IIC+o5veDbPn+BYgDMJ2x7Osbn1/gFSDken/yoOFbYbRMGMfVEQYjJzC4r/qCKH0bl/xuVNLxf9FkWSTCcQFKGOndwuGITDkshD4r2Kk8gUddXPxoahBv33/2QH0CY5zbKYjhgN6I6WtwO+O1uJwtNeV1AGhYjurdd60qggNwx+W7623uK3nIXvJd3hzDO8u5oa53/tIL fake-test-key-REMOVE-ME"
    ]
  }
 }
--- a/examples/groups/bootkube-install/node3.json
+++ b/examples/groups/bootkube-install/node3.json
@@ -8,10 +8,9 @@
  },
  "metadata": {
    "domain_name": "node3.example.com",
-    "etcd_endpoints": "https://node1.example.com:2379",
    "k8s_dns_service_ip": "10.3.0.10",
    "ssh_authorized_keys": [
-      "ADD ME"
+      "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPQFdwVLr+alsWIgYRz9OdqDhnx9jjuFbkdSdpqq4gd9uZApYlivMDD4UgjFazQpezx8DiNhu9ym7i6LgAcdwi+10hE4L9yoJv9uBgbBxOAd65znqLqF91NtV4mlKP5YfJtR7Ehs+pTB+IIC+o5veDbPn+BYgDMJ2x7Osbn1/gFSDken/yoOFbYbRMGMfVEQYjJzC4r/qCKH0bl/xuVNLxf9FkWSTCcQFKGOndwuGITDkshD4r2Kk8gUddXPxoahBv33/2QH0CY5zbKYjhgN6I6WtwO+O1uJwtNeV1AGhYjurdd60qggNwx+W7623uK3nIXvJd3hzDO8u5oa53/tIL fake-test-key-REMOVE-ME"
    ]
  }
 }
--- a/examples/groups/bootkube/node1.json
+++ b/examples/groups/bootkube/node1.json
@@ -8,7 +8,6 @@
  "metadata": {
    "domain_name": "node1.example.com",
    "etcd_initial_cluster": "node1=https://node1.example.com:2380",
-    "etcd_endpoints": "https://node1.example.com:2379",
    "etcd_name": "node1",
    "k8s_dns_service_ip": "10.3.0.10",
    "pxe": "true",
--- a/examples/groups/bootkube/node2.json
+++ b/examples/groups/bootkube/node2.json
@@ -7,7 +7,6 @@
  },
  "metadata": {
    "domain_name": "node2.example.com",
-    "etcd_endpoints": "https://node1.example.com:2379",
    "k8s_dns_service_ip": "10.3.0.10",
    "pxe": "true",
    "ssh_authorized_keys": [
--- a/examples/groups/bootkube/node3.json
+++ b/examples/groups/bootkube/node3.json
@@ -7,7 +7,6 @@
  },
  "metadata": {
    "domain_name": "node3.example.com",
-    "etcd_endpoints": "https://node1.example.com:2379",
    "k8s_dns_service_ip": "10.3.0.10",
    "pxe": "true",
    "ssh_authorized_keys": [
--- a/examples/ignition/bootkube-controller.yaml
+++ b/examples/ignition/bootkube-controller.yaml
@@ -27,15 +27,7 @@ systemd:
    - name: docker.service
      enable: true
    - name: locksmithd.service
-      dropins:
-        - name: 40-etcd-lock.conf
-          contents: |
-            [Service]
-            Environment="REBOOT_STRATEGY=etcd-lock"
-            Environment="LOCKSMITHD_ETCD_CAFILE=/etc/ssl/etcd/etcd-client-ca.crt"
-            Environment="LOCKSMITHD_ETCD_CERTFILE=/etc/ssl/etcd/etcd-client.crt"
-            Environment="LOCKSMITHD_ETCD_KEYFILE=/etc/ssl/etcd/etcd-client.key"
-            Environment="LOCKSMITHD_ENDPOINT={{.etcd_endpoints}}"
+      mask: true
    - name: kubelet.path
      enable: true
      contents: |
--- a/examples/ignition/bootkube-worker.yaml
+++ b/examples/ignition/bootkube-worker.yaml
@@ -4,15 +4,7 @@ systemd:
    - name: docker.service
      enable: true
    - name: locksmithd.service
-      dropins:
-        - name: 40-etcd-lock.conf
-          contents: |
-            [Service]
-            Environment="REBOOT_STRATEGY=etcd-lock"
-            Environment="LOCKSMITHD_ETCD_CAFILE=/etc/ssl/etcd/etcd-client-ca.crt"
-            Environment="LOCKSMITHD_ETCD_CERTFILE=/etc/ssl/etcd/etcd-client.crt"
-            Environment="LOCKSMITHD_ETCD_KEYFILE=/etc/ssl/etcd/etcd-client.key"
-            Environment="LOCKSMITHD_ENDPOINT={{.etcd_endpoints}}"
+      mask: true
    - name: kubelet.path
      enable: true
      contents: |
--- a/examples/terraform/bootkube-install/README.md
+++ b/examples/terraform/bootkube-install/README.md
@@ -64,9 +64,11 @@ Note: The `cached-container-linux-install` profile will PXE boot and install Con
 You may set certain optional variables to override defaults. Set `experimental_self_hosted_etcd = "true"` to deploy "self-hosted" etcd atop Kubernetes instead of running etcd on hosts directly.

 ```hcl
+# Optional (defaults)
+# cached_install = "false"
 # install_disk = "/dev/sda"
 # container_linux_oem = ""
-# experimental_self_hosted_etcd = "true"
+# experimental_self_hosted_etcd = "false"
 ```

 The default is to create a Kubernetes cluster with 1 controller and 2 workers as an example, but check `multi-controller.tfvars.example` for an example which defines 3 controllers and 1 worker.
@@ -101,7 +103,7 @@ module.cluster.null_resource.bootkube-start: Still creating... (8m40s elapsed)
 Apply complete! Resources: 37 added, 0 changed, 0 destroyed.
 ```

-You can now move on to the "Machines" section. Apply will loop until it can successfully copy the kubeconfig to each node and start the one-time Kubernetes bootstrapping process on a controller. In practice, you may see `apply` fail if it connects before the disk install has completed. Run terraform apply until it reconciles successfully.
+You can now move on to the "Machines" section. Apply will loop until it can successfully copy the kubeconfig and etcd TLS assets to each node and start the one-time Kubernetes bootstrapping process on a controller. In practice, you may see `apply` fail if it connects before the disk install has completed. Run terraform apply until it reconciles successfully.

 ## Machines

@@ -149,7 +151,9 @@ kube-system   kube-scheduler-694795526-fks0b             1/1       Running   1
 kube-system   pod-checkpointer-node1.example.com         1/1       Running   2          10m
 ```

-Try restarting machines or deleting pods to see that the cluster is resilient to failures.
+## Addons
+
+Install **important** cluster [addons](../../../Documentation/cluster-addons.md).

 ## Going Further

--- a/examples/terraform/modules/bootkube/groups.tf
+++ b/examples/terraform/modules/bootkube/groups.tf
@@ -28,10 +28,8 @@ resource "matchbox_group" "controller" {
    domain_name          = "${element(var.controller_domains, count.index)}"
    etcd_name            = "${element(var.controller_names, count.index)}"
    etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", var.controller_names, var.controller_domains))}"
-    etcd_endpoints      = "${join(",", formatlist("https://%s:2379", var.controller_domains))}"
    etcd_on_host         = "${var.experimental_self_hosted_etcd ? "false" : "true"}"
    k8s_dns_service_ip   = "${module.bootkube.kube_dns_service_ip}"
-    k8s_etcd_service_ip  = "${module.bootkube.etcd_service_ip}"
    ssh_authorized_key   = "${var.ssh_authorized_key}"
  }
 }
@@ -48,10 +46,8 @@ resource "matchbox_group" "worker" {

  metadata {
    domain_name         = "${element(var.worker_domains, count.index)}"
-    etcd_endpoints      = "${join(",", formatlist("https://%s:2379", var.controller_domains))}"
    etcd_on_host        = "${var.experimental_self_hosted_etcd ? "false" : "true"}"
    k8s_dns_service_ip  = "${module.bootkube.kube_dns_service_ip}"
-    k8s_etcd_service_ip = "${module.bootkube.etcd_service_ip}"
    ssh_authorized_key  = "${var.ssh_authorized_key}"
  }
 }
--- a/examples/terraform/modules/profiles/cl/bootkube-controller.yaml.tmpl
+++ b/examples/terraform/modules/profiles/cl/bootkube-controller.yaml.tmpl
@@ -29,19 +29,7 @@ systemd:
    - name: docker.service
      enable: true
    - name: locksmithd.service
-      dropins:
-        - name: 40-etcd-lock.conf
-          contents: |
-            [Service]
-            Environment="REBOOT_STRATEGY=etcd-lock"
-            Environment="LOCKSMITHD_ETCD_CAFILE=/etc/ssl/etcd/etcd-client-ca.crt"
-            Environment="LOCKSMITHD_ETCD_CERTFILE=/etc/ssl/etcd/etcd-client.crt"
-            Environment="LOCKSMITHD_ETCD_KEYFILE=/etc/ssl/etcd/etcd-client.key"
-            {{ if eq .etcd_on_host "false" -}}
-            Environment="LOCKSMITHD_ENDPOINT=https://{{.k8s_etcd_service_ip}}:2379"
-            {{ else }}
-            Environment="LOCKSMITHD_ENDPOINT={{.etcd_endpoints}}"
-            {{ end }}
+      mask: true
    - name: kubelet.path
      enable: true
      contents: |
--- a/examples/terraform/modules/profiles/cl/bootkube-worker.yaml.tmpl
+++ b/examples/terraform/modules/profiles/cl/bootkube-worker.yaml.tmpl
@@ -4,19 +4,7 @@ systemd:
    - name: docker.service
      enable: true
    - name: locksmithd.service
-      dropins:
-        - name: 40-etcd-lock.conf
-          contents: |
-            [Service]
-            Environment="REBOOT_STRATEGY=etcd-lock"
-            Environment="LOCKSMITHD_ETCD_CAFILE=/etc/ssl/etcd/etcd-client-ca.crt"
-            Environment="LOCKSMITHD_ETCD_CERTFILE=/etc/ssl/etcd/etcd-client.crt"
-            Environment="LOCKSMITHD_ETCD_KEYFILE=/etc/ssl/etcd/etcd-client.key"
-            {{ if eq .etcd_on_host "false" -}}
-            Environment="LOCKSMITHD_ENDPOINT=https://{{.k8s_etcd_service_ip}}:2379"
-            {{ else }}
-            Environment="LOCKSMITHD_ENDPOINT={{.etcd_endpoints}}"
-            {{ end }}
+      mask: true
    - name: kubelet.path
      enable: true
      contents: |