Merge pull request #608 from coreos/locksmithd-to-cluo

Switch Kubernetes clusters from locksmith to Container Linux Update Operator
This commit is contained in:
Dalton Hubble
2017-07-17 11:26:14 -07:00
committed by GitHub
17 changed files with 139 additions and 71 deletions

View File

@@ -11,7 +11,10 @@ Notable changes between releases.
### Examples / Modules
* Upgrade Kubernetes v1.6.6 example clusters
* Kubernetes examples clusters enable etcd TLS (unless experimental self-hosted etcd is enabled)
* Kubernetes examples clusters enable etcd TLS
* Deploy the Container Linux Update Operator (CLUO) to coordinate reboots of Container Linux nodes in Kubernetes clusters. See the cluster [addon docs](Documentation/cluster-addons.md).
* Kubernetes examples (terraform and non-terraform) mask locksmithd
* Terraform modules `bootkube` and `profiles` (Kubernetes) mask locksmithd
## v0.6.1 (2017-05-25)

View File

@@ -60,32 +60,32 @@ Client machines should boot and provision themselves. Local client VMs should ne
We're ready to use bootkube to create a temporary control plane and bootstrap a self-hosted Kubernetes cluster.
Secure copy the etcd TLS assets to `/etc/ssl/etcd/*` on **every** node.
Secure copy the etcd TLS assets to `/etc/ssl/etcd/*` on **every controller** node.
```bash
for node in 'node1' 'node2' 'node3'; do
```sh
for node in 'node1'; do
scp -r assets/tls/etcd-* assets/tls/etcd core@$node.example.com:/home/core/
ssh core@$node.example.com 'sudo mkdir -p /etc/ssl/etcd && sudo mv etcd-* etcd /etc/ssl/etcd/ && sudo chown -R etcd:etcd /etc/ssl/etcd && sudo chmod -R 500 /etc/ssl/etcd/'
done
```
Secure copy the `kubeconfig` to `/etc/kubernetes/kubeconfig` on **every** node which will path activate the `kubelet.service`.
Secure copy the `kubeconfig` to `/etc/kubernetes/kubeconfig` on **every node** to path activate the `kubelet.service`.
```bash
```sh
for node in 'node1' 'node2' 'node3'; do
scp assets/auth/kubeconfig core@$node.example.com:/home/core/kubeconfig
ssh core@$node.example.com 'sudo mv kubeconfig /etc/kubernetes/kubeconfig'
done
```
Secure copy the `bootkube` generated assets to any controller node and run `bootkube-start` (takes ~10 minutes).
Secure copy the `bootkube` generated assets to **any controller** node and run `bootkube-start` (takes ~10 minutes).
```sh
scp -r assets core@node1.example.com:/home/core
ssh core@node1.example.com 'sudo mv assets /opt/bootkube/assets && sudo systemctl start bootkube'
```
Optionally watch the Kubernetes control plane bootstrapping with the bootkube temporary api-server. You will see quite a bit of output.
Watch the Kubernetes control plane bootstrapping with the bootkube temporary api-server. You will see quite a bit of output.
```sh
$ ssh core@node1.example.com 'journalctl -f -u bootkube'
@@ -96,11 +96,11 @@ $ ssh core@node1.example.com 'journalctl -f -u bootkube'
[ 299.311743] bootkube[5]: All self-hosted control plane components successfully started
```
You may cleanup the `bootkube` assets on the node, but you should keep the copy on your laptop. It contains a `kubeconfig` used to access the cluster.
[Verify](#verify) the Kubernetes cluster is accessible once complete. Then install **important** cluster [addons](cluster-addons.md). You may cleanup the `bootkube` assets on the node, but you should keep the copy on your laptop. It contains a `kubeconfig` used to access the cluster.
## Verify
[Install kubectl](https://coreos.com/kubernetes/docs/latest/configure-kubectl.html) on your laptop. Use the generated kubeconfig to access the Kubernetes cluster. Verify that the cluster is accessible and that the kubelet, apiserver, scheduler, and controller-manager are running as pods.
[Install kubectl](https://coreos.com/kubernetes/docs/latest/configure-kubectl.html) on your laptop. Use the generated kubeconfig to access the Kubernetes cluster. Verify that the cluster is accessible and that the apiserver, scheduler, and controller-manager are running as pods.
```sh
$ KUBECONFIG=assets/auth/kubeconfig
@@ -128,7 +128,9 @@ kube-system pod-checkpointer-hb960 1/1 Running 0
kube-system pod-checkpointer-hb960-node1.example.com 1/1 Running 0 6m
```
Try deleting pods to see that the cluster is resilient to failures and machine restarts (Container Linux auto-updates).
## Addons
Install **important** cluster [addons](cluster-addons.md).
## Going further

View File

@@ -0,0 +1,30 @@
## Cluster Addons
Kubernetes clusters run cluster addons atop Kubernetes itself. Addons may be considered essential for bootstrapping (non-optional), important (highly recommended), or optional.
## Essential
Several addons are considered essential. CoreOS cluster creation tools ensure these addons are included. Kubernetes clusters deployed via the Matchbox examples or using our Terraform Modules include these addons as well.
### kube-proxy
`kube-proxy` is deployed as a DaemonSet.
### kube-dns
`kube-dns` is deployed as a Deployment.
## Important
### Container Linux Update Operator
The [Container Linux Update Operator](https://github.com/coreos/container-linux-update-operator) (i.e. CLUO) coordinates reboots of auto-updating Container Linux nodes so that one node reboots at a time and nodes are drained before reboot. CLUO enables the auto-update behavior Container Linux clusters are known for, but does it in a Kubernetes native way. Deploying CLUO is strongly recommended.
Create the `update-operator` deployment and `update-agent` DaemonSet.
```
kubectl apply -f examples/addons/cluo/update-operator.yaml
kubectl apply -f examples/addons/cluo/update-agent.yaml
```
*Note, CLUO replaces `locksmithd` reboot coordination. The `update_engine` systemd unit on hosts still performs the Container Linux update check, download, and install to the inactive partition.*

View File

@@ -0,0 +1,55 @@
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: container-linux-update-agent
namespace: kube-system
spec:
template:
metadata:
labels:
app: container-linux-update-agent
container-linux-update.v1.coreos.com/agent-version: v0.2.1
annotations:
container-linux-update.v1.coreos.com/agent-version: v0.2.1
spec:
containers:
- name: update-agent
image: quay.io/coreos/container-linux-update-operator:v0.2.1
command:
- "/bin/update-agent"
volumeMounts:
- mountPath: /var/run/dbus
name: var-run-dbus
- mountPath: /etc/coreos
name: etc-coreos
- mountPath: /usr/share/coreos
name: usr-share-coreos
- mountPath: /etc/os-release
name: etc-os-release
env:
# read by update-agent as the node name to manage reboots for
- name: UPDATE_AGENT_NODE
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
volumes:
- name: var-run-dbus
hostPath:
path: /var/run/dbus
- name: etc-coreos
hostPath:
path: /etc/coreos
- name: usr-share-coreos
hostPath:
path: /usr/share/coreos
- name: etc-os-release
hostPath:
path: /etc/os-release

View File

@@ -0,0 +1,23 @@
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: container-linux-update-operator
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
app: container-linux-update-operator
spec:
containers:
- name: update-operator
image: quay.io/coreos/container-linux-update-operator:v0.2.1
command:
- "/bin/update-operator"
- "--manage-agent=false"
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace

View File

@@ -9,11 +9,11 @@
"metadata": {
"domain_name": "node1.example.com",
"etcd_initial_cluster": "node1=https://node1.example.com:2380",
"etcd_endpoints": "https://node1.example.com:2379",
"etcd_name": "node1",
"k8s_dns_service_ip": "10.3.0.10",
"ssh_authorized_keys": [
"ADD ME"
"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPQFdwVLr+alsWIgYRz9OdqDhnx9jjuFbkdSdpqq4gd9uZApYlivMDD4UgjFazQpezx8DiNhu9ym7i6LgAcdwi+10hE4L9yoJv9uBgbBxOAd65znqLqF91NtV4mlKP5YfJtR7Ehs+pTB+IIC+o5veDbPn+BYgDMJ2x7Osbn1/gFSDken/yoOFbYbRMGMfVEQYjJzC4r/qCKH0bl/xuVNLxf9FkWSTCcQFKGOndwuGITDkshD4r2Kk8gUddXPxoahBv33/2QH0CY5zbKYjhgN6I6WtwO+O1uJwtNeV1AGhYjurdd60qggNwx+W7623uK3nIXvJd3hzDO8u5oa53/tIL fake-test-key-REMOVE-ME"
]
}
}

View File

@@ -8,10 +8,9 @@
},
"metadata": {
"domain_name": "node2.example.com",
"etcd_endpoints": "https://node1.example.com:2379",
"k8s_dns_service_ip": "10.3.0.10",
"ssh_authorized_keys": [
"ADD ME"
"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPQFdwVLr+alsWIgYRz9OdqDhnx9jjuFbkdSdpqq4gd9uZApYlivMDD4UgjFazQpezx8DiNhu9ym7i6LgAcdwi+10hE4L9yoJv9uBgbBxOAd65znqLqF91NtV4mlKP5YfJtR7Ehs+pTB+IIC+o5veDbPn+BYgDMJ2x7Osbn1/gFSDken/yoOFbYbRMGMfVEQYjJzC4r/qCKH0bl/xuVNLxf9FkWSTCcQFKGOndwuGITDkshD4r2Kk8gUddXPxoahBv33/2QH0CY5zbKYjhgN6I6WtwO+O1uJwtNeV1AGhYjurdd60qggNwx+W7623uK3nIXvJd3hzDO8u5oa53/tIL fake-test-key-REMOVE-ME"
]
}
}

View File

@@ -8,10 +8,9 @@
},
"metadata": {
"domain_name": "node3.example.com",
"etcd_endpoints": "https://node1.example.com:2379",
"k8s_dns_service_ip": "10.3.0.10",
"ssh_authorized_keys": [
"ADD ME"
"ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPQFdwVLr+alsWIgYRz9OdqDhnx9jjuFbkdSdpqq4gd9uZApYlivMDD4UgjFazQpezx8DiNhu9ym7i6LgAcdwi+10hE4L9yoJv9uBgbBxOAd65znqLqF91NtV4mlKP5YfJtR7Ehs+pTB+IIC+o5veDbPn+BYgDMJ2x7Osbn1/gFSDken/yoOFbYbRMGMfVEQYjJzC4r/qCKH0bl/xuVNLxf9FkWSTCcQFKGOndwuGITDkshD4r2Kk8gUddXPxoahBv33/2QH0CY5zbKYjhgN6I6WtwO+O1uJwtNeV1AGhYjurdd60qggNwx+W7623uK3nIXvJd3hzDO8u5oa53/tIL fake-test-key-REMOVE-ME"
]
}
}

View File

@@ -8,7 +8,6 @@
"metadata": {
"domain_name": "node1.example.com",
"etcd_initial_cluster": "node1=https://node1.example.com:2380",
"etcd_endpoints": "https://node1.example.com:2379",
"etcd_name": "node1",
"k8s_dns_service_ip": "10.3.0.10",
"pxe": "true",

View File

@@ -7,7 +7,6 @@
},
"metadata": {
"domain_name": "node2.example.com",
"etcd_endpoints": "https://node1.example.com:2379",
"k8s_dns_service_ip": "10.3.0.10",
"pxe": "true",
"ssh_authorized_keys": [

View File

@@ -7,7 +7,6 @@
},
"metadata": {
"domain_name": "node3.example.com",
"etcd_endpoints": "https://node1.example.com:2379",
"k8s_dns_service_ip": "10.3.0.10",
"pxe": "true",
"ssh_authorized_keys": [

View File

@@ -27,15 +27,7 @@ systemd:
- name: docker.service
enable: true
- name: locksmithd.service
dropins:
- name: 40-etcd-lock.conf
contents: |
[Service]
Environment="REBOOT_STRATEGY=etcd-lock"
Environment="LOCKSMITHD_ETCD_CAFILE=/etc/ssl/etcd/etcd-client-ca.crt"
Environment="LOCKSMITHD_ETCD_CERTFILE=/etc/ssl/etcd/etcd-client.crt"
Environment="LOCKSMITHD_ETCD_KEYFILE=/etc/ssl/etcd/etcd-client.key"
Environment="LOCKSMITHD_ENDPOINT={{.etcd_endpoints}}"
mask: true
- name: kubelet.path
enable: true
contents: |

View File

@@ -4,15 +4,7 @@ systemd:
- name: docker.service
enable: true
- name: locksmithd.service
dropins:
- name: 40-etcd-lock.conf
contents: |
[Service]
Environment="REBOOT_STRATEGY=etcd-lock"
Environment="LOCKSMITHD_ETCD_CAFILE=/etc/ssl/etcd/etcd-client-ca.crt"
Environment="LOCKSMITHD_ETCD_CERTFILE=/etc/ssl/etcd/etcd-client.crt"
Environment="LOCKSMITHD_ETCD_KEYFILE=/etc/ssl/etcd/etcd-client.key"
Environment="LOCKSMITHD_ENDPOINT={{.etcd_endpoints}}"
mask: true
- name: kubelet.path
enable: true
contents: |

View File

@@ -64,9 +64,11 @@ Note: The `cached-container-linux-install` profile will PXE boot and install Con
You may set certain optional variables to override defaults. Set `experimental_self_hosted_etcd = "true"` to deploy "self-hosted" etcd atop Kubernetes instead of running etcd on hosts directly.
```hcl
# Optional (defaults)
# cached_install = "false"
# install_disk = "/dev/sda"
# container_linux_oem = ""
# experimental_self_hosted_etcd = "true"
# experimental_self_hosted_etcd = "false"
```
The default is to create a Kubernetes cluster with 1 controller and 2 workers as an example, but check `multi-controller.tfvars.example` for an example which defines 3 controllers and 1 worker.
@@ -101,7 +103,7 @@ module.cluster.null_resource.bootkube-start: Still creating... (8m40s elapsed)
Apply complete! Resources: 37 added, 0 changed, 0 destroyed.
```
You can now move on to the "Machines" section. Apply will loop until it can successfully copy the kubeconfig to each node and start the one-time Kubernetes bootstrapping process on a controller. In practice, you may see `apply` fail if it connects before the disk install has completed. Run terraform apply until it reconciles successfully.
You can now move on to the "Machines" section. Apply will loop until it can successfully copy the kubeconfig and etcd TLS assets to each node and start the one-time Kubernetes bootstrapping process on a controller. In practice, you may see `apply` fail if it connects before the disk install has completed. Run terraform apply until it reconciles successfully.
## Machines
@@ -149,7 +151,9 @@ kube-system kube-scheduler-694795526-fks0b 1/1 Running 1
kube-system pod-checkpointer-node1.example.com 1/1 Running 2 10m
```
Try restarting machines or deleting pods to see that the cluster is resilient to failures.
## Addons
Install **important** cluster [addons](../../../Documentation/cluster-addons.md).
## Going Further

View File

@@ -28,10 +28,8 @@ resource "matchbox_group" "controller" {
domain_name = "${element(var.controller_domains, count.index)}"
etcd_name = "${element(var.controller_names, count.index)}"
etcd_initial_cluster = "${join(",", formatlist("%s=https://%s:2380", var.controller_names, var.controller_domains))}"
etcd_endpoints = "${join(",", formatlist("https://%s:2379", var.controller_domains))}"
etcd_on_host = "${var.experimental_self_hosted_etcd ? "false" : "true"}"
k8s_dns_service_ip = "${module.bootkube.kube_dns_service_ip}"
k8s_etcd_service_ip = "${module.bootkube.etcd_service_ip}"
ssh_authorized_key = "${var.ssh_authorized_key}"
}
}
@@ -48,10 +46,8 @@ resource "matchbox_group" "worker" {
metadata {
domain_name = "${element(var.worker_domains, count.index)}"
etcd_endpoints = "${join(",", formatlist("https://%s:2379", var.controller_domains))}"
etcd_on_host = "${var.experimental_self_hosted_etcd ? "false" : "true"}"
k8s_dns_service_ip = "${module.bootkube.kube_dns_service_ip}"
k8s_etcd_service_ip = "${module.bootkube.etcd_service_ip}"
ssh_authorized_key = "${var.ssh_authorized_key}"
}
}

View File

@@ -29,19 +29,7 @@ systemd:
- name: docker.service
enable: true
- name: locksmithd.service
dropins:
- name: 40-etcd-lock.conf
contents: |
[Service]
Environment="REBOOT_STRATEGY=etcd-lock"
Environment="LOCKSMITHD_ETCD_CAFILE=/etc/ssl/etcd/etcd-client-ca.crt"
Environment="LOCKSMITHD_ETCD_CERTFILE=/etc/ssl/etcd/etcd-client.crt"
Environment="LOCKSMITHD_ETCD_KEYFILE=/etc/ssl/etcd/etcd-client.key"
{{ if eq .etcd_on_host "false" -}}
Environment="LOCKSMITHD_ENDPOINT=https://{{.k8s_etcd_service_ip}}:2379"
{{ else }}
Environment="LOCKSMITHD_ENDPOINT={{.etcd_endpoints}}"
{{ end }}
mask: true
- name: kubelet.path
enable: true
contents: |

View File

@@ -4,19 +4,7 @@ systemd:
- name: docker.service
enable: true
- name: locksmithd.service
dropins:
- name: 40-etcd-lock.conf
contents: |
[Service]
Environment="REBOOT_STRATEGY=etcd-lock"
Environment="LOCKSMITHD_ETCD_CAFILE=/etc/ssl/etcd/etcd-client-ca.crt"
Environment="LOCKSMITHD_ETCD_CERTFILE=/etc/ssl/etcd/etcd-client.crt"
Environment="LOCKSMITHD_ETCD_KEYFILE=/etc/ssl/etcd/etcd-client.key"
{{ if eq .etcd_on_host "false" -}}
Environment="LOCKSMITHD_ENDPOINT=https://{{.k8s_etcd_service_ip}}:2379"
{{ else }}
Environment="LOCKSMITHD_ENDPOINT={{.etcd_endpoints}}"
{{ end }}
mask: true
- name: kubelet.path
enable: true
contents: |