Commit Graph

289 Commits

Author SHA1 Message Date
Dalton Hubble
9de4267c28 Update CoreDNS from v1.6.7 to v1.7.0
* https://coredns.io/2020/06/15/coredns-1.7.0-release/
2020-07-25 13:08:29 -07:00
Dalton Hubble
835890025b Update Calico from v3.15.0 to v3.15.1
* https://docs.projectcalico.org/v3.15/release-notes/
2020-07-15 22:03:54 -07:00
Dalton Hubble
2bab6334ad Update Kubernetes from v1.18.5 to v1.18.6
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1186
2020-07-15 21:55:02 -07:00
Dalton Hubble
9a5132b2ad Update Cilium from v1.8.0 to v1.8.1
* https://github.com/cilium/cilium/releases/tag/v1.8.1
2020-07-05 15:58:53 -07:00
Dalton Hubble
5a7c963caf Update Kubernetes from v1.18.4 to v1.18.5
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1185
2020-06-27 13:49:10 -07:00
Dalton Hubble
5043456b05 Update Calico from v3.14.1 to v3.15.0
* https://docs.projectcalico.org/v3.15/release-notes/
2020-06-26 02:39:01 -07:00
Dalton Hubble
c014b77090 Update Cilium from v1.8.0-rc4 to v1.8.0
* https://github.com/cilium/cilium/releases/tag/v1.8.0
2020-06-22 22:25:38 -07:00
Dalton Hubble
1c07dfbc2a Remove experimental kube-router CNI provider 2020-06-21 21:55:56 -07:00
Dalton Hubble
af36c53936 Add experimental Cilium CNI provider
* Accept experimental CNI `networking` mode "cilium"
* Run Cilium v1.8.0 with overlay vxlan tunnels and a
minimal set of features. We're interested in:
  * IPAM: Divide pod_cidr into /24 subnets per node
  * CNI networking pod-to-pod, pod-to-external
  * BPF masquerade
  * NetworkPolicy as defined by Kubernetes (no L7)
* Continue using kube-proxy with Cilium probe mode
* Firewall changes:
  * Require UDP 8472 for vxlan (Linux kernel default) between nodes
  * Optional ICMP echo(8) between nodes for host reachability (health)
  * Optional TCP 4240 between nodes for host reachability (health)
2020-06-21 16:21:09 -07:00
Dalton Hubble
e75697ce35 Rename controller node label and NoSchedule taint
* Use node label `node.kubernetes.io/controller` to select
controller nodes (action required)
* Tolerate node taint `node-role.kubernetes.io/controller`
for workloads that should run on controller nodes. Don't
tolerate `node-role.kubernetes.io/master` (action required)
2020-06-17 22:46:35 -07:00
Dalton Hubble
3fe903d0ac Update Kubernetes from v1.18.3 to v1.18.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1184
* Remove unused template file
2020-06-17 19:23:12 -07:00
Dalton Hubble
fc1a7bac89 Remove unused Kubelet certificate and key pair
* Kubelet certificate and key pair in state (not distributed)
are not needed after with Kubelet TLS bootstrap
* https://github.com/poseidon/terraform-render-bootstrap/pull/185

Fix https://github.com/poseidon/typhoon/issues/757
2020-06-11 21:20:41 -07:00
Dalton Hubble
c3b1f23b5d Update Calico from v3.14.0 to v3.14.1
* https://docs.projectcalico.org/v3.14/release-notes/
2020-05-29 00:35:16 -07:00
Dalton Hubble
ff7ec52d0a Update Kubernetes from v1.18.2 to v1.18.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md
2020-05-20 20:34:42 -07:00
Dalton Hubble
a83ddbb30e Add CoreDNS "soft" nodeAffinity for controller nodes
* Add nodeAffinity to CoreDNS deployment PodSpec to
prefer running CoreDNS pods on controllers, while
relying on podAntiAffinity for spreading.
* For single master clusters, running two CoreDNS pods
on the master or running one pod on a worker is
permissible.
* Note: Its still _possible_ to end up with CoreDNS pods
all running on workers since we only express scheduling
preference ("soft"), but unlikely. Plus the motivating
scenario (below) is also rare.

Background:

* CoreDNS replicas are set to the higher of 2 or the
number of control plane nodes to (at a minimum) support
Deployment updates or pod restarts and match the cluster
size (e.g. 5 master/controller nodes likely means a
larger cluster, so run 5 CoreDNS replicas)
* In the past (before v1.14), we required kube-dns (CoreOS
predecessor) to run CoreDNS pods on master nodes. With
CoreDNS this node selection was relaxed. We'd like a
gentler form of it now.

Motivation:

* On clusters using 100% preemptible/spot workers, it is
possible that CoreDNS pods schedule to workers that are all
preempted at the same time, causing a loss of cluster internal
DNS service until a CoreDNS pod reschedules (1 min). We'd like
CoreDNS to prefer controller/master nodes (which aren't preempted)
to reduce the possibility of control plane disruption
2020-05-09 22:48:56 -07:00
Dalton Hubble
157336db92 Update Calico from v3.13.3 to v3.14.0
* https://docs.projectcalico.org/v3.14/release-notes/
2020-05-09 16:02:38 -07:00
Dalton Hubble
1dc36b58b8 Fix Calico node crash loop on Pod restart
* Set a consistent MCS level/range for Calico install-cni
* Note: Rebooting a node was a workaround, because Kubelet
relabels /etc/kubernetes(/cni/net.d)

Background:

* On SELinux enforcing systems, the Calico CNI install-cni
container ran with default SELinux context and a random MCS
pair. install-cni places CNI configs by first creating a
temporary file and then moving them into place, which means
the file MCS categories depend on the containers SELinux
context.
* calico-node Pod restarts creates a new install-cni container
with a different MCS pair that cannot access the earlier
written file (it places configs every time), causing the
init container to error and calico-node to crash loop
* https://github.com/projectcalico/cni-plugin/issues/874

```
mv: inter-device move failed: '/calico.conf.tmp' to
'/host/etc/cni/net.d/10-calico.conflist'; unable to remove target:
Permission denied
Failed to mv files. This may be caused by selinux configuration on the
host, or something else.
```

Note, this isn't a host SELinux configuration issue.
2020-05-09 15:20:06 -07:00
Dalton Hubble
924beb4b0c Enable Kubelet TLS bootstrap and NodeRestriction
* Enable bootstrap token authentication on kube-apiserver
* Generate the bootstrap.kubernetes.io/token Secret that
may be used as a bootstrap token
* Generate a bootstrap kubeconfig (with a bootstrap token)
to be securely distributed to nodes. Each Kubelet will use
the bootstrap kubeconfig to authenticate to kube-apiserver
as `system:bootstrappers` and send a node-unique CSR for
kube-controller-manager to automatically approve to issue
a Kubelet certificate and kubeconfig (expires in 72 hours)
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the `system:node-bootstrapper`
ClusterRole
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the csr nodeclient ClusterRole
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the csr selfnodeclient ClusterRole
* Enable NodeRestriction admission controller to limit the
scope of Node or Pod objects a Kubelet can modify to those of
the node itself
* Ability for a Kubelet to delete its Node object is retained
as preemptible nodes or those in auto-scaling instance groups
need to be able to remove themselves on shutdown. This need
continues to have precedence over any risk of a node deleting
itself maliciously

Security notes:

1. Issued Kubelet certificates authenticate as user `system:node:NAME`
and group `system:nodes` and are limited in their authorization
to perform API operations by Node authorization and NodeRestriction
admission. Previously, a Kubelet's authorization was broader. This
is the primary security motivation.

2. The bootstrap kubeconfig credential has the same sensitivity
as the previous generated TLS client-certificate kubeconfig.
It must be distributed securely to nodes. Its compromise still
allows an attacker to obtain a Kubelet kubeconfig

3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers
a slight security improvement.
  * An attacker who obtains the kubeconfig can likely obtain the
  bootstrap kubeconfig as well, to obtain the ability to renew
  their access
  * A compromised bootstrap kubeconfig could plausibly be handled
  by replacing the bootstrap token Secret, distributing the token
  to new nodes, and expiration. Whereas a compromised TLS-client
  certificate kubeconfig can't be revoked (no CRL). However,
  replacing a bootstrap token can be impractical in real cluster
  environments, so the limited lifetime is mostly a theoretical
  benefit.
  * Cluster CSR objects are visible via kubectl which is nice

4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet
clients have more identity information, which can improve the
utility of audits and future features

Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/
2020-04-25 19:38:56 -07:00
Dalton Hubble
c62c7f5a1a Update Calico from v3.13.1 to v3.13.3
* https://docs.projectcalico.org/v3.13/release-notes/
2020-04-22 20:26:32 -07:00
Dalton Hubble
14d0b20879 Update Kubernetes from v1.18.1 to v1.18.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#downloads-for-v1182
2020-04-16 23:33:42 -07:00
Dalton Hubble
1ad53d3b1c Update Kubernetes from v1.18.0 to v1.18.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md
2020-04-08 19:37:27 -07:00
Dalton Hubble
45dc2f5c0c Update flannel from v0.11.0 to v0.12.0
* https://github.com/coreos/flannel/releases/tag/v0.12.0
2020-03-31 18:23:57 -07:00
Dalton Hubble
42723d13a6 Change default kube-system DaemonSet tolerations
* Change kube-proxy, flannel, and calico-node DaemonSet
tolerations to tolerate `node.kubernetes.io/not-ready`
and `node-role.kubernetes.io/master` (i.e. controllers)
explicitly, rather than tolerating all taints
* kube-system DaemonSets will no longer tolerate custom
node taints by default. Instead, custom node taints must
be enumerated to opt-in to scheduling/executing the
kube-system DaemonSets.

Background: Tolerating all taints ruled out use-cases
where certain nodes might legitimately need to keep
kube-proxy or CNI networking disabled
2020-03-25 22:43:50 -07:00
Dalton Hubble
cb170f802d Update Kubernetes from v1.17.4 to v1.18.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md
2020-03-25 17:47:30 -07:00
Dalton Hubble
e76f0a09fa Switch from upstream hyperkube to component images
* Kubernetes plans to stop releasing the hyperkube image in
the future.
* Upstream will continue releasing container images for
`kube-apiserver`, `kube-controller-manager`, `kube-proxy`,
and `kube-scheduler`. Typhoon will use these images
* Upstream will release the kubelet as a binary for distros
to package, either as a traditional DEB/RPP or as a container
image for container-optimized operating systems. Typhoon will
take on the packaging of Kubelet and its dependencies as a new
container image (alongside kubectl)

Rel: https://github.com/kubernetes/kubernetes/pull/88676
See: https://github.com/poseidon/kubelet
2020-03-17 22:13:42 -07:00
Dalton Hubble
73784c1b2c Update Kubernetes from v1.17.3 to v1.17.4
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#v1174
2020-03-12 22:57:14 -07:00
Dalton Hubble
804029edd5 Update Calico from v3.12.0 to v3.13.1
* https://docs.projectcalico.org/v3.13/release-notes/
2020-03-12 22:55:57 -07:00
Dalton Hubble
d1831e626a Update CoreDNS from v1.6.6 to v1.6.7
* https://coredns.io/2020/01/28/coredns-1.6.7-release/
2020-02-17 14:24:17 -08:00
Dalton Hubble
7961945834 Update Kubernetes from v1.17.2 to v1.17.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#v1173
2020-02-11 20:18:02 -08:00
Dalton Hubble
1ea8fe7a85 Update Calico from v3.11.2 to v3.12.0
* https://docs.projectcalico.org/release-notes/#v3120
2020-02-06 00:03:00 -08:00
Dalton Hubble
05297b94a9 Update Kuberenetes from v1.17.1 to v1.17.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1172
2020-01-21 18:25:46 -08:00
Dalton Hubble
de85f1da7d Update Calico from v3.11.1 to v3.11.2
* https://docs.projectcalico.org/v3.11/release-notes/
2020-01-18 13:37:17 -08:00
Dalton Hubble
5ce4fc6953 Update Kubernetes from v1.17.0 to v1.17.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1171
2020-01-14 20:17:30 -08:00
Dalton Hubble
ac4b7af570 Configure kube-proxy to serve /metrics on 0.0.0.0:10249
* Set kube-proxy --metrics-bind-address to 0.0.0.0 (default
127.0.0.1) so Prometheus metrics can be scraped
* Add pod port list (informational only)
* Require node firewall rules to be updated before scrapes
can succeed
2019-12-29 11:56:52 -08:00
Dalton Hubble
c8c21deb76 Update Calico from v3.10.2 to v3.11.1
* https://docs.projectcalico.org/v3.11/release-notes/
2019-12-28 10:51:11 -08:00
Dalton Hubble
f021d9cb34 Update CoreDNS from v1.6.5 to v1.6.6
* https://coredns.io/2019/12/11/coredns-1.6.6-release/
2019-12-22 10:41:43 -05:00
Dalton Hubble
24e5513ee6 Update Calico from v3.10.1 to v3.10.2
* https://docs.projectcalico.org/v3.10/release-notes/
2019-12-09 20:56:18 -08:00
Dalton Hubble
0ddd90fd05 Update Kubernetes from v1.16.3 to v1.17.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md/#v1170
v0.16.0
2019-12-09 18:29:06 -08:00
Dalton Hubble
4369c706e2 Restore kube-controller-manager settings lost in static pod migration
* Migration from a self-hosted to a static pod control plane dropped
a few kube-controller-manager customizations
* Reduce kube-controller-manager --pod-eviction-timeout from 5m to 1m
to move pods more quickly when nodes are preempted
* Fix flex-volume-plugin-dir since the Kubernetes default points to
a read-only filesystem on Container Linux / Fedora CoreOS

Related:

* https://github.com/poseidon/terraform-render-bootstrap/pull/148
* 7b06557b7a
2019-12-08 22:37:36 -08:00
Dalton Hubble
7df6bd8d1e Tune static pod CPU requests slightly lower
* Reduce kube-apiserver and kube-controller-manager CPU
requests from 200m to 150m. Prefer slightly lower commitment
after running with the requests chosen in #161 for a while
* Reduce calico-node CPU request from 150m to 100m to match
CoreDNS and flannel
2019-12-08 22:25:58 -08:00
Dalton Hubble
dce49114a0 Fix terraform format with fmt 2019-12-05 01:02:01 -08:00
Dalton Hubble
50a221e042 Annotate sensitive output variables to suppress display
* Annotate terraform output variables containing generated TLS
credentials and kubeconfigs as sensitive to suppress / mask
them in terraform CLI display.
* Allow for easier use in automation systems and logged environments
2019-12-05 00:57:07 -08:00
Dalton Hubble
4d7484f72a Change asset_dir variable from required to optional
* `asset_dir` is an absolute path to a directory where generated
assets from terraform-render-bootstrap are written (sensitive)
* Change `asset_dir` to default to "" so no assets are written
(favor Terraform output mechanisms). Previously, asset_dir was
required so all users set some path. To take advantage of the
new optionality, remove asset_dir or set it to ""
2019-12-05 00:56:54 -08:00
Dalton Hubble
6c7ba3864f Introduce a Terraform output map with distribution assets
* Introduce a new `assets_dist` output variable that provides
a mapping from suggested asset paths to asset contents (for
assets that should be distributed to controller nodes). This
new output format is intended to align with a modified asset
distribution style in Typhoon.
* Lay the groundwork for `assets_dir` to become optional. The
output map provides output variable access to the minimal assets
that are required for bootstrap
* Assets that aren't required for bootstrap itself (e.g.
the etcd CA key) but can be used by admins may later be added
as specific output variables to further reduce asset_dir use

Background:

* `terraform-render-bootstrap` rendered assets were previously
only provided by rendering files to an `asset_dir`. This was
neccessary, but created a responsibility to maintain those
assets on the machine where terraform apply was run
2019-12-04 20:15:40 -08:00
Dalton Hubble
8005052cfb Remove unused raw kubeconfig field outputs
* Remove unused `ca_cert`, `kubelet_cert`, `kubelet_key`,
and `server` outputs
* These outputs were once needed to support clusters with
managed instance groups, but that hasn't been the case for
quite some time
v0.15.0
2019-11-13 16:49:07 -08:00
Dalton Hubble
0f1f16c612 Add small CPU resource requests to static pods
* Set small CPU requests on static pods kube-apiserver,
kube-controller-manager, and kube-scheduler to align with
upstream tooling and for edge cases
* Control plane nodes are tainted to isolate them from
ordinary workloads. Even dense workloads can only compress
CPU resources on worker nodes.
* Control plane static pods use the highest priority class, so
contention favors control plane pods (over say node-exporter)
and CPU is compressible too.
* Effectively, a practical case for these requests hasn't been
observed. However, a small static pod CPU request may offer
a slight benefit if a controller became overloaded and the
above mechanisms were insufficient for some reason (bit of a
stretch, due to CPU compressibility)
* Continue to avoid setting a memory request for static pods.
It would impose a hard size requirement on controller nodes,
which isn't warranted and is handled more gently by Typhoon
default instance types across clouds and via docs
2019-11-13 16:44:33 -08:00
Dalton Hubble
43e1230c55 Update CoreDNS from v1.6.2 to v1.6.5
* Add health `lameduck` option 5s. Before CoreDNS shuts down,
it will wait and report unhealthy for 5s to allow time for
plugins to shutdown cleanly
* Minor bug fixes over a few releases
* https://coredns.io/2019/08/31/coredns-1.6.3-release/
* https://coredns.io/2019/09/27/coredns-1.6.4-release/
* https://coredns.io/2019/11/05/coredns-1.6.5-release/
2019-11-13 14:33:50 -08:00
Dalton Hubble
1bba891d95 Adopt Terraform v0.12 templatefile function
* Adopt Terrform v0.12 type and templatefile function
features to replace the use of terraform-provider-template's
`template_dir`
* Use of `for_each` to write local assets requires
that consumers use Terraform v0.12.6+ (action required)
* Continue use of `template_file` as its quite common. In
future, we may replace it as well.
* Remove outputs `id` and `content_hash` (no longer used)

Background:

* `template_dir` was added to `terraform-provider-template`
to add support for template directory rendering in CoreOS
Tectonic Kubernetes distribution (~2017)
* Terraform v0.12 introduced a native `templatefile` function
and v0.12.6 introduced native `for_each` support (July 2019)
that makes it possible to replace `template_dir` usage
2019-11-13 14:05:01 -08:00
Dalton Hubble
0daa1276c6 Update Kubernetes from v1.16.2 to v1.16.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1163
2019-11-13 13:02:01 -08:00
Dalton Hubble
a2b1dbe2c0 Update Calico from v3.10.0 to v3.10.1
* https://docs.projectcalico.org/v3.10/release-notes/
2019-11-07 11:07:15 -08:00