252 Commits

Author SHA1 Message Date
Dalton Hubble
0ddd90fd05 Update Kubernetes from v1.16.3 to v1.17.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md/#v1170
v0.16.0
2019-12-09 18:29:06 -08:00
Dalton Hubble
4369c706e2 Restore kube-controller-manager settings lost in static pod migration
* Migration from a self-hosted to a static pod control plane dropped
a few kube-controller-manager customizations
* Reduce kube-controller-manager --pod-eviction-timeout from 5m to 1m
to move pods more quickly when nodes are preempted
* Fix flex-volume-plugin-dir since the Kubernetes default points to
a read-only filesystem on Container Linux / Fedora CoreOS

Related:

* https://github.com/poseidon/terraform-render-bootstrap/pull/148
* 7b06557b7a
2019-12-08 22:37:36 -08:00
Dalton Hubble
7df6bd8d1e Tune static pod CPU requests slightly lower
* Reduce kube-apiserver and kube-controller-manager CPU
requests from 200m to 150m. Prefer slightly lower commitment
after running with the requests chosen in #161 for a while
* Reduce calico-node CPU request from 150m to 100m to match
CoreDNS and flannel
2019-12-08 22:25:58 -08:00
Dalton Hubble
dce49114a0 Fix terraform format with fmt 2019-12-05 01:02:01 -08:00
Dalton Hubble
50a221e042 Annotate sensitive output variables to suppress display
* Annotate terraform output variables containing generated TLS
credentials and kubeconfigs as sensitive to suppress / mask
them in terraform CLI display.
* Allow for easier use in automation systems and logged environments
2019-12-05 00:57:07 -08:00
Dalton Hubble
4d7484f72a Change asset_dir variable from required to optional
* `asset_dir` is an absolute path to a directory where generated
assets from terraform-render-bootstrap are written (sensitive)
* Change `asset_dir` to default to "" so no assets are written
(favor Terraform output mechanisms). Previously, asset_dir was
required so all users set some path. To take advantage of the
new optionality, remove asset_dir or set it to ""
2019-12-05 00:56:54 -08:00
Dalton Hubble
6c7ba3864f Introduce a Terraform output map with distribution assets
* Introduce a new `assets_dist` output variable that provides
a mapping from suggested asset paths to asset contents (for
assets that should be distributed to controller nodes). This
new output format is intended to align with a modified asset
distribution style in Typhoon.
* Lay the groundwork for `assets_dir` to become optional. The
output map provides output variable access to the minimal assets
that are required for bootstrap
* Assets that aren't required for bootstrap itself (e.g.
the etcd CA key) but can be used by admins may later be added
as specific output variables to further reduce asset_dir use

Background:

* `terraform-render-bootstrap` rendered assets were previously
only provided by rendering files to an `asset_dir`. This was
neccessary, but created a responsibility to maintain those
assets on the machine where terraform apply was run
2019-12-04 20:15:40 -08:00
Dalton Hubble
8005052cfb Remove unused raw kubeconfig field outputs
* Remove unused `ca_cert`, `kubelet_cert`, `kubelet_key`,
and `server` outputs
* These outputs were once needed to support clusters with
managed instance groups, but that hasn't been the case for
quite some time
v0.15.0
2019-11-13 16:49:07 -08:00
Dalton Hubble
0f1f16c612 Add small CPU resource requests to static pods
* Set small CPU requests on static pods kube-apiserver,
kube-controller-manager, and kube-scheduler to align with
upstream tooling and for edge cases
* Control plane nodes are tainted to isolate them from
ordinary workloads. Even dense workloads can only compress
CPU resources on worker nodes.
* Control plane static pods use the highest priority class, so
contention favors control plane pods (over say node-exporter)
and CPU is compressible too.
* Effectively, a practical case for these requests hasn't been
observed. However, a small static pod CPU request may offer
a slight benefit if a controller became overloaded and the
above mechanisms were insufficient for some reason (bit of a
stretch, due to CPU compressibility)
* Continue to avoid setting a memory request for static pods.
It would impose a hard size requirement on controller nodes,
which isn't warranted and is handled more gently by Typhoon
default instance types across clouds and via docs
2019-11-13 16:44:33 -08:00
Dalton Hubble
43e1230c55 Update CoreDNS from v1.6.2 to v1.6.5
* Add health `lameduck` option 5s. Before CoreDNS shuts down,
it will wait and report unhealthy for 5s to allow time for
plugins to shutdown cleanly
* Minor bug fixes over a few releases
* https://coredns.io/2019/08/31/coredns-1.6.3-release/
* https://coredns.io/2019/09/27/coredns-1.6.4-release/
* https://coredns.io/2019/11/05/coredns-1.6.5-release/
2019-11-13 14:33:50 -08:00
Dalton Hubble
1bba891d95 Adopt Terraform v0.12 templatefile function
* Adopt Terrform v0.12 type and templatefile function
features to replace the use of terraform-provider-template's
`template_dir`
* Use of `for_each` to write local assets requires
that consumers use Terraform v0.12.6+ (action required)
* Continue use of `template_file` as its quite common. In
future, we may replace it as well.
* Remove outputs `id` and `content_hash` (no longer used)

Background:

* `template_dir` was added to `terraform-provider-template`
to add support for template directory rendering in CoreOS
Tectonic Kubernetes distribution (~2017)
* Terraform v0.12 introduced a native `templatefile` function
and v0.12.6 introduced native `for_each` support (July 2019)
that makes it possible to replace `template_dir` usage
2019-11-13 14:05:01 -08:00
Dalton Hubble
0daa1276c6 Update Kubernetes from v1.16.2 to v1.16.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1163
2019-11-13 13:02:01 -08:00
Dalton Hubble
a2b1dbe2c0 Update Calico from v3.10.0 to v3.10.1
* https://docs.projectcalico.org/v3.10/release-notes/
2019-11-07 11:07:15 -08:00
Dalton Hubble
3c7334ab55 Upgrade Calico from v3.9.2 to v3.10.0
* Change calico-node livenessProve from httpGet to exec
a calico-node -felix-ready, as recommended by Calico
* Allow advertising Kubernetes service ClusterIPs
2019-10-27 01:06:09 -07:00
Dalton Hubble
e09d6bef33 Switch kube-proxy from iptables mode to ipvs mode
* Kubernetes v1.11 considered kube-proxy IPVS mode GA
* Many problems were found https://github.com/poseidon/typhoon/pull/321
* Since then, major blockers seem to have been addressed
2019-10-15 22:55:17 -07:00
Dalton Hubble
0fcc067476 Update Kubernetes from v1.16.1 to v1.16.2
* https://github.com/kubernetes/kubernetes/releases/tag/v1.16.2
2019-10-15 22:38:51 -07:00
Dalton Hubble
6f2734bb3c Update Calico from v3.9.1 to v3.9.2
* https://github.com/projectcalico/calico/releases/tag/v3.9.2
2019-10-15 22:36:37 -07:00
Dalton Hubble
10d9cec5c2 Add stricter type constraints to variables 2019-10-06 20:41:50 -07:00
Dalton Hubble
1f8b634652 Remove unneeded control plane flags
* Several flags now default to the arguments we've been
setting and are no longer needed
2019-10-06 20:25:46 -07:00
Dalton Hubble
586d6e36f6 Update Kubernetes from v1.16.0 to v1.16.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1161
2019-10-02 21:22:11 -07:00
Dalton Hubble
18b7a74d30 Update Calico from v3.8.2 to v3.9.1
* https://docs.projectcalico.org/v3.9/release-notes/
2019-09-29 11:14:20 -07:00
Dalton Hubble
539b725093 Update Kubernetes from v1.15.3 to v1.16.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md#v1160
2019-09-17 21:15:46 -07:00
Dalton Hubble
d6206abedd Replace Terraform element function with indexing
* Better to explictly index (and error on out-of-bounds) than
use Terraform `element` (which has special wrap-around behavior)
* https://www.terraform.io/docs/configuration/functions/element.html
2019-09-14 16:46:27 -07:00
Dalton Hubble
e839ec5a2b Fix Terraform formatting 2019-09-14 16:44:36 -07:00
Dalton Hubble
3dade188f2 Rename project to terraform-render-bootstrap
* Rename from terraform-render-bootkube to terraform-render-bootstrap
* Generated manifest and certificate assets are no longer geared
specifically for bootkube (no longer used)
2019-09-14 16:16:49 -07:00
Dalton Hubble
97bbed6c3a Rename CA organization from bootkube to typhoon
* Rename the organization in generated CA certificates for
clusters from bootkube to typhoon
* Mainly helpful to avoid confusion with bootkube CA certificates
if users inspect their CA, especially now that bootkube isn't used
(better their searches lead to Typhoon)
2019-09-14 16:08:06 -07:00
Dalton Hubble
6e59af7113 Migrate from a self-hosted to static pod control plane
* Run kube-apiserver, kube-scheduler, and kube-controller-manager
as static pods on each controller node
* Boostrap a minimal control plane by copying `static-manifests`
to the Kubelet `--pod-manifest-path` and tls/auth secrets to
`/etc/kubernetes/bootstrap-secrets`. Then, kubectl apply Kubernetes
manifests.
* Discontinue using bootkube to bootstrap and pivot to a self-hosted
control plane.
* Remove bootkube self-hosted kube-apiserver DaemonSet and
kube-scheduler and kube-controller-manager Deployments
* Remove pod-checkpointer manifests (no longer needed)

Advantages:

* Reduce control plane bootstrapping complexity. Self-hosted pivot and
pod checkpointing worked well, but in-place edits to kube-apiserver,
kube-controller-manager, or kube-scheduler is infrequently used. The
concept was originally geared toward continuously in-place upgrading
clusters, a goal Typhoon doesn't take on (rec. blue/green clusters).
As such, the value-add isn't justifying the extra components for this
particular project.
* Static pods still provide kubectl visibility and log access

Drawbacks:

* In-place edits to kube-apiserver, kube-controller-manager, and
kube-scheduler are not possible via kubectl (non-goal)
* Assets must be copied to each controller (not just one)
* Static pod must load credentials via hostPath, which is less clean
compared with the former Kubernetes secrets and service accounts
2019-09-02 20:52:46 -07:00
Dalton Hubble
98cc19f80f Update CoreDNS from v1.5.0 to v1.6.2
* https://coredns.io/2019/06/26/coredns-1.5.1-release/
* https://coredns.io/2019/07/03/coredns-1.5.2-release/
* https://coredns.io/2019/07/28/coredns-1.6.0-release/
* https://coredns.io/2019/08/02/coredns-1.6.1-release/
* https://coredns.io/2019/08/13/coredns-1.6.2-release/
2019-08-31 15:20:55 -07:00
Dalton Hubble
248675e7a9 Update Kubernetes from v1.15.2 to v1.15.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md/#v1153
2019-08-19 14:41:54 -07:00
Dalton Hubble
8b3738b2cc Update Calico from v3.8.1 to v3.8.2
* https://docs.projectcalico.org/v3.8/release-notes/
2019-08-16 14:53:20 -07:00
Dalton Hubble
c21da02249 Update Kubernetes from v1.15.1 to v1.15.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#downloads-for-v1152
2019-08-05 08:44:54 -07:00
Dalton Hubble
83dd5a7cfc Update Calico from v3.8.0 to v3.8.1
* https://github.com/projectcalico/calico/releases/tag/v3.8.1
2019-07-27 15:17:47 -07:00
Dalton Hubble
ed94836925 Update kube-router from v0.3.1 to v0.3.2
* kube-router is experimental and not supported or validated
* Bumping so the next time kube-router is evaluated, we're on
a modern version
* https://github.com/cloudnativelabs/kube-router/releases/tag/v0.3.2
2019-07-27 15:12:43 -07:00
Dalton Hubble
5b9faa9031 Update Kubernetes from v1.15.0 to v1.15.1
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#downloads-for-v1151
2019-07-19 01:18:09 -07:00
Dalton Hubble
119cb00fa7 Upgrade Calico from v3.7.4 to v3.8.0
* Enable CNI bandwidth plugin for traffic shaping
* https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#support-traffic-shaping
2019-07-11 21:00:58 -07:00
Dalton Hubble
4caca47776 Run kube-apiserver as non-root user (nobody) 2019-07-06 13:51:54 -07:00
Dalton Hubble
3bfd1253ec Always run kube-apiserver on port 6443 (internally)
* Require bootstrap-kube-apiserver and kube-apiserver components
listen on port 6443 (internally) to allow kube-apiserver pods to
run with lower user privilege
* Remove variable `apiserver_port`. The kube-apiserver listen
port is no longer customizable.
* Add variable `external_apiserver_port` to allow architectures
where a load balancer fronts kube-apiserver 6443 backends, but
listens on a different port externally. For example, Google Cloud
TCP Proxy load balancers cannot listen on 6443
2019-07-06 13:50:22 -07:00
Dalton Hubble
95f6fc7fa5 Update Calico from v3.7.3 to v3.7.4
* https://docs.projectcalico.org/v3.7/release-notes/
2019-07-02 20:15:53 -07:00
Dalton Hubble
62df9ad69c Update Kubernetes from v1.14.3 to v1.15.0
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#v1150
2019-06-23 13:04:13 -07:00
Dalton Hubble
89c3ab4e27 Update Calico from v3.7.2 to v3.7.3
* https://docs.projectcalico.org/v3.7/release-notes/
2019-06-13 23:36:35 -07:00
Dalton Hubble
0103bc06bb Define module required provider versions 2019-06-06 09:39:48 -07:00
Dalton Hubble
33d033f1a6 Migrate from Terraform v0.11.x to v0.12.x (breaking!)
* Terraform v0.12 is a major Terraform release with breaking changes
to the HCL language. In v0.11, it was required to use redundant brackets
as interpreter type hints to pass lists or concat and flatten lists and
strings. In v0.12, that work-around is no longer supported. Lists are
represented as first-class objects and the redundant brackets create
nested lists. Consequently, its not possible to pass lists in a way that
works with both v0.11 and v0.12 at the same time. We've made the
difficult choice to pursue a hard cutover to Terraform v0.12.x
* https://www.terraform.io/upgrade-guides/0-12.html#referring-to-list-variables
* Use expression syntax instead of interpolated strings, where suggested
* Define Terraform required_version ~> v0.12.0 (> v0.12, < v0.13)
2019-06-06 09:39:46 -07:00
Dalton Hubble
082921d679 Update Kubernetes from v1.14.2 to v1.14.3
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1143
v0.14.0
2019-05-31 01:05:00 -07:00
Dalton Hubble
efd1cfd9bf Update CoreDNS from v1.3.1 to v1.5.0
* Add `ready` plugin and change the readinessProbe to check
default port 8181 to ensure all plugins are ready
* `upstream [ADDRESS]` defines upstream resolvers for external
services. If no address is given, resolution is against CoreDNS
itself, which is the default. So `upstream` can be removed
2019-05-27 00:07:59 -07:00
Dalton Hubble
85571f6dae Update Kubernetes from v1.14.1 to v1.14.2
* https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.14.md#v1142
2019-05-17 13:00:30 +02:00
Dalton Hubble
eca7c49fe1 Update Calico from v3.7.0 to v3.7.2
* https://docs.projectcalico.org/v3.7/release-notes/
2019-05-17 12:26:02 +02:00
Dalton Hubble
42b9e782b2 Update kube-router from v0.3.0 to v0.3.1
* kube-router is experimental and not supported
* https://github.com/cloudnativelabs/kube-router/releases/tag/v0.3.1
2019-05-17 12:20:23 +02:00
Dalton Hubble
fc7a6fb20a Change flannel port from 8472 to 4789
* Change flannel port from the kernel default 8472 to the
IANA assigned VXLAN port 4789
* Requires a change to firewall rules or security groups
depending on the platform (**action required!**)
* Why now? Calico now offers its own VXLAN backend so
standardizing on the IANA port simplifies configuration
* https://github.com/coreos/flannel/blob/master/Documentation/backends.md#vxlan
2019-05-06 21:23:08 -07:00
Dalton Hubble
b96d641f6d Update Calico from v3.6.1 to v3.7.0
* Accept a `network_encapsulation` variable to choose whether the
default IPPool should use ipip (default) or vxlan encapsulation
* Use `network_mtu` as the MTU for workload interfaces for ipip
or vxlan (although Calico can have a IPPools with a mix, we're
picking ipip xor vxlan)
2019-05-05 20:41:53 -07:00
Dalton Hubble
614defe090 Update kube-router from v0.2.5 to v0.3.0
* https://github.com/cloudnativelabs/kube-router/releases/tag/v0.3.0
* Recall, kube-router is experimental and not vouched for
as part of clusters
2019-05-04 11:38:19 -07:00