terraform-render-bootstrap

github-personal/terraform-render-bootstrap

mirror of https://github.com/outbackdingo/terraform-render-bootstrap.git synced 2026-01-27 18:20:40 +00:00

Author	SHA1	Message	Date
Dalton Hubble	97fe45c93e	Update Calico from v3.23.1 to v3.23.3 * https://github.com/projectcalico/calico/releases/tag/v3.23.3	2022-07-30 18:10:02 -07:00
Dalton Hubble	178664d84e	Update Calico from v3.22.2 to v3.23.1 * https://github.com/projectcalico/calico/releases/tag/v3.23.1	2022-06-18 18:49:58 -07:00
Dalton Hubble	f325be5041	Update Cilium from v1.11.4 to v1.11.5 * https://github.com/cilium/cilium/releases/tag/v1.11.5	2022-05-31 15:21:36 +01:00
James Harmison	5bbca44f66	Update cilium ds name and label to align with upstream	2022-04-20 18:47:59 -07:00
Dalton Hubble	fa4745d155	Update Calico from v3.21.2 to v3.22.1 * Calico aims to fix https://github.com/projectcalico/calico/issues/5011	2022-03-11 10:57:07 -08:00
Dalton Hubble	37f45cb28b	Update Cilium from v1.10.5 to v1.11.0 * https://github.com/cilium/cilium/releases/tag/v1.11.0	2021-12-10 11:23:56 -08:00
Dalton Hubble	8add7022d1	Normalize CA certs mounts in static Pods and kube-proxy * Mount both /etc/ssl/certs and /etc/pki into control plane static pods and kube-proxy, rather than choosing one based a variable (set based on Flatcar Linux or Fedora CoreOS) * Remove `trusted_certs_dir` variable * Remove deprecated `--port` from `kube-scheduler` static Pod	2021-12-09 09:26:28 -08:00
Dalton Hubble	362158a6d6	Add missing caliconodestatuses CRD for Calico * https://github.com/projectcalico/calico/pull/5012	2021-12-09 09:19:12 -08:00
Dalton Hubble	9f9d7708c3	Update Calico and flannel CNI providers * Update Calico from v3.20.2 to v3.21.0 * Update flannel from v0.14.0 to v0.15.0	2021-11-11 14:25:11 -08:00
Dalton Hubble	0b102c4089	Update Calico from v3.20.1 to v3.20.2 * https://github.com/projectcalico/calico/releases/tag/v3.20.2 * Add support for iptables legacy vs nft detection	2021-10-05 19:33:09 -07:00
Dalton Hubble	c6fa09bda1	Update Calico and Cilium CNI providers * Update Calico from v3.20.0 to v3.20.1 * Update Cilium from v1.10.3 to v1.10.4 * Remove Cilium wait for BGF mount	2021-09-21 09:11:49 -07:00
Dalton Hubble	bfc2fa9697	Fix ClusterIP access when using Cilium * When a router sets node(s) as next-hops in a network, ClusterIP Services should be able to respond as usual * https://github.com/cilium/cilium/issues/14581	2021-09-15 19:43:58 -07:00
Dalton Hubble	a2e1cdfd8a	Update Calico from v3.19.2 to v3.20.0 * https://github.com/projectcalico/calico/blob/v3.20.0/_includes/charts/calico/templates/calico-node.yaml	2021-08-18 19:43:40 -07:00
Dalton Hubble	b5f5d843ec	Disable kube-scheduler insecure port * Kubernetes v1.22.0 disables kube-controller-manager insecure port which was used internally for Prometheus metrics scraping In Typhoon, we'll switch to using the https port which requires Prometheus present a bearer token * Go ahead and disable the insecure port for kube-scheduler too, we'll configure Prometheus to scrape it with a bearer token as well * Remove unused kube-apiserver `--port` flag Rel: * https://github.com/kubernetes/kubernetes/pull/96216	2021-08-10 21:11:30 -07:00
Dalton Hubble	5c0bebc1e7	Add Cilium init container to auto-mount cgroup2 * Add init container to auto-mount /sys/fs/cgroup cgroup2 at /run/cilium/cgroupv2 for the Cilium agent * Enable CNI exclusive mode, to disable other configs found in /etc/cni/net.d/ * https://github.com/cilium/cilium/pull/16259	2021-07-24 10:30:06 -07:00
Dalton Hubble	362f42a7a2	Update CoreDNS from v1.8.0 to v1.8.4 * https://coredns.io/2021/01/20/coredns-1.8.1-release/ * https://coredns.io/2021/02/23/coredns-1.8.2-release/ * https://coredns.io/2021/02/24/coredns-1.8.3-release/ * https://coredns.io/2021/05/28/coredns-1.8.4-release/	2021-06-23 23:26:27 -07:00
Dalton Hubble	7052c66882	Update Calico from v3.18.1 to v3.19.0 * https://docs.projectcalico.org/archive/v3.19/release-notes/	2021-05-13 11:17:48 -07:00
Dalton Hubble	a4ecf168df	Update static Pod manifests for Kubernetes v1.21.0 * Set `kube-apiserver` `service-account-jwks-uri` because conformance ServiceAccountIssuerDiscovery OIDC discovery will access a JWT endpoint using the kube-apiserver's advertise address by default, instead of using the intended in-cluster service (10.3.0.1) resolved by cluster DNS `kubernetes.default.svc.cluster.local`, which causes a cert SAN error * Set the authentication and authorization kubeconfig for kube-scheduler and kube-controller-manager. Here, authn/z refer to aggregated API use cases only, so its not strictly neccessary and warnings about missing `extension-apiserver-authentication` when enable_aggregation is false can be ignored * Mount `/var/lib/kubelet/volumeplugins` to to the default location expected within kube-controller-manager to remove the need for a flag * Enable `tokencleaner` controller to automatically delete expired bootstrap tokens (default node token is good 1 year, so cleanup won't really matter at that point, but enable regardless) * Remove unused `cloud-provider` flag, we never intend to use in-tree cloud providers or support custom providers	2021-04-10 17:42:18 -07:00
Dalton Hubble	f87aa7f96a	Change CNI config directory to /etc/cni/net.d * Change CNI config directory from `/etc/kubernetes/cni/net.d` to `/etc/cni/net.d` (Kubelet default)	2021-04-01 16:48:46 -07:00
Dalton Hubble	adcba1c211	Update Calico from v3.17.3 to v3.18.1 * https://docs.projectcalico.org/archive/v3.18/release-notes/	2021-03-14 10:15:39 -07:00
Dalton Hubble	84972373d4	Rename bootstrap-secrets directory to pki * Change control plane static pods to mount `/etc/kubernetes/pki`, instead of `/etc/kubernetes/bootstrap-secrets` to better reflect their purpose and match some loose conventions upstream * Require TLS assets to be placed at `/etc/kubernetes/pki`, instead of `/etc/kubernetes/bootstrap-secrets` on hosts (breaking) * Mount to `/etc/kubernetes/pki` to match the host (less surprise) * https://kubernetes.io/docs/setup/best-practices/certificates/	2020-12-02 23:13:53 -08:00
Dalton Hubble	ac5cb95774	Generate kubeconfig's for kube-scheduler and kube-controller-manager * Generate TLS client certificates for kube-scheduler and kube-controller-manager with `system:kube-scheduler` and `system:kube-controller-manager` CNs * Template separate kubeconfigs for kube-scheduler and kube-controller manager (`scheduler.conf` and `controller-manager.conf`). Rename admin for clarity * Before v1.16.0, Typhoon scheduled a self-hosted control plane, which allowed the steady-state kube-scheduler and kube-controller-manager to use a scoped ServiceAccount. With a static pod control plane, separate CN TLS client certificates are the nearest equiv. * https://kubernetes.io/docs/setup/best-practices/certificates/ * Remove unused Kubelet certificate, TLS bootstrap is used instead	2020-12-01 20:18:36 -08:00
Dalton Hubble	19c3ce61bd	Add TokenReview and TokenRequestProjection kube-apiserver flags * Add kube-apiserver flags for TokenReview and TokenRequestProjection (beta, defaults on) to allow using Service Account Token Volume Projection to create and mount service account tokens tied to a Pod's lifecycle * Both features will be promoted from beta to stable in v1.20 * Rename `experimental-cluster-signing-duration` to just `cluster-signing-duration` Rel: * https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection	2020-12-01 19:50:25 -08:00
Dalton Hubble	fd10b94f87	Update Calico from v3.16.5 to v3.17.0 * Consider Calico's MTU auto-detection, but leave Calico MTU variable for now (`network_mtu` ignored) * Remove SELinux level setting workaround for https://github.com/projectcalico/cni-plugin/issues/874	2020-11-25 11:18:59 -08:00
Starbuck	74c299bf2c	Restore kube-controller-manager --use-service-account-credentials * kube-controller-manager Pods can start control loops with credentials that have been granted relevant controller manager roles or using generated service accounts bound to each role * During the migration of the control plane from self-hosted to static pods (https://github.com/poseidon/terraform-render-bootstrap/pull/148) the flag for using separate service accounts was inadvertently dropped * Restore the --use-service-account-credentials flag used before v1.16 Related: * https://kubernetes.io/docs/reference/access-authn-authz/rbac/#controller-roles * https://github.com/poseidon/terraform-render-bootstrap/pull/225	2020-11-10 12:06:51 -08:00
Dalton Hubble	c6e3a2bcdc	Update Cilium from v1.8.5 to v1.9.0-rc3 * https://github.com/cilium/cilium/releases/tag/v1.9.0-rc3 * https://github.com/cilium/cilium/releases/tag/v1.9.0-rc2 * https://github.com/cilium/cilium/releases/tag/v1.9.0-rc1	2020-11-03 00:05:32 -08:00
Dalton Hubble	7988fb7159	Update Calico from v3.15.3 to v3.16.3 * https://github.com/projectcalico/calico/releases/tag/v3.16.3 * https://docs.projectcalico.org/v3.16/release-notes/	2020-10-15 20:00:41 -07:00
Nesc58	016d4ebd0c	Mount /run/xtables.lock in flannel Daemonset * Mount xtables.lock (like Calico and Cilium) since iptables may be called by other processes (kube-proxy)	2020-09-16 19:01:42 -07:00
Dalton Hubble	f2dd897d67	Change seccomp annotations to Pod seccompProfile * seccomp graduated to GA in Kubernetes v1.19. Support for seccomp alpha annotations will be removed in v1.22 * Replace seccomp annotations with the GA seccompProfile field in the PodTemplate securityContext * Switch profile from `docker/default` to `runtime/default` (no effective change, since docker is the runtime) * Verify with docker inspect SecurityOpt. Without the profile, you'd see `seccomp=unconfined` Related: * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.19.md#seccomp-graduates-to-general-availability	2020-09-10 00:28:58 -07:00
Dalton Hubble	2686d59203	Allow leader election among Cilium operator replicas * Allow Cilium operator Pods to leader elect when Deployment has more than one replica * Use topology spread constraint to keep multiple operators from running on the same node (pods bind hostNetwork ports)	2020-09-07 17:48:19 -07:00
Dalton Hubble	3675b3a539	Update from coreos/flannel-cni to poseidon/flannel-cni * Update CNI plugins from v0.6.0 to v0.8.6 to fix several CVEs * Update the base image to alpine:3.12 * Use `flannel-cni` as an init container and remove sleep * Add Linux ARM64 and multi-arch container images * https://github.com/poseidon/flannel-cni * https://quay.io/repository/poseidon/flannel-cni Background * Switch from github.com/coreos/flannel-cni v0.3.0 which was last published by me in 2017 and which is no longer accessible to me to maintain or patch * Port to the poseidon/flannel-cni rewrite, which releases v0.4.0 to continue the prior release numbering	2020-08-02 15:06:18 -07:00
Dalton Hubble	45053a62cb	Update Cilium from v1.8.1 to v1.8.2 * Drop unused option https://github.com/cilium/cilium/pull/12618	2020-07-25 15:52:19 -07:00
Dalton Hubble	1c07dfbc2a	Remove experimental kube-router CNI provider	2020-06-21 21:55:56 -07:00
Dalton Hubble	af36c53936	Add experimental Cilium CNI provider * Accept experimental CNI `networking` mode "cilium" * Run Cilium v1.8.0 with overlay vxlan tunnels and a minimal set of features. We're interested in: * IPAM: Divide pod_cidr into /24 subnets per node * CNI networking pod-to-pod, pod-to-external * BPF masquerade * NetworkPolicy as defined by Kubernetes (no L7) * Continue using kube-proxy with Cilium probe mode * Firewall changes: * Require UDP 8472 for vxlan (Linux kernel default) between nodes * Optional ICMP echo(8) between nodes for host reachability (health) * Optional TCP 4240 between nodes for host reachability (health)	2020-06-21 16:21:09 -07:00
Dalton Hubble	e75697ce35	Rename controller node label and NoSchedule taint * Use node label `node.kubernetes.io/controller` to select controller nodes (action required) * Tolerate node taint `node-role.kubernetes.io/controller` for workloads that should run on controller nodes. Don't tolerate `node-role.kubernetes.io/master` (action required)	2020-06-17 22:46:35 -07:00
Dalton Hubble	3fe903d0ac	Update Kubernetes from v1.18.3 to v1.18.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1184 * Remove unused template file	2020-06-17 19:23:12 -07:00
Dalton Hubble	a83ddbb30e	Add CoreDNS "soft" nodeAffinity for controller nodes * Add nodeAffinity to CoreDNS deployment PodSpec to prefer running CoreDNS pods on controllers, while relying on podAntiAffinity for spreading. * For single master clusters, running two CoreDNS pods on the master or running one pod on a worker is permissible. * Note: Its still _possible_ to end up with CoreDNS pods all running on workers since we only express scheduling preference ("soft"), but unlikely. Plus the motivating scenario (below) is also rare. Background: * CoreDNS replicas are set to the higher of 2 or the number of control plane nodes to (at a minimum) support Deployment updates or pod restarts and match the cluster size (e.g. 5 master/controller nodes likely means a larger cluster, so run 5 CoreDNS replicas) * In the past (before v1.14), we required kube-dns (CoreOS predecessor) to run CoreDNS pods on master nodes. With CoreDNS this node selection was relaxed. We'd like a gentler form of it now. Motivation: * On clusters using 100% preemptible/spot workers, it is possible that CoreDNS pods schedule to workers that are all preempted at the same time, causing a loss of cluster internal DNS service until a CoreDNS pod reschedules (1 min). We'd like CoreDNS to prefer controller/master nodes (which aren't preempted) to reduce the possibility of control plane disruption	2020-05-09 22:48:56 -07:00
Dalton Hubble	1dc36b58b8	Fix Calico node crash loop on Pod restart * Set a consistent MCS level/range for Calico install-cni * Note: Rebooting a node was a workaround, because Kubelet relabels /etc/kubernetes(/cni/net.d) Background: * On SELinux enforcing systems, the Calico CNI install-cni container ran with default SELinux context and a random MCS pair. install-cni places CNI configs by first creating a temporary file and then moving them into place, which means the file MCS categories depend on the containers SELinux context. * calico-node Pod restarts creates a new install-cni container with a different MCS pair that cannot access the earlier written file (it places configs every time), causing the init container to error and calico-node to crash loop * https://github.com/projectcalico/cni-plugin/issues/874 ``` mv: inter-device move failed: '/calico.conf.tmp' to '/host/etc/cni/net.d/10-calico.conflist'; unable to remove target: Permission denied Failed to mv files. This may be caused by selinux configuration on the host, or something else. ``` Note, this isn't a host SELinux configuration issue.	2020-05-09 15:20:06 -07:00
Dalton Hubble	924beb4b0c	Enable Kubelet TLS bootstrap and NodeRestriction * Enable bootstrap token authentication on kube-apiserver * Generate the bootstrap.kubernetes.io/token Secret that may be used as a bootstrap token * Generate a bootstrap kubeconfig (with a bootstrap token) to be securely distributed to nodes. Each Kubelet will use the bootstrap kubeconfig to authenticate to kube-apiserver as `system:bootstrappers` and send a node-unique CSR for kube-controller-manager to automatically approve to issue a Kubelet certificate and kubeconfig (expires in 72 hours) * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the `system:node-bootstrapper` ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr nodeclient ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr selfnodeclient ClusterRole * Enable NodeRestriction admission controller to limit the scope of Node or Pod objects a Kubelet can modify to those of the node itself * Ability for a Kubelet to delete its Node object is retained as preemptible nodes or those in auto-scaling instance groups need to be able to remove themselves on shutdown. This need continues to have precedence over any risk of a node deleting itself maliciously Security notes: 1. Issued Kubelet certificates authenticate as user `system:node:NAME` and group `system:nodes` and are limited in their authorization to perform API operations by Node authorization and NodeRestriction admission. Previously, a Kubelet's authorization was broader. This is the primary security motivation. 2. The bootstrap kubeconfig credential has the same sensitivity as the previous generated TLS client-certificate kubeconfig. It must be distributed securely to nodes. Its compromise still allows an attacker to obtain a Kubelet kubeconfig 3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers a slight security improvement. * An attacker who obtains the kubeconfig can likely obtain the bootstrap kubeconfig as well, to obtain the ability to renew their access * A compromised bootstrap kubeconfig could plausibly be handled by replacing the bootstrap token Secret, distributing the token to new nodes, and expiration. Whereas a compromised TLS-client certificate kubeconfig can't be revoked (no CRL). However, replacing a bootstrap token can be impractical in real cluster environments, so the limited lifetime is mostly a theoretical benefit. * Cluster CSR objects are visible via kubectl which is nice 4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet clients have more identity information, which can improve the utility of audits and future features Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/	2020-04-25 19:38:56 -07:00
Dalton Hubble	42723d13a6	Change default kube-system DaemonSet tolerations * Change kube-proxy, flannel, and calico-node DaemonSet tolerations to tolerate `node.kubernetes.io/not-ready` and `node-role.kubernetes.io/master` (i.e. controllers) explicitly, rather than tolerating all taints * kube-system DaemonSets will no longer tolerate custom node taints by default. Instead, custom node taints must be enumerated to opt-in to scheduling/executing the kube-system DaemonSets. Background: Tolerating all taints ruled out use-cases where certain nodes might legitimately need to keep kube-proxy or CNI networking disabled	2020-03-25 22:43:50 -07:00
Dalton Hubble	e76f0a09fa	Switch from upstream hyperkube to component images * Kubernetes plans to stop releasing the hyperkube image in the future. * Upstream will continue releasing container images for `kube-apiserver`, `kube-controller-manager`, `kube-proxy`, and `kube-scheduler`. Typhoon will use these images * Upstream will release the kubelet as a binary for distros to package, either as a traditional DEB/RPP or as a container image for container-optimized operating systems. Typhoon will take on the packaging of Kubelet and its dependencies as a new container image (alongside kubectl) Rel: https://github.com/kubernetes/kubernetes/pull/88676 See: https://github.com/poseidon/kubelet	2020-03-17 22:13:42 -07:00
Dalton Hubble	804029edd5	Update Calico from v3.12.0 to v3.13.1 * https://docs.projectcalico.org/v3.13/release-notes/	2020-03-12 22:55:57 -07:00
Dalton Hubble	ac4b7af570	Configure kube-proxy to serve /metrics on 0.0.0.0:10249 * Set kube-proxy --metrics-bind-address to 0.0.0.0 (default 127.0.0.1) so Prometheus metrics can be scraped * Add pod port list (informational only) * Require node firewall rules to be updated before scrapes can succeed	2019-12-29 11:56:52 -08:00
Dalton Hubble	4369c706e2	Restore kube-controller-manager settings lost in static pod migration * Migration from a self-hosted to a static pod control plane dropped a few kube-controller-manager customizations * Reduce kube-controller-manager --pod-eviction-timeout from 5m to 1m to move pods more quickly when nodes are preempted * Fix flex-volume-plugin-dir since the Kubernetes default points to a read-only filesystem on Container Linux / Fedora CoreOS Related: * https://github.com/poseidon/terraform-render-bootstrap/pull/148 * `7b06557b7a`	2019-12-08 22:37:36 -08:00
Dalton Hubble	7df6bd8d1e	Tune static pod CPU requests slightly lower * Reduce kube-apiserver and kube-controller-manager CPU requests from 200m to 150m. Prefer slightly lower commitment after running with the requests chosen in #161 for a while * Reduce calico-node CPU request from 150m to 100m to match CoreDNS and flannel	2019-12-08 22:25:58 -08:00
Dalton Hubble	0f1f16c612	Add small CPU resource requests to static pods * Set small CPU requests on static pods kube-apiserver, kube-controller-manager, and kube-scheduler to align with upstream tooling and for edge cases * Control plane nodes are tainted to isolate them from ordinary workloads. Even dense workloads can only compress CPU resources on worker nodes. * Control plane static pods use the highest priority class, so contention favors control plane pods (over say node-exporter) and CPU is compressible too. * Effectively, a practical case for these requests hasn't been observed. However, a small static pod CPU request may offer a slight benefit if a controller became overloaded and the above mechanisms were insufficient for some reason (bit of a stretch, due to CPU compressibility) * Continue to avoid setting a memory request for static pods. It would impose a hard size requirement on controller nodes, which isn't warranted and is handled more gently by Typhoon default instance types across clouds and via docs	2019-11-13 16:44:33 -08:00
Dalton Hubble	43e1230c55	Update CoreDNS from v1.6.2 to v1.6.5 * Add health `lameduck` option 5s. Before CoreDNS shuts down, it will wait and report unhealthy for 5s to allow time for plugins to shutdown cleanly * Minor bug fixes over a few releases * https://coredns.io/2019/08/31/coredns-1.6.3-release/ * https://coredns.io/2019/09/27/coredns-1.6.4-release/ * https://coredns.io/2019/11/05/coredns-1.6.5-release/	2019-11-13 14:33:50 -08:00
Dalton Hubble	3c7334ab55	Upgrade Calico from v3.9.2 to v3.10.0 * Change calico-node livenessProve from httpGet to exec a calico-node -felix-ready, as recommended by Calico * Allow advertising Kubernetes service ClusterIPs	2019-10-27 01:06:09 -07:00
Dalton Hubble	e09d6bef33	Switch kube-proxy from iptables mode to ipvs mode * Kubernetes v1.11 considered kube-proxy IPVS mode GA * Many problems were found https://github.com/poseidon/typhoon/pull/321 * Since then, major blockers seem to have been addressed	2019-10-15 22:55:17 -07:00
Dalton Hubble	1f8b634652	Remove unneeded control plane flags * Several flags now default to the arguments we've been setting and are no longer needed	2019-10-06 20:25:46 -07:00

1 2 3 4

159 Commits