* Add a `coredns` variable to configure the CoreDNS manifests,
with an `enable` field to determine whether CoreDNS manifests
are applied to the cluster during provisioning (default true)
* Add a `kube-proxy` variable to configure kube-proxy manifests,
with an `enable` field to determine whether the kube-proxy
Daemonset is applied to the cluster during provisioning (default
true)
* These optional allow for provisioning clusters without CoreDNS
or kube-proxy, so these components can be customized or managed
through separate plan/apply processes or automation
* Cilium v1.14 seems to have problems reliably accessing the
apiserver via default in-cluster service discovery (relies on
kube-proxy instead of DNS) after some time
* Configure Cilium agents to use the DNS name resolving to the
cluster's load balanced apiserver and port. Regrettably, this
relies on external DNS rather than being self-contained, but its
what Cilium pushes users towards
* Mount both /etc/ssl/certs and /etc/pki into control plane static
pods and kube-proxy, rather than choosing one based a variable
(set based on Flatcar Linux or Fedora CoreOS)
* Remove `trusted_certs_dir` variable
* Remove deprecated `--port` from `kube-scheduler` static Pod
* Set `kube-apiserver` `service-account-jwks-uri` because conformance
ServiceAccountIssuerDiscovery OIDC discovery will access a JWT endpoint
using the kube-apiserver's advertise address by default, instead of
using the intended in-cluster service (10.3.0.1) resolved by cluster DNS
`kubernetes.default.svc.cluster.local`, which causes a cert SAN error
* Set the authentication and authorization kubeconfig for kube-scheduler
and kube-controller-manager. Here, authn/z refer to aggregated API
use cases only, so its not strictly neccessary and warnings about
missing `extension-apiserver-authentication` when enable_aggregation
is false can be ignored
* Mount `/var/lib/kubelet/volumeplugins` to to the default location
expected within kube-controller-manager to remove the need for a flag
* Enable `tokencleaner` controller to automatically delete expired
bootstrap tokens (default node token is good 1 year, so cleanup won't
really matter at that point, but enable regardless)
* Remove unused `cloud-provider` flag, we never intend to use in-tree
cloud providers or support custom providers
* Terraform `random_password` is identical to `random_string` except
its value is marked sensitive so it isn't displayed in terraform
plan and other outputs
* Prefer marking the bootstrap token as sensitive for cases where
terraform is run in an automated CI/CD system
* Change control plane static pods to mount `/etc/kubernetes/pki`,
instead of `/etc/kubernetes/bootstrap-secrets` to better reflect
their purpose and match some loose conventions upstream
* Require TLS assets to be placed at `/etc/kubernetes/pki`, instead
of `/etc/kubernetes/bootstrap-secrets` on hosts (breaking)
* Mount to `/etc/kubernetes/pki` to match the host (less surprise)
* https://kubernetes.io/docs/setup/best-practices/certificates/
* Originally, generated TLS certificates, manifests, and
cluster "assets" written to local disk (`asset_dir`) during
terraform apply cluster bootstrap
* Typhoon v1.17.0 introduced bootstrapping using only Terraform
state to store cluster assets, to avoid ever writing sensitive
materials to disk and improve automated use-cases. `asset_dir`
was changed to optional and defaulted to "" (no writes)
* Typhoon v1.18.0 deprecated the `asset_dir` variable, removed
docs, and announced it would be deleted in future.
* Remove the `asset_dir` variable
Cluster assets are now stored in Terraform state only. For those
who wish to write those assets to local files, this is possible
doing so explicitly.
```
resource local_file "assets" {
for_each = module.bootstrap.assets_dist
filename = "some-assets/${each.key}"
content = each.value
}
```
Related:
* https://github.com/poseidon/typhoon/pull/595
* https://github.com/poseidon/typhoon/pull/678
* Enable bootstrap token authentication on kube-apiserver
* Generate the bootstrap.kubernetes.io/token Secret that
may be used as a bootstrap token
* Generate a bootstrap kubeconfig (with a bootstrap token)
to be securely distributed to nodes. Each Kubelet will use
the bootstrap kubeconfig to authenticate to kube-apiserver
as `system:bootstrappers` and send a node-unique CSR for
kube-controller-manager to automatically approve to issue
a Kubelet certificate and kubeconfig (expires in 72 hours)
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the `system:node-bootstrapper`
ClusterRole
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the csr nodeclient ClusterRole
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the csr selfnodeclient ClusterRole
* Enable NodeRestriction admission controller to limit the
scope of Node or Pod objects a Kubelet can modify to those of
the node itself
* Ability for a Kubelet to delete its Node object is retained
as preemptible nodes or those in auto-scaling instance groups
need to be able to remove themselves on shutdown. This need
continues to have precedence over any risk of a node deleting
itself maliciously
Security notes:
1. Issued Kubelet certificates authenticate as user `system:node:NAME`
and group `system:nodes` and are limited in their authorization
to perform API operations by Node authorization and NodeRestriction
admission. Previously, a Kubelet's authorization was broader. This
is the primary security motivation.
2. The bootstrap kubeconfig credential has the same sensitivity
as the previous generated TLS client-certificate kubeconfig.
It must be distributed securely to nodes. Its compromise still
allows an attacker to obtain a Kubelet kubeconfig
3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers
a slight security improvement.
* An attacker who obtains the kubeconfig can likely obtain the
bootstrap kubeconfig as well, to obtain the ability to renew
their access
* A compromised bootstrap kubeconfig could plausibly be handled
by replacing the bootstrap token Secret, distributing the token
to new nodes, and expiration. Whereas a compromised TLS-client
certificate kubeconfig can't be revoked (no CRL). However,
replacing a bootstrap token can be impractical in real cluster
environments, so the limited lifetime is mostly a theoretical
benefit.
* Cluster CSR objects are visible via kubectl which is nice
4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet
clients have more identity information, which can improve the
utility of audits and future features
Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/
* Change kube-proxy, flannel, and calico-node DaemonSet
tolerations to tolerate `node.kubernetes.io/not-ready`
and `node-role.kubernetes.io/master` (i.e. controllers)
explicitly, rather than tolerating all taints
* kube-system DaemonSets will no longer tolerate custom
node taints by default. Instead, custom node taints must
be enumerated to opt-in to scheduling/executing the
kube-system DaemonSets.
Background: Tolerating all taints ruled out use-cases
where certain nodes might legitimately need to keep
kube-proxy or CNI networking disabled
* Kubernetes plans to stop releasing the hyperkube image in
the future.
* Upstream will continue releasing container images for
`kube-apiserver`, `kube-controller-manager`, `kube-proxy`,
and `kube-scheduler`. Typhoon will use these images
* Upstream will release the kubelet as a binary for distros
to package, either as a traditional DEB/RPP or as a container
image for container-optimized operating systems. Typhoon will
take on the packaging of Kubelet and its dependencies as a new
container image (alongside kubectl)
Rel: https://github.com/kubernetes/kubernetes/pull/88676
See: https://github.com/poseidon/kubelet
* `asset_dir` is an absolute path to a directory where generated
assets from terraform-render-bootstrap are written (sensitive)
* Change `asset_dir` to default to "" so no assets are written
(favor Terraform output mechanisms). Previously, asset_dir was
required so all users set some path. To take advantage of the
new optionality, remove asset_dir or set it to ""
* Introduce a new `assets_dist` output variable that provides
a mapping from suggested asset paths to asset contents (for
assets that should be distributed to controller nodes). This
new output format is intended to align with a modified asset
distribution style in Typhoon.
* Lay the groundwork for `assets_dir` to become optional. The
output map provides output variable access to the minimal assets
that are required for bootstrap
* Assets that aren't required for bootstrap itself (e.g.
the etcd CA key) but can be used by admins may later be added
as specific output variables to further reduce asset_dir use
Background:
* `terraform-render-bootstrap` rendered assets were previously
only provided by rendering files to an `asset_dir`. This was
neccessary, but created a responsibility to maintain those
assets on the machine where terraform apply was run
* Adopt Terrform v0.12 type and templatefile function
features to replace the use of terraform-provider-template's
`template_dir`
* Use of `for_each` to write local assets requires
that consumers use Terraform v0.12.6+ (action required)
* Continue use of `template_file` as its quite common. In
future, we may replace it as well.
* Remove outputs `id` and `content_hash` (no longer used)
Background:
* `template_dir` was added to `terraform-provider-template`
to add support for template directory rendering in CoreOS
Tectonic Kubernetes distribution (~2017)
* Terraform v0.12 introduced a native `templatefile` function
and v0.12.6 introduced native `for_each` support (July 2019)
that makes it possible to replace `template_dir` usage