16 Commits

Author SHA1 Message Date
Dalton Hubble
c50071487c Add service_account_issuer variable for kube-apiserver
* Allow the service account token issuer to be adjusted or served
from a public bucket or static cache
* Output the public key used to sign service account tokens so that
it can be used to compute JWKS (JSON Web Key Sets) if desired

Docs: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-issuer-discovery
2025-02-07 10:58:54 -08:00
Dalton Hubble
1ddecb1cef Change Cilium configuration to use kube-proxy replacement
* Skip creating the kube-proxy DaemonSet when Cilium is chosen
2024-08-23 07:15:18 -07:00
Dalton Hubble
990286021a Organize CoreDNS and kube-proxy manifests so they're optional
* Add a `coredns` variable to configure the CoreDNS manifests,
with an `enable` field to determine whether CoreDNS manifests
are applied to the cluster during provisioning (default true)
* Add a `kube-proxy` variable to configure kube-proxy manifests,
with an `enable` field to determine whether the kube-proxy
Daemonset is applied to the cluster during provisioning (default
true)
* These optional allow for provisioning clusters without CoreDNS
or kube-proxy, so these components can be customized or managed
through separate plan/apply processes or automation
2024-05-12 18:05:55 -07:00
Dalton Hubble
720adbeb43 Configure Cilium agents to connect to apiserver explicitly
* Cilium v1.14 seems to have problems reliably accessing the
apiserver via default in-cluster service discovery (relies on
kube-proxy instead of DNS) after some time
* Configure Cilium agents to use the DNS name resolving to the
cluster's load balanced apiserver and port. Regrettably, this
relies on external DNS rather than being self-contained, but its
what Cilium pushes users towards
2023-10-29 16:08:21 -07:00
Dalton Hubble
8add7022d1 Normalize CA certs mounts in static Pods and kube-proxy
* Mount both /etc/ssl/certs and /etc/pki into control plane static
pods and kube-proxy, rather than choosing one based a variable
(set based on Flatcar Linux or Fedora CoreOS)
* Remove `trusted_certs_dir` variable
* Remove deprecated `--port` from `kube-scheduler` static Pod
2021-12-09 09:26:28 -08:00
Dalton Hubble
a4ecf168df Update static Pod manifests for Kubernetes v1.21.0
* Set `kube-apiserver` `service-account-jwks-uri` because conformance
ServiceAccountIssuerDiscovery OIDC discovery will access a JWT endpoint
using the kube-apiserver's advertise address by default, instead of
using the intended in-cluster service (10.3.0.1) resolved by cluster DNS
`kubernetes.default.svc.cluster.local`, which causes a cert SAN error
* Set the authentication and authorization kubeconfig for kube-scheduler
and kube-controller-manager. Here, authn/z refer to aggregated API
use cases only, so its not strictly neccessary and warnings about
missing `extension-apiserver-authentication` when enable_aggregation
is false can be ignored
* Mount `/var/lib/kubelet/volumeplugins` to to the default location
expected within kube-controller-manager to remove the need for a flag
* Enable `tokencleaner` controller to automatically delete expired
bootstrap tokens (default node token is good 1 year, so cleanup won't
really matter at that point, but enable regardless)
* Remove unused `cloud-provider` flag, we never intend to use in-tree
cloud providers or support custom providers
2021-04-10 17:42:18 -07:00
Dalton Hubble
8fc689b89c Switch bootstrap token to a random_password
* Terraform `random_password` is identical to `random_string` except
its value is marked sensitive so it isn't displayed in terraform
plan and other outputs
* Prefer marking the bootstrap token as sensitive for cases where
terraform is run in an automated CI/CD system
2021-03-14 10:45:27 -07:00
Dalton Hubble
84972373d4 Rename bootstrap-secrets directory to pki
* Change control plane static pods to mount `/etc/kubernetes/pki`,
instead of `/etc/kubernetes/bootstrap-secrets` to better reflect
their purpose and match some loose conventions upstream
* Require TLS assets to be placed at `/etc/kubernetes/pki`, instead
of `/etc/kubernetes/bootstrap-secrets` on hosts (breaking)
* Mount to `/etc/kubernetes/pki` to match the host (less surprise)
* https://kubernetes.io/docs/setup/best-practices/certificates/
2020-12-02 23:13:53 -08:00
Dalton Hubble
9037d7311b Remove asset_dir variable and optional asset writes
* Originally, generated TLS certificates, manifests, and
cluster "assets" written to local disk (`asset_dir`) during
terraform apply cluster bootstrap
* Typhoon v1.17.0 introduced bootstrapping using only Terraform
state to store cluster assets, to avoid ever writing sensitive
materials to disk and improve automated use-cases. `asset_dir`
was changed to optional and defaulted to "" (no writes)
* Typhoon v1.18.0 deprecated the `asset_dir` variable, removed
docs, and announced it would be deleted in future.
* Remove the `asset_dir` variable

Cluster assets are now stored in Terraform state only. For those
who wish to write those assets to local files, this is possible
doing so explicitly.

```
resource local_file "assets" {
  for_each = module.bootstrap.assets_dist
  filename = "some-assets/${each.key}"
  content = each.value
}
```

Related:

* https://github.com/poseidon/typhoon/pull/595
* https://github.com/poseidon/typhoon/pull/678
2020-10-17 14:57:13 -07:00
Dalton Hubble
924beb4b0c Enable Kubelet TLS bootstrap and NodeRestriction
* Enable bootstrap token authentication on kube-apiserver
* Generate the bootstrap.kubernetes.io/token Secret that
may be used as a bootstrap token
* Generate a bootstrap kubeconfig (with a bootstrap token)
to be securely distributed to nodes. Each Kubelet will use
the bootstrap kubeconfig to authenticate to kube-apiserver
as `system:bootstrappers` and send a node-unique CSR for
kube-controller-manager to automatically approve to issue
a Kubelet certificate and kubeconfig (expires in 72 hours)
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the `system:node-bootstrapper`
ClusterRole
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the csr nodeclient ClusterRole
* Add ClusterRoleBinding for bootstrap token subjects
(`system:bootstrappers`) to have the csr selfnodeclient ClusterRole
* Enable NodeRestriction admission controller to limit the
scope of Node or Pod objects a Kubelet can modify to those of
the node itself
* Ability for a Kubelet to delete its Node object is retained
as preemptible nodes or those in auto-scaling instance groups
need to be able to remove themselves on shutdown. This need
continues to have precedence over any risk of a node deleting
itself maliciously

Security notes:

1. Issued Kubelet certificates authenticate as user `system:node:NAME`
and group `system:nodes` and are limited in their authorization
to perform API operations by Node authorization and NodeRestriction
admission. Previously, a Kubelet's authorization was broader. This
is the primary security motivation.

2. The bootstrap kubeconfig credential has the same sensitivity
as the previous generated TLS client-certificate kubeconfig.
It must be distributed securely to nodes. Its compromise still
allows an attacker to obtain a Kubelet kubeconfig

3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers
a slight security improvement.
  * An attacker who obtains the kubeconfig can likely obtain the
  bootstrap kubeconfig as well, to obtain the ability to renew
  their access
  * A compromised bootstrap kubeconfig could plausibly be handled
  by replacing the bootstrap token Secret, distributing the token
  to new nodes, and expiration. Whereas a compromised TLS-client
  certificate kubeconfig can't be revoked (no CRL). However,
  replacing a bootstrap token can be impractical in real cluster
  environments, so the limited lifetime is mostly a theoretical
  benefit.
  * Cluster CSR objects are visible via kubectl which is nice

4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet
clients have more identity information, which can improve the
utility of audits and future features

Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/
2020-04-25 19:38:56 -07:00
Dalton Hubble
42723d13a6 Change default kube-system DaemonSet tolerations
* Change kube-proxy, flannel, and calico-node DaemonSet
tolerations to tolerate `node.kubernetes.io/not-ready`
and `node-role.kubernetes.io/master` (i.e. controllers)
explicitly, rather than tolerating all taints
* kube-system DaemonSets will no longer tolerate custom
node taints by default. Instead, custom node taints must
be enumerated to opt-in to scheduling/executing the
kube-system DaemonSets.

Background: Tolerating all taints ruled out use-cases
where certain nodes might legitimately need to keep
kube-proxy or CNI networking disabled
2020-03-25 22:43:50 -07:00
Dalton Hubble
e76f0a09fa Switch from upstream hyperkube to component images
* Kubernetes plans to stop releasing the hyperkube image in
the future.
* Upstream will continue releasing container images for
`kube-apiserver`, `kube-controller-manager`, `kube-proxy`,
and `kube-scheduler`. Typhoon will use these images
* Upstream will release the kubelet as a binary for distros
to package, either as a traditional DEB/RPP or as a container
image for container-optimized operating systems. Typhoon will
take on the packaging of Kubelet and its dependencies as a new
container image (alongside kubectl)

Rel: https://github.com/kubernetes/kubernetes/pull/88676
See: https://github.com/poseidon/kubelet
2020-03-17 22:13:42 -07:00
Dalton Hubble
dce49114a0 Fix terraform format with fmt 2019-12-05 01:02:01 -08:00
Dalton Hubble
4d7484f72a Change asset_dir variable from required to optional
* `asset_dir` is an absolute path to a directory where generated
assets from terraform-render-bootstrap are written (sensitive)
* Change `asset_dir` to default to "" so no assets are written
(favor Terraform output mechanisms). Previously, asset_dir was
required so all users set some path. To take advantage of the
new optionality, remove asset_dir or set it to ""
2019-12-05 00:56:54 -08:00
Dalton Hubble
6c7ba3864f Introduce a Terraform output map with distribution assets
* Introduce a new `assets_dist` output variable that provides
a mapping from suggested asset paths to asset contents (for
assets that should be distributed to controller nodes). This
new output format is intended to align with a modified asset
distribution style in Typhoon.
* Lay the groundwork for `assets_dir` to become optional. The
output map provides output variable access to the minimal assets
that are required for bootstrap
* Assets that aren't required for bootstrap itself (e.g.
the etcd CA key) but can be used by admins may later be added
as specific output variables to further reduce asset_dir use

Background:

* `terraform-render-bootstrap` rendered assets were previously
only provided by rendering files to an `asset_dir`. This was
neccessary, but created a responsibility to maintain those
assets on the machine where terraform apply was run
2019-12-04 20:15:40 -08:00
Dalton Hubble
1bba891d95 Adopt Terraform v0.12 templatefile function
* Adopt Terrform v0.12 type and templatefile function
features to replace the use of terraform-provider-template's
`template_dir`
* Use of `for_each` to write local assets requires
that consumers use Terraform v0.12.6+ (action required)
* Continue use of `template_file` as its quite common. In
future, we may replace it as well.
* Remove outputs `id` and `content_hash` (no longer used)

Background:

* `template_dir` was added to `terraform-provider-template`
to add support for template directory rendering in CoreOS
Tectonic Kubernetes distribution (~2017)
* Terraform v0.12 introduced a native `templatefile` function
and v0.12.6 introduced native `for_each` support (July 2019)
that makes it possible to replace `template_dir` usage
2019-11-13 14:05:01 -08:00