The [Streaming Standby][standby] architecture requires custom tls certs
for two clusters in two regions to connect to each other.
This patch manages the custom certs following the configuration
described in the article [Using Cert Manager to Deploy TLS for Postgres
on Kubernetes][article].
NOTE: One thing not mentioned anywhere in the crunchy documentation is
how custom tls certs work with pgbouncer. The pgbouncer service uses a
tls certificate issued by the pgo root cert, not by the custom
certificate authority.
For this reason, we use kustomize to patch the zitadel Deployment and
the zitadel-init and zitadel-setup Jobs. The patch projects the ca
bundle from the `zitadel-pgbouncer` secret into the zitadel pods at
/pgbouncer/ca.crt
[standby]: https://access.crunchydata.com/documentation/postgres-operator/latest/architecture/disaster-recovery#streaming-standby-with-an-external-repo
[article]: https://www.crunchydata.com/blog/using-cert-manager-to-deploy-tls-for-postgres-on-kubernetes
A full backup was taken using:
```
kubectl annotate postgrescluster zitadel postgres-operator.crunchydata.com/pgbackrest-backup="$(date)"
```
And completed with:
```
❯ k logs -f zitadel-backup-5r6v-v5jnm
time="2024-03-10T21:52:15Z" level=info msg="crunchy-pgbackrest starts"
time="2024-03-10T21:52:15Z" level=info msg="debug flag set to false"
time="2024-03-10T21:52:15Z" level=info msg="backrest backup command requested"
time="2024-03-10T21:52:15Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=2 --type=full]"
time="2024-03-10T21:55:18Z" level=info msg="crunchy-pgbackrest ends"
```
This patch verifies the point in time backup is robust in the face of
the following operations:
1. pg cluster zitadel was deleted (whole namespace emptied)
2. pg cluster zitadel was re-created _without_ a `dataSource`
3. pgo initailized a new database and backed up the blank database to
S3.
4. pg cluster zitadel was deleted again.
5. pg cluster zitadel was re-created with `dataSource` `options: ["--type=time", "--target=\"2024-03-10 21:56:00+00\""]` (Just after the full backup completed)
6. Restore completed successfully.
7. Applied the holos zitadel component.
8. Zitadel came up successfully and user login worked as expected.
- [x] Perform an in place [restore][restore] from [s3][bucket].
- [x] Set repo1-retention-full to clear warning
[restore]: https://access.crunchydata.com/documentation/postgres-operator/latest/tutorials/backups-disaster-recovery/disaster-recovery#restore-properties
[bucket]: https://access.crunchydata.com/documentation/postgres-operator/latest/tutorials/backups-disaster-recovery/disaster-recovery#cloud-based-data-source
To establish the canonical https://login.ois.run identity issuer on the
core cluster pair.
Custom resources for PGO have been imported with:
timoni mod vendor crds -f deploy/clusters/core2/components/prod-pgo-crds/prod-pgo-crds.gen.yaml
Note, the zitadel tls connection took some considerable effort to get
working. We intentionally use pgo issued certs to reduce the toil of
managing certs issued by cert manager.
The default tls configuration of pgo is pretty good with verify full
enabled.
The core2 cluster cannot provision pvcs because it's using the k8s-dev
pool when it has credentials valid only for the k8s-prod pool.
This patch adds an entry to the platform cluster map to configure the
pool for each cluster, with a default of k8s-dev.
PGO uses plain yaml and kustomize as the recommended installation
method. Holos supports upstream by adding a new PlainFiles component
kind, which simply copies files into place and lets kustomize handle the
generation of the api objects.
Cue is responsible for very little in this kind of component, basically
allowing overlay resources if needed and deferring everything else to
the holos cli.
The holos cli in turn is responsible for executing kubectl kustomize
build on the input directory to produce the rendered output, then writes
the rendered output into place.
The resource names for the arc controller are too long:
❯ k get pods -n arc-systems
NAME READY STATUS RESTARTS AGE
gha-runner-scale-set-controller-gha-rs-controller-6bdf45bd6jx5n 1/1 Running 0 59m
Solve the problem by allowing components to set the release name to
`gha-rs-controller` which requires an additional field from the cue code
to differentiate from the chart name.
Multiple holos components rely on kustomize to modify the output of the
upstream helm chart, for example patching a Deployment to inject the
istio sidecar.
The new holos cue based component system did not support running
kustomize after helm template. This patch adds the kustomize execution
if two fields are defined in the helm chart kind of cue output.
The API spec is pretty loose in this patch but I'm proceeding for
expedience and to inform the final API with more use cases as more
components are migrated to cue.
This patch migrates the https redirect and the
istio-ingressgateway-loopback Service from
`holos-infra/components/core/istio/ingress/templates/deployment`
Using a list to merge dependencies through the tree from root to leaf is
challenging. This patch uses a #DependsOn struct instead then builds
the list of dependencies for flux from the struct field values.
It makes sense to manage the SecretStore along with the Namespace in the
platform namespaces holos component. Otherwise, the first component
that needs an ExternalSecret also needs to manage a SecretStore, which
creates an artificial dependency for subesequent components that also
need a SecretStore in the same namespace.
Best to just have all components depend on the namespaces component.
This patch partially adds the Let's Encrypt issuers. The platform data
expands to take a contact email and a cloudflare login email.
The external secret needs to be added next.
Straight-forward helm install with no customization.
This patch also adds a "Skip" output kind which allows intermediate cue
files in the tree to signal holos to skip over the instance. This
enables constraints to be added at intermediate layers without build
errors.
Add the recommended labels mapping to holos stage, project, and
component names. Project will eventually be renamed to "collection" or
something.
Example:
app.kubernetes.io/part-of: prod
app.kubernetes.io/name: secrets
app.kubernetes.io/component: validate
app.kubernetes.io/instance: prod-secrets-validate
Also sort the api objects produced from cue so the output of the `holos
render` command is stable for git commits.
This patch changes the interface between CUE and Holos to remove the
content field and replace it with an api object map. The map is a
`map[string]map[string]string` with the rendered yaml as the value of a
kind/name nesting.
This structure enables better error messages, cue disjunction errors
indicate the type and the name of the resource instead of just the list
index number.
Without this patch the secret data was nested under a key with the same
name as the secret name. This caused the ceph controller to not find
the values.
This patch changes the golden path for #ExternalSecret to copy all data
keys 1:1 from the external to the target in the cluster.
Without this patch all clusters would use the same ceph secret from the
provisioner cluster. This is a problem because ceph credentials are
unique per cluster.
This patch renames the ceph secret to have a cluster name prefix.
The secret is created with:
```bash
vault kv get -format=json -field data kv/k2/kube-namespace/ceph-csi-rbd/csi-rbd-secret \
| holos create secret --namespace ceph-system k2-ceph-csi-rbd --cluster-name=k2 --data-stdin --append-hash=false
```
This patch adds the `pod-security.kubernetes.io/enforce: privileged`
label to the ceph-system namespace.
The Namespace resources are managed all over the map, it would be a good
idea to consolidate the PlatformNamespaces data into one well known
place for the entire platform. Eschewing for now.
This patch adds the ceph-csi-rbd helm chart component to the metal
cluster type. The purpose is to enable PersistentVolumeClaims on ois
metal clusters.
Cloud clusters like GKE and EKS are expected to skip rendering the metal
type.
Helm values are handled with CUE. The ceph secret is managed as an
ExternalSecret resource, appended to the rendered output by cue and the
holos cli.
Use:
❯ holos render --cluster-name=k2 ~/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/metal/...
2:45PM INF render.go:40 rendered prod-metal-ceph version=0.47.0 status=ok action=rendered name=prod-metal-ceph
This patch validates secrets are synced from the provisioner cluster to
a workload cluster. This verifies the eso-creds-refresher job, external
secrets operator, etc...
Refer to
0ae58858f5
for the corresponding commit on the k2 cluster.
This patch prints out the cue file and line numbers when a cue error
contains multiple go errors to unwrap.
For example:
```
❯ holos render --cluster-name=k2 ~/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/workload/...
3:31PM ERR could not execute version=0.46.0 err="could not decode: content: error in call to encoding/yaml.MarshalStream: incomplete value string (and 1 more errors)" loc=builder.go:212
content: error in call to encoding/yaml.MarshalStream: incomplete value string:
/home/jeff/workspace/holos-run/holos/docs/examples/schema.cue:199:11
/home/jeff/workspace/holos-run/holos/docs/examples/cue.mod/gen/external-secrets.io/externalsecret/v1beta1/types_gen.cue:83:14
```
This patch adds the `eso-creds-refresher` CronJob which executes every 8
hours in the holos-system namespace of each workload cluster. The job
creates Secrets with a `token` field representing the id token
credential for a SecretStore to use when synchronizing secrets to and
from the provisioner cluster.
Service accounts in the provisioner cluster are selected with
selector=holos.run/job.name=eso-creds-refresher.
Each selected service account has a token issued with a 12 hour
expiration ttl and is stored in a Secret matching the service account
name in the same namespace in the workload cluster.
The job takes about 25 seconds to run once the image is cached on the
node.
This patch adds a ConfigMap and Pod to the eso-creds-refresher
component. The Pod executes the gcloud container, impersonates the
eso-creds-refresher iam service account using workload identity, then
authenticates to the remote provisioner cluster.
This is the foundation for a script to automatically create Secret API
objects in a workload cluster which have a kubernetes service account
token ESO SecretStore resources can use to fetch secrets from the
provisioner cluster.
Once we have that script in place we can turn this Pod into a Job and
replace Vault.
The provisioner cluster is a worker-less autopilot cluster that provides
secrets to other clusters in the platform. The `eso-creds-refresher`
Job in the holos-system namespace of each other cluster refreshes
service account tokens for SecretStores.
This patch adds the IAM structure for the Job implemented by Namespace,
ServiceAccount, Role, and RoleBinding api objects.
This patch adds a holos component to deploy a SecretStore and
ExternalSecret in the default namespace to validate authentication with
Vault is configured correctly.
The default ksa is used to authenticate to vault.
This patch makes it possible to build all components for a platform with
a single command:
❯ holos render ~/workspace/holos-run/holos/docs/examples/platforms/reference/...
2:51PM INF render.go:39 rendered prod-secrets-eso version=0.42.0 status=ok action=rendered name=prod-secrets-eso
2:51PM INF render.go:39 rendered prod-secrets-namespaces version=0.42.0 status=ok action=rendered name=prod-secrets-namespaces
Note the `reference/...` path base name. Without this patch cue tried
to build an intermediate directory instance.
In helm mode, cue is responsible for producing the values.yaml file.
Holos is responsible for taking the values produced by cue and providing
them to helm to produce rendered kubernetes api objects.
This patch adds intermediate data structures to hold the output from
cue: the helm values, the flux kustomization, and the helm charts to
provide the helm values to.
Holos takes this information and orchestrates running helm template to
render the api objects and write them to the file system for git ops.
Content seems more appropriate of a field name, and it makes sense since
we are likely to output other formats than yaml, probably json too. We
need to discriminate on content type, so also add a contentType field.
Semantics are meant to be the same has the http content type header, but
simple.
The intent is for all of the output formats to share a common `name`
field, useful to construct a file name to write rendered output to for
git ops.
This is equivalent to the OrderedComponent name specified in the
platform.yaml in the prototype.
Leaf directories can output different kinds of things:
1. Platform specification. A list of components to manage.
2. Kubernetes API Objects suitable for kubectl apply -f- and friends.
3. Helm values to provide to a helm chart to render API objects.
This patch adds an output schema and a kind discriminator so the holos
cli can figure out what type of output it's working with. This makes it
possible to have a single `holos build <directory>` command that does
the right thing.