68 Commits

Author SHA1 Message Date
Jeff McCune
340715f76c (#36) Provide certs to Cockroach DB and Zitadel with ExternalSecrets
This patch switches CockroachDB to use certs provided by ExternalSecrets
instead of managing Certificate resources in-cluster from the upstream
helm chart.

This paves the way for multi-cluster replication by moving certificates
outside of the lifecycle of the workload cluster cockroach db operates
within.

Closes: #36
2024-03-06 10:38:47 -08:00
Jeff McCune
64ffacfc7a (#36) Add Cockroach Issuer for Zitadel to provisioner cluster
Issuing mtls certs for cockroach db moves to the provisioner cluster so
we can more easily support cross cluster replication in the future.
crdb certs will be synced same as public tls certs, using ExternalSecret
resources.
2024-03-06 09:36:20 -08:00
Jeff McCune
fd5a2fdbc1 (#36) Sync certs as ExternalSecrets from workload clusters
This patch replaces the httpbin and login cert on the workload clusters
with an ExternalSecret to sync the tls cert from the provisioner
cluster.
2024-03-05 17:05:10 -08:00
Jeff McCune
eb3e272612 (#36) Dynamically generate cluster certs from Platform spec
Each cluster should be more or less identical, configure certs from the
dynamic list of platform clusters.
2024-03-05 16:44:35 -08:00
Jeff McCune
2b3b5a4887 (#36) Issue login and httpbin certs
This patch uses cert manager in the provisioner cluster to provision tls
certs for https://login.example.com and https://httpbin.k2.example.com

The certs are not yet synced to the clusters.  Next step is to replace
the Certificate resources with ExternalSecret resources, then remove
cert manager from the workload clusters.
2024-03-05 14:27:37 -08:00
Jeff McCune
7426e8f867 (#36) Move cert-manager to the provisioner cluster
This patch moves certificate management to the provisioner cluster to
centralize all secrets into the highly secured cluster.  This change
also simplifies the architecture in a number of ways:

1. Certificate lives are now completely independent of cluster
   lifecycle.
2. Remove the need for bi-directional sync to save cert secrets.
3. Workload clusters no longer need access to DNS.
2024-03-05 12:51:58 -08:00
Jeff McCune
b3f682453d (#31) Inject istio sidecar into Deployment zitadel using Kustomize
Multiple holos components rely on kustomize to modify the output of the
upstream helm chart, for example patching a Deployment to inject the
istio sidecar.

The new holos cue based component system did not support running
kustomize after helm template.  This patch adds the kustomize execution
if two fields are defined in the helm chart kind of cue output.

The API spec is pretty loose in this patch but I'm proceeding for
expedience and to inform the final API with more use cases as more
components are migrated to cue.
2024-03-05 09:56:39 -08:00
Jeff McCune
0c3181ae05 (#31) Add VirtualService for Zitadel
Also import the Kustomize types using:

    cue get go sigs.k8s.io/kustomize/api/types/...
2024-03-04 17:18:46 -08:00
Jeff McCune
18cbff0c13 (#31) Add tls cert for zitadel to connect to cockroach db
Cockroach DB uses tls certs for client authentication.  Issue one for
Zitadel.

With this patch Zitadel starts up bit is not yet exposted with a
VirtualService.

Refer to https://zitadel.com/docs/self-hosting/manage/configure
2024-03-04 14:46:49 -08:00
Jeff McCune
b4fca0929c (#31) ExternalSecret for zitadel-masterkey 2024-03-04 14:31:27 -08:00
Jeff McCune
911d65bdc6 (#31) Setup login.ois.run with basic istio default Gateway
The istio default Gateway is the basis for what will become a dynamic
set of server entries specified from cue project data integrated with
extauthz.

For now we simply need to get the identity provider up and running as
the first step toward identity and access management.
2024-03-04 13:59:17 -08:00
Jeff McCune
9db4873205 (#31) Add Cockroach DB for Zitadel
Following https://github.com/zitadel/zitadel-charts/blob/main/examples/4-cockroach-secure/README.md
2024-03-04 10:31:39 -08:00
Jeff McCune
f90e83e142 (#30) Add httpbin Gateway and VirtualService
There isn't a default Gateway yet, so use a specific `httpbin` gateway
to test istio instead.
2024-03-02 21:12:03 -08:00
Jeff McCune
bdd2964edb (#30) Add httpbin Service for ns istio-ingress 2024-03-02 20:39:55 -08:00
Jeff McCune
56375b82d8 (#30) Fix httpbin Deployment selector match labels
Without this patch the deployment fails with:

```
Deployment/istio-ingress/httpbin dry-run failed, reason: Invalid:
Deployment.apps "httpbin" is invalid: spec.template.metadata.labels:
Invalid value:
map[string]string{"app.kubernetes.io/component":"httpbin",
"app.kubernetes.io/instance":"prod-mesh-httpbin",
"app.kubernetes.io/name":"mesh", "app .kubernetes.io/part-of":"prod",
"holos.run/component.name":"httpbin", "holos.run/project.name":"mesh",
"holos.run/stage.name":"prod", "sidecar.istio.io/inject":"true"}:
`selector` does not match template `labels`
```
2024-03-02 20:23:23 -08:00
Jeff McCune
dc27489249 (#30) Add httpbin Deployment in istio-ingress namespace
This patch gets the Deployment running with a restricted seccomp
profile.
2024-03-02 20:17:16 -08:00
Jeff McCune
7d8a618e25 (#30) Add httpbin Certificate to verify the mesh
Also fix certmanager which was not installing role bindings correctly
because the flux kustomization was writing over the metadata namespace
field.
2024-03-02 17:16:42 -08:00
Jeff McCune
646f6fcdb0 (#30) Add https redirect overlay resources
This patch migrates the https redirect and the
istio-ingressgateway-loopback Service from
`holos-infra/components/core/istio/ingress/templates/deployment`
2024-03-02 15:01:58 -08:00
Jeff McCune
4ce39db745 (#30) Enforce restricted pod security profile on istio-ingress namespace
This patch enforces the restricted pod security profile on the istio
ingress namespace. The istio cni to move the traffic redirection from
the init container to a cni daemon set pod.

Refer to:

 - https://istio.io/latest/docs/setup/additional-setup/pod-security-admission/
 - https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted
2024-03-02 11:16:55 -08:00
Jeff McCune
eba58d1639 (#30) Add ingress component and istio-ingressgateway Deployment
Migrated from holos-infra/components/core/istio/ingress
2024-03-02 10:22:21 -08:00
Jeff McCune
765832d90d (#30) Trim istiod 2024-03-01 16:27:49 -08:00
Jeff McCune
d1163d689a (#30) Add istiod istio controller and meshconfig
This patch adds the standard istiod controller, which depends on
istio-base.

The holos reference platform heavily customizes the meshconfig, so the
upstream istio ConfigMap is disabled in the helm chart values.  The mesh
config is generated from cue data defined in the controller holos
component.

Note: This patch adds a static configuration for the istio meshconfig in
the meshconfig.cue file.  The extauthz providers are a core piece of
functionality in the holos reference platform and a key motivation of
moving to CUE from Helm is the need to dynamically generate the
meshconfig from a platform scoped set of projects and services across
multiple clusters.

For expedience this dynamic generation is not part of this patch but is
expected to replace the static meshconfig once the cluster is more fully
configured with the new cue based holos command line interface.
2024-03-01 16:13:19 -08:00
Jeff McCune
63009ba419 (#30) Fix cue formatting 2024-03-01 10:35:32 -08:00
Jeff McCune
9c42cf9109 (#30) Import istio crds into cue definitions
❯ timoni mod vendor crds -f ~/workspace/holos-run/holos-infra/deploy/clusters/k2/components/prod-mesh-istio-base/prod-mesh-istio-base.gen.yaml
10:30AM INF schemas vendored: extensions.istio.io/wasmplugin/v1alpha1
10:30AM INF schemas vendored: install.istio.io/istiooperator/v1alpha1
10:30AM INF schemas vendored: networking.istio.io/destinationrule/v1alpha3
10:30AM INF schemas vendored: networking.istio.io/destinationrule/v1beta1
10:30AM INF schemas vendored: networking.istio.io/envoyfilter/v1alpha3
10:30AM INF schemas vendored: networking.istio.io/gateway/v1alpha3
10:30AM INF schemas vendored: networking.istio.io/gateway/v1beta1
10:30AM INF schemas vendored: networking.istio.io/proxyconfig/v1beta1
10:30AM INF schemas vendored: networking.istio.io/serviceentry/v1alpha3
10:30AM INF schemas vendored: networking.istio.io/serviceentry/v1beta1
10:30AM INF schemas vendored: networking.istio.io/sidecar/v1alpha3
10:30AM INF schemas vendored: networking.istio.io/sidecar/v1beta1
10:30AM INF schemas vendored: networking.istio.io/virtualservice/v1alpha3
10:30AM INF schemas vendored: networking.istio.io/virtualservice/v1beta1
10:30AM INF schemas vendored: networking.istio.io/workloadentry/v1alpha3
10:30AM INF schemas vendored: networking.istio.io/workloadentry/v1beta1
10:30AM INF schemas vendored: networking.istio.io/workloadgroup/v1alpha3
10:30AM INF schemas vendored: networking.istio.io/workloadgroup/v1beta1
10:30AM INF schemas vendored: security.istio.io/authorizationpolicy/v1
10:30AM INF schemas vendored: security.istio.io/authorizationpolicy/v1beta1
10:30AM INF schemas vendored: security.istio.io/peerauthentication/v1beta1
10:30AM INF schemas vendored: security.istio.io/requestauthentication/v1
10:30AM INF schemas vendored: security.istio.io/requestauthentication/v1beta1
10:30AM INF schemas vendored: telemetry.istio.io/telemetry/v1alpha1
2024-03-01 10:31:52 -08:00
Jeff McCune
3fce5188a2 (#30) Add holos cue instance prod-mesh-istio-base
This patch installs the istio base helm chart from upstream which
includes the custom resource definitions.
2024-03-01 10:28:54 -08:00
Jeff McCune
fde88ad5eb (#30) Add #DependsOn struct to unify dependencies
Using a list to merge dependencies through the tree from root to leaf is
challenging.  This patch uses a #DependsOn struct instead then builds
the list of dependencies for flux from the struct field values.
2024-03-01 10:13:55 -08:00
Jeff McCune
7a8d30f833 (#30) Mesh istio-system istio-ingress namespaces
Need to be in place with privileged pod security policies.
2024-03-01 09:35:57 -08:00
Jeff McCune
8987442b91 (#27) Add cert-manager ExternalSecret cloudflare-api-token-secret
This enables the dns01 letsencrypt acme solver and is heavily used in
the reference platform.

Secret migrated from Vault using:

```bash
vault kv get -format=json -field data kv/k8s/ns/cert-manager/cloudflare-api-token-secret \
  | holos create secret --namespace cert-manager cloudflare-api-token-secret --data-stdin --append-hash=false
```
2024-03-01 08:44:06 -08:00
Jeff McCune
a6af3a46cf (#27) Manage SecretStore with platform namespaces
It makes sense to manage the SecretStore along with the Namespace in the
platform namespaces holos component.  Otherwise, the first component
that needs an ExternalSecret also needs to manage a SecretStore, which
creates an artificial dependency for subesequent components that also
need a SecretStore in the same namespace.

Best to just have all components depend on the namespaces component.
2024-03-01 08:05:00 -08:00
Jeff McCune
71d545a883 (#27) Add cert-manager LetsEncrypt issuers
This patch partially adds the Let's Encrypt issuers.  The platform data
expands to take a contact email and a cloudflare login email.

The external secret needs to be added next.
2024-02-29 21:40:55 -08:00
Jeff McCune
044d3082d9 (#27) Add cert-manager custom resource definitions
Without this patch the cert-manager component is missing the custom
resource definitions.

This patch adds them using the helm installCRDs value.
2024-02-29 20:46:42 -08:00
Jeff McCune
c2d5c4ad36 (#27) Add cert-manager to the mesh collection
Straight-forward helm install with no customization.

This patch also adds a "Skip" output kind which allows intermediate cue
files in the tree to signal holos to skip over the instance.  This
enables constraints to be added at intermediate layers without build
errors.
2024-02-29 16:50:27 -08:00
Jeff McCune
ab03ef1052 (#27) Refactor top level schema
Remove content and contentType top level keys, deprecated in favor of
apiObjects.

Clarify toward the use of #CollectionName instead of project name.
2024-02-29 15:48:54 -08:00
Jeff McCune
8c76061b0d (#27) Add recommended labels and sort output
Add the recommended labels mapping to holos stage, project, and
component names.  Project will eventually be renamed to "collection" or
something.

Example:

    app.kubernetes.io/part-of: prod
    app.kubernetes.io/name: secrets
    app.kubernetes.io/component: validate
    app.kubernetes.io/instance: prod-secrets-validate

Also sort the api objects produced from cue so the output of the `holos
render` command is stable for git commits.
2024-02-29 15:12:19 -08:00
Jeff McCune
f60db8fa1f (#25) Show name of api object in errors
This patch changes the interface between CUE and Holos to remove the
content field and replace it with an api object map.  The map is a
`map[string]map[string]string` with the rendered yaml as the value of a
kind/name nesting.

This structure enables better error messages, cue disjunction errors
indicate the type and the name of the resource instead of just the list
index number.
2024-02-29 11:23:49 -08:00
Jeff McCune
eefc092ea9 (#22) Copy external secret data files one for one
Without this patch the secret data was nested under a key with the same
name as the secret name.  This caused the ceph controller to not find
the values.

This patch changes the golden path for #ExternalSecret to copy all data
keys 1:1 from the external to the target in the cluster.
2024-02-28 16:51:26 -08:00
Jeff McCune
0860ac3409 (#22) Rename ceph secret to include ClusterName
Without this patch all clusters would use the same ceph secret from the
provisioner cluster.  This is a problem because ceph credentials are
unique per cluster.

This patch renames the ceph secret to have a cluster name prefix.

The secret is created with:

```bash
vault kv get -format=json -field data kv/k2/kube-namespace/ceph-csi-rbd/csi-rbd-secret \
  | holos create secret --namespace ceph-system k2-ceph-csi-rbd --cluster-name=k2 --data-stdin --append-hash=false
```
2024-02-28 16:14:22 -08:00
Jeff McCune
6b156e9883 (#22) Label ns ceph-system with pod-security enforce: privileged
This patch adds the `pod-security.kubernetes.io/enforce: privileged`
label to the ceph-system namespace.

The Namespace resources are managed all over the map, it would be a good
idea to consolidate the PlatformNamespaces data into one well known
place for the entire platform.  Eschewing for now.
2024-02-28 15:57:01 -08:00
Jeff McCune
4c5429b64a (#22) Ceph CSI for Metal clusters
This patch adds the ceph-csi-rbd helm chart component to the metal
cluster type.  The purpose is to enable PersistentVolumeClaims on ois
metal clusters.

Cloud clusters like GKE and EKS are expected to skip rendering the metal
type.

Helm values are handled with CUE.  The ceph secret is managed as an
ExternalSecret resource, appended to the rendered output by cue and the
holos cli.

Use:

    ❯ holos render --cluster-name=k2 ~/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/metal/...
    2:45PM INF render.go:40 rendered prod-metal-ceph version=0.47.0 status=ok action=rendered name=prod-metal-ceph
2024-02-28 14:46:03 -08:00
Jeff McCune
6090ab224e (#14) Validate secrets fetched from provisioner cluster
This patch validates secrets are synced from the provisioner cluster to
a workload cluster.  This verifies the eso-creds-refresher job, external
secrets operator, etc...

Refer to
0ae58858f5
for the corresponding commit on the k2 cluster.
2024-02-27 15:55:17 -08:00
Jeff McCune
10e140258d (#15) Report multiple cue errors
This patch prints out the cue file and line numbers when a cue error
contains multiple go errors to unwrap.

For example:

```
❯ holos render --cluster-name=k2 ~/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/workload/...
3:31PM ERR could not execute version=0.46.0 err="could not decode: content: error in call to encoding/yaml.MarshalStream: incomplete value string (and 1 more errors)" loc=builder.go:212
content: error in call to encoding/yaml.MarshalStream: incomplete value string:
    /home/jeff/workspace/holos-run/holos/docs/examples/schema.cue:199:11
    /home/jeff/workspace/holos-run/holos/docs/examples/cue.mod/gen/external-secrets.io/externalsecret/v1beta1/types_gen.cue:83:14
```
2024-02-27 15:32:11 -08:00
Jeff McCune
b4ad6425e5 (#14) Validate SecretStore works
This patch validates a SecretStore in the holos-system namespace works
after provisioner credentials are refreshed.
2024-02-27 11:25:00 -08:00
Jeff McCune
3343d226e5 (#14) Fix namespaces "external-secrets" not found
Needed for the `prod-secrets-eso` component to reconcile with flux.

NAME                                    REVISION                SUSPENDED       READY   MESSAGE
flux-system                             main@sha1:28b9ab6b      False           True    Applied revision: main@sha1:28b9ab6b
prod-secrets-eso                        main@sha1:28b9ab6b      False           True    Applied revision: main@sha1:28b9ab6b
prod-secrets-eso-creds-refresher        main@sha1:28b9ab6b      False           True    Applied revision: main@sha1:28b9ab6b
prod-secrets-namespaces                 main@sha1:28b9ab6b      False           True    Applied revision: main@sha1:28b9ab6b
2024-02-26 20:53:43 -08:00
Jeff McCune
51f22443f3 Move secrets project components to the workload cluster
Goal is to render all of the flux kustomization components with:

```
❯ holos render --cluster-name=k2 ~/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/workload/...
4:47PM INF render.go:39 rendered prod-secrets-eso version=0.42.1 status=ok action=rendered name=prod-secrets-eso
4:47PM INF render.go:39 rendered prod-secrets-eso-creds-refresher version=0.42.1 status=ok action=rendered name=prod-secrets-eso-creds-refresher
4:47PM INF render.go:39 rendered prod-secrets-namespaces version=0.42.1 status=ok action=rendered name=prod-secrets-namespaces
```
2024-02-21 16:45:48 -08:00
Jeff McCune
e98ee28f74 Add eso-creds-refresher CronJob
This patch adds the `eso-creds-refresher` CronJob which executes every 8
hours in the holos-system namespace of each workload cluster.  The job
creates Secrets with a `token` field representing the id token
credential for a SecretStore to use when synchronizing secrets to and
from the provisioner cluster.

Service accounts in the provisioner cluster are selected with
selector=holos.run/job.name=eso-creds-refresher.

Each selected service account has a token issued with a 12 hour
expiration ttl and is stored in a Secret matching the service account
name in the same namespace in the workload cluster.

The job takes about 25 seconds to run once the image is cached on the
node.
2024-02-21 15:09:26 -08:00
Jeff McCune
b16d3459f7 Allow eso-creds-refresher iam service account to list ksas
Without this patch the Job on a workload cluster fails with:

```
+ kubectl get serviceaccount -A --selector=holos.run/job.name=eso-creds-refresher --output=json
Error from server (Forbidden): serviceaccounts is forbidden: User
"eso-creds-refresher@holos-run.iam.gserviceaccount.com" cannot list
resource "serviceaccounts" in API group "" at the cluster scope:
requires one of ["container.serviceAccounts.list"] permission(s).
```
2024-02-21 11:13:04 -08:00
Jeff McCune
f41b883dce Add holos.run/job.name=eso-creds-refresher label to ksa
This label is intended for the Job to select which service accounts to
issue tokens for.  For example:

  kubectl get serviceaccount -A --selector=holos.run/job.name=eso-creds-refresher --output=json
2024-02-21 11:03:33 -08:00
Jeff McCune
572281914c Remove view role from eso-creds-refresher
Listing namespaces is sufficient, viewing all resources isn't necessary.
2024-02-21 10:32:41 -08:00
Jeff McCune
4cdf9d2dae Refactor eso-reader and eso-writer provisioner service accounts
Without this patch it is difficult to navigate the structure of the
configuration of the api objects because they're positional elements in
a list.

This patch extracts the configuration of the eso-reader and eso-writer
ServiceAccount, Role, and RoleBinding structs into a definition that
behaves like a function.  The individual objects are fields of the
struct instead of positional elements in a list.
2024-02-21 10:08:39 -08:00
Jeff McCune
fd306aae76 Pod eso-creds-refresher authenticates to provisioner
This patch adds a ConfigMap and Pod to the eso-creds-refresher
component.  The Pod executes the gcloud container, impersonates the
eso-creds-refresher iam service account using workload identity, then
authenticates to the remote provisioner cluster.

This is the foundation for a script to automatically create Secret API
objects in a workload cluster which have a kubernetes service account
token ESO SecretStore resources can use to fetch secrets from the
provisioner cluster.

Once we have that script in place we can turn this Pod into a Job and
replace Vault.
2024-02-20 17:45:43 -08:00