Compare commits

...

209 Commits

Author SHA1 Message Date
Jeff McCune
2e2ed398c6 (#175) Fix tests 2024-05-20 11:32:29 -07:00
Jeff McCune
34f2a52cb7 (#175) Add holos render platform command
Split holos render into component and platform.

This patch splits the previous `holos render` command into subcommands.
`holos render component ./path/to/component/` behaves as the previous
`holos render` command and renders an individual component.

The new `holos render platform ./path/to/platform/` subcommand makes
space to render the entire platform using the platform model pulled from
the PlatformService.

Starting with an empty directory:

```sh
holos register user
holos generate platform bare
holos pull platform config .
holos render platform ./platform/
```

```txt
10:01AM INF platform.go:29 ok render component version=0.80.2 path=components/configmap cluster=k1 num=1 total=1 duration=448.133038ms
```

The bare platform has a single component which refers to the platform
model pulled from the PlatformService:

```sh
cat deploy/clusters/mycluster/components/platform-configmap/platform-configmap.gen.yaml
```

```yaml
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: platform
  namespace: default
data:
  platform: |
    spec:
      model:
        cloud:
          providers:
            - cloudflare
        cloudflare:
          email: platform@openinfrastructure.co
        org:
          displayName: Open Infrastructure Services
          name: ois
```
2024-05-20 10:41:24 -07:00
Jeff McCune
d3888a884f (#175) go mod tidy 2024-05-20 06:32:53 -07:00
Jeff McCune
3845871368 (#175) holos pull platform config
This patch adds a subcommand to pull the data necessary to construct a
PlatformConfig DTO.  The PlatformConfig message contains all of the
fields and values necessary to build a platform and the platform
components.  This is an alternative to holos passing multiple tags to
CUE.  The PlatformConfig is marshalled and passed once.

The platform config is also stored in the local filesystem in the root
directory of the platform.  This enables repeated local building and
rendering without making an rpc call.

The build / render pipeline is expected to cache the PlatformConfig once
at the start of the pipeline using the pull subcommand.
2024-05-19 08:27:21 -07:00
Jeff McCune
a3b2d19adb (#175) Render the platform with the model
The `holos render platform` command is unimplemented.  This patch
partially implements platform rendering by fetching the platform model
from the PlatformService and providing it to CUE using a tag.

CUE returns a `kind: Platform` resource to `holos` which will eventually
process a Buildlan for each platform component listed in the Platform
spec.

For now, however, it's sufficient to have the current platform model
available to CUE.
2024-05-18 11:40:30 -07:00
Jeff McCune
e4e7cd8c47 (#175) Make holos render --cluster-name flag optional
Problem:
Rendering the whole platform doesn't need a cluster name.

Solution:
Make the flag optional, do not set the cue tag if it's empty.

Result:
Holos renders the platform resource and proceeds to the point where we
need to implement the iteration over platform components, passing the
platform model to each one and rendering the component.
2024-05-17 15:48:36 -07:00
Jeff McCune
fb22e5521b (#175) Define the Platform resource in CUE
We need to output a kind: Platform resource from cue so holos can
iterate over each build plan.  The platform resource itself should also
contain a copy of the platform model obtained from the PlatformService
so holos can easily pass the model to each BuildPlan it needs to execute
to render the full platform.

This patch lays the groundwork for the Platform resource.  A future
patch will have the holos cli obtain the platform model and inject it as
a JSON encoded string to CUE.  CUE will return the Platform resource
which is a list of references to build plans.  Holos will then iterate
over each build plan, pass the model back in, and execute the build
plan.

To illustrate where we're headed, the `cue export` step will move into
`holos` with a future patch.

```
❯ holos register user
3:34PM INF register.go:77 user version=0.80.0 email=jeff@ois.run server=https://app.dev.k2.holos.run:443 user_id=018f8839-3d74-7e39-afe9-181ad2fc8abe org_id=018f8839-3d74-7e3a-918c-b36494da0115
❯ holos generate platform bare
3:34PM INF generate.go:79 wrote platform.metadata.json version=0.80.0 platform_id=018f8839-3d74-7e3b-8cb8-77a2c124d173 path=/home/jeff/holos/dev/bare/platform.metadata.json
3:34PM INF generate.go:91 generated platform bare version=0.80.0 platform_id=018f8839-3d74-7e3b-8cb8-77a2c124d173 path=/home/jeff/holos/dev/bare
❯ holos push platform form .
3:34PM INF push.go:70 pushed: https://app.dev.k2.holos.run:443/ui/platform/018f8839-3d74-7e3b-8cb8-77a2c124d173 version=0.80.0
❯ cue export ./platform/
{
    "metadata": {
        "name": "bare",
        "labels": {},
        "annotations": {}
    },
    "spec": {
        "model": {}
    },
    "kind": "Platform",
    "apiVersion": "holos.run/v1alpha1"
}
```
2024-05-17 15:34:56 -07:00
Jeff McCune
d2ae766ae3 Merge pull request #176 from holos-run/dependabot/go_modules/github.com/docker/docker-26.0.2incompatible
Bump github.com/docker/docker from 26.0.0+incompatible to 26.0.2+incompatible
2024-05-17 11:53:44 -07:00
dependabot[bot]
c0db949729 Bump github.com/docker/docker
Bumps [github.com/docker/docker](https://github.com/docker/docker) from 26.0.0+incompatible to 26.0.2+incompatible.
- [Release notes](https://github.com/docker/docker/releases)
- [Commits](https://github.com/docker/docker/compare/v26.0.0...v26.0.2)

---
updated-dependencies:
- dependency-name: github.com/docker/docker
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-17 18:52:51 +00:00
Jeff McCune
d2d4337ffd (#175) Improve url output
❯ holos push platform form .
11:49AM INF push.go:70 pushed: https://app.dev.k2.holos.run:443/ui/platform/018f87d1-7ca2-7e37-97ed-a06bcee9b442 version=0.79.0
2024-05-17 11:49:04 -07:00
Jeff McCune
b0ca04635e (#175) Update the client context when switching servers
When the holos server URL switches, we also need to update the client
context to get the correct org id.

Also improve quality of life by printing the url to the form when the
platform form is pushed to the server.

❯ holos push platform form .
11:41AM INF push.go:71 updated platform form version=0.79.0 server=https://app.dev.k2.holos.run:443 platform_id=018f87d1-7ca2-7e37-97ed-a06bcee9b442
11:41AM INF push.go:72 https://app.dev.k2.holos.run:443/ui/platform/018f87d1-7ca2-7e37-97ed-a06bcee9b442 version=0.79.0
2024-05-17 11:43:52 -07:00
Jeff McCune
198c66e6cd (#175) Fix tests
Not sure why this started failing, but it wasn't necessary.
2024-05-17 10:22:35 -07:00
Jeff McCune
24346b9a38 (#172) Deploy v0.79.0 to dev 2024-05-17 10:15:05 -07:00
Jeff McCune
0639562f1c (#175) go mod tidy 2024-05-17 10:09:40 -07:00
Jeff McCune
c1fa9cc531 (#175) Fix lint 2024-05-17 10:08:06 -07:00
Jeff McCune
18653534ad (#175) Add holos push platform form command
This sub-command renders the web app form from CUE code and updates the
form using the `holos.platform.v1alpha1.PlatformService/UpdatePlatform`
rpc method.

Example use case, starting fresh:

```
rm -rf ~/holos
mkdir ~/holos
cd ~/holos
```

Step 1: Login

```sh
holos login
```

```txt
9:53AM INF login.go:40 logged in as jeff@ois.run version=0.79.0 name="Jeff McCune" exp="2024-05-17 21:16:07 -0700 PDT" email=jeff@ois.run
```

Step 2: Register to create server side resources.

```sh
holos register user
```

```
9:52AM INF register.go:68 user version=0.79.0 email=jeff@ois.run user_id=018f826d-85a8-751d-81ee-64d0f2775b3f org_id=018f826d-85a8-751e-98dd-a6cddd9dd8f0
```

Step 3: Generate the bare platform in the local filesystem.

```sh
holos generate platform bare
```

```txt
9:52AM INF generate.go:79 wrote platform.metadata.json version=0.79.0 platform_id=018f826d-85a8-751f-96d0-0d2bf70df909 path=/home/jeff/holos/platform.metadata.json
9:52AM INF generate.go:91 generated platform bare version=0.79.0 platform_id=018f826d-85a8-751f-96d0-0d2bf70df909 path=/home/jeff/holos
```

Step 4: Push the platform form to the `holos server` web app.

```sh
holos push platform form .
```

```txt
9:52AM INF client.go:67 updated platform version=0.79.0 platform_id=018f826d-85a8-751f-96d0-0d2bf70df909 duration=73.62995ms
```

At this point the platform form is published and functions as expected
when visiting the platform web interface.
2024-05-17 09:51:36 -07:00
Jeff McCune
2b89c33067 (#175) Add holos orgid command
Makes it easier to work with grpcurl:

    grpcurl -H "x-oidc-id-token: $(holos token)" -d '{"org_id":"'$(holos orgid)'"}' ${HOLOS_SERVER##*/} holos.platform.v1alpha1.PlatformService.ListPlatforms
2024-05-16 21:11:24 -07:00
Jeff McCune
aee26d9375 (#175) Set header User-Agent: holos/0.70.0 (go1.22.2)
Previously: User-Agent: connect-go/1.16.0 (go1.22.2)
2024-05-16 20:49:06 -07:00
Jeff McCune
7b04d492ab (#175) Set http.Server ReadHeaderTimeout
Upstream connectrpc recommends it.  Refer to
https://connectrpc.com/docs/faq#stream-error
2024-05-16 20:28:31 -07:00
Jeff McCune
8abd03e165 (#175) Log x-request-id and x-b3-trace headers
This patch logs the x-request-id header which makes it straight forward
to correlate the logs with the service mesh logs.

For example, select the request id from the gateway logs by copying the
log from the holos server logs.

```sh
kubectl -n istio-ingress logs -l app=istio-ingressgateway -f \
  | grep --line-buffered '^{' \
  | jq 'select(.request_id=="'d0867115-5795-4096-942e-5ac188cdf618'")'
```

```json
{
  "upstream_local_address": "10.244.1.51:44248",
  "x_forwarded_for": "192.168.2.21",
  "authority": "jeff.app.dev.k2.holos.run:443",
  "upstream_transport_failure_reason": null,
  "connection_termination_details": null,
  "response_code": 200,
  "duration": 6,
  "response_flags": "-",
  "upstream_service_time": "5",
  "upstream_cluster": "outbound|3000||holos.jeff-holos.svc.cluster.local",
  "upstream_host": "10.244.1.249:3000",
  "user_agent": "connect-go/1.16.0 (go1.22.2)",
  "requested_server_name": "jeff.app.dev.k2.holos.run",
  "request_id": "d0867115-5795-4096-942e-5ac188cdf618",
  "start_time": "2024-05-17T03:16:37.900Z",
  "method": "POST",
  "protocol": "HTTP/2",
  "downstream_local_address": "65.102.23.41:443",
  "path": "/holos.user.v1alpha1.UserService/GetUser",
  "bytes_sent": 159,
  "downstream_remote_address": "192.168.2.21:59564",
  "response_code_details": "via_upstream",
  "bytes_received": 0,
  "route_name": "holos-api"
}
```
2024-05-16 20:14:34 -07:00
Jeff McCune
2df843bc98 (#175) Link the generated platform to holos server
When the user generates a platform, we need to know the platform ID it's
linked to in the holos server.  If there is no platform with the same
name, the `holos generate platform` command should error out.

This is necessary because the first thing we want to show is pushing an
updated form to `holos server`.  To update the web ui the CLI needs to
know the platform ID to update.

This patch modifies the generate command to obtain a list of platforms
for the org and verify the generated name matches one of the platforms
  that already exists.

A future patch could have the `generate platform` command call the
`holos.platform.v1alpha1.PlatformService.CreatePlatform` method if the
platform isn't found.

Results:

```sh
holos generate platform bare
```

```txt
4:15PM INF generate.go:77 wrote platform.metadata.json version=0.77.1 platform_id=018f826d-85a8-751f-96d0-0d2bf70df909 path=/home/jeff/holos/platform.metadata.json
4:15PM INF generate.go:89 generated platform bare version=0.77.1 platform_id=018f826d-85a8-751f-96d0-0d2bf70df909 path=/home/jeff/holos
```

```sh
cat platform.metadata.json
```

```json
{
  "id": "018f826d-85a8-751f-96d0-0d2bf70df909",
  "name": "bare",
  "display_name": "Bare Platform"
}
```
2024-05-16 16:18:38 -07:00
Jeff McCune
be4d2c29a5 (#175) Log info message when generating a platform
holos generate platform bare
    2:11PM INF generate.go:55 generated platform bare version=0.77.1 path=/home/jeff/holos
2024-05-16 14:26:51 -07:00
Jeff McCune
8ce88bf491 (#175) Fix goreleaser
Buf was being automatically updated in the pipeline.
2024-05-16 14:00:37 -07:00
Jeff McCune
b05571a595 (#175) Go tidy and update package.json
For goreleaser
2024-05-16 13:41:47 -07:00
Jeff McCune
4edfc71d68 (#175) Log the grpc procedure at info level
This patch logs the service and rpc method of every request at Info
level.  The error code and message is also logged.  This gives a good
indication of what rpc methods are being called and by whom.
2024-05-16 11:43:20 -07:00
Jeff McCune
3049694a0a (#175) holos register user
This patch adds a `holos register user` command.  Given an authenticated
id token and no other record of the user in the database, the cli tool
use the API to:

 1. User is registered in `holos server`
 2. User is linked to one Holos Organization.
 3. Holos Organization has the `bare` platform.
 4. Holos Organization has the `reference` platform.
 5. Ensure `~/.holos/client-context.json` contains the user id and an
    org id.

The `holos.ClientContext` struct is intended as a light weight way to
save and load the current organization id to the file system for further
API calls.

The assumption is most users will have only one single org.  We can add
a more complicated config context system like kubectl uses if and when
we need it.
2024-05-16 10:51:40 -07:00
Jeff McCune
5860c5747b (#87) generate sub-command with embedded platform
This patch adds a generate subcommand that copies a platform embedded
into the executable to the local filesystem.  The purpose is to
accelerate initial setup with canned example platforms.

Two platforms are intended to start, one bare and one reference
platform.  The number of platforms embedded into holos should be kept
small (2-3) to limit our support burden.
2024-05-14 15:03:21 -07:00
Jeff McCune
d3c2d55706 (#172) Deploy v0.76.0 to dev 2024-05-14 13:28:19 -07:00
Jeff McCune
ac2ff47a9c (#172) Wire Version Info in the UI
This patch adds the GetVersion rpc method to
holos.system.v1alpha1.SystemService and wires the version information up
to the Web UI.

This is a good example to crib from later regarding fetching and
refreshing data from the web ui using grpc and field masks.
2024-05-14 11:50:06 -07:00
Jeff McCune
9a2773c618 (#171) Refactor API to use FieldMasks
This patch refactors the API following the [API Best Practices][api]
documentation.  The UpdatePlatform method is modeled after a mutating
operation described [by Netflix][nflx] instead of using a REST resource
representation.  This makes it much easier to iterate over the fields
that need to be updated as the PlatformUpdateOperation is a flat data
structure while a Platform resource may have nested fields.  Nested
fields are more complicated and less clear to handle with a FieldMask.

This patch also adds a snapckbar message on save.  Previously, the save
button didn't give any indication of success or failure.  This patch
fixes the problem by adding a snackbar message that pop up at the bottom
of the screen nicely.

When the snackbar message is dismissed or times out the save button is
re-enabled.

[api]: https://protobuf.dev/programming-guides/api/
[nflx]: https://netflixtechblog.com/practical-api-design-at-netflix-part-2-protobuf-fieldmask-for-mutation-operations-2e75e1d230e4

Examples:

FieldMask for ListPlatforms

```
grpcurl -H "x-oidc-id-token: $(holos token)" -d @ ${HOLOS_SERVER##*/} holos.platform.v1alpha1.PlatformService.ListPlatforms <<EOF
{
  "org_id": "018f36fb-e3f7-7f7f-a1c5-c85fb735d215",
  "field_mask": { "paths": ["id","name"] }
}
EOF
```

```json
{
 "platforms": [
   {
     "id": "018f36fb-e3ff-7f7f-a5d1-7ca2bf499e94",
     "name": "bare"
   },
   {
     "id": "018f6b06-9e57-7223-91a9-784e145d998c",
     "name": "gary"
   },
   {
     "id": "018f6b06-9e53-7223-8ae1-1ad53d46b158",
     "name": "jeff"
   },
   {
     "id": "018f6b06-9e5b-7223-8b8b-ea62618e8200",
     "name": "nate"
   }
 ]
}
```

Closes: #171
2024-05-13 16:20:20 -07:00
Jeff McCune
51b6575d9f (#171) Refactor to API Best Practices
This patch refactors the API to be resource-oriented around one service
per resource type.  PlatformService, OrganizationService, UserService,
etc...

Validation is improved to use CEL rules provided by [protovalidate][1].

Place holders for FieldMask and other best practices are added, but are
unimplemented as per [API Best Practices][2].

The intent is to set us up well for copying and pasting solid existing
examples as we add features.

With this patch the server and web app client are both updated to use
the refactored API, however the following are not working:

 1. Update the model.
 2. Field Masks.

[1]: https://buf.build/bufbuild/protovalidate
[2]: https://protobuf.dev/programming-guides/api/
2024-05-10 15:55:41 -07:00
Jeff McCune
68a43f0682 (#167) Add holos rpc platform-model command
This command is just a prototype of how to fetch the platform model so
we can make it available to CUE.

The idea is we take the data from the holos server and write it into a
CUE `_Platform` struct.  This will probably involve converting the data
to CUE format and nesting it under the platform struct spec field.
2024-05-08 16:34:00 -07:00
Jeff McCune
9da88c4d1b (#169) ZITADEL ServerError - PGBouncer DNS
Add runbook notes.  Root cause and permanent solution have not been
identified yet.
2024-05-08 12:04:50 -07:00
Jeff McCune
19df2ec0fb (#167) Bump dev deployment to 0.74.0 2024-05-07 16:58:03 -07:00
Jeff McCune
bac7aec0ba (#167) Restructure the bare platform
This patch restructures the bare platform in preparation for a
`Platform` kind of output from CUE in addition to the existing
`BuildPlan` kind.

This patch establishes a pattern where our own CUE defined code goes
into the two CUE module paths:

1. `internal/platforms/cue.mod/gen/github.com/holos-run/holos/api/v1alpha1`
2. `internal/platforms/cue.mod/pkg/github.com/holos-run/holos/api/v1alpha1`
3. `internal/platforms/cue.mod/usr/github.com/holos-run/holos/api/v1alpha1`

The first path is automatically generated from Go structs.  The second
path is where we override and provide additional cue level integration.

The third path is reserved for the end user to further refine and
constrain our definitions.
2024-05-07 16:53:00 -07:00
Jeff McCune
42f916af41 (#164) Use quay.io/holos/oauth2-proxy:v7.6.0-1-g77a03ae2
Custom build to set samesite=none on the csrf cookie.
2024-05-06 16:18:32 -07:00
Jeff McCune
47a5e237e0 (#162) Lint go, typescript, and proto3 files
This patch adds lint coverage for proto3 and typescript to keep our code
reasonably clean.  The go linter was already enabled.
2024-05-06 14:17:08 -07:00
Jeff McCune
1279e2351a (#162) Move Platform back to holos.v1alpha1
No need to have a separate package for the PlatformService and related
protobuf messages.
2024-05-06 13:47:37 -07:00
Jeff McCune
adb8177026 Merge pull request #166 from holos-run/jeff/165-deploy-holos
(#165) Deploy Holos to Dev
2024-05-06 11:23:48 -07:00
Jeff McCune
4e8fa5abda (#165) Bump dev deployment to 0.73.1 2024-05-06 11:22:24 -07:00
Jeff McCune
6894f45b6c (#165) Deploy Holos to Dev
This patch deploys holos to the dev environment on the k2 cluster.  It's
accessible at https://app.dev.k2.holos.run/ behind the auth proxy by
default.
2024-05-06 11:10:29 -07:00
Jeff McCune
89d25be837 (#161) Wrap main router outlet in <main></main> 2024-05-06 09:16:15 -07:00
Jeff McCune
5b33e48552 (#161) Reasonably advanced form modeling the reference platform
This form goes a good way toward capturing what we need to configure the
entire reference platform.  Elements and sections are responsive to
which cloud providers are selected, which achieves my goal of modeling a
reasonably advanced form using only JSON data produced by CUE.

To write the form via the API:

    cue export ./forms/platform/ --out json \
      | jq '{platform_id: "'${platformId}'", fields: .spec.fields}' \
      | grpcurl -H "x-oidc-id-token: $(holos token)" -d @ ${host}:443 \
      holos.platform.v1alpha1.PlatformService.PutForm
2024-05-04 20:16:09 -07:00
Jeff McCune
79e8ab639a (#161) Fix the FormGroup & Refactor API
The way we were organizing fields into section broke Formly validation.
This patch fixes the problem by using the recommended approach of
[Nested Forms][1].

This patch also refactors the PlatformService API to clean it up.
GetForm / PutForm are separated from the Platform methods.  Similarly
GetModel / PutModel are separated out and are specific to get and put
the model data.

NOTE: I'm not sure we should have separated out the platform service
into it's own protobuf package.  Seems maybe unnecessary.

❯ grpcurl -H "x-oidc-id-token: $(holos token)" -d '{"platform_id":"018f36fb-e3ff-7f7f-a5d1-7ca2bf499e94"}' jeff.app.dev.k2.holos.run:443 holos.platform.v1alpha1.PlatformService.GetModel
{
  "model": {
    "org": {
      "contactEmail": "platform@openinfrastructure.co",
      "displayName": "Open Infrastructure Services LLC",
      "domain": "ois.run",
      "name": "ois"
    },
    "privacy": {
      "country": "earth",
      "regions": [
        "us-east-2",
        "us-west-2"
      ]
    },
    "terms": {
      "didAgree": true
    }
  }
}

[1]: https://formly.dev/docs/examples/other/nested-formly-forms
2024-05-04 10:14:37 -07:00
Jeff McCune
a0cc673736 (#150) Wire up select and multi select boxes
This patch wires up a Select and a Multi Select box.  This patch also
establishes a decision as it relates to Formly TypeScript / gRPC Proto3
/ CUE definitions of the form data structure.  The decision is to use
gRPC as a transport for any JSON to avoid friction trying to fit Formly
types into Proto3 messages.

Note when using google.protobuf.Value messages with bufbuild/connect-es,
we need to round trip them one last time through JSON to get the
original JSON on the other side.  This is because connect-es preserves
the type discriminators in the case and value fields of the message.

Refer to: [Accessing oneof
groups](https://github.com/bufbuild/protobuf-es/blob/main/docs/runtime_api.md#accessing-oneof-groups)

NOTE: On the wire, carry any JSON as field configs for expedience.  I
attempted to reflect FormlyFieldConfig in protobuf, but it was too time
consuming.  The loosely defined Formly json data API creates significant
friction when joined with a well defined protobuf API.  Therefore, we do
not specify anything about the Forms API, convey any valid JSON, and
leave it up to CUE and Formly on the sending and receiving side of the
API.

We use CUE to define our own holos form elements as a subset of the loose
Formly definitions.  We further hope Formly will move toward a better JSON
data API, but it's unlikely.  Consider replacing Formly entirely and
building on top of the strongly typed Angular Dyanmic Forms API.

Refer to: https://github.com/ngx-formly/ngx-formly/blob/v6.3.0/src/core/src/lib/models/fieldconfig.ts#L15
Consider: https://angular.io/guide/dynamic-form

Usage:

Generate the form from CUE

    cue export ./forms/platform/ --out json | jq -cM | pbcopy

Store the form JSON in the config_values column of the platforms table.

View the form, and submit some data. Then get the data back out for use rendering the platform:

    grpcurl -H "x-oidc-id-token: $(holos token)" -d '{"platform_id":"'${platformId}'"}' $holos holos.v1alpha1.PlatformService.GetConfig

```json
{
  "platform": {
    "spec": {
      "config": {
        "user": {
          "sections": {
            "org": {
              "fields": {
                "contactEmail": "jeff@openinfrastructure.co",
                "displayName": "Open Infrastructure Services LLC",
                "domain": "ois.run",
                "name": "ois"
              }
            },
            "privacy": {
              "fields": {
                "country": "earth",
                "regions": [
                  "us-east-2",
                  "us-west-2"
                ]
              }
            },
            "terms": {
              "fields": {
                "didAgree": true
              }
            }
          }
        }
      }
    }
  }
}
```
2024-05-03 10:42:03 -07:00
Jeff McCune
d06ecfadc8 (#150) Refactor PlatformService.GetConfig for use with CUE
Problem:
The GetConfig response value isn't directly usable with CUE without some
gymnastics.

Solution:
Refactor the protobuf definition and response output to make the user
defined and supplied config values provided by the API directly usable
in the CUE code that defines the platform.

Result:

The top level platform config is directly usable in the
`internal/platforms/bare` directory:

    grpcurl -H "x-oidc-id-token: $(holos token)" -d '{"platform_id":"'${platformID}'"}' $host \
      holos.v1alpha1.PlatformService.GetConfig \
      > platform.holos.json

Vet the user supplied data:

    cue vet ./ -d '#PlatformConfig' platform.holos.json

Build the holos component.  The ConfigMap consumes the user supplied
data:

    cue export --out yaml -t cluster=k2 ./components/configmap platform.holos.json \
      | yq .spec.components

Note the data provided by the input form is embedded into the
ConfigMap managed by Holos:

```yaml
KubernetesObjectsList:
  - metadata:
      name: platform-configmap
    apiObjectMap:
      ConfigMap:
        platform: |
          metadata:
            name: platform
            namespace: default
            labels:
              app.holos.run/managed: "true"
          data:
            platform: |
              kind: Platform
              spec:
                config:
                  user:
                    sections:
                      org:
                        fields:
                          contactEmail: jeff@openinfrastructure.co
                          displayName: Open Infrastructure Services LLC
                          domain: ois.run
                          name: ois
              apiVersion: app.holos.run/v1alpha1
              metadata:
                name: bare
                labels: {}
                annotations: {}
              holos:
                flags:
                  cluster: k2
          kind: ConfigMap
          apiVersion: v1
    Skip: false
```
2024-05-02 06:39:33 -07:00
Jeff McCune
64a117b0c3 (#150) Add PlatformService.GetConfig and refactor ConfigValues proto
Problem:
The use of google.protobuf.Any was making it awkward to work with the
data provided by the user.  The structure of the form data is defined by
the platform engineer, so the intent of Any was to wrap the data in a
way we can pass over the network and persist in the database.

The escaped JSON encoding was problematic and error prone to decode on
the other end.

Solution:
Define the Platform values as a two level map with string keys, but with
protobuf message fields "sections" and "fields" respectively.  Use
google.protobuf.Value from the struct package to encode the actual
value.

Result:
In TypeScript, google.protobuf.Value encodes and decodes easily to a
JSON value.  On the go side, connect correctly handles the value as
well.

No more ugly error prone escaping:

```
❯ grpcurl -H "x-oidc-id-token: $(holos token)" -d '{"platform_id":"'${platformId}'"}' $host holos.v1alpha1.PlatformService.GetConfig
{
  "sections": {
    "org": {
      "fields": {
        "contactEmail": "jeff@openinfrastructure.co",
        "displayName": "Open Infrastructure Services LLC",
        "domain": "ois.run",
        "name": "ois"
      }
    }
  }
}
```

This return value is intended to be directly usable in the CUE code, so
we may further nest the values into a platform.spec key.
2024-05-01 21:30:30 -07:00
Jeff McCune
cf006be9cf (#150) Add SystemService DropTables and SeedDatabase
Makes it easier to reset the database and give Gary and Nate access to
the same organization I'm in so they can provide feedback.
2024-05-01 14:30:13 -07:00
Jeff McCune
45ad3d8e63 (#150) Fix 500 error when config values aren't provided
AddPlatform was failing with a 500 error trying to decode a nil byte
slice when adding a platform without providing any values.
2024-05-01 11:31:25 -07:00
Jeff McCune
441c968c4f (#150) Look up user by iss sub, not email.
Also log when orgs are created.
2024-05-01 10:02:08 -07:00
Jeff McCune
99f2763fdf (#150) Store Platform Config Form and Values as JSON
This patch changes the backend to store the platform config form
definition and the config values supplied by the form as JSON in the
database.

The gRPC API does not change with this patch, but may need to depending
on how this works and how easy it is to evolve the data model and add
features.
2024-05-01 09:11:53 -07:00
Jeff McCune
1312395a11 (#150) Fix platforms page links
The links were hard to click.  This makes the links a much larger click
target following the example at https://material.angular.io/components/list/overview#navigation-lists
2024-05-01 08:51:29 -07:00
Jeff McCune
615f147bcb (#150) Add PutPlatformConfig to store the config values
This patch is a work in progress wiring up the form to put the values to
the holos server using grpc.

In an effort to simplify the platform configuration, the structure is a
two level map with the top level being configuration sections and the
second level being the fields associated with the config section.

To support multiple kinds of values and field controls, the values are
serialized to JSON for rpc over the network and for storage in the
database.  When they values are used, either by the UI or by the `holos
render` command, they're to be unmarshalled and in-lined into the
Platform Config data structure.

Pick back up ensuring the Platform rpc handler correctly encodes and
decodes the structure to the database.

Consider changing the config_form and config_values fields to JSON field
types in the database.  It will likely make working with this a lot
easier.

With this patch we're ready to wire up the holos render command to fetch
the platform configuration and create the end to end demo.

Here's essentially what the render command will fetch and lay down as a
json file for CUE:

```
❯ grpcurl -H "x-oidc-id-token: $(holos token)" -d '{"platform_id":"018f2c4e-ecde-7bcb-8b89-27a99e6cc7a1"}' jeff.app.dev.k2.holos.run:443 holos.v1alpha1.PlatformService.GetPlatform | jq .platform.config.values
{
  "sections": {
    "org": {
      "values": {
        "contactEmail": "\"platform@openinfrastructure.co\"",
        "displayName": "\"Open Infrastructure Services  LLC\"",
        "domain": "\"ois.run\"",
        "name": "\"ois\""
      }
    }
  }
}
```
2024-04-30 20:21:15 -07:00
Jeff McCune
d0ad3bfc69 (#150) Add Platform Detail to edit platform config
This patch adds a /platform/:id route path to a PlatformDetail
component.  The platform detail component calls the GetPlatform method
given the platform ID and renders the platform config form on the detail
tab.

The submit button is not yet wired up.

The API for adding platforms changes, allowing raw json bytes using the
RawConfig.  The raw bytes are not presented on the read path though,
calling GetPlatforms provides the platform and the config form inline in
the response.

Use the `raw_config` field instead of `config` when creating the form
data.

```
❯ grpcurl -H "x-oidc-id-token: $(holos token)" -d @ jeff.app.dev.k2.holos.run:443 holos.v1alpha1.PlatformService.AddPlatform <<EOF
{
  "platform": {
    "org_id": "018f27cd-e5ac-7f98-bfe1-2dbab208a48c",
    "name": "bare2",
    "raw_config": {
      "form": "$(cue export ./forms/platform/ --out json | jq -cM | base64 -w0)"
    }
  }
}
EOF
```
2024-04-30 14:02:49 -07:00
Jeff McCune
fe58a33747 (#150) Add holos.v1alpha1.PlatformService.GetForm
The GetForm method is intended for the Angular frontend to get
[FormlyFieldConfig][1] data for each section of the Platform config.

[1]: https://formly.dev/docs/api/core/#formlyfieldconfig

Steps to exercise for later testing:

Add the form definition to the database:

```
grpcurl -H "x-oidc-id-token: $(holos token)" -d @ jeff.app.dev.k2.holos.run:443 holos.v1alpha1.PlatformService.AddPlatform <<EOF
{
  "platform": {
    "org_id": "018f27cd-e5ac-7f98-bfe1-2dbab208a48c",
    "name": "bare${RANDOM}",
    "config": {
      "form": "$(cue export ./forms/platform/ --out json | jq -cM | base64 -w0)"
    }
  }
}
EOF
```

Get the form definition back out:

```

❯ grpcurl -H "x-oidc-id-token: $(holos token)" -d '{"platform_id":"018f2bc1-6590-7670-958a-9f3bc02b658f"}' jeff.app.dev.k2.holos.run:443 holos.v1alpha1.PlatformService.GetForm
{
  "apiVersion": "forms.holos.run/v1alpha1",
  "kind": "PlatformForm",
  "metadata": {
    "name": "bare"
  },
  "spec": {
    "sections": [
      {
        "name": "org",
        "displayName": "Organization",
        "description": "Organization config values are used to derive more specific configuration values throughout the platform.",
        "fieldConfigs": [
          {
            "key": "name",
            "type": "input",
            "props": {
              "label": "Name",
              "placeholder": "example",
              "description": "DNS label, e.g. 'example'",
              "required": true
            }
          },
          {
            "key": "domain",
            "type": "input",
            "props": {
              "label": "Domain",
              "placeholder": "example.com",
              "description": "DNS domain, e.g. 'example.com'",
              "required": true
            }
          },
          {
            "key": "displayName",
            "type": "input",
            "props": {
              "label": "Display Name",
              "placeholder": "Example Organization",
              "description": "Display name, e.g. 'Example Organization'",
              "required": true
            }
          },
          {
            "key": "contactEmail",
            "type": "input",
            "props": {
              "label": "Contact Email",
              "placeholder": "platform-team@example.com",
              "description": "Technical contact email address",
              "required": true
            }
          }
        ]
      }
    ]
  }
}
```

References

```
❯ cue export ./forms/platform/ --out yaml | yq
apiVersion: forms.holos.run/v1alpha1
kind: PlatformForm
metadata:
  name: bare
spec:
  sections:
    - name: org
      displayName: Organization
      description: Organization config values are used to derive more specific configuration values throughout the platform.
      fieldConfigs:
        - key: name
          type: input
          props:
            label: Name
            placeholder: example
            description: DNS label, e.g. 'example'
            required: true
        - key: domain
          type: input
          props:
            label: Domain
            placeholder: example.com
            description: DNS domain, e.g. 'example.com'
            required: true
        - key: displayName
          type: input
          props:
            label: Display Name
            placeholder: Example Organization
            description: Display name, e.g. 'Example Organization'
            required: true
        - key: contactEmail
          type: input
          props:
            label: Contact Email
            placeholder: platform-team@example.com
            description: Technical contact email address
            required: true
```
2024-04-29 14:24:16 -07:00
Jeff McCune
26e537e768 (#150) Add platform config form, values, cue
This patch adds 4 fields to the Platform table:

 1. Config Form represents the JSON FormlyFieldConfig for the UI.
 2. Config CUE represents the CUE file containing a definition the
    Config Values must unify with.
 3. Config Definition is the CUE definition variable name used to unify
    the values with the cue code.  Should be #PlatformSpec in most
    cases.
 4. Config Values represents the JSON values provided by the UI.

The use case is the platform engineer defines the #PlatformSpec in cue,
and provides the form field config.  The platform engineer then provides
1-3 above when adding or updating a Platform.

The UI then presents the form to the end user and provides values for 4
when the user submits the form.

This patch also refactors the AddPlatform method to accept a Platform
message.  To do so we make the id field optional since it is server
assigned.

The patch also adds a database constraint to ensure platform names are
unique within the scope of an organization.

Results:

Note how the CUE representation of the Platform Form is exported to JSON
then converted to a base64 encoded string, which is the protobuf JSON
representation of a bytes[] value.

```
grpcurl -H "x-oidc-id-token: $(holos token)" -d @ jeff.app.dev.k2.holos.run:443 holos.v1alpha1.PlatformService.AddPlatform <<EOF
{
  "platform": {
    "id": "0d3dc0c0-bbc8-41f8-8c6e-75f0476509d6",
    "org_id": "018f27cd-e5ac-7f98-bfe1-2dbab208a48c",
    "name": "bare",
    "config": {
      "form": "$(cd internal/platforms/bare && cue export ./forms/platform/ --out json | jq -cM | base64 -w0)"
    }
  }
}
EOF
```

Note the requested platform ID is ignored.

```
{
  "platforms": [
    {
      "id": "018f2af9-f7ba-772a-9db6-f985ece8fed1",
      "timestamps": {
        "createdAt": "2024-04-29T17:49:36.058379Z",
        "updatedAt": "2024-04-29T17:49:36.058379Z"
      },
      "name": "bare",
      "creator": {
        "id": "018f27cd-e591-7f98-a9d2-416167282d37"
      },
      "config": {
        "form": "eyJhcGlWZXJzaW9uIjoiZm9ybXMuaG9sb3MucnVuL3YxYWxwaGExIiwia2luZCI6IlBsYXRmb3JtRm9ybSIsIm1ldGFkYXRhIjp7Im5hbWUiOiJiYXJlIn0sInNwZWMiOnsic2VjdGlvbnMiOlt7Im5hbWUiOiJvcmciLCJkaXNwbGF5TmFtZSI6Ik9yZ2FuaXphdGlvbiIsImRlc2NyaXB0aW9uIjoiT3JnYW5pemF0aW9uIGNvbmZpZyB2YWx1ZXMgYXJlIHVzZWQgdG8gZGVyaXZlIG1vcmUgc3BlY2lmaWMgY29uZmlndXJhdGlvbiB2YWx1ZXMgdGhyb3VnaG91dCB0aGUgcGxhdGZvcm0uIiwiZmllbGRDb25maWdzIjpbeyJrZXkiOiJuYW1lIiwidHlwZSI6ImlucHV0IiwicHJvcHMiOnsibGFiZWwiOiJOYW1lIiwicGxhY2Vob2xkZXIiOiJleGFtcGxlIiwiZGVzY3JpcHRpb24iOiJETlMgbGFiZWwsIGUuZy4gJ2V4YW1wbGUnIiwicmVxdWlyZWQiOnRydWV9fSx7ImtleSI6ImRvbWFpbiIsInR5cGUiOiJpbnB1dCIsInByb3BzIjp7ImxhYmVsIjoiRG9tYWluIiwicGxhY2Vob2xkZXIiOiJleGFtcGxlLmNvbSIsImRlc2NyaXB0aW9uIjoiRE5TIGRvbWFpbiwgZS5nLiAnZXhhbXBsZS5jb20nIiwicmVxdWlyZWQiOnRydWV9fSx7ImtleSI6ImRpc3BsYXlOYW1lIiwidHlwZSI6ImlucHV0IiwicHJvcHMiOnsibGFiZWwiOiJEaXNwbGF5IE5hbWUiLCJwbGFjZWhvbGRlciI6IkV4YW1wbGUgT3JnYW5pemF0aW9uIiwiZGVzY3JpcHRpb24iOiJEaXNwbGF5IG5hbWUsIGUuZy4gJ0V4YW1wbGUgT3JnYW5pemF0aW9uJyIsInJlcXVpcmVkIjp0cnVlfX0seyJrZXkiOiJjb250YWN0RW1haWwiLCJ0eXBlIjoiaW5wdXQiLCJwcm9wcyI6eyJsYWJlbCI6IkNvbnRhY3QgRW1haWwiLCJwbGFjZWhvbGRlciI6InBsYXRmb3JtLXRlYW1AZXhhbXBsZS5jb20iLCJkZXNjcmlwdGlvbiI6IlRlY2huaWNhbCBjb250YWN0IGVtYWlsIGFkZHJlc3MiLCJyZXF1aXJlZCI6dHJ1ZX19XX1dfX0K"
      }
    }
  ]
}
```
2024-04-29 10:53:23 -07:00
Jeff McCune
ad70a6c4fe (#150) Add holos.v1alpha1.PlatformService.AddPlatform
This patch adds a basic AddPlatform method that adds a platform with a
name and a display name.

Next steps are to add fields for the Platform Config Form definition and
the Platform Config values submitted from the form.
2024-04-29 09:35:49 -07:00
Jeff McCune
22a04da6bb (#150) Add holos.v1alpha1.PlatformService.GetPlatforms
Next step: AddPlatform

Also consider extracting the queries to get the requested org_id to a
helper function.  This will likely eventually move to an interceptor
because every request is org scoped and needs authorization checks
against the org.

```
grpcurl -H "x-oidc-id-token: $(holos token)" -d '{"org_id":"018f27cd-e5ac-7f98-bfe1-2dbab208a48c"}' jeff.app.dev.k2.holos.run:443 holos.v1alpha1.PlatformService.GetPlatforms
```
2024-04-28 20:21:32 -07:00
Jeff McCune
dc97fe0ff0 (#150) Define a PlatformForm for platform design
Problem:
Platform engineers need the ability to define custom input fields for
their own platform level configuration values.  The holos web UI needs
to present the platform config values in a clean way.  The values
entered on the form need to make their way into the top level
Platform.spec field for use across all components and clusters in the
platform.

Solution:
Define a Platform Form in a forms cue package.  The output of this
definition is intended to be sent to the holos server to provide to the
web UI.

Result:
Platform engineers can define their platform config input values in
their infrastructure repository.  For example, the bare platform form
inputs are defined at `platforms/bare/forms/platform/platform-form.cue`.

This cue file produces [FormlyFieldConfig][1] output.

```console
cue export ./forms/platform/ --out yaml
```

```yaml
apiVersion: forms.holos.run/v1alpha1
kind: PlatformForm
metadata:
  name: bare
spec:
  sections:
    - name: org
      displayName: Organization
      description: Organization config values are used to derive more specific configuration values throughout the platform.
      fieldConfigs:
        - key: name
          type: input
          props:
            label: Name
            placeholder: example
            description: DNS label, e.g. 'example'
            required: true
        - key: domain
          type: input
          props:
            label: Domain
            placeholder: example.com
            description: DNS domain, e.g. 'example.com'
            required: true
        - key: displayName
          type: input
          props:
            label: Display Name
            placeholder: Example Organization
            description: Display name, e.g. 'Example Organization'
            required: true
        - key: contactEmail
          type: input
          props:
            label: Contact Email
            placeholder: platform-team@example.com
            description: Technical contact email address
            required: true
```

Next Steps:
Add a holos subcommand to produce the output and store it in the
backend.  Wire the front end to fetch the form config from the backend.

[1]: https://formly.dev/docs/api/core#formlyfieldconfig
2024-04-28 11:25:06 -07:00
Jeff McCune
9ca97c6e01 Merge pull request #148 from holos-run/jeff/147-cue-oom
(#147) Add holos render --print-instances flag
2024-04-26 16:31:14 -07:00
Jeff McCune
924653e240 (#150) Bare Platform
This patch adds a bare platform that does nothing but render a configmap
containing the platform config structure itself.

The definition of the platform structure is firming up.  The platform
designer, which may be a holos customer, is responsible for defining the
structure of the `platform.spec` output field.

Us holos developers have a reserved namespace to add configuration
fields and data in the `platform.holos` output file.

Beyond these two fields, the platform config structure has TypeMeta and
ObjectMeta fields similar to a kubernetes api object to support
versioning the platform config data, naming the platform, annotating the
platform, and labeling the platform.

The path forward from here is to:

 1. Eventually move the stable definitions into a CUE module that gets
    imported into the user's package.
 2. As a platform designer, add the organization field to the
    #PlatformSpec definition as a CUE definition.
 3. As a platform designer, add the organization field Form data
    structure as a JSON file.
 4. Add an API to upload the #PlatformSpec cue file and the
    #PlatformSpec form json file to the saas backend.
 5. Wire up Angular to pull the form json from the API and render the
    form.
 6. Wire up Angular to write the form data to a gRPC service method.
 7. Wire up the `holos cli` to read the form data from a gRPC service
    method.
 8. Tie it all together where the holos cli renders the configmap.
2024-04-26 16:14:30 -07:00
Jeff McCune
59d48f8599 (#146) Platform Config Mock Up
This patch adds a mock up of the platform config.  The goal is to use
this to connect to an anemic example platform built from `holos init`.
2024-04-26 11:29:58 -07:00
Jeff McCune
90f8eab816 (#144) Tidy go.mod and package.json 2024-04-25 19:14:20 -07:00
Jeff McCune
9ae45e260d (#147) Add holos render --print-instances flag
To enumerate all of the instances that could be run in separate
processes with xargs instead of run in the for loop in the Builder Run
method.
2024-04-25 13:59:10 -07:00
Jeff McCune
aee15f95e2 Merge pull request #145 from holos-run/jeff/144-organization-selector
(#144) Profile Button
2024-04-25 09:55:55 -07:00
Jeff McCune
1c540ac375 (#144) Profile Button and Organization Selector
This patch adds an organization "selector" that's really just a place
holder.  The active organization is the last element in the list
returned by the GetCallerOrganizations method for now.

The purpose is to make sure we have the structure in place for more than
one organizations without needing to implement full support for the
feature at this early stage.

The Angular frontend is expected to call the activeOrg() method of the
OrganizationService.  In the future this could store the state of which
organization the user has selected.  The purpose is to return an org id
to send as a request parameter for other requests.

Note this patch also implements refresh behavior.  The list of orgs is
fetched once on application load.  If there is no user, or the user has
zero orgs, the user is created and an organization is added with them as
an owner.  This is accompished using observable pipes.

The pipe is tied to a refresh behavior.  Clicking the org button
triggers the refresh behavior, which executes the pipe again and
notifies all subscribers.

This works quite well and should be idiomatic angular / rxjs.  Clicking
the button automatically updates the UI after making the necessary API
calls.
2024-04-25 09:55:13 -07:00
Jeff McCune
5b0e883ac9 (#144) Get or Create the orgranization
This patch adds the OrganizationService to the Angular front end and
displays a simple list of the organizations the user is a member of in
the profile card.

There isn't a service yet to return the currently selected
organization, but that could be a simple method to return the most
recent entry in the list until we put something more complicated in
place like local storage of what the user has selected.

It may make sense to put a database constraint on the number of
organizations until we implement the feature later, it's too early to do
so now, I just want to make sure it's possible to add later.
2024-04-25 07:02:17 -07:00
Jeff McCune
9a2519af71 (#144) Make the linter happy 2024-04-24 13:41:45 -07:00
Jeff McCune
9b9ff601c0 (#144) Call GetCallerClaims once instead of multiple times
Problem:
When loading the page the GetCallerClaims rpc method is called multiple
times unnecessarily.

Solution:
Use [shareReplay][1] to replay the last observable event for all
subscribers, including subscribers coming late to the party.

Result:
Network inspector in chrome indicates GetCallerClaims is called once and
only once.

[1]: https://rxjs.dev/api/operators/shareReplay
2024-04-24 12:44:44 -07:00
Jeff McCune
2f798296dc (#144) Profile Button
This patch adds a ProfileButton component which makes a ConnectRPC gRPC
call to the `holos.v1alpha1.UserService.GetCallerClaims` method and
renders the profile button based on the claims.

Note, in the network inspector there are two API calls to
`holos.v1alpha1.UserService.GetCallerClaims` which is unfortunate.  A
follow up patch might be good to fix this.
2024-04-24 12:23:54 -07:00
Jeff McCune
2b2ff63cad (#144) Connect /ui to ng serve for hot reload
Problem:
It's slow to build the angular app, compile it into the go executable,
copy it to the pod, then restart the server.

Solution:
Configure the mesh to route /ui to `ng serve` running on my local
host.

Result:
Navigating to https://jeff.app.dev.k2.holos.run/ui gets responses from
the ng development server.

Use:

    ng serve --host 0.0.0.0
2024-04-23 20:30:02 -07:00
Jeff McCune
3b135c09f3 (#144) Make a ConnectRPC call to the GetUserClaims method
This patch wires up an Angular RxJS Observable to the result of a gRPC
call to the `holos.v1alpha1.UserService.GetCallerClaims` method.

The implementation is a combination of [this connect example][1] and the
official [angular data][2] guide.

[1]: https://github.com/connectrpc/examples-es/tree/main/angular
[2]: https://angular.io/start/start-data#configuring-the-shippingcomponent-to-use-cartservice
2024-04-23 17:18:35 -07:00
Jeff McCune
28813eba5b (#126) v0.69.0 2024-04-23 10:24:58 -07:00
Jeff McCune
02ff765f54 Merge pull request #143 from holos-run/jeff/126-registration
(#126) Minimal API to register users and organizations
2024-04-23 09:00:07 -07:00
Jeff McCune
fe8a806132 (#126) Refactor to GetCallerX / CreateCallerX
This patch simplifies the user and organization registration and query
for the UI.  The pattern clients are expected to follow is to create if
the get fails.  For example, the following pseudo-go-code is the
expected calling convention:

    var entity *ent.User
    entity, err := Get()
    if err != nil {
      if ent.MaskNotFound(err) == nil {
        entity = Create()
      } else {
        return err
      }
    }
    return entity

This patch adds the following service methods.  For initial
registration, all input data comes from the id token claims of the
authenticated user.

```
❯ grpcurl -H "x-oidc-id-token: $(holos token)" jeff.app.dev.k2.holos.run:443 list | xargs -n1 grpcurl -H "x-oidc-id-token: $(holos token)" jeff.app.dev.k2.holos.run:443 list
holos.v1alpha1.OrganizationService.CreateCallerOrganization
holos.v1alpha1.OrganizationService.GetCallerOrganizations
holos.v1alpha1.UserService.CreateCallerUser
holos.v1alpha1.UserService.GetCallerClaims
holos.v1alpha1.UserService.GetCallerUser
```
2024-04-23 08:43:23 -07:00
Jeff McCune
6626d58301 (#126) Add OrganizationService
Next step after this is to simplify the calling convention to a get
followed by a create if the get fails.
2024-04-23 05:28:07 -07:00
Jeff McCune
cb0911e890 (#126) Add holos.v1alpha1.UserService.GetUser
❯ grpcurl -H "x-oidc-id-token: $(holos token)" jeff.app.dev.k2.holos.run:443 holos.v1alpha1.UserService.GetUser
{
  "user": {
    "id": "018f07f4-4f9c-7b69-9d9e-07bf7bb4fe33",
    "email": "jeff@openinfrastructure.co",
    "name": "Jeff McCune",
    "timestamps": {
      "createdAt": "2024-04-22T22:36:42.780492Z",
      "updatedAt": "2024-04-22T22:36:42.780492Z"
    }
  }
}
2024-04-22 16:49:25 -07:00
Jeff McCune
3745a68dc5 (#126) Add unique index on user iss sub fields
The server will frequently look up the user record given the iss and sub
claims from the id token, index them and make sure the combination of
the two is unique.
2024-04-22 16:14:40 -07:00
Jeff McCune
fd64830476 (#126) Rename HolosService to UserService
Organize services by the resource they manage.
2024-04-22 16:05:32 -07:00
Jeff McCune
1ee0fa9c1f (#126) User Registration via API
With this patch user registration works with grpcurl.  Nothing in the
web UI yet.

```
grpcurl -H "x-oidc-id-token: $(holos token)" jeff.app.dev.k2.holos.run:443 holos.v1alpha1.HolosService.RegisterUser
```

Cannot register twice:

```
❯ grpcurl -H "x-oidc-id-token: $(holos token)" jeff.app.dev.k2.holos.run:443 holos.v1alpha1.HolosService.RegisterUser
ERROR:
  Code: FailedPrecondition
  Message: user.go:26: ent: constraint failed: ERROR: duplicate key value violates unique constraint "users_email_key" (SQLSTATE 23505)
```

GetUserClaims works though:

```
❯ grpcurl -H "x-oidc-id-token: $(holos token)" jeff.app.dev.k2.holos.run:443 holos.v1alpha1.HolosService.GetUserClaims
{
  "iss": "https://login.ois.run",
  "sub": "261773693724656988",
  "email": "jeff@openinfrastructure.co",
  "emailVerified": true,
  "name": "Jeff McCune"
}
```
2024-04-22 15:38:07 -07:00
Jeff McCune
8fab325b0a (#126) Add gRPC reflection
So grpcurl works as expected:

```
❯ grpcurl -H "x-oidc-id-token: $(holos token)" jeff.app.dev.k2.holos.run:443 list
holos.v1alpha1.HolosService
```
2024-04-22 15:02:08 -07:00
Jeff McCune
858ffad913 (#126) Add holos token command for grpcurl 2024-04-22 14:54:09 -07:00
Jeff McCune
62735b99e7 (#126) Update Tiltfile to use holos.run for dev
This patch updates the Tiltfile to use the holos.run domain which is
integrated with the default Gateway.
2024-04-22 13:42:18 -07:00
Jeff McCune
29ab9c6300 (#141) Install provisioner helper.rb from entrypoint
And add a script to reset the choria provisioner credentials and config.
2024-04-22 13:20:38 -07:00
Jeff McCune
debc01c7de (#141) Fix Incorrect Provisioning Token foo given
The `make-provisioner-jwt` incorrectly used the choria broker password
as the provisioning token.  In the reference [setup.sh][1] both the
token and the `broker_provisioning_password` are set to `s3cret` so I
confused the two, but they are actually different values.

This patch ensures the provisioning token configured in
`provisioner.yaml` matches the token embedded into the provisioning.jwt
file using `choria jwt provisioning` via the `make-provisioner-jwt`
script.

[1]: 6dbc8fd105/example/setup/templates/provisioner/provisioner.yaml (L6)
2024-04-22 12:31:10 -07:00
Jeff McCune
c07f35ecd6 (#141) Fix holos controller invalid websocket connection error
Problem:
When the ingress default Gateway AuthorizationPolicy/authpolicy-custom
rule is in place the choria machine room holos controller fails to
connect to the provisioner broker with the following error:

```
❯ holos controller run --config=agent.cfg
WARN[0000] Starting controller version 0.68.1 with config file /home/jeff/workspace/holos-run/holos/hack/choria/agent/agent.cfg  leader=false
WARN[0000] Switching to provisioning configuration due to build defaults and missing /home/jeff/workspace/holos-run/holos/hack/choria/agent/agent.cfg
WARN[0000] Setting anonymous TLS mode during provisioning  component=server connection=coffee.home identity=coffee.home
WARN[0000] Initial connection to the Broker failed on try 1: invalid websocket connection  component=server connection=coffee.home identity=coffee.home
WARN[0000] Initial connection to the Broker failed on try 2: invalid websocket connection  component=server connection=coffee.home identity=coffee.home
WARN[0002] Initial connection to the Broker failed on try 3: invalid websocket connection  component=server connection=coffee.home identity=coffee.home
```

This problem is caused because the provisioning token url is set to
`wss://jeff.provision.dev.k2.holos.run:443` which has the port number
specified.

Solution:
Follow the upstream istio guidance of [Writing Host Match Policies][1]
to match host headers with or without the port specified.

Result:
The controller is able to connect to the provisioner broker:

[1]: https://istio.io/latest/docs/ops/best-practices/security/#writing-host-match-policies
2024-04-22 12:31:10 -07:00
Jeff McCune
c8f528700c (#141) Fix error: do not know how to handle choria_provisioning purpose token
Solution:
remove the plugin.security.choria.ca setting
2024-04-22 12:30:16 -07:00
Jeff McCune
896248c237 (#141) Try and connect holos controller to the provisioner
Running into error:

time="2024-04-20T03:23:19Z" level=warning msg="Denying connection: verified error: do not know how to handle choria_provisioning purpose token, unverified error: <nil>" component=authentication remote="10.244.1.51:56338" stage=check
time="2024-04-20T03:23:19Z" level=error msg="192.168.2.21/10.244.1.51:56338 - wid:367 - authentication error" component=network_broker
2024-04-22 12:29:56 -07:00
Jeff McCune
74a181db21 (#133) Add missing Choria Provisioner deployment 2024-04-19 15:14:31 -07:00
Jeff McCune
ba10113342 (#133) Fix tls error when connecting to provisioner websocket
This problem fixes an error where the istio ingress gateway proxy failed
to verify the TLS certificate presented by the choria broker upstream
server.

    kubectl logs choria-broker-0

    level=error msg="websocket: TLS handshake error from 10.244.1.190:36142: remote error: tls: unknown certificate\n"

Istio ingress logs:

    kubectl -n istio-ingress logs -l app=istio-ingressgateway -f | grep --line-buffered '^{' | jq .

    "upstream_transport_failure_reason": "TLS_error:|268435581:SSL_routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED:TLS_error_end:TLS_error_end"

Client curl output:

    curl https://jeff.provision.dev.k2.holos.run

    upstream connect error or disconnect/reset before headers. retried and the latest reset reason: remote connection failure, transport failure reason: TLS_error:|268435581:SSL routines:OPENSSL_i
nternal:CERTIFICATE_VERIFY_FAILED:TLS_error_end:TLS_error_end

Explanation of error:

Istio defaults to expecting a tls certificate matching the downstream
host/authority which isn't how we've configured Choria.

Refer to [ClientTLSSettings][1]

> A list of alternate names to verify the subject identity in the
> certificate. If specified, the proxy will verify that the server
> certificate’s subject alt name matches one of the specified values. If
> specified, this list overrides the value of subject_alt_names from the
> ServiceEntry. If unspecified, automatic validation of upstream presented
> certificate for new upstream connections will be done based on the
> downstream HTTP host/authority header, provided
> VERIFY_CERTIFICATE_AT_CLIENT and ENABLE_AUTO_SNI environmental variables
> are set to true.

[1]: https://istio.io/latest/docs/reference/config/networking/destination-rule/#ClientTLSSettings
2024-04-19 13:13:09 -07:00
Jeff McCune
eb0207c92e (#133) Choria Provisioner
This patch is a work in progress to configure the provisioner to connect
to the broker.  Services and deployments are prefixed with choria for
clarity.
2024-04-19 13:13:08 -07:00
Jeff McCune
0fbcee8119 (#133) Extend the life of the Platform Issuer CA
The platform issuer root CA was set to expire after 90 days, the default
value.  This is too short.

Extend the life of the root CA beyond 100 years.
2024-04-19 11:29:17 -07:00
Jeff McCune
ce8bc798f6 (#133) Exclude nats and provision hosts from the auth proxy
Problem:
The identity aware auth proxy attached to the default gateway is
blocking access to NATS and the Choria Provisioner cluster.

Solution:
Add configuration that causes the project hosts to get added to the
exclusion list of the AuthorizationPolicy/authproxy-custom rule.

Result:
Requests bypass the auth proxy and go straight to the backend.  The
rules look like:

    kubectl get authorizationpolicy authproxy-custom -o yaml

```yaml
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: authproxy-custom
  namespace: istio-ingress
  labels:
    app.kubernetes.io/name: authproxy-custom
    app.kubernetes.io/part-of: istio-ingressgateway
spec:
  action: CUSTOM
  provider:
    name: ingressauth
  rules:
  - to:
    - operation:
        notHosts:
        - login.ois.run
        - vault.core.ois.run
        - provision.holos.run
        - nats.holos.run
        - provision.dev.holos.run
        - nats.dev.holos.run
        - jeff.provision.dev.holos.run
        - jeff.nats.dev.holos.run
        - gary.provision.dev.holos.run
        - gary.nats.dev.holos.run
        - nate.provision.dev.holos.run
        - nate.nats.dev.holos.run
        - provision.k2.holos.run
        - nats.k2.holos.run
        - provision.dev.k2.holos.run
        - nats.dev.k2.holos.run
        - jeff.provision.dev.k2.holos.run
        - jeff.nats.dev.k2.holos.run
        - gary.provision.dev.k2.holos.run
        - gary.nats.dev.k2.holos.run
        - nate.provision.dev.k2.holos.run
        - nate.nats.dev.k2.holos.run
    when:
    - key: request.headers[x-oidc-id-token]
      notValues:
      - '*'
  selector:
    matchLabels:
      istio: ingressgateway
```
2024-04-19 06:32:11 -07:00
Jeff McCune
996195d651 (#137) Fix ArgoCD PKCE login small comment 2024-04-18 14:39:13 -07:00
Jeff McCune
f00b29d3a3 (#137) Fix ArgoCD PKCE login
This patch configures ArgoCD to log in via PKCE.

Note the changes are primarily in platform.site.cue and ensuring the
emailDomain is set properly.  Note too the redirect URL needs to be
`/pkce/verify` when PKCE is enabled.  Finally, if the setting is
reconfigured make sure to clear cookies otherwise the incorrect
`/auth/callback` path may be used.
2024-04-18 14:37:06 -07:00
Jeff McCune
a6756ecf11 (#132) Update go deps to fix Machine Room 2024-04-18 13:35:58 -07:00
Jeff McCune
ef7ec30037 (#132) Use machine-room main branch
Our patch to use Args got merged.
2024-04-18 11:38:51 -07:00
Jeff McCune
1642787825 (#101) Fix port names in servers must be unique: duplicate name https
Problem:
Port names in the default Gateway.spec.servers.port field must be unique
across all servers associated with the workload.

Solution:
Append the fully qualified domain name with dots replaced with hyphens.

Result:
Port name is unique.
2024-04-18 11:21:05 -07:00
Jeff McCune
f83781480f (#101) Do not add Gateway.spec.servers for other clusters
Problem:
The default gateway in one cluster gets server entries for all hosts in
the problem.  This makes the list unnecessarily large with entries for
clusters that should not be handled on the current cluster.

For example, the k2 cluster has gateway entries to route hosts for k1,
k3, k4, k5, etc...

Solution:
Add a field to the CertInfo definition representing which clusters the
host is valid on.

Result:
Hosts which are valid on all clusters, e.g. login.ois.run, have all
project clusters added to the clusters field of the CertInfo.  Hosts
which are valid on a single cluster have the coresponding single entry
added.

When building resources, holos components should check if `#ClusterName`
is a valid field of the CertInfo.clusters field.  If so, the host is
valid for the current cluster.  If not, the host should be omitted from
the current cluster.
2024-04-18 11:10:16 -07:00
Jeff McCune
9b70205855 (#101) Use letsencrypt production instead of staging
Certificates issue OK from staging, switching to production.
2024-04-18 10:36:28 -07:00
Jeff McCune
0e4bf3c144 (#101) Manage certs on all clusters
We're going to do a big re-issue so might as well do it once so we don't
have to re-issue again to add more clusters to existing projects.
2024-04-18 10:16:55 -07:00
Jeff McCune
1241c74b41 (#101) Do not add the project name as a project host
Doing so forces unnecessary hosts for some projects.  For example,
iam.ois.run is useless for the iam project, the primary project host is
login to build login.ois.run.

Some projects may not need any hosts as well.

Better to let the user specify `project: foo: hosts: foo: _` if they
want it.
2024-04-18 09:59:21 -07:00
Jeff McCune
44fea098de (#101) Manage an ExternalSecret for every Server in the default Gateway
This patch loops over every Gateway.spec.servers entry in the default
gateway and manages an ExternalSecret to sync the credential from the
provisioner cluster.
2024-04-18 09:53:39 -07:00
Jeff McCune
52286efa25 (#101) Fix duplicate certs in holos components
Problem:
A Holos Component is created for each project stage, but all hosts for
all stages in the project are added.  This creates duplicates.

Solution:
Sort project hosts by their stage and map the holos component for a
stage to the hosts for that stage.

Result:
Duplicates are eliminated, the prod certs are not in the dev holos
component and vice-versa.
2024-04-18 09:17:49 -07:00
Jeff McCune
a1b2179442 (#101) Remove holos-saas-certs holos component
No longer needed now that project host certs are using wildcards and
organized nicely.
2024-04-18 06:32:06 -07:00
Jeff McCune
cffc430738 (#101) Provision wildcard certs for all Gateway servers
This patch provisions wildcard certs in the provisioning cluster.  The
CN matches the project stage host global hostname without any cluster
qualifiers.

The use of a wildcard in place of the environment name dns segment at
the leftmost position of the fully qualified dns name enables additional
environments to be configured without reissuing certificates.

This is to avoid the 100 name per cert limit in LetsEncrypt.
2024-04-18 06:26:29 -07:00
Jeff McCune
d76454272b (#101) Simplify the GatewayServers struct
Mapping each project host fqdn to the stage is unnecessary.  The list of
gateway servers is constructed from each FQDN in the project.

This patch removes the unnecessary struct mappings.
2024-04-18 05:32:19 -07:00
Jeff McCune
9d1e77c00f (#101) Define #ProjectHosts to manage project hosts
Problem:
It's difficult to map and reduce the collection of project hosts when
configuring related Certificate, Gateway.spec.servers, VirtualService,
and auth proxy cookie domain settings.

Solution:
Define #ProjectHosts which takes a project and provides Hosts which is a
struct with a fqdn key and a #CertInfo value.  The #CertInfo definition
is intended to provide everything need to reduce the Hosts property to
structs usful for the problematic resources mentioned previously.

Result:
Gateway.spec.servers are mapped using #ProjectHosts

Next step is to map the Certificate resources on the provisioner
cluster.
2024-04-17 21:59:04 -07:00
Jeff McCune
2050abdc6c (#101) Add wildcard support to project certs
Problem:
Adding environments to a project causes certs to be re-issued.

Solution:
Enable wildcard certs for per-environment namespaces like jeff, gary,
nate, etc...

Result:
Environments can be added to a project stage without needing the cert to
be re-issued.
2024-04-17 12:32:44 -07:00
Jeff McCune
3ea013c503 (#101) Consolidate certificates by project stage
This patch avoids LetsEncrypt rate limits by consolidating multiple dns
names into one certificate.

For each project host, create a certificate for each stage in the
project.  The certificate contains the dns names for all clusters and
environments associated with that stage and host.

This can become quite a list, the limit is 100 dnsNames.

For the Holos project which has 7 clusters and 4 dev environments, the
number of dns names is 32 (4 envs + 4 envs * 7 clusters = 32 dns names).

Still, a much needed improvement because we're limited to 50 certs per
week.

It may be worth considering wildcards for the per-developer
environments, which are the ones we'll likely spin up the most
frequently.
2024-04-17 11:58:46 -07:00
Jeff McCune
309db96138 (#133) Choria Broker for Holos Controller provisioning
This patch is a partial step toward getting the choria broker up
and running in my own namespace.  The choria broker is necessary for
provisioning machine room agents such as the holos controller.
2024-04-17 08:48:31 -07:00
Jeff McCune
283b4be71c (#132) Use forked version of machine-room
Until https://github.com/choria-io/machine-room/pull/12 gets merged
2024-04-16 19:46:36 -07:00
Jeff McCune
ab9bca0750 (#132) Controller Subcommand
This patch adds an initial holos controller subcommand.  The machine
room agent starts, but doesn't yet provision because we haven't deployed
the provisioning infrastructure yet.
2024-04-16 15:40:25 -07:00
Jeff McCune
ac2be67c3c (#130) NATS deployment with operator jwt
Configure NATS in a 3 Node deployment with resolver authentication using
an Operator JWT.

The operator secret nkeys are stored in the provisioner cluster.  Get
them with:

    holos get secret -n jeff-holos nats-nsc --print-key nsc.tgz | tar -tvzf-
2024-04-15 17:02:18 -07:00
Jeff McCune
6ffafb8cca (#127) Setup Routing using Dashboard Schematic
This patch sets up basic routing and a 404 not found page.  The Home and
Clusters page are generated from the [dashboard schematic][1]

    ng generate @angular/material:dashboard home
    ng generate @angular/material:dashboard cluster-list
    ng g c error-not-found

[1]: https://material.angular.io/guide/schematics#dashboard-schematic
2024-04-15 13:48:00 -07:00
Jeff McCune
590e6b556c (#127) Generate Angular Material navigation
Instead of trying to hand-craft a navigation sidebar and toolbar from
Youtube videos, use the [navigation schematic][1] to quickly get a "good
enough" UI.

    ng generate @angular/material:navigation nav

[1]: https://material.angular.io/guide/schematics#navigation-schematic
2024-04-15 10:43:24 -07:00
Jeff McCune
5dc5c6fbdf (#127) ng add @angular/material
And start working on the sidenav and toolbar.
2024-04-14 07:03:45 -07:00
Jeff McCune
cd8c9f2c32 (#127) ConnectRPC generated code 2024-04-13 11:03:19 -07:00
Jeff McCune
3490941d4c (#127) Frontend deps from make tools
Needed to generate the connectrpc bindings and build the holos
executable.
2024-04-12 20:09:41 -07:00
Jeff McCune
3f201df0c2 (#126) Configure Angular to align with frontend.go
Angular must build output into a path compatible with the Go
http.FileServer.  We cannot easily graft an fs.FS onto a sub-path, so we
need the `./ui/` path in the output.  This requires special
configuration from the Angular default application builder behavior.
2024-04-12 20:08:37 -07:00
Jeff McCune
4c22d515bd (#127) ng new holos
ng new holos --routing --skip-git --standalone
SCSS
No SSR
2024-04-12 20:07:17 -07:00
Jeff McCune
ec0ef1c4b3 (#127) Angular - Restart again
Restart again this time with SCSS instead of CSS.
2024-04-12 20:03:45 -07:00
Jeff McCune
1e51e2d49a (#127) Angular Navigation schematic
Following [Navigation schematic][1].

    ng generate @angular/material:navigation navigation

[1]: https://material.angular.io/guide/schematics#navigation-schematic
2024-04-12 19:45:26 -07:00
Jeff McCune
5186499b90 Revert "(#127) Angular - ng add ng-matero"
This reverts commit fc275e4164.

Yuck, don't like it.
2024-04-12 17:21:26 -07:00
Jeff McCune
fc275e4164 (#127) Angular - ng add ng-matero
Trying [ng-matero][1].  Seems to exceed the max prod budget of 1mb, but
worth trying anyway.

[1]: https://github.com/ng-matero/ng-matero
2024-04-12 17:18:33 -07:00
Jeff McCune
9fa466f7cf (#126) Build the front end app when building holos
Always build the front end app bundle when rebuilding the holos cli so
we're sure things are up to date.
2024-04-12 17:04:41 -07:00
Jeff McCune
efd6f256a5 (#126) Connect generated bindings for the frontend 2024-04-12 16:57:30 -07:00
Jeff McCune
f7f9d6b5f0 (#126) Angular Material - ng add @angular/material 2024-04-12 16:57:15 -07:00
Jeff McCune
0526062ab2 (#126) Configure Angular to align with frontend.go
Angular must build output into a path compatible with the Go
http.FileServer.  We cannot easily graft an fs.FS onto a sub-path, so we
need the `./ui/` path in the output.  This requires special
configuration from the Angular default application builder behavior.
2024-04-12 16:57:15 -07:00
Jeff McCune
a1ededa722 (#126) http.FileServer serves /ui instead of /app
This fixes Angular not being served up correctly.

Note, special configuration in Angular is necessary to get the build
output into the ui/ directory.  Refer to: [Output path configuration][1]
and [browser directory created in outputPath][2].

[1]: https://angular.io/guide/workspace-config#output-path-configuration
[2]: https://github.com/angular/angular-cli/issues/26304
2024-04-12 16:51:45 -07:00
Jeff McCune
9b09a02912 (#115) Angular new project with defaults
Setup angular with the defaults.  CSS, No SSR / Static Site Generation.

    npm install -g @angular/cli
    ng new holos

```
? Which stylesheet format would you like to use? CSS             [ https://developer.mozilla.org/docs/Web/CSS                     ]
? Do you want to enable Server-Side Rendering (SSR) and Static Site Generation (SSG/Prerendering)? No
```

```
CREATE holos/README.md (1059 bytes)
CREATE holos/.editorconfig (274 bytes)
CREATE holos/.gitignore (548 bytes)
CREATE holos/angular.json (2587 bytes)
CREATE holos/package.json (1036 bytes)
CREATE holos/tsconfig.json (857 bytes)
CREATE holos/tsconfig.app.json (263 bytes)
CREATE holos/tsconfig.spec.json (273 bytes)
CREATE holos/.vscode/extensions.json (130 bytes)
CREATE holos/.vscode/launch.json (470 bytes)
CREATE holos/.vscode/tasks.json (938 bytes)
CREATE holos/src/main.ts (250 bytes)
CREATE holos/src/favicon.ico (15086 bytes)
CREATE holos/src/index.html (291 bytes)
CREATE holos/src/styles.css (80 bytes)
CREATE holos/src/app/app.component.css (0 bytes)
CREATE holos/src/app/app.component.html (19903 bytes)
CREATE holos/src/app/app.component.spec.ts (913 bytes)
CREATE holos/src/app/app.component.ts (301 bytes)
CREATE holos/src/app/app.config.ts (227 bytes)
CREATE holos/src/app/app.routes.ts (77 bytes)
CREATE holos/src/assets/.gitkeep (0 bytes)
✔ Packages installed successfully.
```
2024-04-12 15:07:38 -07:00
Jeff McCune
657a5e82a5 (#115) Remove Angular SSR
We don't want Angular Server Side Rendering, we want plain old client
side angular.
2024-04-12 14:57:39 -07:00
Jeff McCune
1eece02254 (#126) Angular Material UI
ng add @angular/material

```
❯ ng add @angular/material
Skipping installation: Package already installed
? Choose a prebuilt theme name, or "custom" for a custom theme: Indigo/Pink        [ Preview: https://material.angular.io?theme=indigo-pink ]
? Set up global Angular Material typography styles? Yes
? Include the Angular animations module? Include and enable animations Yes
```
2024-04-12 14:16:45 -07:00
Jeff McCune
c866b47dcb (#126) Check for errors decoding claims
Return an empty claims struct when there's an error.
2024-04-12 14:16:44 -07:00
Jeff McCune
ff52ec750b (#126) Try to fix golangci-lint
It's doing way too much, might want to consider something else.

Getting these errors:

```
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.dockerignore: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.envrc: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.gitattributes: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.github/CODEOWNERS: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.github/buf-logo.svg: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.github/dependabot.yml: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.github/workflows/add-to-project.yaml: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.github/workflows/back-to-development.yaml: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.github/workflows/buf-binary-size.yaml: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.github/workflows/buf-shadow-sync.yaml: Cannot open: File exists
/usr/bin/tar: ../../../go/pkg/mod/github.com/bufbuild/buf@v1.30.1/.github/workflows/buf.yaml: Cannot open: File exists
```
2024-04-12 14:01:16 -07:00
Jeff McCune
4184619afc (#126) Refactor pkg to internal
pkg folder is not needed.  Move everything internal for now.
2024-04-12 13:56:16 -07:00
Jeff McCune
954dbd1ec8 (#126) Refactor id token acquisition to token package
And add a logout command that deletes the token cache.

The token package is intended for subcommands that need to make API
calls to the holos api server, getting a token should be a simple matter
of calling the token.Get() method, which takes minimal dependencies.
2024-04-12 13:15:03 -07:00
Jeff McCune
30b70e76aa (#126) Add login command
This copies the login command from the previous holos cli.  Wire
dependency injection and all the rest of the unnecessary stuff from
kubelogin are removed, streamlined down into a single function that
takes a few oidc related parameters.

This will need to be extracted out into an infrastructure service so
multiple other command line tools can easily re-use it and get the ID
token into the x-oidc-id-token header.
2024-04-12 12:13:33 -07:00
Jeff McCune
ec6d112711 (#126) Remove hydra and kratos databases
No longer needed for dev.
2024-04-12 10:24:26 -07:00
Jeff McCune
e796c6a763 (#126) Default to DATABASE_URL env var 2024-04-12 10:20:13 -07:00
Jeff McCune
be32201294 (#126) Basic User and Organization Ent models
Get rid of the previous UserIdentity model, this is no longer part of
the core domain and instead handled within the context of ZITADEL.
2024-04-12 09:59:40 -07:00
Jeff McCune
5ebc54b5b7 (#124) Go Tools 2024-04-12 09:14:13 -07:00
Jeff McCune
2954a57872 (#120) Fix NATS target namespace
The upstream nats charts don't specify namespaces for each attribute.
This works with helm update, but not helm template which holos uses to
render the yaml.

The missing namespace causes flux to fail.

This patch uses the flux kustomization to add the target namespace to
all resources.
2024-04-10 21:54:58 -07:00
Jeff McCune
df705bd79f (#121) Fix Multiple Charts cause holos render to fail
When rendering a holos component which contains more than one helm chart, rendering fails.  It should succeed.

```
holos render --cluster-name=k2 /home/jeff/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/holos/... --log-level debug
```

```
9:03PM ERR could not execute version=0.64.2 err="could not rename: rename /home/jeff/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/holos/nats/envs/vendor553679311 /home/jeff/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/holos/nats/envs/vendor: file exists" loc=helm.go:145
```

This patch fixes the problem by moving each child item of the temporary
directory charts are installed into.  This avoids the problem of moving
the parent when the parent target already exists.
2024-04-10 21:27:39 -07:00
Jeff McCune
4e8ce3585d (#115) Minor clean up of cue code 2024-04-10 21:21:16 -07:00
Jeff McCune
ab5f17c3d2 (#115) Fix goreleaser
Import modules to take the direct dependency and prevent go mod tidy
from modifying go.mod and go.sum which causes goreleaser to fail.
2024-04-10 19:09:30 -07:00
Jeff McCune
a8918c74d4 (#115) Angular spike - fix make frontend
And install frontend deps.
2024-04-09 21:03:26 -07:00
Jeff McCune
ae5738d82d (#115) Angular with SSR
Executed:

    ng new
    ng add @angular/ssr

Name: holos
Style: CSS
SSR and SSG?: No

ssr added using ng add following https://angular.io/guide/prerendering
2024-04-09 20:52:42 -07:00
Jeff McCune
bb99aedffa (#115) Remove frontend
Clean up for ng new in angular spike.
2024-04-09 20:35:43 -07:00
Jeff McCune
d6ee1864c8 (#116) Tilt for development
Add Tilt back from holos server

Note with this patch the ec-creds.yaml file needs to be applied to the
provisioner and an external secret used to sync the image pull creds.

With this patch the dev instance is accessible behind the auth proxy.
pgAdmin also works from the Tilt UI.

https://jeff.holos.dev.k2.ois.run/app/start
2024-04-09 20:26:37 -07:00
Jeff McCune
8a4be66277 (#113) Fix goreleaser try 4
Please check in your pipeline what can be changing the following files:
  M go.sum
2024-04-09 16:48:21 -07:00
Jeff McCune
79ce2f8458 (#113) Fix goreleaser try 3 2024-04-09 16:35:38 -07:00
Jeff McCune
3d4ae44ddd (#113) Fix goreleaser try 2
goreleaser fails with Failure: plugin connect-query: could not find protoc plugin for name connect-query - please make sure protoc-gen-connect-query is installed and present on your $PATH
2024-04-09 16:23:35 -07:00
Jeff McCune
1efb1faa40 (#113) Fix goreleaser
git executable must come before actions checkout
2024-04-09 16:04:42 -07:00
Jeff McCune
bfd6a56397 (#113) Fix actions workflows 2024-04-09 15:57:31 -07:00
Jeff McCune
a788f6d8e8 (#112) Refactor config flag handling
Remove the server.Config struct, not needed.  Remove the app struct and
move the configuration to the main holos.Config.ServerConfig.

Add flags specific to server configuration.

With this patch logging is simplified.  Subcommands have a handle on the
top level holos.Config and can get a fully configured logger from
cfg.Logger() after flag parsing happens.
2024-04-09 11:42:24 -07:00
Jeff McCune
80fa91d74d (#112) Rename wrapper package to errors
The wrapper package name doesn't indicate what it's for.  Rename to
errors and delegate to the standard library.
2024-04-08 20:53:58 -07:00
Jeff McCune
db34562e9a (#112) Get tests passing 2024-04-08 20:53:57 -07:00
Jeff McCune
d6af089ab3 (#112) Rename package core to app
Disambiguate the term `core` which should mean the core domain.  The app
is a supporting domain concerned with logging and configuration
initialization early in the life cycle.
2024-04-08 20:53:57 -07:00
Jeff McCune
b3a70c5911 (#112) Copy holos-server to holos server subcommand
From holos-server commit da35fe966ded2098fe069293ec30864775a6c4f0

Compiles but needs cleanup
2024-04-08 20:53:25 -07:00
Jeff McCune
bf5765c9cb (#110) Update ZITADEL to v2.49.1 from v2.46.0
Attempt to resolve issue where `/oauth/v2/keys` returns `{"keys": []}`
causing id token verification failures.

Closes: #110
2024-04-07 17:20:10 -07:00
Jeff McCune
6c7697648c (#110) Add runbook to take a full database backup
This runbook documents how to write a full database backup to a blank S3
bucket given an existing postgrescluster resource with a live, running
database.

The pgo controller needs to remove and re-create the repo for the backup
to succeed, otherwise it complains about a missing file expected from a
previous backup.
2024-04-07 17:20:07 -07:00
Jeff McCune
04158485c7 (#96) Do not expire ZITADEL signing public key
The public key needs to be configured along with the signing key.
2024-04-05 10:52:36 -07:00
Jeff McCune
cf83c77280 (#96) Do not expire ZITADEL signing private key
Without this patch users encounter an error from istio because it does
not have a valid Jwks from ZITADEL to verify the request when processing
a `RequestAuthentication` policy.

Fixes error `AuthProxy JWKS Error - Jwks doesn't have key to match kid or alg from Jwt`.

Occurs when accessing a protected URL for the first time after tokens have expired.
2024-04-04 15:56:00 -07:00
Jeff McCune
6e545b13dd (#104) Deploy crunchy monitoring stack for ZITADEL
Not exposed via the ingress gateway, but accessible via

    kubectl port-forward svc/crunchy-grafana 3000

Refer to [day two monitoring][1].  This is pretty much a straight copy
of the upstream kustomize configs at [2].

[1]: https://access.crunchydata.com/documentation/postgres-operator/5.5/tutorials/day-two/monitoring
[2]: https://github.com/CrunchyData/postgres-operator-examples/tree/main/kustomize/monitoring
2024-04-04 15:40:07 -07:00
Jeff McCune
bf258a1f41 (#104) Enable monitoring for ZITADEL postgres
This patch enables the monitoring configuration for the ZITADEL postgres
cluster.

Refer to: https://access.crunchydata.com/documentation/postgres-operator/5.5/tutorials/day-two/monitoring

Integrating with:
https://github.com/CrunchyData/postgres-operator-examples/tree/main/kustomize/monitoring
which will become a separate holos component instance.
2024-04-03 22:26:38 -07:00
Jeff McCune
6f06c73d6f (#85) Initial addition of kube-prometheus-stack
Grafana does not yet have the istio sidecar.  Prometheus is accessible
through the auth proxy.  Cert manager is added to the workload clusters
so tls certs can be issued for webhooks, the kube-prom-stack helm chart
uses cert manager for this purpose.

With this patch Grafana is integrated with OIDC and I'm able to log in
as an Administrator.
2024-04-03 21:29:26 -07:00
Jeff McCune
a689c53a9c (#47) v0.62.1 - Projects v1alpha1 milestone complete 2024-04-03 15:32:34 -07:00
Jeff McCune
58cdda1d35 Merge pull request #100 from holos-run/jeff/47-iam-v2
(#47) Remove the prod-iam-zitadel namespace
2024-04-03 15:23:48 -07:00
Jeff McCune
bcb02b5c5c (#47) Remove the prod-iam-zitadel namespace
No longer needed, cluster has moved to prod-iam namespace.
2024-04-03 15:10:30 -07:00
Jeff McCune
0736c7de1a (#47) Bind ALL VirtualServices to the default gateway
Problem:
The VirtualService that catches auth routes for paths, e.g.
`/holos/authproxy/istio-ingress` is bound to the default gateway which
no longer exists because it has no hosts.

Solution:
It's unnecessary and complicated to create a Gateway for every project.
Instead, put all server entries into one `default` gateway and
consolidate the list using CUE.

Result:
It's easier to reason about this system.  There is only one ingress
gateway, `default` and everything gets added to it.  VirtualServices
need only bind to this gateway, which has a hosts entry appropriately
namespaced for the project.
2024-04-03 14:56:40 -07:00
Jeff McCune
28be9f9fbb (#47) Use the project specific Gateway
The login service is unavailable because the wrong gateway is used.
When using projects the VS needs to attach to the correct Gateway.
2024-04-03 12:59:48 -07:00
Jeff McCune
647681de38 (#99) Restore backups from prod-iam namespace
This patch configures the standby cluster to restore backups from the
prod-iam namespace instead of the prod-iam-zitadel namespace.
2024-04-03 12:30:12 -07:00
Jeff McCune
81beb5c539 (#47) Restore ZITADEL from existing backups
Problem:
The ZITADEL database isn't restoring into the prod-iam namespace after
moving from prod-iam-zitadel because no backup exists at the bucket
path.

Solution:
Hard-code the path to the old namespace to restore the database.  We'll
figure out how to move the backups to the new location in a follow up
change.
2024-04-03 11:44:16 -07:00
Jeff McCune
5c1e0a29c8 (#47) Have Ceph depend on secret stores
Another kustomization reconciling too early.
2024-04-03 11:22:15 -07:00
Jeff McCune
01ac5276a9 (#47) Have Gateway depend on secret stores
The `prod-platform-gateway` kustomization is reconciling early:

ExternalSecret/istio-ingress/argocd.ois.run dry-run failed: failed to
get API group resources: unable to retrieve the complete list of server
APIs: external-secrets.io/v1beta1: the server could not find the
requested resource
2024-04-03 11:20:15 -07:00
Jeff McCune
e40594ad8e (#47) Move ZITADEL to prod-iam project namespace
This patch moves ZITADEL from the prod-iam-zitadel namespace to the
projects managed prod-iam namespace, which is the prod environment of
the prod stage of the iam project.
2024-04-03 11:06:55 -07:00
Jeff McCune
bc9c6a622a (#97) Increase ZITADEL pgdata volume to 20Gi
Problem:

```
❯ k exec zitadel-pgha1-4npq-0 -it -- bash
Defaulted container "database" out of: database, replication-cert-copy, pgbackrest, pgbackrest-config, postgres-startup (init), nss-wrapper-init (init)
bash-4.4$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         119G   51G   68G  43% /
tmpfs            64M     0   64M   0% /dev
/dev/rbd3       9.8G  9.8G     0 100% /pgdata
/dev/sda6       119G   51G   68G  43% /tmp
tmpfs            16G   24K   16G   1% /pgconf/tls
tmpfs            16G   24K   16G   1% /etc/database-containerinfo
tmpfs            16G   16K   16G   1% /etc/patroni
tmpfs            16G     0   16G   0% /dev/shm
tmpfs            16G   28K   16G   1% /etc/pgbackrest/conf.d
tmpfs            16G   12K   16G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs           7.9G     0  7.9G   0% /proc/acpi
tmpfs           7.9G     0  7.9G   0% /proc/scsi
tmpfs           7.9G     0  7.9G   0% /sys/firmware
```
2024-04-03 10:09:49 -07:00
Jeff McCune
17f22199b7 (#86) ArgoCD - Disable Dex
Not needed
2024-04-02 15:47:22 -07:00
Jeff McCune
7e93fe4535 (#86) ArgoCD
Using the Helm chart so we can inject the istio sidecar with a kustomize
patch and tweak the configs for OIDC integration.

Login works, istio sidecar is injected.  ArgoCD can only be configured
with one domain unfortunately, it's not accessible at argocd.ois.run,
only argocd.k2.ois.run (or whatever cluster it's installed into).

Ideally it would use the Host header but it does not.

RBAC is not implemented but the User Info endpoint does have group
membership so this shouldn't be a problem to implement.
2024-04-02 15:33:47 -07:00
Jeff McCune
2e98df3572 (#86) ArgoCD in prod-platform project namespace
Deploys using the official release yaml.
2024-04-02 13:34:03 -07:00
Jeff McCune
3b561de413 (#93) Custom AuthPolicy rules for vault
This patch defines a #AuthPolicyRules struct which excludes hosts from
the blanket auth policy and includes them in specialized auth policies.
The purpose is to handle special cases like vault requests which have an
`X-Vault-Token` and `X-Vault-Request` header.

Vault does not use jwts so we cannot verify them in the mesh, have to
pass them along to the backend.

Closes: #93
2024-04-02 12:54:31 -07:00
Jeff McCune
0d0dae8742 (#89) Disable project auth proxies by default
Focus on the ingress gateway auth proxy for now and see how far it gets
us.
2024-04-01 21:48:08 -07:00
Jeff McCune
61b4b5bd17 (#89) Refactor auth proxy callbacks
The ingress gateway auth proxy callback conflicts with the project stage
auth proxy callback for the same backend Host: header value.

This patch disambiguates them by the namespace the auth proxy resides
in.
2024-04-01 21:37:52 -07:00
Jeff McCune
0060740b76 (#82) ingress gateway AuthorizationPolicy
This patch adds a `RequestAuthentication` and `AuthorizationPolicy` rule
to protect all requests flowing through the default ingress gateway.

Consider a browser request for httpbin.k2.example.com representing any
arbitrary host with a valid destination inside the service mesh.  The
default ingress gateway will check if there is already an
x-oidc-id-token header, and if so validate the token is issued by
ZITADEL and the aud value contains the ZITADEL project number.

If the header is not present, the request is forwarded to oauth2-proxy
in the istio-ingress namespace.  This auth proxy is configured to start
the oidc auth flow with a redirect back to /holos/oidc/callback of the
Host: value originally provided in the browser request.

Closes: #82
2024-04-01 20:37:34 -07:00
Jeff McCune
bf8a4af579 (#82) ingressgateway ExtAuthzHttp provider
This patch adds an ingress gateway extauthz provider.  Because ZITADEL
returns all applications associated with a ZITADEL project in the aud
claim, it makes sense to have one ingress auth proxy at the initial
ingress gateway so we can get the ID token in the request header for
backend namespaces to match using `RequestAuthentication` and
`AuthorizationPolicy`.

This change likely makes the additional per-stage auth proxies
unnecessary and over-engineered.  Backend namespaces will have access to
the ID token.
2024-04-01 16:53:11 -07:00
Jeff McCune
dc057fe39d (#89) Add platform project hosts for argocd, grafana, and prometheus
Certificates are issued by the provisioner and synced to the workload
clusters.
2024-04-01 13:09:46 -07:00
Jeff McCune
9877ab131a (#89) Platform Project
This patch manages a platform project to host platform level services
like ArgoCD, Kube Prom Stack, Kiali, etc...
2024-04-01 11:46:02 -07:00
Jeff McCune
13aba64cb7 (#66) Move CUSTOM AuthorizationPolicy to env namespace
It doesn't make sense to link the stage ext authz provider to the
ingress gateway because there can be only one provider per workload.

Link it instead to the backend environment and use the
`security.holos.run/authproxy` label to match the workload.
2024-03-31 18:56:14 -07:00
Jeff McCune
fe9bc2dbfc (#81) Istio 1.21.0 2024-03-31 12:51:56 -07:00
Jeff McCune
c53b682852 (#66) Use x-oidc-id-token instead of authorization header
Problem:
Backend services and web apps expect to place their own credentials into
the Authorization header.  oauth2-proxy writes over the authorization
header creating a conflict.

Solution:
Use the alpha configuration to place the id token into the
x-oidc-id-token header and configure the service mesh to authenticate
requests that have this header in place.

Note: ZITADEL does not use a JWT for an access token, unlike Keycloak
and Dex.  The access token is not compatible with a
RequestAuthentication jwt rule so we must use the id token.
2024-03-31 11:41:23 -07:00
Jeff McCune
3aca6a9e4c (#66) configure auth proxies to set Authorization: Bearer header
Without this patch the istio RequestAuthentication resources fail to
match because the access token from ZITADEL returned by oauth2-proxy in
the x-auth-request-access-token header is not a proper jwt.

The error is:

```
Jwt is not in the form of Header.Payload.Signature with two dots and 3 sections
```

This patch works around the problem by configuring oauth2-proxy to set
the ID token, which is guaranteed to be a proper JWT in the
authorization response headers.

Unfortunately, oauth2-proxy will only place the ID token in the
Authorization header response, which will write over any header set by a
client application.  This is likely to cause problems with single page
apps.

We'll probably need to work around this issue by using the alpha
configuration to set the id token in some out-of-the-way header.  We've
done this before, it'll just take some work to setup the ConfigMap and
translate the config again.
2024-03-30 16:15:27 -07:00
Jeff McCune
40fdfc0317 (#66) Fix auth proxy provider name, stage is always first
dev-holos-authproxy not authproxy-dev-holos
2024-03-30 14:05:50 -07:00
Jeff McCune
25d9415b0a (#66) Fix redis not able to write to /data
Without this patch redis cannot write to the /data directory, which
causes oauth2-proxy to fail with a 500 server error.
2024-03-30 13:40:34 -07:00
Jeff McCune
43c8702398 (#66) Configure an ExtAuthzProxy provider for each project stage
This patch configures an istio envoyExtAuthzHttp provider for each stage
in each project.  An example provider for the dev stage of the holos
project is `authproxy-dev-holos`
2024-03-30 11:28:23 -07:00
Jeff McCune
ce94776dbb (#66) Add ZITADEL project and client ids for iam project
core1 and core2 don't render without these resource identifiers in
place.
2024-03-30 09:18:54 -07:00
Jeff McCune
78ab6cd848 (#66) Match /holos/oidc for all hosts in the project stage
This has the same effect and makes the VirtualService much more
manageable, particularly when calling `kubectl get vs -A`.
2024-03-29 22:50:17 -07:00
Jeff McCune
0a7001f868 (#66) Configure the primary domain for zitadel
This bypasses the account selection screen and automatically redirects
back to the application without user interaction.
2024-03-29 22:44:52 -07:00
Jeff McCune
2db7be671b (#66) Route prefix /holos/oidc to authproxy
This patch configures the service mesh to route all requests with a uri
path prefix of `/holos/oidc` to the auth proxy associated with the
project stage.

Consider a request to https://jeff.holos.dev.k2.ois.run/holos/oidc/sign_in

This request is usually routed to the backend app, but
VirtualService/authproxy in the dev-holos-system namespace matches the
request and routes it to the auth proxy instead.

The auth proxy matches the request Host: header against the whitelist
and cookiedomain setting, which matches the suffix
`.holos.dev.k2.ois.run`.  The auth proxy redirects to the oidc issuer
with a callback url of the request Host for a url of
`https://jeff.holos.dev.k2.ois.run/holos/oidc/callback`.

ZITADEL matches the callback against those registered with the app and
the app client id.  A code is then sent back to the auth proxy.

The auth proxy sets a cookie named `__Secure-authproxy-dev-holos` with a
domain of `.holos.dev.k2.ois.run` from the suffix match of the
`--cookiedomain` flag.

Because this all works using paths, the `auth` prefix domains have been
removed.  They're unnecessary, oauth2-proxy is available for any host
routed to the project stage at path prefix `/holos/oidc`.

Refer to https://oauth2-proxy.github.io/oauth2-proxy/features/endpoints/
for good endpoints for debuggin, replacing `/oauth2` with `/holos/oidc`
2024-03-29 21:56:46 -07:00
Jeff McCune
b51870f7bf (#66) Deploy oauth2-proxy and redis to stage namespaces
This patch deploys oauth2-proxy and redis to the system namespace of
each stage in each project.  The plan is to redirect unauthenticated
requests to the request host at the /holos/oidc/callback endpoint.

This patch removes the --redirect-uri flag, which makes the auth domain
prefix moot, so a future patch should remove those if they really are
unnecessary.

The reason to remove the --redirect-uri flag is to make sure we set the
cookie to a domain suffix of the request Host: header.
2024-03-29 20:56:26 -07:00
Jeff McCune
0227dfa7e5 (#66) Add Gateway entries for oauth2-proxy
This patch adds entries to the project stage Gateway for oauth2-proxy.
Three entries for each stage are added, one for the global endpoint plus
one for each cluster.
2024-03-29 15:30:02 -07:00
Jeff McCune
05b59d9af0 (#66) Refactor project hosts for auth proxy cookies
Without this patch the auth proxy cookie domain is difficult to manage.
This patch refactors the hosts managed for each environment in a project
to better align with security domains and auth proxy session cookies.

The convention is: `<env?>.<host>.<stage?>.<cluster?>.<domain>` where
`host` can be 0..N entries with a default value of `[projectName]`.

env may be omitted for prod or the dev env of the dev stage.  stage may
be omitted for prod.  cluster may be omitted for the global endpoint.

For a project named `holos`:

| Project | Stage | Env  | Cluster | Host                      |
| ------- | ----- | ---  | ------- | ------                    |
| holos   | dev   | jeff | k2      | jeff.holos.dev.k2.ois.run |
| holos   | dev   | jeff | global  | jeff.holos.dev.ois.run    |
| holos   | dev   | -    | k2      | holos.dev.k2.ois.run      |
| holos   | dev   | -    | global  | holos.dev.ois.run         |
| holos   | prod  | -    | k2      | holos.k2.ois.run          |
| holos   | prod  | -    | global  | holos.ois.run             |

Auth proxy:

| Project | Stage | Auth Proxy Host           | Auth Cookie Domain   |
| ------- | ----- | ------                    | ------------------   |
| holos   | dev   | auth.holos.dev.ois.run    | holos.dev.ois.run    |
| holos   | dev   | auth.holos.dev.k1.ois.run | holos.dev.k1.ois.run |
| holos   | dev   | auth.holos.dev.k2.ois.run | holos.dev.k2.ois.run |
| holos   | prod  | auth.holos.ois.run        | holos.ois.run        |
| holos   | prod  | auth.holos.k1.ois.run     | holos.k1.ois.run     |
| holos   | prod  | auth.holos.k2.ois.run     | holos.k2.ois.run     |
2024-03-29 15:30:01 -07:00
Jeff McCune
04f9f3b3a8 Merge pull request #79 from holos-run/nate/makefile_version
Show the holos version in 'make install|build'
2024-03-29 15:04:48 -07:00
Nate McCurdy
b58be8b38c Show the holos version in 'make install|build'
Prior to this, when running the 'install' or 'build' Makefile target,
the version of holos being built was not shown even though the 'build'
target attempted to show the version.

```
.PHONY: build
build: generate ## Build holos executable.
	@echo "building ${BIN_NAME} ${VERSION}"
```

For example:
```
> make install
go generate ./...
building holos
...
```

Holo's version is stored in pkg/version/embedded/{major,minor,patch},
not the `Version` const. So the fix is to change the value of `VERSION`
so that it comes from those embedded files.

Now the version of holos is shown:

```
> make install
go generate ./...
building holos 0.61.1
...
```

This also adds a new Makefile target called `show-version` which shows
the full version string (i.e. the value of `$VERSION`).
2024-03-29 15:01:33 -07:00
Jeff McCune
10493d754a (#66) Add httpbin to each project environment
The goal of this patch is to verify each project environment is wired up
to the ingress Gateway for the project stage.

This is a necessary step to eventually configure the VirtualService and
AuthorizationPolicy to only match on the `/dump/request` path of each
endpoint for troubleshooting.
2024-03-28 21:51:34 -07:00
Jeff McCune
cf28516b8b (#66) Project managed namespaces
This patch uses the existing #ManagedNamespaces definition to create and
manage namespaces on the provisioner and workload clusters so that
SecretStore and eso-creds-refresher resources are managed in the project
environment namespaces and the project stage system namespace.
2024-03-28 15:09:57 -07:00
Jeff McCune
d81e25c4e4 (#66) Project Certificates
Provisioner cluster:

This patch creates a Certificate resource in the provisioner for each
host associated with the project.  By default, one host is created for
each stage with the short hostname set to the project name.

A namespace is also created for each project for eso creds refresher to
manage service accounts for SecretStore resources in the workload
clusters.

Workload cluster:

For each env, plus one system namespace per stage:

 - Namespace per env
 - SecretStore per env
 - ExternalSecret per host in the env

Common names for the holos project, prod stage:

- holos.k1.ois.run
- holos.k2.ois.run
- holos.ois.run

Common names for the holos project, dev stage:

- holos.dev.k1.ois.run
- holos.dev.k2.ois.run
- holos.dev.ois.run
- holos.gary.k1.ois.run
- holos.gary.k2.ois.run
- holos.gary.ois.run
- holos.jeff.k1.ois.run
- holos.jeff.k2.ois.run
- holos.jeff.ois.run
- holos.nate.k1.ois.run
- holos.nate.k2.ois.run
- holos.nate.ois.run

Usage:

    holos render --cluster-name=provisioner \
      ~/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/provisioner/projects/...
    holos render --cluster-name=k1 \
      ~/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/workload/projects/...
    holos render --cluster-name=k2 \
      ~/workspace/holos-run/holos/docs/examples/platforms/reference/clusters/workload/projects/...
2024-03-27 20:54:51 -07:00
Jeff McCune
c4612ff5d2 (#64) Manage one system namespace per project
This patch introduces a new BuildPlan spec.components.resources
collection, which is a map version of
spec.components.kubernetesObjectsList.  The map version is much easier
to work with and produce in CUE than the list version.

The list version should be deprecated and removed prior to public
release.

The projects holos instance renders multiple holos components, each
containing kubernetes api objects defined directly in CUE.

<project>-system is intended for the ext auth proxy providers for all
stages.

<project>-namespaces is intended to create a namespace for each
environment in the project.

The intent is to expand the platform level definition of a project to
include the per-stage auth proxy and per-env role bindings.  Secret
Store and ESO creds refresher resources will also be defined by the
platform level definition of a project.
2024-03-26 12:23:01 -07:00
Jeff McCune
d70acbb47e ignore .vscode 2024-03-22 21:22:06 -07:00
533 changed files with 127906 additions and 870 deletions

View File

@@ -17,12 +17,33 @@ jobs:
name: lint
runs-on: gha-rs
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Node
uses: actions/setup-node@v4
with:
node-version: 20
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: stable
cache: false
- name: Install Packages
run: sudo apt update && sudo apt -qq -y install git curl zip unzip tar bzip2 make
- name: Install Tools
run: |
set -x
make tools
make buf
go generate ./...
make frontend
go mod tidy
- name: golangci-lint
uses: golangci/golangci-lint-action@v4
with:
version: latest
skip-pkg-cache: true

View File

@@ -14,23 +14,49 @@ jobs:
goreleaser:
runs-on: gha-rs
steps:
# Must come before Checkout, otherwise goreleaser fails
- name: Provide GPG and Git
run: sudo apt update && sudo apt -qq -y install gnupg git
run: sudo apt update && sudo apt -qq -y install gnupg git curl zip unzip tar bzip2 make
# Must come after git executable is provided
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Node
uses: actions/setup-node@v4
with:
node-version: 20
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: stable
# Necessary to run these outside of goreleaser, otherwise
# /home/runner/_work/holos/holos/internal/frontend/node_modules/.bin/protoc-gen-connect-query is not in PATH
- name: Install Tools
run: |
set -x
make tools
make buf
go generate ./...
make frontend
go mod tidy
- name: Import GPG key
uses: crazy-max/ghaction-import-gpg@v6
with:
gpg_private_key: ${{ secrets.GPG_CODE_SIGNING_SECRETKEY }}
passphrase: ${{ secrets.GPG_CODE_SIGNING_PASSPHRASE }}
- name: List keys
run: gpg -K
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: stable
- name: Git diff
run: git diff
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v5
with:

View File

@@ -18,13 +18,18 @@ jobs:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Node
uses: actions/setup-node@v4
with:
node-version: 20
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: stable
- name: Provide unzip for Helm
run: sudo apt update && sudo apt -qq -y install curl zip unzip tar bzip2
- name: Install Packages
run: sudo apt update && sudo apt -qq -y install git curl zip unzip tar bzip2 make
- name: Set up Helm
uses: azure/setup-helm@v4
@@ -32,5 +37,14 @@ jobs:
- name: Set up Kubectl
uses: azure/setup-kubectl@v3
- name: Install Tools
run: |
set -x
make tools
make buf
go generate ./...
make frontend
go mod tidy
- name: Test
run: ./scripts/test

6
.gitignore vendored
View File

@@ -1,7 +1,9 @@
bin/
/bin/
vendor/
.idea/
coverage.out
dist/
/dist/
*.hold/
/deploy/
.vscode/
tmp/

View File

@@ -10,10 +10,8 @@ version: 1
before:
hooks:
# You may remove this if you don't use go modules.
- go mod tidy
# you may remove this if you don't need go generate
- go generate ./...
- go mod tidy
builds:
- main: ./cmd/holos
@@ -23,6 +21,9 @@ builds:
- linux
- windows
- darwin
goarch:
- amd64
- arm64
signs:
- artifacts: checksum

View File

@@ -4,7 +4,7 @@ PROJ=holos
ORG_PATH=github.com/holos-run
REPO_PATH=$(ORG_PATH)/$(PROJ)
VERSION := $(shell grep "const Version " pkg/version/version.go | sed -E 's/.*"(.+)"$$/\1/')
VERSION := $(shell cat version/embedded/major version/embedded/minor version/embedded/patch | xargs printf "%s.%s.%s")
BIN_NAME := holos
DOCKER_REPO=quay.io/openinfrastructure/holos
@@ -12,11 +12,14 @@ IMAGE_NAME=$(DOCKER_REPO)
$( shell mkdir -p bin)
# For buf plugin protoc-gen-connect-es
export PATH := $(PWD)/internal/frontend/holos/node_modules/.bin:$(PATH)
GIT_COMMIT=$(shell git rev-parse HEAD)
GIT_TREE_STATE=$(shell test -n "`git status --porcelain`" && echo "dirty" || echo "clean")
BUILD_DATE=$(shell date -Iseconds)
LD_FLAGS="-w -X ${ORG_PATH}/${PROJ}/pkg/version.GitCommit=${GIT_COMMIT} -X ${ORG_PATH}/${PROJ}/pkg/version.GitTreeState=${GIT_TREE_STATE} -X ${ORG_PATH}/${PROJ}/pkg/version.BuildDate=${BUILD_DATE}"
LD_FLAGS="-w -X ${ORG_PATH}/${PROJ}/version.GitCommit=${GIT_COMMIT} -X ${ORG_PATH}/${PROJ}/version.GitTreeState=${GIT_TREE_STATE} -X ${ORG_PATH}/${PROJ}/version.BuildDate=${BUILD_DATE}"
.PHONY: default
default: test
@@ -39,6 +42,10 @@ bumpmajor: ## Bump the major version.
scripts/bump minor 0
scripts/bump patch 0
.PHONY: show-version
show-version: ## Print the full version.
@echo $(VERSION)
.PHONY: tidy
tidy: ## Tidy go module.
go mod tidy
@@ -54,14 +61,26 @@ vet: ## Vet Go code.
.PHONY: gencue
gencue: ## Generate CUE definitions
cd docs/examples && cue get go github.com/holos-run/holos/api/...
cd internal/generate/platforms && cue get go github.com/holos-run/holos/api/v1alpha1/...
.PHONY: rmgen
rmgen: ## Remove generated code
git rm -rf service/gen/ internal/frontend/holos/src/app/gen/ || true
rm -rf service/gen/ internal/frontend/holos/src/app/gen/
git rm -rf internal/ent/
rm -rf internal/ent/
git restore --staged internal/ent/generate.go internal/ent/schema/
git restore internal/ent/generate.go internal/ent/schema/
.PHONY: regenerate
regenerate: generate ## Re-generate code (delete and re-create)
.PHONY: generate
generate: ## Generate code.
generate: buf gencue ## Generate code.
go generate ./...
.PHONY: build
build: generate ## Build holos executable.
build: generate frontend ## Build holos executable.
@echo "building ${BIN_NAME} ${VERSION}"
@echo "GOPATH=${GOPATH}"
go build -trimpath -o bin/$(BIN_NAME) -ldflags $(LD_FLAGS) $(REPO_PATH)/cmd/$(BIN_NAME)
@@ -80,6 +99,8 @@ test: ## Run tests.
.PHONY: lint
lint: ## Run linters.
buf lint
cd internal/frontend/holos && ng lint
golangci-lint run
.PHONY: coverage
@@ -90,6 +111,40 @@ coverage: test ## Test coverage profile.
snapshot: ## Go release snapshot
goreleaser release --snapshot --clean
.PHONY: buf
buf: ## buf generate
cd service && buf dep update
buf generate
.PHONY: tools
tools: go-deps frontend-deps ## install tool dependencies
.PHONY: go-deps
go-deps: ## tool versions pinned in tools.go
go install github.com/bufbuild/buf/cmd/buf
go install github.com/fullstorydev/grpcurl/cmd/grpcurl
go install google.golang.org/protobuf/cmd/protoc-gen-go
go install connectrpc.com/connect/cmd/protoc-gen-connect-go
go install honnef.co/go/tools/cmd/staticcheck@latest
# curl https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | bash
.PHONY: frontend-deps
frontend-deps: ## Setup npm and vite
cd internal/frontend/holos && npm install
cd internal/frontend/holos && npm install --save-dev @bufbuild/buf @connectrpc/protoc-gen-connect-es
cd internal/frontend/holos && npm install @connectrpc/connect @connectrpc/connect-web @bufbuild/protobuf
# https://github.com/connectrpc/connect-query-es/blob/1350b6f07b6aead81793917954bdb1cc3ce09df9/packages/protoc-gen-connect-query/README.md?plain=1#L23
cd internal/frontend/holos && npm install --save-dev @connectrpc/protoc-gen-connect-query @bufbuild/protoc-gen-es
cd internal/frontend/holos && npm install @connectrpc/connect-query @bufbuild/protobuf
.PHONY: frontend
frontend: buf
cd internal/frontend/holos && rm -rf dist
mkdir -p internal/frontend/holos/dist
cd internal/frontend/holos && ng build
touch internal/frontend/frontend.go
.PHONY: help
help: ## Display this help menu.
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf " \033[36m%-20s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)

311
Tiltfile Normal file
View File

@@ -0,0 +1,311 @@
# -*- mode: Python -*-
# This Tiltfile manages a Go project with live leload in Kubernetes
listen_port = 3000
metrics_port = 9090
# Use our wrapper to set the kube namespace
if os.getenv('TILT_WRAPPER') != '1':
fail("could not run, ./hack/tilt/bin/tilt was not used to start tilt")
# AWS Account to work in
aws_account = '271053619184'
aws_region = 'us-east-2'
# Resource ids
holos_backend = 'Holos Backend'
pg_admin = 'pgAdmin'
pg_cluster = 'PostgresCluster'
pg_svc = 'Database Pod'
compile_id = 'Go Build'
auth_id = 'Auth Policy'
lint_id = 'Run Linters'
tests_id = 'Run Tests'
# PostgresCluster resource name in k8s
pg_cluster_name = 'holos'
# Database name inside the PostgresCluster
pg_database_name = 'holos'
# PGAdmin name
pg_admin_name = 'pgadmin'
# Default Registry.
# See: https://github.com/tilt-dev/tilt.build/blob/master/docs/choosing_clusters.md#manual-configuration
# Note, Tilt will append the image name to the registry uri path
default_registry('{account}.dkr.ecr.{region}.amazonaws.com/holos-run/holos-server'.format(account=aws_account, region=aws_region))
# Set a name prefix specific to the user. Multiple developers share the tilt-holos namespace.
developer = os.getenv('USER')
holos_server = 'holos'
# See ./hack/tilt/bin/tilt
namespace = os.getenv('NAMESPACE')
# We always develop against the k1 cluster.
os.putenv('KUBECONFIG', os.path.abspath('./hack/tilt/kubeconfig'))
# The context defined in ./hack/tilt/kubeconfig
allow_k8s_contexts('sso@k1')
allow_k8s_contexts('sso@k2')
allow_k8s_contexts('sso@k3')
allow_k8s_contexts('sso@k4')
allow_k8s_contexts('sso@k5')
# PG db connection for localhost -> k8s port-forward
os.putenv('PGHOST', 'localhost')
os.putenv('PGPORT', '15432')
# We always develop in the dev aws account.
os.putenv('AWS_CONFIG_FILE', os.path.abspath('./hack/tilt/aws.config'))
os.putenv('AWS_ACCOUNT', aws_account)
os.putenv('AWS_DEFAULT_REGION', aws_region)
os.putenv('AWS_PROFILE', 'dev-holos')
os.putenv('AWS_SDK_LOAD_CONFIG', '1')
# Authenticate to AWS ECR when tilt up is run by the developer
local_resource('AWS Credentials', './hack/tilt/aws-login.sh', auto_init=True)
# Extensions are open-source, pre-packaged functions that extend Tilt
#
# More info: https://github.com/tilt-dev/tilt-extensions
# More info: https://docs.tilt.dev/extensions.html
load('ext://restart_process', 'docker_build_with_restart')
load('ext://k8s_attach', 'k8s_attach')
load('ext://git_resource', 'git_checkout')
load('ext://uibutton', 'cmd_button')
# Paths edited by the developer Tilt watches to trigger compilation.
# Generated files should be excluded to avoid an infinite build loop.
developer_paths = [
'./cmd',
'./internal/server',
'./internal/ent/schema',
'./frontend/package-lock.json',
'./frontend/src',
'./go.mod',
'./pkg',
'./service/holos',
]
# Builds the holos-server executable
local_resource(compile_id, 'make build', deps=developer_paths)
# Build Docker image
# Tilt will automatically associate image builds with the resource(s)
# that reference them (e.g. via Kubernetes or Docker Compose YAML).
#
# More info: https://docs.tilt.dev/api.html#api.docker_build
#
docker_build_with_restart(
'holos',
context='.',
entrypoint=[
'/app/bin/holos',
'server',
'--listen-port={}'.format(listen_port),
'--oidc-issuer=https://login.ois.run',
'--oidc-audience=262096764402729854@holos_platform',
'--log-level=debug',
'--metrics-port={}'.format(metrics_port),
],
dockerfile='./hack/tilt/Dockerfile',
only=['./bin'],
# (Recommended) Updating a running container in-place
# https://docs.tilt.dev/live_update_reference.html
live_update=[
# Sync files from host to container
sync('./bin', '/app/bin'),
# Wait for aws-login https://github.com/tilt-dev/tilt/issues/3048
sync('./tilt/aws-login.last', '/dev/null'),
# Execute commands in the container when paths change
# run('/app/hack/codegen.sh', trigger=['./app/api'])
],
)
# Run local commands
# Local commands can be helpful for one-time tasks like installing
# project prerequisites. They can also manage long-lived processes
# for non-containerized services or dependencies.
#
# More info: https://docs.tilt.dev/local_resource.html
#
# local_resource('install-helm',
# cmd='which helm > /dev/null || brew install helm',
# # `cmd_bat`, when present, is used instead of `cmd` on Windows.
# cmd_bat=[
# 'powershell.exe',
# '-Noninteractive',
# '-Command',
# '& {if (!(Get-Command helm -ErrorAction SilentlyContinue)) {scoop install helm}}'
# ]
# )
# Teach tilt about our custom resources (Note, this may be intended for workloads)
# k8s_kind('authorizationpolicy')
# k8s_kind('requestauthentication')
# k8s_kind('virtualservice')
k8s_kind('pgadmin')
# Troubleshooting
def resource_name(id):
print('resource: {}'.format(id))
return id.name
workload_to_resource_function(resource_name)
# Apply Kubernetes manifests
# Tilt will build & push any necessary images, re-deploying your
# resources as they change.
#
# More info: https://docs.tilt.dev/api.html#api.k8s_yaml
#
def holos_yaml():
"""Return a k8s Deployment personalized for the developer."""
k8s_yaml_template = str(read_file('./hack/tilt/k8s.yaml'))
return k8s_yaml_template.format(
name=holos_server,
developer=developer,
namespace=namespace,
listen_port=listen_port,
metrics_port=metrics_port,
tz=os.getenv('TZ'),
)
# Customize a Kubernetes resource
# By default, Kubernetes resource names are automatically assigned
# based on objects in the YAML manifests, e.g. Deployment name.
#
# Tilt strives for sane defaults, so calling k8s_resource is
# optional, and you only need to pass the arguments you want to
# override.
#
# More info: https://docs.tilt.dev/api.html#api.k8s_resource
#
k8s_yaml(blob(holos_yaml()))
# Backend server process
k8s_resource(
workload=holos_server,
new_name=holos_backend,
objects=[
'{}:serviceaccount'.format(holos_server),
'{}:servicemonitor'.format(holos_server),
],
resource_deps=[compile_id],
links=[
link('https://{}.app.dev.k2.holos.run/ui/'.format(developer), "Holos Web UI")
],
)
# AuthorizationPolicy - Beyond Corp functionality
k8s_resource(
new_name=auth_id,
objects=[
'{}:virtualservice'.format(holos_server),
],
)
# Database
# Note: Tilt confuses the backup pods with the database server pods, so this code is careful to tease the pods
# apart so logs are streamed correctly.
# See: https://github.com/tilt-dev/tilt.specs/blob/master/resource_assembly.md
# pgAdmin Web UI
k8s_resource(
workload=pg_admin_name,
new_name=pg_admin,
port_forwards=[
port_forward(15050, 5050, pg_admin),
],
)
# Disabled because these don't group resources nicely
# k8s_kind('postgrescluster')
# Postgres database in-cluster
k8s_resource(
new_name=pg_cluster,
objects=['holos:postgrescluster'],
)
# Needed to select the database by label
# https://docs.tilt.dev/api.html#api.k8s_custom_deploy
k8s_custom_deploy(
pg_svc,
apply_cmd=['./hack/tilt/k8s-get-db-sts', pg_cluster_name],
delete_cmd=['echo', 'Skipping delete. Object managed by custom resource.'],
deps=[],
)
k8s_resource(
pg_svc,
port_forwards=[
port_forward(15432, 5432, 'psql'),
],
resource_deps=[pg_cluster]
)
# Run tests
local_resource(
tests_id,
'make test',
allow_parallel=True,
auto_init=False,
deps=developer_paths,
)
# Run linter
local_resource(
lint_id,
'make lint',
allow_parallel=True,
auto_init=False,
deps=developer_paths,
)
# UI Buttons for helpful things.
# Icons: https://fonts.google.com/icons
os.putenv("GH_FORCE_TTY", "80%")
cmd_button(
'{}:go-test-failfast'.format(tests_id),
argv=['./hack/tilt/go-test-failfast'],
resource=tests_id,
icon_name='quiz',
text='Fail Fast',
)
cmd_button(
'{}:issues'.format(holos_server),
argv=['./hack/tilt/gh-issues'],
resource=holos_backend,
icon_name='folder_data',
text='Issues',
)
cmd_button(
'{}:gh-issue-view'.format(holos_server),
argv=['./hack/tilt/gh-issue-view'],
resource=holos_backend,
icon_name='task',
text='View Issue',
)
cmd_button(
'{}:get-pgdb-creds'.format(holos_server),
argv=['./hack/tilt/get-pgdb-creds', pg_cluster_name, pg_database_name],
resource=pg_svc,
icon_name='lock_open_right',
text='DB Creds',
)
cmd_button(
'{}:get-pgdb-creds'.format(pg_admin_name),
argv=['./hack/tilt/get-pgdb-creds', pg_cluster_name, pg_database_name],
resource=pg_admin,
icon_name='lock_open_right',
text='DB Creds',
)
cmd_button(
'{}:get-pgadmin-creds'.format(pg_admin_name),
argv=['./hack/tilt/get-pgadmin-creds', pg_admin_name],
resource=pg_admin,
icon_name='lock_open_right',
text='pgAdmin Login',
)
print("✨ Tiltfile evaluated")

View File

@@ -19,9 +19,10 @@ type BuildPlanSpec struct {
}
type BuildPlanComponents struct {
HelmChartList []HelmChart `json:"helmChartList,omitempty" yaml:"helmChartList,omitempty"`
KubernetesObjectsList []KubernetesObjects `json:"kubernetesObjectsList,omitempty" yaml:"kubernetesObjectsList,omitempty"`
KustomizeBuildList []KustomizeBuild `json:"kustomizeBuildList,omitempty" yaml:"kustomizeBuildList,omitempty"`
HelmChartList []HelmChart `json:"helmChartList,omitempty" yaml:"helmChartList,omitempty"`
KubernetesObjectsList []KubernetesObjects `json:"kubernetesObjectsList,omitempty" yaml:"kubernetesObjectsList,omitempty"`
KustomizeBuildList []KustomizeBuild `json:"kustomizeBuildList,omitempty" yaml:"kustomizeBuildList,omitempty"`
Resources map[string]KubernetesObjects `json:"resources,omitempty" yaml:"resources,omitempty"`
}
func (bp *BuildPlan) Validate() error {
@@ -37,3 +38,14 @@ func (bp *BuildPlan) Validate() error {
}
return nil
}
func (bp *BuildPlan) ResultCapacity() (count int) {
if bp == nil {
return 0
}
count = len(bp.Spec.Components.HelmChartList) +
len(bp.Spec.Components.KubernetesObjectsList) +
len(bp.Spec.Components.KustomizeBuildList) +
len(bp.Spec.Components.Resources)
return count
}

13
api/v1alpha1/form.go Normal file
View File

@@ -0,0 +1,13 @@
package v1alpha1
import object "github.com/holos-run/holos/service/gen/holos/object/v1alpha1"
// Form represents a collection of Formly json powered form.
type Form struct {
TypeMeta `json:",inline" yaml:",inline"`
Spec FormSpec `json:"spec" yaml:"spec"`
}
type FormSpec struct {
Form object.Form `json:"form" yaml:"form"`
}

View File

@@ -8,9 +8,9 @@ import (
"strings"
"github.com/holos-run/holos"
"github.com/holos-run/holos/pkg/logger"
"github.com/holos-run/holos/pkg/util"
"github.com/holos-run/holos/pkg/wrapper"
"github.com/holos-run/holos/internal/errors"
"github.com/holos-run/holos/internal/logger"
"github.com/holos-run/holos/internal/util"
)
// A HelmChart represents a helm command to provide chart values in order to render kubernetes api objects.
@@ -35,21 +35,21 @@ type Repository struct {
URL string `json:"url"`
}
func (hc *HelmChart) Render(ctx context.Context, path holos.PathComponent) (*Result, error) {
func (hc *HelmChart) Render(ctx context.Context, path holos.InstancePath) (*Result, error) {
result := Result{HolosComponent: hc.HolosComponent}
if err := hc.helm(ctx, &result, path); err != nil {
return nil, err
}
result.addObjectMap(ctx, hc.APIObjectMap)
if err := result.kustomize(ctx); err != nil {
return nil, wrapper.Wrap(fmt.Errorf("could not kustomize: %w", err))
return nil, errors.Wrap(fmt.Errorf("could not kustomize: %w", err))
}
return &result, nil
}
// runHelm provides the values produced by CUE to helm template and returns
// the rendered kubernetes api objects in the result.
func (hc *HelmChart) helm(ctx context.Context, r *Result, path holos.PathComponent) error {
func (hc *HelmChart) helm(ctx context.Context, r *Result, path holos.InstancePath) error {
log := logger.FromContext(ctx).With("chart", hc.Chart.Name)
if hc.Chart.Name == "" {
log.WarnContext(ctx, "skipping helm: no chart name specified, use a different component type")
@@ -64,13 +64,13 @@ func (hc *HelmChart) helm(ctx context.Context, r *Result, path holos.PathCompone
out, err := util.RunCmd(ctx, "helm", "repo", "add", repo.Name, repo.URL)
if err != nil {
log.ErrorContext(ctx, "could not run helm", "stderr", out.Stderr.String(), "stdout", out.Stdout.String())
return wrapper.Wrap(fmt.Errorf("could not run helm repo add: %w", err))
return errors.Wrap(fmt.Errorf("could not run helm repo add: %w", err))
}
// Update repository
out, err = util.RunCmd(ctx, "helm", "repo", "update", repo.Name)
if err != nil {
log.ErrorContext(ctx, "could not run helm", "stderr", out.Stderr.String(), "stdout", out.Stdout.String())
return wrapper.Wrap(fmt.Errorf("could not run helm repo update: %w", err))
return errors.Wrap(fmt.Errorf("could not run helm repo update: %w", err))
}
} else {
log.DebugContext(ctx, "no chart repository url proceeding assuming oci chart")
@@ -85,13 +85,13 @@ func (hc *HelmChart) helm(ctx context.Context, r *Result, path holos.PathCompone
// Write values file
tempDir, err := os.MkdirTemp("", "holos")
if err != nil {
return wrapper.Wrap(fmt.Errorf("could not make temp dir: %w", err))
return errors.Wrap(fmt.Errorf("could not make temp dir: %w", err))
}
defer util.Remove(ctx, tempDir)
valuesPath := filepath.Join(tempDir, "values.yaml")
if err := os.WriteFile(valuesPath, []byte(hc.ValuesContent), 0644); err != nil {
return wrapper.Wrap(fmt.Errorf("could not write values: %w", err))
return errors.Wrap(fmt.Errorf("could not write values: %w", err))
}
log.DebugContext(ctx, "helm: wrote values", "path", valuesPath, "bytes", len(hc.ValuesContent))
@@ -112,7 +112,7 @@ func (hc *HelmChart) helm(ctx context.Context, r *Result, path holos.PathCompone
err = fmt.Errorf("%s: %w", line, err)
}
}
return wrapper.Wrap(fmt.Errorf("could not run helm template: %w", err))
return errors.Wrap(fmt.Errorf("could not run helm template: %w", err))
}
r.accumulatedOutput = helmOut.Stdout.String()
@@ -121,12 +121,12 @@ func (hc *HelmChart) helm(ctx context.Context, r *Result, path holos.PathCompone
}
// cacheChart stores a cached copy of Chart in the chart subdirectory of path.
func cacheChart(ctx context.Context, path holos.PathComponent, chartDir string, chart Chart) error {
func cacheChart(ctx context.Context, path holos.InstancePath, chartDir string, chart Chart) error {
log := logger.FromContext(ctx)
cacheTemp, err := os.MkdirTemp(string(path), chartDir)
if err != nil {
return wrapper.Wrap(fmt.Errorf("could not make temp dir: %w", err))
return errors.Wrap(fmt.Errorf("could not make temp dir: %w", err))
}
defer util.Remove(ctx, cacheTemp)
@@ -136,14 +136,30 @@ func cacheChart(ctx context.Context, path holos.PathComponent, chartDir string,
}
helmOut, err := util.RunCmd(ctx, "helm", "pull", "--destination", cacheTemp, "--untar=true", "--version", chart.Version, chartName)
if err != nil {
return wrapper.Wrap(fmt.Errorf("could not run helm pull: %w", err))
return errors.Wrap(fmt.Errorf("could not run helm pull: %w", err))
}
log.Debug("helm pull", "stdout", helmOut.Stdout, "stderr", helmOut.Stderr)
cachePath := filepath.Join(string(path), chartDir)
if err := os.Rename(cacheTemp, cachePath); err != nil {
return wrapper.Wrap(fmt.Errorf("could not rename: %w", err))
if err := os.MkdirAll(cachePath, 0777); err != nil {
return errors.Wrap(fmt.Errorf("could not mkdir: %w", err))
}
items, err := os.ReadDir(cacheTemp)
if err != nil {
return errors.Wrap(fmt.Errorf("could not read directory: %w", err))
}
for _, item := range items {
src := filepath.Join(cacheTemp, item.Name())
dst := filepath.Join(cachePath, item.Name())
log.DebugContext(ctx, "rename", "src", src, "dst", dst)
if err := os.Rename(src, dst); err != nil {
return errors.Wrap(fmt.Errorf("could not rename: %w", err))
}
}
log.InfoContext(ctx, "cached", "chart", chart.Name, "version", chart.Version, "path", cachePath)
return nil

View File

@@ -2,6 +2,7 @@ package v1alpha1
import (
"context"
"github.com/holos-run/holos"
)
@@ -13,7 +14,7 @@ type KubernetesObjects struct {
}
// Render produces kubernetes api objects from the APIObjectMap
func (o *KubernetesObjects) Render(ctx context.Context, path holos.PathComponent) (*Result, error) {
func (o *KubernetesObjects) Render(ctx context.Context, path holos.InstancePath) (*Result, error) {
result := Result{HolosComponent: o.HolosComponent}
result.addObjectMap(ctx, o.APIObjectMap)
return &result, nil

View File

@@ -2,10 +2,11 @@ package v1alpha1
import (
"context"
"github.com/holos-run/holos"
"github.com/holos-run/holos/pkg/logger"
"github.com/holos-run/holos/pkg/util"
"github.com/holos-run/holos/pkg/wrapper"
"github.com/holos-run/holos/internal/errors"
"github.com/holos-run/holos/internal/logger"
"github.com/holos-run/holos/internal/util"
)
const KustomizeBuildKind = "KustomizeBuild"
@@ -29,14 +30,14 @@ type KustomizeBuild struct {
// Render produces a Result by executing kubectl kustomize on the holos
// component path. Useful for processing raw yaml files.
func (kb *KustomizeBuild) Render(ctx context.Context, path holos.PathComponent) (*Result, error) {
func (kb *KustomizeBuild) Render(ctx context.Context, path holos.InstancePath) (*Result, error) {
log := logger.FromContext(ctx)
result := Result{HolosComponent: kb.HolosComponent}
// Run kustomize.
kOut, err := util.RunCmd(ctx, "kubectl", "kustomize", string(path))
if err != nil {
log.ErrorContext(ctx, kOut.Stderr.String())
return nil, wrapper.Wrap(err)
return nil, errors.Wrap(err)
}
// Replace the accumulated output
result.accumulatedOutput = kOut.Stdout.String()

View File

@@ -1,8 +1,14 @@
package v1alpha1
// Label is an arbitrary unique identifier. Defined as a type for clarity and type checking.
type Label string
// Kind is a kubernetes api object kind. Defined as a type for clarity and type checking.
type Kind string
// APIObjectMap is the shape of marshalled api objects returned from cue to the
// holos cli. A map is used to improve the clarity of error messages from cue.
type APIObjectMap map[string]map[string]string
type APIObjectMap map[Kind]map[Label]string
// FileContentMap is a map of file names to file contents.
type FileContentMap map[string]string

32
api/v1alpha1/platform.go Normal file
View File

@@ -0,0 +1,32 @@
package v1alpha1
import "google.golang.org/protobuf/types/known/structpb"
// Platform represents a platform to manage. A Platform resource informs holos
// which components to build. The platform resource also acts as a container
// for the platform model form values provided by the PlatformService. The
// primary use case is to collect the cluster names, cluster types, platform
// model, and holos components to build into one resource.
type Platform struct {
TypeMeta `json:",inline" yaml:",inline"`
Metadata ObjectMeta `json:"metadata" yaml:"metadata"`
Spec PlatformSpec `json:"spec" yaml:"spec"`
}
// PlatformSpec represents the platform build plan specification.
type PlatformSpec struct {
// Model represents the platform model holos gets from from the
// holos.platform.v1alpha1.PlatformService.GetPlatform method and provides to
// CUE using a tag.
Model structpb.Struct `json:"model" yaml:"model"`
Components []PlatformSpecComponent `json:"components" yaml:"components"`
}
// PlatformSpecComponent represents a component to build or render with flags to
// pass, for example the cluster name.
type PlatformSpecComponent struct {
// Path is the path of the component relative to the platform root.
Path string `json:"path" yaml:"path"`
// Cluster is the cluster name to use when building the component.
Cluster string `json:"cluster" yaml:"cluster"`
}

View File

@@ -2,12 +2,13 @@ package v1alpha1
import (
"context"
"github.com/holos-run/holos"
)
type Renderer interface {
GetKind() string
Render(ctx context.Context, path holos.PathComponent) (*Result, error)
Render(ctx context.Context, path holos.InstancePath) (*Result, error)
}
// Render produces a Result representing the kubernetes api objects to
@@ -16,6 +17,6 @@ type Renderer interface {
// conceptualized as a data pipeline, for example a component may render a
// result by first calling helm template, then passing the result through
// kustomize, then mixing in overlay api objects.
func Render(ctx context.Context, r Renderer, path holos.PathComponent) (*Result, error) {
func Render(ctx context.Context, r Renderer, path holos.InstancePath) (*Result, error) {
return r.Render(ctx, path)
}

View File

@@ -3,12 +3,13 @@ package v1alpha1
import (
"context"
"fmt"
"github.com/holos-run/holos/pkg/logger"
"github.com/holos-run/holos/pkg/util"
"github.com/holos-run/holos/pkg/wrapper"
"os"
"path/filepath"
"slices"
"github.com/holos-run/holos/internal/errors"
"github.com/holos-run/holos/internal/logger"
"github.com/holos-run/holos/internal/util"
)
// Result is the build result for display or writing. Holos components Render the Result as a data pipeline.
@@ -18,6 +19,14 @@ type Result struct {
accumulatedOutput string
}
// Continue returns true if Skip is true indicating the result is to be skipped over.
func (r *Result) Continue() bool {
if r == nil {
return false
}
return r.Skip
}
func (r *Result) Name() string {
return r.Metadata.Name
}
@@ -31,6 +40,11 @@ func (r *Result) KustomizationFilename(writeTo string, cluster string) string {
return filepath.Join(writeTo, "clusters", cluster, "holos", "components", r.Metadata.Name+"-kustomization.gen.yaml")
}
// KustomizationContent returns the kustomization file contents to write.
func (r *Result) KustomizationContent() string {
return r.KsContent
}
// AccumulatedOutput returns the accumulated rendered output.
func (r *Result) AccumulatedOutput() string {
return r.accumulatedOutput
@@ -40,7 +54,7 @@ func (r *Result) AccumulatedOutput() string {
func (r *Result) addObjectMap(ctx context.Context, objectMap APIObjectMap) {
log := logger.FromContext(ctx)
b := []byte(r.AccumulatedOutput())
kinds := make([]string, 0, len(objectMap))
kinds := make([]Kind, 0, len(objectMap))
// Sort the keys
for kind := range objectMap {
kinds = append(kinds, kind)
@@ -50,7 +64,7 @@ func (r *Result) addObjectMap(ctx context.Context, objectMap APIObjectMap) {
for _, kind := range kinds {
v := objectMap[kind]
// Sort the keys
names := make([]string, 0, len(v))
names := make([]Label, 0, len(v))
for name := range v {
names = append(names, name)
}
@@ -81,7 +95,7 @@ func (r *Result) kustomize(ctx context.Context) error {
}
tempDir, err := os.MkdirTemp("", "holos.kustomize")
if err != nil {
return wrapper.Wrap(err)
return errors.Wrap(err)
}
defer util.Remove(ctx, tempDir)
@@ -90,7 +104,7 @@ func (r *Result) kustomize(ctx context.Context) error {
b := []byte(r.AccumulatedOutput())
b = util.EnsureNewline(b)
if err := os.WriteFile(target, b, 0644); err != nil {
return wrapper.Wrap(fmt.Errorf("could not write resources: %w", err))
return errors.Wrap(fmt.Errorf("could not write resources: %w", err))
}
log.DebugContext(ctx, "wrote: "+target, "op", "write", "path", target, "bytes", len(b))
@@ -98,12 +112,12 @@ func (r *Result) kustomize(ctx context.Context) error {
for file, content := range r.KustomizeFiles {
target := filepath.Join(tempDir, file)
if err := os.MkdirAll(filepath.Dir(target), 0755); err != nil {
return wrapper.Wrap(err)
return errors.Wrap(err)
}
b := []byte(content)
b = util.EnsureNewline(b)
if err := os.WriteFile(target, b, 0644); err != nil {
return wrapper.Wrap(fmt.Errorf("could not write: %w", err))
return errors.Wrap(fmt.Errorf("could not write: %w", err))
}
log.DebugContext(ctx, "wrote: "+target, "op", "write", "path", target, "bytes", len(b))
}
@@ -112,7 +126,7 @@ func (r *Result) kustomize(ctx context.Context) error {
kOut, err := util.RunCmd(ctx, "kubectl", "kustomize", tempDir)
if err != nil {
log.ErrorContext(ctx, kOut.Stderr.String())
return wrapper.Wrap(err)
return errors.Wrap(err)
}
// Replace the accumulated output
r.accumulatedOutput = kOut.Stdout.String()
@@ -125,12 +139,12 @@ func (r *Result) Save(ctx context.Context, path string, content string) error {
dir := filepath.Dir(path)
if err := os.MkdirAll(dir, os.FileMode(0775)); err != nil {
log.WarnContext(ctx, "could not mkdir", "path", dir, "err", err)
return wrapper.Wrap(err)
return errors.Wrap(err)
}
// Write the kube api objects
if err := os.WriteFile(path, []byte(content), os.FileMode(0644)); err != nil {
log.WarnContext(ctx, "could not write", "path", path, "err", err)
return wrapper.Wrap(err)
return errors.Wrap(err)
}
log.DebugContext(ctx, "out: wrote "+path, "action", "write", "path", path, "status", "ok")
return nil

View File

@@ -8,3 +8,13 @@ type TypeMeta struct {
func (tm *TypeMeta) GetKind() string {
return tm.Kind
}
func (tm *TypeMeta) GetAPIVersion() string {
return tm.Kind
}
// Discriminator is an interface to discriminate the kind api object.
type Discriminator interface {
GetKind() string
GetAPIVersion() string
}

20
buf.gen.yaml Normal file
View File

@@ -0,0 +1,20 @@
# Generates gRPC and ConnectRPC bindings for Go and TypeScript
#
# Note: protoc-gen-connect-query is the primary method of wiring up the React
# frontend.
version: v1
plugins:
- plugin: go
out: service/gen
opt: paths=source_relative
- plugin: connect-go
out: service/gen
opt: paths=source_relative
- plugin: es
out: internal/frontend/holos/src/app/gen
opt:
- target=ts
- plugin: connect-es
out: internal/frontend/holos/src/app/gen
opt:
- target=ts

8
buf.lock Normal file
View File

@@ -0,0 +1,8 @@
# Generated by buf. DO NOT EDIT.
version: v1
deps:
- remote: buf.build
owner: bufbuild
repository: protovalidate
commit: b983156c5e994cc9892e0ce3e64e17e0
digest: shake256:fb47a62989d38c2529bcc5cd86ded43d800eb84cee82b42b9e8a9e815d4ee8134a0fb9d0ce8299b27c2d2bbb7d6ade0c4ad5a8a4d467e1e2c7ca619ae9f634e2

3
buf.work.yaml Normal file
View File

@@ -0,0 +1,3 @@
version: v1
directories:
- service

View File

@@ -1,8 +1,9 @@
package main
import (
"github.com/holos-run/holos/pkg/cli"
"os"
"github.com/holos-run/holos/internal/cli"
)
func main() {

View File

@@ -1,10 +1,11 @@
package main
import (
"github.com/holos-run/holos/pkg/cli"
"github.com/rogpeppe/go-internal/testscript"
"os"
"testing"
"github.com/holos-run/holos/internal/cli"
"github.com/rogpeppe/go-internal/testscript"
)
func TestMain(m *testing.M) {

View File

@@ -2,6 +2,8 @@
exec holos build ./foo/... --log-level debug
stdout '^bf2bc7f9-9ba0-4f9e-9bd2-9a205627eb0b$'
-- platform.config.json --
{}
-- cue.mod --
package holos
-- foo/constraints.cue --
@@ -20,6 +22,7 @@ spec: components: KubernetesObjectsList: [
package holos
_cluster: string @tag(cluster, string)
_platform_config: string @tag(platform_config, string)
#KubernetesObjects: {
apiVersion: "holos.run/v1alpha1"

View File

@@ -3,12 +3,15 @@
stderr 'apiObjectMap.foo.bar: cannot convert incomplete value'
stderr '/component.cue:\d+:\d+$'
-- platform.config.json --
{}
-- cue.mod --
package holos
-- component.cue --
package holos
_cluster: string @tag(cluster, string)
_platform_config: string @tag(platform_config, string)
apiVersion: "holos.run/v1alpha1"
kind: "BuildPlan"

View File

@@ -3,6 +3,8 @@ exec holos build .
stdout '^kind: SecretStore$'
stdout '# Source: CUE apiObjects.SecretStore.default'
-- platform.config.json --
{}
-- cue.mod --
package holos
-- component.cue --
@@ -13,6 +15,7 @@ kind: "BuildPlan"
spec: components: KubernetesObjectsList: [{apiObjectMap: #APIObjects.apiObjectMap}]
_cluster: string @tag(cluster, string)
_platform_config: string @tag(platform_config, string)
#SecretStore: {
kind: string

View File

@@ -4,6 +4,8 @@ stdout '^kind: SecretStore$'
stdout '# Source: CUE apiObjects.SecretStore.default'
stderr 'skipping helm: no chart name specified'
-- platform.config.json --
{}
-- cue.mod --
package holos
-- component.cue --
@@ -14,6 +16,7 @@ kind: "BuildPlan"
spec: components: HelmChartList: [{apiObjectMap: #APIObjects.apiObjectMap}]
_cluster: string @tag(cluster, string)
_platform_config: string @tag(platform_config, string)
#SecretStore: {
kind: string

View File

@@ -2,6 +2,8 @@
! exec holos build .
stderr 'apiObjects.secretstore.default.foo: field not allowed'
-- platform.config.json --
{}
-- cue.mod --
package holos
-- component.cue --
@@ -10,6 +12,7 @@ package holos
apiVersion: "holos.run/v1alpha1"
kind: "KubernetesObjects"
cluster: string @tag(cluster, string)
_platform_config: string @tag(platform_config, string)
#SecretStore: {
metadata: name: string

View File

@@ -2,6 +2,8 @@
! exec holos build .
stderr 'Error: execution error at \(zitadel/templates/secret_zitadel-masterkey.yaml:2:4\): Either set .Values.zitadel.masterkey xor .Values.zitadel.masterkeySecretName'
-- platform.config.json --
{}
-- cue.mod --
package holos
-- zitadel.cue --
@@ -12,6 +14,7 @@ kind: "BuildPlan"
spec: components: HelmChartList: [_HelmChart]
_cluster: string @tag(cluster, string)
_platform_config: string @tag(platform_config, string)
_HelmChart: {
apiVersion: "holos.run/v1alpha1"

View File

@@ -1,15 +1,18 @@
# Kustomize is a supported holos component kind
exec holos render --cluster-name=mycluster . --log-level=debug
exec holos render component --cluster-name=mycluster . --log-level=debug
# Want generated output
cmp want.yaml deploy/clusters/mycluster/components/kstest/kstest.gen.yaml
-- platform.config.json --
{}
-- cue.mod --
package holos
-- component.cue --
package holos
_cluster: string @tag(cluster, string)
_platform_config: string @tag(platform_config, string)
apiVersion: "holos.run/v1alpha1"
kind: "BuildPlan"

View File

@@ -3,11 +3,14 @@
! exec holos build .
stderr 'unknown field \\"TypoKubernetesObjectsList\\"'
-- platform.config.json --
{}
-- cue.mod --
package holos
-- component.cue --
package holos
_cluster: string @tag(cluster, string)
_platform_config: string @tag(platform_config, string)
apiVersion: "holos.run/v1alpha1"
kind: "BuildPlan"

View File

@@ -1,5 +1,3 @@
exec holos --version
# want version with no v on stdout
stdout -count=1 '^\d+\.\d+\.\d+$'
# want nothing on stderr
! stderr .

View File

@@ -0,0 +1,10 @@
{
"org_id": "018f36fb-e3f7-7f7f-a1c5-c85fb735d215",
"field_mask": {
"paths": [
"id",
"name",
"displayName"
]
}
}

View File

@@ -0,0 +1,8 @@
{
"update_mask": {
"paths": ["form"]
},
"update": {
"platform_id": "018f36fb-e3ff-7f7f-a5d1-7ca2bf499e94"
}
}

View File

@@ -0,0 +1,11 @@
{
"update_mask": {
"paths": ["model","name","display_name"]
},
"update": {
"platform_id": "018f36fb-e3ff-7f7f-a5d1-7ca2bf499e94",
"name": "bareplatform",
"display_name": "Bare Platform",
"model": {}
}
}

View File

@@ -0,0 +1,6 @@
{
"update": {
"platform_id": "018f36fb-e3ff-7f7f-a5d1-7ca2bf499e94",
"model": {}
}
}

View File

@@ -0,0 +1,45 @@
package holos
import ap "security.istio.io/authorizationpolicy/v1"
// #AuthPolicyRules represents AuthorizationPolicy rules for hosts that need
// specialized treatment. Entries in this struct are excluded from
// AuthorizationPolicy/authproxy-custom in the istio-ingress namespace. Entries
// are added to their own AuthorizationPolicy.
#AuthPolicyRules: {
// AuthProxySpec represents the identity provider configuration
AuthProxySpec: #AuthProxySpec & #Platform.authproxy
// Hosts are hosts that need specialized treatment
hosts: {
[Name=_]: {
// name is the fully qualifed hostname, a Host: header value.
name: Name
// slug is the resource name prefix
slug: string
// NoAuthorizationPolicy disables an AuthorizationPolicy for the host
NoAuthorizationPolicy: true | *false
// Refer to https://istio.io/latest/docs/reference/config/security/authorization-policy/#Rule
spec: ap.#AuthorizationPolicySpec & {
action: "CUSTOM"
provider: name: AuthProxySpec.provider
selector: matchLabels: istio: "ingressgateway"
}
}
}
objects: #APIObjects & {
for Host in hosts {
if Host.NoAuthorizationPolicy == false {
apiObjects: {
AuthorizationPolicy: "\(Host.slug)-custom": {
metadata: namespace: "istio-ingress"
metadata: name: "\(Host.slug)-custom"
spec: Host.spec
}
}
}
}
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,189 @@
// Code generated by timoni. DO NOT EDIT.
//timoni:generate timoni vendor crd -f /home/jeff/workspace/holos-run/holos-infra/deploy/clusters/k2/components/prod-platform-argocd/prod-platform-argocd.gen.yaml
package v1alpha1
import "strings"
// AppProject provides a logical grouping of applications,
// providing controls for: * where the apps may deploy to
// (cluster whitelist) * what may be deployed (repository
// whitelist, resource whitelist/blacklist) * who can access
// these applications (roles, OIDC group claims bindings) * and
// what they can do (RBAC policies) * automation access to these
// roles (JWT tokens)
#AppProject: {
// APIVersion defines the versioned schema of this representation
// of an object. Servers should convert recognized schemas to the
// latest internal value, and may reject unrecognized values.
// More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
apiVersion: "argoproj.io/v1alpha1"
// Kind is a string value representing the REST resource this
// object represents. Servers may infer this from the endpoint
// the client submits requests to. Cannot be updated. In
// CamelCase. More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
kind: "AppProject"
metadata: {
name!: strings.MaxRunes(253) & strings.MinRunes(1) & {
string
}
namespace!: strings.MaxRunes(63) & strings.MinRunes(1) & {
string
}
labels?: {
[string]: string
}
annotations?: {
[string]: string
}
}
// AppProjectSpec is the specification of an AppProject
spec!: #AppProjectSpec
}
// AppProjectSpec is the specification of an AppProject
#AppProjectSpec: {
// ClusterResourceBlacklist contains list of blacklisted cluster
// level resources
clusterResourceBlacklist?: [...{
group: string
kind: string
}]
// ClusterResourceWhitelist contains list of whitelisted cluster
// level resources
clusterResourceWhitelist?: [...{
group: string
kind: string
}]
// Description contains optional project description
description?: string
// Destinations contains list of destinations available for
// deployment
destinations?: [...{
// Name is an alternate way of specifying the target cluster by
// its symbolic name. This must be set if Server is not set.
name?: string
// Namespace specifies the target namespace for the application's
// resources. The namespace will only be set for namespace-scoped
// resources that have not set a value for .metadata.namespace
namespace?: string
// Server specifies the URL of the target cluster's Kubernetes
// control plane API. This must be set if Name is not set.
server?: string
}]
// NamespaceResourceBlacklist contains list of blacklisted
// namespace level resources
namespaceResourceBlacklist?: [...{
group: string
kind: string
}]
// NamespaceResourceWhitelist contains list of whitelisted
// namespace level resources
namespaceResourceWhitelist?: [...{
group: string
kind: string
}]
// OrphanedResources specifies if controller should monitor
// orphaned resources of apps in this project
orphanedResources?: {
// Ignore contains a list of resources that are to be excluded
// from orphaned resources monitoring
ignore?: [...{
group?: string
kind?: string
name?: string
}]
// Warn indicates if warning condition should be created for apps
// which have orphaned resources
warn?: bool
}
// PermitOnlyProjectScopedClusters determines whether destinations
// can only reference clusters which are project-scoped
permitOnlyProjectScopedClusters?: bool
// Roles are user defined RBAC roles associated with this project
roles?: [...{
// Description is a description of the role
description?: string
// Groups are a list of OIDC group claims bound to this role
groups?: [...string]
// JWTTokens are a list of generated JWT tokens bound to this role
jwtTokens?: [...{
exp?: int
iat: int
id?: string
}]
// Name is a name for this role
name: string
// Policies Stores a list of casbin formatted strings that define
// access policies for the role in the project
policies?: [...string]
}]
// SignatureKeys contains a list of PGP key IDs that commits in
// Git must be signed with in order to be allowed for sync
signatureKeys?: [...{
// The ID of the key in hexadecimal notation
keyID: string
}]
// SourceNamespaces defines the namespaces application resources
// are allowed to be created in
sourceNamespaces?: [...string]
// SourceRepos contains list of repository URLs which can be used
// for deployment
sourceRepos?: [...string]
// SyncWindows controls when syncs can be run for apps in this
// project
syncWindows?: [...{
// Applications contains a list of applications that the window
// will apply to
applications?: [...string]
// Clusters contains a list of clusters that the window will apply
// to
clusters?: [...string]
// Duration is the amount of time the sync window will be open
duration?: string
// Kind defines if the window allows or blocks syncs
kind?: string
// ManualSync enables manual syncs when they would otherwise be
// blocked
manualSync?: bool
// Namespaces contains a list of namespaces that the window will
// apply to
namespaces?: [...string]
// Schedule is the time the window will begin, specified in cron
// format
schedule?: string
// TimeZone of the sync that will be applied to the schedule
timeZone?: string
}]
}

View File

@@ -19,7 +19,8 @@ package v1alpha1
}
#BuildPlanComponents: {
helmCharts?: [...#HelmChart] @go(HelmCharts,[]HelmChart)
kubernetesObjects?: [...#KubernetesObjects] @go(KubernetesObjects,[]KubernetesObjects)
kustomizeBuilds?: [...#KustomizeBuild] @go(KustomizeBuilds,[]KustomizeBuild)
helmChartList?: [...#HelmChart] @go(HelmChartList,[]HelmChart)
kubernetesObjectsList?: [...#KubernetesObjects] @go(KubernetesObjectsList,[]KubernetesObjects)
kustomizeBuildList?: [...#KustomizeBuild] @go(KustomizeBuildList,[]KustomizeBuild)
resources?: {[string]: #KubernetesObjects} @go(Resources,map[string]KubernetesObjects)
}

View File

@@ -11,5 +11,5 @@ package v1alpha1
// ChartDir is the directory name created in the holos component directory to cache a chart.
#ChartDir: "vendor"
// ResourcesFile is the file name used to store component output when post processing with kustomize.
// ResourcesFile is the file name used to store component output when post-processing with kustomize.
#ResourcesFile: "resources.yaml"

View File

@@ -4,6 +4,12 @@
package v1alpha1
// Label is an arbitrary unique identifier. Defined as a type for clarity and type checking.
#Label: string
// Kind is a kubernetes api object kind. Defined as a type for clarity and type checking.
#Kind: string
// APIObjectMap is the shape of marshalled api objects returned from cue to the
// holos cli. A map is used to improve the clarity of error messages from cue.
#APIObjectMap: {[string]: [string]: string}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,546 @@
// Code generated by timoni. DO NOT EDIT.
//timoni:generate timoni vendor crd -f /home/jeff/workspace/holos-run/holos-infra/deploy/clusters/k2/components/prod-platform-monitoring/prod-platform-monitoring.gen.yaml
package v1
import "strings"
// PodMonitor defines monitoring for a set of pods.
#PodMonitor: {
// APIVersion defines the versioned schema of this representation
// of an object. Servers should convert recognized schemas to the
// latest internal value, and may reject unrecognized values.
// More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
apiVersion: "monitoring.coreos.com/v1"
// Kind is a string value representing the REST resource this
// object represents. Servers may infer this from the endpoint
// the client submits requests to. Cannot be updated. In
// CamelCase. More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
kind: "PodMonitor"
metadata!: {
name!: strings.MaxRunes(253) & strings.MinRunes(1) & {
string
}
namespace!: strings.MaxRunes(63) & strings.MinRunes(1) & {
string
}
labels?: {
[string]: string
}
annotations?: {
[string]: string
}
}
// Specification of desired Pod selection for target discovery by
// Prometheus.
spec!: #PodMonitorSpec
}
// Specification of desired Pod selection for target discovery by
// Prometheus.
#PodMonitorSpec: {
attachMetadata?: {
// When set to true, Prometheus must have the `get` permission on
// the `Nodes` objects.
node?: bool
}
// The label to use to retrieve the job name from. `jobLabel`
// selects the label from the associated Kubernetes `Pod` object
// which will be used as the `job` label for all metrics.
// For example if `jobLabel` is set to `foo` and the Kubernetes
// `Pod` object is labeled with `foo: bar`, then Prometheus adds
// the `job="bar"` label to all ingested metrics.
// If the value of this field is empty, the `job` label of the
// metrics defaults to the namespace and name of the PodMonitor
// object (e.g. `<namespace>/<name>`).
jobLabel?: string
// Per-scrape limit on the number of targets dropped by relabeling
// that will be kept in memory. 0 means no limit.
// It requires Prometheus >= v2.47.0.
keepDroppedTargets?: int
// Per-scrape limit on number of labels that will be accepted for
// a sample.
// It requires Prometheus >= v2.27.0.
labelLimit?: int
// Per-scrape limit on length of labels name that will be accepted
// for a sample.
// It requires Prometheus >= v2.27.0.
labelNameLengthLimit?: int
// Per-scrape limit on length of labels value that will be
// accepted for a sample.
// It requires Prometheus >= v2.27.0.
labelValueLengthLimit?: int
// Selector to select which namespaces the Kubernetes `Pods`
// objects are discovered from.
namespaceSelector?: {
// Boolean describing whether all namespaces are selected in
// contrast to a list restricting them.
any?: bool
// List of namespace names to select from.
matchNames?: [...string]
}
// List of endpoints part of this PodMonitor.
podMetricsEndpoints?: [...{
// `authorization` configures the Authorization header credentials
// to use when scraping the target.
// Cannot be set at the same time as `basicAuth`, or `oauth2`.
authorization?: {
// Selects a key of a Secret in the namespace that contains the
// credentials for authentication.
credentials?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// Defines the authentication type. The value is case-insensitive.
// "Basic" is not a supported value.
// Default: "Bearer"
type?: string
}
// `basicAuth` configures the Basic Authentication credentials to
// use when scraping the target.
// Cannot be set at the same time as `authorization`, or `oauth2`.
basicAuth?: {
// `password` specifies a key of a Secret containing the password
// for authentication.
password?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// `username` specifies a key of a Secret containing the username
// for authentication.
username?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// `bearerTokenSecret` specifies a key of a Secret containing the
// bearer token for scraping targets. The secret needs to be in
// the same namespace as the PodMonitor object and readable by
// the Prometheus Operator.
// Deprecated: use `authorization` instead.
bearerTokenSecret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// `enableHttp2` can be used to disable HTTP2 when scraping the
// target.
enableHttp2?: bool
// When true, the pods which are not running (e.g. either in
// Failed or Succeeded state) are dropped during the target
// discovery.
// If unset, the filtering is enabled.
// More info:
// https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase
filterRunning?: bool
// `followRedirects` defines whether the scrape requests should
// follow HTTP 3xx redirects.
followRedirects?: bool
// When true, `honorLabels` preserves the metric's labels when
// they collide with the target's labels.
honorLabels?: bool
// `honorTimestamps` controls whether Prometheus preserves the
// timestamps when exposed by the target.
honorTimestamps?: bool
// Interval at which Prometheus scrapes the metrics from the
// target.
// If empty, Prometheus uses the global scrape interval.
interval?: =~"^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"
// `metricRelabelings` configures the relabeling rules to apply to
// the samples before ingestion.
metricRelabelings?: [...{
// Action to perform based on the regex matching.
// `Uppercase` and `Lowercase` actions require Prometheus >=
// v2.36.0. `DropEqual` and `KeepEqual` actions require
// Prometheus >= v2.41.0.
// Default: "Replace"
action?: "replace" | "Replace" | "keep" | "Keep" | "drop" | "Drop" | "hashmod" | "HashMod" | "labelmap" | "LabelMap" | "labeldrop" | "LabelDrop" | "labelkeep" | "LabelKeep" | "lowercase" | "Lowercase" | "uppercase" | "Uppercase" | "keepequal" | "KeepEqual" | "dropequal" | "DropEqual" | *"replace"
// Modulus to take of the hash of the source label values.
// Only applicable when the action is `HashMod`.
modulus?: int
// Regular expression against which the extracted value is
// matched.
regex?: string
// Replacement value against which a Replace action is performed
// if the regular expression matches.
// Regex capture groups are available.
replacement?: string
// Separator is the string between concatenated SourceLabels.
separator?: string
// The source labels select values from existing labels. Their
// content is concatenated using the configured Separator and
// matched against the configured regular expression.
sourceLabels?: [...=~"^[a-zA-Z_][a-zA-Z0-9_]*$"]
// Label to which the resulting string is written in a
// replacement.
// It is mandatory for `Replace`, `HashMod`, `Lowercase`,
// `Uppercase`, `KeepEqual` and `DropEqual` actions.
// Regex capture groups are available.
targetLabel?: string
}]
// `oauth2` configures the OAuth2 settings to use when scraping
// the target.
// It requires Prometheus >= 2.27.0.
// Cannot be set at the same time as `authorization`, or
// `basicAuth`.
oauth2?: {
// `clientId` specifies a key of a Secret or ConfigMap containing
// the OAuth2 client's ID.
clientId: {
// ConfigMap containing data to use for the targets.
configMap?: {
// The key to select.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the ConfigMap or its key must be defined
optional?: bool
}
// Secret containing data to use for the targets.
secret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// `clientSecret` specifies a key of a Secret containing the
// OAuth2 client's secret.
clientSecret: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// `endpointParams` configures the HTTP parameters to append to
// the token URL.
endpointParams?: {
[string]: string
}
// `scopes` defines the OAuth2 scopes used for the token request.
scopes?: [...string]
// `tokenURL` configures the URL to fetch the token from.
tokenUrl: strings.MinRunes(1)
}
// `params` define optional HTTP URL parameters.
params?: {
[string]: [...string]
}
// HTTP path from which to scrape for metrics.
// If empty, Prometheus uses the default value (e.g. `/metrics`).
path?: string
// Name of the Pod port which this endpoint refers to.
// It takes precedence over `targetPort`.
port?: string
// `proxyURL` configures the HTTP Proxy URL (e.g.
// "http://proxyserver:2195") to go through when scraping the
// target.
proxyUrl?: string
// `relabelings` configures the relabeling rules to apply the
// target's metadata labels.
// The Operator automatically adds relabelings for a few standard
// Kubernetes fields.
// The original scrape job's name is available via the
// `__tmp_prometheus_job_name` label.
// More info:
// https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
relabelings?: [...{
// Action to perform based on the regex matching.
// `Uppercase` and `Lowercase` actions require Prometheus >=
// v2.36.0. `DropEqual` and `KeepEqual` actions require
// Prometheus >= v2.41.0.
// Default: "Replace"
action?: "replace" | "Replace" | "keep" | "Keep" | "drop" | "Drop" | "hashmod" | "HashMod" | "labelmap" | "LabelMap" | "labeldrop" | "LabelDrop" | "labelkeep" | "LabelKeep" | "lowercase" | "Lowercase" | "uppercase" | "Uppercase" | "keepequal" | "KeepEqual" | "dropequal" | "DropEqual" | *"replace"
// Modulus to take of the hash of the source label values.
// Only applicable when the action is `HashMod`.
modulus?: int
// Regular expression against which the extracted value is
// matched.
regex?: string
// Replacement value against which a Replace action is performed
// if the regular expression matches.
// Regex capture groups are available.
replacement?: string
// Separator is the string between concatenated SourceLabels.
separator?: string
// The source labels select values from existing labels. Their
// content is concatenated using the configured Separator and
// matched against the configured regular expression.
sourceLabels?: [...=~"^[a-zA-Z_][a-zA-Z0-9_]*$"]
// Label to which the resulting string is written in a
// replacement.
// It is mandatory for `Replace`, `HashMod`, `Lowercase`,
// `Uppercase`, `KeepEqual` and `DropEqual` actions.
// Regex capture groups are available.
targetLabel?: string
}]
// HTTP scheme to use for scraping.
// `http` and `https` are the expected values unless you rewrite
// the `__scheme__` label via relabeling.
// If empty, Prometheus uses the default value `http`.
scheme?: "http" | "https"
// Timeout after which Prometheus considers the scrape to be
// failed.
// If empty, Prometheus uses the global scrape timeout unless it
// is less than the target's scrape interval value in which the
// latter is used.
scrapeTimeout?: =~"^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"
// Name or number of the target port of the `Pod` object behind
// the Service, the port must be specified with container port
// property.
// Deprecated: use 'port' instead.
targetPort?: (int | string) & {
string
}
// TLS configuration to use when scraping the target.
tlsConfig?: {
// Certificate authority used when verifying server certificates.
ca?: {
// ConfigMap containing data to use for the targets.
configMap?: {
// The key to select.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the ConfigMap or its key must be defined
optional?: bool
}
// Secret containing data to use for the targets.
secret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// Client certificate to present when doing client-authentication.
cert?: {
// ConfigMap containing data to use for the targets.
configMap?: {
// The key to select.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the ConfigMap or its key must be defined
optional?: bool
}
// Secret containing data to use for the targets.
secret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// Disable target certificate validation.
insecureSkipVerify?: bool
// Secret containing the client key file for the targets.
keySecret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// Used to verify the hostname for the targets.
serverName?: string
}
// `trackTimestampsStaleness` defines whether Prometheus tracks
// staleness of the metrics that have an explicit timestamp
// present in scraped data. Has no effect if `honorTimestamps` is
// false.
// It requires Prometheus >= v2.48.0.
trackTimestampsStaleness?: bool
}]
// `podTargetLabels` defines the labels which are transferred from
// the associated Kubernetes `Pod` object onto the ingested
// metrics.
podTargetLabels?: [...string]
// `sampleLimit` defines a per-scrape limit on the number of
// scraped samples that will be accepted.
sampleLimit?: int
// The scrape class to apply.
scrapeClass?: strings.MinRunes(1)
// `scrapeProtocols` defines the protocols to negotiate during a
// scrape. It tells clients the protocols supported by Prometheus
// in order of preference (from most to least preferred).
// If unset, Prometheus uses its default value.
// It requires Prometheus >= v2.49.0.
scrapeProtocols?: [..."PrometheusProto" | "OpenMetricsText0.0.1" | "OpenMetricsText1.0.0" | "PrometheusText0.0.4"]
// Label selector to select the Kubernetes `Pod` objects.
selector: {
// matchExpressions is a list of label selector requirements. The
// requirements are ANDed.
matchExpressions?: [...{
// key is the label key that the selector applies to.
key: string
// operator represents a key's relationship to a set of values.
// Valid operators are In, NotIn, Exists and DoesNotExist.
operator: string
// values is an array of string values. If the operator is In or
// NotIn, the values array must be non-empty. If the operator is
// Exists or DoesNotExist, the values array must be empty. This
// array is replaced during a strategic merge patch.
values?: [...string]
}]
// matchLabels is a map of {key,value} pairs. A single {key,value}
// in the matchLabels map is equivalent to an element of
// matchExpressions, whose key field is "key", the operator is
// "In", and the values array contains only "value". The
// requirements are ANDed.
matchLabels?: {
[string]: string
}
}
// `targetLimit` defines a limit on the number of scraped targets
// that will be accepted.
targetLimit?: int
}

View File

@@ -0,0 +1,536 @@
// Code generated by timoni. DO NOT EDIT.
//timoni:generate timoni vendor crd -f /home/jeff/workspace/holos-run/holos-infra/deploy/clusters/k2/components/prod-platform-monitoring/prod-platform-monitoring.gen.yaml
package v1
import "strings"
// Probe defines monitoring for a set of static targets or
// ingresses.
#Probe: {
// APIVersion defines the versioned schema of this representation
// of an object. Servers should convert recognized schemas to the
// latest internal value, and may reject unrecognized values.
// More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
apiVersion: "monitoring.coreos.com/v1"
// Kind is a string value representing the REST resource this
// object represents. Servers may infer this from the endpoint
// the client submits requests to. Cannot be updated. In
// CamelCase. More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
kind: "Probe"
metadata!: {
name!: strings.MaxRunes(253) & strings.MinRunes(1) & {
string
}
namespace!: strings.MaxRunes(63) & strings.MinRunes(1) & {
string
}
labels?: {
[string]: string
}
annotations?: {
[string]: string
}
}
// Specification of desired Ingress selection for target discovery
// by Prometheus.
spec!: #ProbeSpec
}
// Specification of desired Ingress selection for target discovery
// by Prometheus.
#ProbeSpec: {
// Authorization section for this endpoint
authorization?: {
// Selects a key of a Secret in the namespace that contains the
// credentials for authentication.
credentials?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// Defines the authentication type. The value is case-insensitive.
// "Basic" is not a supported value.
// Default: "Bearer"
type?: string
}
// BasicAuth allow an endpoint to authenticate over basic
// authentication. More info:
// https://prometheus.io/docs/operating/configuration/#endpoint
basicAuth?: {
// `password` specifies a key of a Secret containing the password
// for authentication.
password?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// `username` specifies a key of a Secret containing the username
// for authentication.
username?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// Secret to mount to read bearer token for scraping targets. The
// secret needs to be in the same namespace as the probe and
// accessible by the Prometheus Operator.
bearerTokenSecret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// Interval at which targets are probed using the configured
// prober. If not specified Prometheus' global scrape interval is
// used.
interval?: =~"^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"
// The job name assigned to scraped metrics by default.
jobName?: string
// Per-scrape limit on the number of targets dropped by relabeling
// that will be kept in memory. 0 means no limit.
// It requires Prometheus >= v2.47.0.
keepDroppedTargets?: int
// Per-scrape limit on number of labels that will be accepted for
// a sample. Only valid in Prometheus versions 2.27.0 and newer.
labelLimit?: int
// Per-scrape limit on length of labels name that will be accepted
// for a sample. Only valid in Prometheus versions 2.27.0 and
// newer.
labelNameLengthLimit?: int
// Per-scrape limit on length of labels value that will be
// accepted for a sample. Only valid in Prometheus versions
// 2.27.0 and newer.
labelValueLengthLimit?: int
// MetricRelabelConfigs to apply to samples before ingestion.
metricRelabelings?: [...{
// Action to perform based on the regex matching.
// `Uppercase` and `Lowercase` actions require Prometheus >=
// v2.36.0. `DropEqual` and `KeepEqual` actions require
// Prometheus >= v2.41.0.
// Default: "Replace"
action?: "replace" | "Replace" | "keep" | "Keep" | "drop" | "Drop" | "hashmod" | "HashMod" | "labelmap" | "LabelMap" | "labeldrop" | "LabelDrop" | "labelkeep" | "LabelKeep" | "lowercase" | "Lowercase" | "uppercase" | "Uppercase" | "keepequal" | "KeepEqual" | "dropequal" | "DropEqual" | *"replace"
// Modulus to take of the hash of the source label values.
// Only applicable when the action is `HashMod`.
modulus?: int
// Regular expression against which the extracted value is
// matched.
regex?: string
// Replacement value against which a Replace action is performed
// if the regular expression matches.
// Regex capture groups are available.
replacement?: string
// Separator is the string between concatenated SourceLabels.
separator?: string
// The source labels select values from existing labels. Their
// content is concatenated using the configured Separator and
// matched against the configured regular expression.
sourceLabels?: [...=~"^[a-zA-Z_][a-zA-Z0-9_]*$"]
// Label to which the resulting string is written in a
// replacement.
// It is mandatory for `Replace`, `HashMod`, `Lowercase`,
// `Uppercase`, `KeepEqual` and `DropEqual` actions.
// Regex capture groups are available.
targetLabel?: string
}]
// The module to use for probing specifying how to probe the
// target. Example module configuring in the blackbox exporter:
// https://github.com/prometheus/blackbox_exporter/blob/master/example.yml
module?: string
// OAuth2 for the URL. Only valid in Prometheus versions 2.27.0
// and newer.
oauth2?: {
// `clientId` specifies a key of a Secret or ConfigMap containing
// the OAuth2 client's ID.
clientId: {
// ConfigMap containing data to use for the targets.
configMap?: {
// The key to select.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the ConfigMap or its key must be defined
optional?: bool
}
// Secret containing data to use for the targets.
secret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// `clientSecret` specifies a key of a Secret containing the
// OAuth2 client's secret.
clientSecret: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// `endpointParams` configures the HTTP parameters to append to
// the token URL.
endpointParams?: {
[string]: string
}
// `scopes` defines the OAuth2 scopes used for the token request.
scopes?: [...string]
// `tokenURL` configures the URL to fetch the token from.
tokenUrl: strings.MinRunes(1)
}
// Specification for the prober to use for probing targets. The
// prober.URL parameter is required. Targets cannot be probed if
// left empty.
prober?: {
// Path to collect metrics from. Defaults to `/probe`.
path?: string | *"/probe"
// Optional ProxyURL.
proxyUrl?: string
// HTTP scheme to use for scraping. `http` and `https` are the
// expected values unless you rewrite the `__scheme__` label via
// relabeling. If empty, Prometheus uses the default value
// `http`.
scheme?: "http" | "https"
// Mandatory URL of the prober.
url: string
}
// SampleLimit defines per-scrape limit on number of scraped
// samples that will be accepted.
sampleLimit?: int
// The scrape class to apply.
scrapeClass?: strings.MinRunes(1)
// `scrapeProtocols` defines the protocols to negotiate during a
// scrape. It tells clients the protocols supported by Prometheus
// in order of preference (from most to least preferred).
// If unset, Prometheus uses its default value.
// It requires Prometheus >= v2.49.0.
scrapeProtocols?: [..."PrometheusProto" | "OpenMetricsText0.0.1" | "OpenMetricsText1.0.0" | "PrometheusText0.0.4"]
// Timeout for scraping metrics from the Prometheus exporter. If
// not specified, the Prometheus global scrape timeout is used.
scrapeTimeout?: =~"^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"
// TargetLimit defines a limit on the number of scraped targets
// that will be accepted.
targetLimit?: int
// Targets defines a set of static or dynamically discovered
// targets to probe.
targets?: {
// ingress defines the Ingress objects to probe and the relabeling
// configuration. If `staticConfig` is also defined,
// `staticConfig` takes precedence.
ingress?: {
// From which namespaces to select Ingress objects.
namespaceSelector?: {
// Boolean describing whether all namespaces are selected in
// contrast to a list restricting them.
any?: bool
// List of namespace names to select from.
matchNames?: [...string]
}
// RelabelConfigs to apply to the label set of the target before
// it gets scraped. The original ingress address is available via
// the `__tmp_prometheus_ingress_address` label. It can be used
// to customize the probed URL. The original scrape job's name is
// available via the `__tmp_prometheus_job_name` label. More
// info:
// https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
relabelingConfigs?: [...{
// Action to perform based on the regex matching.
// `Uppercase` and `Lowercase` actions require Prometheus >=
// v2.36.0. `DropEqual` and `KeepEqual` actions require
// Prometheus >= v2.41.0.
// Default: "Replace"
action?: "replace" | "Replace" | "keep" | "Keep" | "drop" | "Drop" | "hashmod" | "HashMod" | "labelmap" | "LabelMap" | "labeldrop" | "LabelDrop" | "labelkeep" | "LabelKeep" | "lowercase" | "Lowercase" | "uppercase" | "Uppercase" | "keepequal" | "KeepEqual" | "dropequal" | "DropEqual" | *"replace"
// Modulus to take of the hash of the source label values.
// Only applicable when the action is `HashMod`.
modulus?: int
// Regular expression against which the extracted value is
// matched.
regex?: string
// Replacement value against which a Replace action is performed
// if the regular expression matches.
// Regex capture groups are available.
replacement?: string
// Separator is the string between concatenated SourceLabels.
separator?: string
// The source labels select values from existing labels. Their
// content is concatenated using the configured Separator and
// matched against the configured regular expression.
sourceLabels?: [...=~"^[a-zA-Z_][a-zA-Z0-9_]*$"]
// Label to which the resulting string is written in a
// replacement.
// It is mandatory for `Replace`, `HashMod`, `Lowercase`,
// `Uppercase`, `KeepEqual` and `DropEqual` actions.
// Regex capture groups are available.
targetLabel?: string
}]
// Selector to select the Ingress objects.
selector?: {
// matchExpressions is a list of label selector requirements. The
// requirements are ANDed.
matchExpressions?: [...{
// key is the label key that the selector applies to.
key: string
// operator represents a key's relationship to a set of values.
// Valid operators are In, NotIn, Exists and DoesNotExist.
operator: string
// values is an array of string values. If the operator is In or
// NotIn, the values array must be non-empty. If the operator is
// Exists or DoesNotExist, the values array must be empty. This
// array is replaced during a strategic merge patch.
values?: [...string]
}]
// matchLabels is a map of {key,value} pairs. A single {key,value}
// in the matchLabels map is equivalent to an element of
// matchExpressions, whose key field is "key", the operator is
// "In", and the values array contains only "value". The
// requirements are ANDed.
matchLabels?: {
[string]: string
}
}
}
// staticConfig defines the static list of targets to probe and
// the relabeling configuration. If `ingress` is also defined,
// `staticConfig` takes precedence. More info:
// https://prometheus.io/docs/prometheus/latest/configuration/configuration/#static_config.
staticConfig?: {
// Labels assigned to all metrics scraped from the targets.
labels?: {
[string]: string
}
// RelabelConfigs to apply to the label set of the targets before
// it gets scraped. More info:
// https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
relabelingConfigs?: [...{
// Action to perform based on the regex matching.
// `Uppercase` and `Lowercase` actions require Prometheus >=
// v2.36.0. `DropEqual` and `KeepEqual` actions require
// Prometheus >= v2.41.0.
// Default: "Replace"
action?: "replace" | "Replace" | "keep" | "Keep" | "drop" | "Drop" | "hashmod" | "HashMod" | "labelmap" | "LabelMap" | "labeldrop" | "LabelDrop" | "labelkeep" | "LabelKeep" | "lowercase" | "Lowercase" | "uppercase" | "Uppercase" | "keepequal" | "KeepEqual" | "dropequal" | "DropEqual" | *"replace"
// Modulus to take of the hash of the source label values.
// Only applicable when the action is `HashMod`.
modulus?: int
// Regular expression against which the extracted value is
// matched.
regex?: string
// Replacement value against which a Replace action is performed
// if the regular expression matches.
// Regex capture groups are available.
replacement?: string
// Separator is the string between concatenated SourceLabels.
separator?: string
// The source labels select values from existing labels. Their
// content is concatenated using the configured Separator and
// matched against the configured regular expression.
sourceLabels?: [...=~"^[a-zA-Z_][a-zA-Z0-9_]*$"]
// Label to which the resulting string is written in a
// replacement.
// It is mandatory for `Replace`, `HashMod`, `Lowercase`,
// `Uppercase`, `KeepEqual` and `DropEqual` actions.
// Regex capture groups are available.
targetLabel?: string
}]
// The list of hosts to probe.
static?: [...string]
}
}
// TLS configuration to use when scraping the endpoint.
tlsConfig?: {
// Certificate authority used when verifying server certificates.
ca?: {
// ConfigMap containing data to use for the targets.
configMap?: {
// The key to select.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the ConfigMap or its key must be defined
optional?: bool
}
// Secret containing data to use for the targets.
secret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// Client certificate to present when doing client-authentication.
cert?: {
// ConfigMap containing data to use for the targets.
configMap?: {
// The key to select.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the ConfigMap or its key must be defined
optional?: bool
}
// Secret containing data to use for the targets.
secret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// Disable target certificate validation.
insecureSkipVerify?: bool
// Secret containing the client key file for the targets.
keySecret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// Used to verify the hostname for the targets.
serverName?: string
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,100 @@
// Code generated by timoni. DO NOT EDIT.
//timoni:generate timoni vendor crd -f /home/jeff/workspace/holos-run/holos-infra/deploy/clusters/k2/components/prod-platform-monitoring/prod-platform-monitoring.gen.yaml
package v1
import "strings"
// PrometheusRule defines recording and alerting rules for a
// Prometheus instance
#PrometheusRule: {
// APIVersion defines the versioned schema of this representation
// of an object. Servers should convert recognized schemas to the
// latest internal value, and may reject unrecognized values.
// More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
apiVersion: "monitoring.coreos.com/v1"
// Kind is a string value representing the REST resource this
// object represents. Servers may infer this from the endpoint
// the client submits requests to. Cannot be updated. In
// CamelCase. More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
kind: "PrometheusRule"
metadata!: {
name!: strings.MaxRunes(253) & strings.MinRunes(1) & {
string
}
namespace!: strings.MaxRunes(63) & strings.MinRunes(1) & {
string
}
labels?: {
[string]: string
}
annotations?: {
[string]: string
}
}
// Specification of desired alerting rule definitions for
// Prometheus.
spec!: #PrometheusRuleSpec
}
#PrometheusRuleSpec: {
// Content of Prometheus rule file
groups?: [...{
// Interval determines how often rules in the group are evaluated.
interval?: =~"^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"
// Limit the number of alerts an alerting rule and series a
// recording rule can produce. Limit is supported starting with
// Prometheus >= 2.31 and Thanos Ruler >= 0.24.
limit?: int
// Name of the rule group.
name: strings.MinRunes(1)
// PartialResponseStrategy is only used by ThanosRuler and will be
// ignored by Prometheus instances. More info:
// https://github.com/thanos-io/thanos/blob/main/docs/components/rule.md#partial-response
partial_response_strategy?: =~"^(?i)(abort|warn)?$"
// List of alerting and recording rules.
rules?: [...{
// Name of the alert. Must be a valid label value. Only one of
// `record` and `alert` must be set.
alert?: string
// Annotations to add to each alert. Only valid for alerting
// rules.
annotations?: {
[string]: string
}
// PromQL expression to evaluate.
expr: (int | string) & {
string
}
// Alerts are considered firing once they have been returned for
// this long.
for?: =~"^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"
// KeepFiringFor defines how long an alert will continue firing
// after the condition that triggered it has cleared.
keep_firing_for?: strings.MinRunes(1) & {
=~"^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"
}
// Labels to add or overwrite.
labels?: {
[string]: string
}
// Name of the time series to output to. Must be a valid metric
// name. Only one of `record` and `alert` must be set.
record?: string
}]
}]
}

View File

@@ -0,0 +1,566 @@
// Code generated by timoni. DO NOT EDIT.
//timoni:generate timoni vendor crd -f /home/jeff/workspace/holos-run/holos-infra/deploy/clusters/k2/components/prod-platform-monitoring/prod-platform-monitoring.gen.yaml
package v1
import "strings"
// ServiceMonitor defines monitoring for a set of services.
#ServiceMonitor: {
// APIVersion defines the versioned schema of this representation
// of an object. Servers should convert recognized schemas to the
// latest internal value, and may reject unrecognized values.
// More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
apiVersion: "monitoring.coreos.com/v1"
// Kind is a string value representing the REST resource this
// object represents. Servers may infer this from the endpoint
// the client submits requests to. Cannot be updated. In
// CamelCase. More info:
// https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
kind: "ServiceMonitor"
metadata!: {
name!: strings.MaxRunes(253) & strings.MinRunes(1) & {
string
}
namespace!: strings.MaxRunes(63) & strings.MinRunes(1) & {
string
}
labels?: {
[string]: string
}
annotations?: {
[string]: string
}
}
// Specification of desired Service selection for target discovery
// by Prometheus.
spec!: #ServiceMonitorSpec
}
// Specification of desired Service selection for target discovery
// by Prometheus.
#ServiceMonitorSpec: {
attachMetadata?: {
// When set to true, Prometheus must have the `get` permission on
// the `Nodes` objects.
node?: bool
}
// List of endpoints part of this ServiceMonitor.
endpoints?: [...{
// `authorization` configures the Authorization header credentials
// to use when scraping the target.
// Cannot be set at the same time as `basicAuth`, or `oauth2`.
authorization?: {
// Selects a key of a Secret in the namespace that contains the
// credentials for authentication.
credentials?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// Defines the authentication type. The value is case-insensitive.
// "Basic" is not a supported value.
// Default: "Bearer"
type?: string
}
// `basicAuth` configures the Basic Authentication credentials to
// use when scraping the target.
// Cannot be set at the same time as `authorization`, or `oauth2`.
basicAuth?: {
// `password` specifies a key of a Secret containing the password
// for authentication.
password?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// `username` specifies a key of a Secret containing the username
// for authentication.
username?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// File to read bearer token for scraping the target.
// Deprecated: use `authorization` instead.
bearerTokenFile?: string
// `bearerTokenSecret` specifies a key of a Secret containing the
// bearer token for scraping targets. The secret needs to be in
// the same namespace as the ServiceMonitor object and readable
// by the Prometheus Operator.
// Deprecated: use `authorization` instead.
bearerTokenSecret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// `enableHttp2` can be used to disable HTTP2 when scraping the
// target.
enableHttp2?: bool
// When true, the pods which are not running (e.g. either in
// Failed or Succeeded state) are dropped during the target
// discovery.
// If unset, the filtering is enabled.
// More info:
// https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase
filterRunning?: bool
// `followRedirects` defines whether the scrape requests should
// follow HTTP 3xx redirects.
followRedirects?: bool
// When true, `honorLabels` preserves the metric's labels when
// they collide with the target's labels.
honorLabels?: bool
// `honorTimestamps` controls whether Prometheus preserves the
// timestamps when exposed by the target.
honorTimestamps?: bool
// Interval at which Prometheus scrapes the metrics from the
// target.
// If empty, Prometheus uses the global scrape interval.
interval?: =~"^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"
// `metricRelabelings` configures the relabeling rules to apply to
// the samples before ingestion.
metricRelabelings?: [...{
// Action to perform based on the regex matching.
// `Uppercase` and `Lowercase` actions require Prometheus >=
// v2.36.0. `DropEqual` and `KeepEqual` actions require
// Prometheus >= v2.41.0.
// Default: "Replace"
action?: "replace" | "Replace" | "keep" | "Keep" | "drop" | "Drop" | "hashmod" | "HashMod" | "labelmap" | "LabelMap" | "labeldrop" | "LabelDrop" | "labelkeep" | "LabelKeep" | "lowercase" | "Lowercase" | "uppercase" | "Uppercase" | "keepequal" | "KeepEqual" | "dropequal" | "DropEqual" | *"replace"
// Modulus to take of the hash of the source label values.
// Only applicable when the action is `HashMod`.
modulus?: int
// Regular expression against which the extracted value is
// matched.
regex?: string
// Replacement value against which a Replace action is performed
// if the regular expression matches.
// Regex capture groups are available.
replacement?: string
// Separator is the string between concatenated SourceLabels.
separator?: string
// The source labels select values from existing labels. Their
// content is concatenated using the configured Separator and
// matched against the configured regular expression.
sourceLabels?: [...=~"^[a-zA-Z_][a-zA-Z0-9_]*$"]
// Label to which the resulting string is written in a
// replacement.
// It is mandatory for `Replace`, `HashMod`, `Lowercase`,
// `Uppercase`, `KeepEqual` and `DropEqual` actions.
// Regex capture groups are available.
targetLabel?: string
}]
// `oauth2` configures the OAuth2 settings to use when scraping
// the target.
// It requires Prometheus >= 2.27.0.
// Cannot be set at the same time as `authorization`, or
// `basicAuth`.
oauth2?: {
// `clientId` specifies a key of a Secret or ConfigMap containing
// the OAuth2 client's ID.
clientId: {
// ConfigMap containing data to use for the targets.
configMap?: {
// The key to select.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the ConfigMap or its key must be defined
optional?: bool
}
// Secret containing data to use for the targets.
secret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// `clientSecret` specifies a key of a Secret containing the
// OAuth2 client's secret.
clientSecret: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// `endpointParams` configures the HTTP parameters to append to
// the token URL.
endpointParams?: {
[string]: string
}
// `scopes` defines the OAuth2 scopes used for the token request.
scopes?: [...string]
// `tokenURL` configures the URL to fetch the token from.
tokenUrl: strings.MinRunes(1)
}
// params define optional HTTP URL parameters.
params?: {
[string]: [...string]
}
// HTTP path from which to scrape for metrics.
// If empty, Prometheus uses the default value (e.g. `/metrics`).
path?: string
// Name of the Service port which this endpoint refers to.
// It takes precedence over `targetPort`.
port?: string
// `proxyURL` configures the HTTP Proxy URL (e.g.
// "http://proxyserver:2195") to go through when scraping the
// target.
proxyUrl?: string
// `relabelings` configures the relabeling rules to apply the
// target's metadata labels.
// The Operator automatically adds relabelings for a few standard
// Kubernetes fields.
// The original scrape job's name is available via the
// `__tmp_prometheus_job_name` label.
// More info:
// https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config
relabelings?: [...{
// Action to perform based on the regex matching.
// `Uppercase` and `Lowercase` actions require Prometheus >=
// v2.36.0. `DropEqual` and `KeepEqual` actions require
// Prometheus >= v2.41.0.
// Default: "Replace"
action?: "replace" | "Replace" | "keep" | "Keep" | "drop" | "Drop" | "hashmod" | "HashMod" | "labelmap" | "LabelMap" | "labeldrop" | "LabelDrop" | "labelkeep" | "LabelKeep" | "lowercase" | "Lowercase" | "uppercase" | "Uppercase" | "keepequal" | "KeepEqual" | "dropequal" | "DropEqual" | *"replace"
// Modulus to take of the hash of the source label values.
// Only applicable when the action is `HashMod`.
modulus?: int
// Regular expression against which the extracted value is
// matched.
regex?: string
// Replacement value against which a Replace action is performed
// if the regular expression matches.
// Regex capture groups are available.
replacement?: string
// Separator is the string between concatenated SourceLabels.
separator?: string
// The source labels select values from existing labels. Their
// content is concatenated using the configured Separator and
// matched against the configured regular expression.
sourceLabels?: [...=~"^[a-zA-Z_][a-zA-Z0-9_]*$"]
// Label to which the resulting string is written in a
// replacement.
// It is mandatory for `Replace`, `HashMod`, `Lowercase`,
// `Uppercase`, `KeepEqual` and `DropEqual` actions.
// Regex capture groups are available.
targetLabel?: string
}]
// HTTP scheme to use for scraping.
// `http` and `https` are the expected values unless you rewrite
// the `__scheme__` label via relabeling.
// If empty, Prometheus uses the default value `http`.
scheme?: "http" | "https"
// Timeout after which Prometheus considers the scrape to be
// failed.
// If empty, Prometheus uses the global scrape timeout unless it
// is less than the target's scrape interval value in which the
// latter is used.
scrapeTimeout?: =~"^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$"
// Name or number of the target port of the `Pod` object behind
// the Service. The port must be specified with the container's
// port property.
targetPort?: (int | string) & {
string
}
// TLS configuration to use when scraping the target.
tlsConfig?: {
// Certificate authority used when verifying server certificates.
ca?: {
// ConfigMap containing data to use for the targets.
configMap?: {
// The key to select.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the ConfigMap or its key must be defined
optional?: bool
}
// Secret containing data to use for the targets.
secret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// Path to the CA cert in the Prometheus container to use for the
// targets.
caFile?: string
// Client certificate to present when doing client-authentication.
cert?: {
// ConfigMap containing data to use for the targets.
configMap?: {
// The key to select.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the ConfigMap or its key must be defined
optional?: bool
}
// Secret containing data to use for the targets.
secret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
}
// Path to the client cert file in the Prometheus container for
// the targets.
certFile?: string
// Disable target certificate validation.
insecureSkipVerify?: bool
// Path to the client key file in the Prometheus container for the
// targets.
keyFile?: string
// Secret containing the client key file for the targets.
keySecret?: {
// The key of the secret to select from. Must be a valid secret
// key.
key: string
// Name of the referent. More info:
// https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
// TODO: Add other useful fields. apiVersion, kind, uid?
name?: string
// Specify whether the Secret or its key must be defined
optional?: bool
}
// Used to verify the hostname for the targets.
serverName?: string
}
// `trackTimestampsStaleness` defines whether Prometheus tracks
// staleness of the metrics that have an explicit timestamp
// present in scraped data. Has no effect if `honorTimestamps` is
// false.
// It requires Prometheus >= v2.48.0.
trackTimestampsStaleness?: bool
}]
// `jobLabel` selects the label from the associated Kubernetes
// `Service` object which will be used as the `job` label for all
// metrics.
// For example if `jobLabel` is set to `foo` and the Kubernetes
// `Service` object is labeled with `foo: bar`, then Prometheus
// adds the `job="bar"` label to all ingested metrics.
// If the value of this field is empty or if the label doesn't
// exist for the given Service, the `job` label of the metrics
// defaults to the name of the associated Kubernetes `Service`.
jobLabel?: string
// Per-scrape limit on the number of targets dropped by relabeling
// that will be kept in memory. 0 means no limit.
// It requires Prometheus >= v2.47.0.
keepDroppedTargets?: int
// Per-scrape limit on number of labels that will be accepted for
// a sample.
// It requires Prometheus >= v2.27.0.
labelLimit?: int
// Per-scrape limit on length of labels name that will be accepted
// for a sample.
// It requires Prometheus >= v2.27.0.
labelNameLengthLimit?: int
// Per-scrape limit on length of labels value that will be
// accepted for a sample.
// It requires Prometheus >= v2.27.0.
labelValueLengthLimit?: int
// Selector to select which namespaces the Kubernetes `Endpoints`
// objects are discovered from.
namespaceSelector?: {
// Boolean describing whether all namespaces are selected in
// contrast to a list restricting them.
any?: bool
// List of namespace names to select from.
matchNames?: [...string]
}
// `podTargetLabels` defines the labels which are transferred from
// the associated Kubernetes `Pod` object onto the ingested
// metrics.
podTargetLabels?: [...string]
// `sampleLimit` defines a per-scrape limit on the number of
// scraped samples that will be accepted.
sampleLimit?: int
// The scrape class to apply.
scrapeClass?: strings.MinRunes(1)
// `scrapeProtocols` defines the protocols to negotiate during a
// scrape. It tells clients the protocols supported by Prometheus
// in order of preference (from most to least preferred).
// If unset, Prometheus uses its default value.
// It requires Prometheus >= v2.49.0.
scrapeProtocols?: [..."PrometheusProto" | "OpenMetricsText0.0.1" | "OpenMetricsText1.0.0" | "PrometheusText0.0.4"]
// Label selector to select the Kubernetes `Endpoints` objects.
selector: {
// matchExpressions is a list of label selector requirements. The
// requirements are ANDed.
matchExpressions?: [...{
// key is the label key that the selector applies to.
key: string
// operator represents a key's relationship to a set of values.
// Valid operators are In, NotIn, Exists and DoesNotExist.
operator: string
// values is an array of string values. If the operator is In or
// NotIn, the values array must be non-empty. If the operator is
// Exists or DoesNotExist, the values array must be empty. This
// array is replaced during a strategic merge patch.
values?: [...string]
}]
// matchLabels is a map of {key,value} pairs. A single {key,value}
// in the matchLabels map is equivalent to an element of
// matchExpressions, whose key field is "key", the operator is
// "In", and the values array contains only "value". The
// requirements are ANDed.
matchLabels?: {
[string]: string
}
}
// `targetLabels` defines the labels which are transferred from
// the associated Kubernetes `Service` object onto the ingested
// metrics.
targetLabels?: [...string]
// `targetLimit` defines a limit on the number of scraped targets
// that will be accepted.
targetLimit?: int
}

File diff suppressed because it is too large Load Diff

View File

@@ -306,19 +306,10 @@ import "strings"
// "value"` for prefix-based match - `regex: "value"` for RE2
// style regex-based match
// (https://github.com/google/re2/wiki/Syntax).
uri?: ({} | {
exact: _
} | {
prefix: _
} | {
regex: _
}) & {
uri?: {
exact?: string
prefix?: string
// RE2 style regex-based match
// (https://github.com/google/re2/wiki/Syntax).
regex?: string
regex?: string
}
// withoutHeader has the same syntax with the header, but has

View File

@@ -4,3 +4,8 @@ package v1
apiVersion: "apps/v1"
kind: "Deployment"
}
#StatefulSet: {
apiVersion: "apps/v1"
kind: "StatefulSet"
}

View File

@@ -7,14 +7,23 @@ import "encoding/yaml"
// apiObjects holds each the api objects produced by cue.
apiObjects: {
[Kind=_]: {
[Name=_]: {
[string]: {
kind: Kind
...
}
}
Namespace?: [Name=_]: #Namespace & {metadata: name: Name}
ExternalSecret?: [Name=_]: #ExternalSecret & {_name: Name}
VirtualService?: [Name=_]: #VirtualService & {metadata: name: Name}
Issuer?: [Name=_]: #Issuer & {metadata: name: Name}
Gateway?: [Name=_]: #Gateway & {metadata: name: Name}
ConfigMap?: [Name=_]: #ConfigMap & {metadata: name: Name}
ServiceAccount?: [Name=_]: #ServiceAccount & {metadata: name: Name}
Deployment?: [_]: #Deployment
StatefulSet?: [_]: #StatefulSet
RequestAuthentication?: [_]: #RequestAuthentication
AuthorizationPolicy?: [_]: #AuthorizationPolicy
}
// apiObjectMap holds the marshalled representation of apiObjects

View File

@@ -0,0 +1,26 @@
package holos
// NOTE: Beyond the base reference platform, services should typically be added to #OptionalServices instead of directly to a managed namespace.
// ManagedNamespace is a namespace to manage across all clusters in the holos platform.
#ManagedNamespace: {
namespace: {
metadata: {
name: string
labels: [string]: string
}
}
// clusterNames represents the set of clusters the namespace is managed on. Usually all clusters.
clusterNames: [...string]
for cluster in clusterNames {
clusters: (cluster): name: cluster
}
}
// #ManagedNamepsaces is the union of all namespaces across all cluster types and optional services.
// Holos adopts the namespace sameness position of SIG Multicluster, refer to https://github.com/kubernetes/community/blob/dd4c8b704ef1c9c3bfd928c6fa9234276d61ad18/sig-multicluster/namespace-sameness-position-statement.md
#ManagedNamespaces: {
[Name=_]: #ManagedNamespace & {
namespace: metadata: name: Name
}
}

View File

@@ -0,0 +1,54 @@
package holos
// #MeshConfig provides the istio meshconfig in the config key given projects.
#MeshConfig: {
projects: #Projects
// clusterName is the value of the --cluster-name flag, the cluster currently being manged / rendered.
clusterName: string | *#ClusterName
// for extAuthzHttp extension providers
extensionProviderMap: [Name=_]: #ExtAuthzProxy & {name: Name}
// for other extension providers like zipkin
extensionProviderExtraMap: [Name=_]: {name: Name}
config: {
accessLogEncoding: string | *"JSON"
accessLogFile: string | *"/dev/stdout"
defaultConfig: {
discoveryAddress: string | *"istiod.istio-system.svc:15012"
tracing: zipkin: address: string | *"zipkin.istio-system:9411"
}
defaultProviders: metrics: [...string] | *["prometheus"]
enablePrometheusMerge: false | *true
rootNamespace: string | *"istio-system"
trustDomain: string | *"cluster.local"
extensionProviders: [
for x in extensionProviderMap {x},
for y in extensionProviderExtraMap {y},
]
}
}
// #ExtAuthzProxy defines the provider configuration for an istio external authorization auth proxy.
#ExtAuthzProxy: {
name: string
envoyExtAuthzHttp: {
headersToDownstreamOnDeny: [
"content-type",
"set-cookie",
]
headersToUpstreamOnAllow: [
"authorization",
"path",
"x-oidc-id-token",
]
includeAdditionalHeadersInCheck: "X-Auth-Request-Redirect": "%REQ(x-forwarded-proto)%://%REQ(:authority)%%REQ(:path)%%REQ(:query)%"
includeRequestHeadersInCheck: [
"authorization",
"cookie",
"x-forwarded-for",
]
port: 4180
service: string
}
}

View File

@@ -1,6 +1,8 @@
// Controls optional feature flags for services distributed across multiple holos components.
// For example, enable issuing certificates in the provisioner cluster when an optional service is
// enabled for a workload cluster.
// enabled for a workload cluster. Another example is NATS, which isn't necessary on all clusters,
// but is necessary on clusters with a project like holos which depends on NATS.
package holos
import "list"

View File

@@ -0,0 +1,45 @@
package holos
let Namespace = "jeff-holos"
let Broker = "choria-broker"
spec: components: KubernetesObjectsList: [
#KubernetesObjects & {
_dependsOn: "prod-platform-issuer": _
metadata: name: "\(Namespace)-\(Broker)"
apiObjectMap: OBJECTS.apiObjectMap
},
]
let SelectorLabels = {
"app.kubernetes.io/instance": Broker
"app.kubernetes.io/name": Broker
}
let OBJECTS = #APIObjects & {
apiObjects: {
Certificate: "\(Broker)-tls": #Certificate & {
metadata: {
name: "\(Broker)-tls"
namespace: Namespace
labels: SelectorLabels
}
spec: {
commonName: "\(Broker).\(Namespace).svc.cluster.local"
dnsNames: [
Broker,
"\(Broker).\(Namespace).svc",
"\(Broker).\(Namespace).svc.cluster.local",
"*.\(Broker)",
"*.\(Broker).\(Namespace).svc",
"*.\(Broker).\(Namespace).svc.cluster.local",
]
issuerRef: kind: "ClusterIssuer"
issuerRef: name: "platform-issuer"
secretName: metadata.name
usages: ["signing", "key encipherment", "server auth", "client auth"]
}
}
}
}

View File

@@ -0,0 +1,45 @@
package holos
let Namespace = "jeff-holos"
let Provisioner = "choria-provisioner"
spec: components: KubernetesObjectsList: [
#KubernetesObjects & {
_dependsOn: "prod-platform-issuer": _
metadata: name: "\(Namespace)-\(Provisioner)"
apiObjectMap: OBJECTS.apiObjectMap
},
]
let SelectorLabels = {
"app.kubernetes.io/instance": Provisioner
"app.kubernetes.io/name": Provisioner
}
let OBJECTS = #APIObjects & {
apiObjects: {
Certificate: "\(Provisioner)-tls": #Certificate & {
metadata: {
name: "\(Provisioner)-tls"
namespace: Namespace
labels: SelectorLabels
}
spec: {
commonName: "\(Provisioner).\(Namespace).svc.cluster.local"
dnsNames: [
Provisioner,
"\(Provisioner).\(Namespace).svc",
"\(Provisioner).\(Namespace).svc.cluster.local",
"*.\(Provisioner)",
"*.\(Provisioner).\(Namespace).svc",
"*.\(Provisioner).\(Namespace).svc.cluster.local",
]
issuerRef: kind: "ClusterIssuer"
issuerRef: name: "platform-issuer"
secretName: metadata.name
usages: ["signing", "key encipherment", "server auth", "client auth"]
}
}
}
}

View File

@@ -0,0 +1,177 @@
package holos
let Namespace = "jeff-holos"
let Broker = "choria-broker"
spec: components: KubernetesObjectsList: [
#KubernetesObjects & {
_dependsOn: "prod-secrets-stores": _
metadata: name: "\(Namespace)-\(Broker)"
apiObjectMap: OBJECTS.apiObjectMap
},
]
let SelectorLabels = {
"app.kubernetes.io/part-of": "choria"
"app.kubernetes.io/name": Broker
}
let Metadata = {
name: Broker
namespace: Namespace
labels: SelectorLabels
}
let OBJECTS = #APIObjects & {
apiObjects: {
ExternalSecret: "\(Broker)-tls": #ExternalSecret & {
metadata: name: "\(Broker)-tls"
metadata: namespace: Namespace
}
ExternalSecret: "\(Broker)": #ExternalSecret & {
metadata: name: Broker
metadata: namespace: Namespace
}
StatefulSet: "\(Broker)": {
metadata: Metadata
spec: {
selector: matchLabels: SelectorLabels
serviceName: Broker
template: metadata: labels: SelectorLabels
template: spec: {
containers: [
{
name: Broker
command: ["choria", "broker", "run", "--config", "/etc/choria/broker.conf"]
image: "registry.choria.io/choria/choria:0.28.0"
imagePullPolicy: "IfNotPresent"
ports: [
{
containerPort: 4222
name: "tcp-nats"
protocol: "TCP"
},
{
containerPort: 4333
name: "https-wss"
protocol: "TCP"
},
{
containerPort: 5222
name: "tcp-cluster"
protocol: "TCP"
},
{
containerPort: 8222
name: "http-stats"
protocol: "TCP"
},
]
livenessProbe: httpGet: {
path: "/healthz"
port: "http-stats"
}
readinessProbe: livenessProbe
resources: {}
securityContext: {}
volumeMounts: [
{
mountPath: "/etc/choria"
name: Broker
},
{
mountPath: "/etc/choria-tls"
name: "\(Broker)-tls"
},
]
},
]
securityContext: {}
serviceAccountName: Broker
volumes: [
{
name: Broker
secret: secretName: Broker
},
{
name: "\(Broker)-tls"
secret: secretName: "\(Broker)-tls"
},
]
}
}
}
ServiceAccount: "\(Broker)": #ServiceAccount & {
metadata: Metadata
}
Service: "\(Broker)": #Service & {
metadata: Metadata
spec: {
type: "ClusterIP"
clusterIP: "None"
selector: SelectorLabels
ports: [
{
name: "tcp-nats"
appProtocol: "tcp"
port: 4222
protocol: "TCP"
targetPort: "tcp-nats"
},
{
name: "tcp-cluster"
appProtocol: "tcp"
port: 5222
protocol: "TCP"
targetPort: "tcp-cluster"
},
{
name: "https-wss"
appProtocol: "https"
port: 443
protocol: "TCP"
targetPort: "https-wss"
},
]
}
}
DestinationRule: "\(Broker)-wss": #DestinationRule & {
_decriptions: "Configures Istio to connect to Choria using a cert issued by the Platform Issuer"
metadata: Metadata
spec: host: "\(Broker).\(Namespace).svc.cluster.local"
spec: trafficPolicy: tls: {
credentialName: "istio-ingress-mtls-cert"
mode: "MUTUAL"
// subjectAltNames is important, otherwise istio will fail to verify the
// choria broker upstream server. make sure this matches a value
// present in the choria broker's cert.
//
// kubectl get secret choria-broker-tls -o json | jq --exit-status
// '.data | map_values(@base64d)' | jq .\"tls.crt\" -r | openssl x509
// -text -noout -in -
subjectAltNames: [spec.host]
}
}
VirtualService: "\(Broker)-wss": #VirtualService & {
metadata: name: "\(Broker)-wss"
metadata: namespace: Namespace
spec: {
gateways: ["istio-ingress/default"]
hosts: ["jeff.provision.dev.\(#ClusterName).holos.run"]
http: [
{
route: [
{
destination: {
host: "\(Broker).\(Namespace).svc.cluster.local"
port: "number": 443
}
},
]
},
]
}
}
}
}

View File

@@ -0,0 +1,18 @@
FROM registry.choria.io/choria/provisioner:latest
RUN curl -Lo nsc.zip https://github.com/nats-io/nsc/releases/download/v2.8.6/nsc-linux-amd64.zip &&\
unzip nsc.zip && \
mv nsc /usr/local/bin/nsc && \
chmod 755 /usr/local/bin/nsc && \
rm -f nsc.zip
# TODO: Add jwt executable
# TODO: Add helper executable
USER choria
ENV USER=choria
ENTRYPOINT ["/usr/sbin/choria-provisioner"]
# These two files are expected to be in the provisioner secret.
CMD ["--config=/etc/provisioner/provisioner.yaml", "--choria-config=/etc/provisioner/choria.cfg"]

View File

@@ -0,0 +1,82 @@
package holos
let Namespace = "jeff-holos"
let Provisioner = "choria-provisioner"
spec: components: KubernetesObjectsList: [
#KubernetesObjects & {
_dependsOn: "prod-secrets-stores": _
metadata: name: "\(Namespace)-\(Provisioner)"
apiObjectMap: OBJECTS.apiObjectMap
},
]
let SelectorLabels = {
"app.kubernetes.io/instance": Provisioner
"app.kubernetes.io/name": Provisioner
}
let Metadata = {
name: Provisioner
namespace: Namespace
labels: SelectorLabels
}
let OBJECTS = #APIObjects & {
apiObjects: {
ExternalSecret: "\(Provisioner)-tls": #ExternalSecret & {
metadata: name: "\(Provisioner)-tls"
metadata: namespace: Namespace
}
ExternalSecret: "\(Provisioner)": #ExternalSecret & {
metadata: name: Provisioner
metadata: namespace: Namespace
}
ServiceAccount: "\(Provisioner)": #ServiceAccount & {
metadata: Metadata
}
Deployment: "\(Provisioner)": {
metadata: Metadata
spec: {
selector: matchLabels: SelectorLabels
template: metadata: labels: SelectorLabels
template: spec: {
containers: [
{
name: Provisioner
command: ["bash", "/etc/provisioner/entrypoint"]
// skopeo inspect docker://registry.choria.io/choria/provisioner | jq .RepoTags
image: "registry.choria.io/choria/provisioner:0.15.1"
imagePullPolicy: "IfNotPresent"
resources: {}
securityContext: {}
volumeMounts: [
{
mountPath: "/etc/provisioner"
name: Provisioner
},
{
mountPath: "/etc/provisioner-tls"
name: "\(Provisioner)-tls"
},
]
},
]
securityContext: {}
serviceAccountName: Provisioner
volumes: [
{
name: Provisioner
secret: secretName: name
},
{
name: "\(Provisioner)-tls"
secret: secretName: name
},
]
}
}
}
}
}

View File

@@ -0,0 +1,8 @@
# Machine Room Provisioner
This sub-tree contains Holos Components to manage a [Choria Provisioner][1]
system for the use case of provisioning `holos controller` instances. These
instances are implementations of Machine Room which are in turn implementations
of Choria Server, hence why we use Choria Provisioner.
[1]: https://choria-io.github.io/provisioner/

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# curl -LO https://github.com/nats-io/nack/releases/latest/download/crds.yml
resources:
- crds.yml

View File

@@ -0,0 +1,8 @@
package holos
// NATS NetStream Controller (NACK)
spec: components: KustomizeBuildList: [
#KustomizeBuild & {
metadata: name: "prod-nack-crds"
},
]

View File

@@ -0,0 +1,62 @@
package holos
// for Project in _Projects {
// spec: components: resources: (#ProjectTemplate & {project: Project}).workload.resources
// }
let Namespace = "jeff-holos"
#Kustomization: spec: targetNamespace: Namespace
spec: components: HelmChartList: [
#HelmChart & {
metadata: name: "jeff-holos-nats"
namespace: Namespace
_dependsOn: "prod-secrets-stores": _
chart: {
name: "nats"
version: "1.1.10"
repository: NatsRepository
}
_values: #NatsValues & {
config: {
// https://github.com/nats-io/k8s/tree/main/helm/charts/nats#operator-mode-with-nats-resolver
resolver: enabled: true
resolver: merge: {
type: "full"
interval: "2m"
timeout: "1.9s"
}
merge: {
operator: "eyJ0eXAiOiJKV1QiLCJhbGciOiJlZDI1NTE5LW5rZXkifQ.eyJqdGkiOiJUSElBTDM2NUtOS0lVVVJDMzNLNFJGQkJVRlFBSTRLS0NQTDJGVDZYVjdNQVhWU1dFNElRIiwiaWF0IjoxNzEzMjIxMzE1LCJpc3MiOiJPREtQM0RZTzc3T1NBRU5IU0FFR0s3WUNFTFBYT1FFWUI3RVFSTVBLWlBNQUxINE5BRUVLSjZDRyIsIm5hbWUiOiJIb2xvcyIsInN1YiI6Ik9ES1AzRFlPNzdPU0FFTkhTQUVHSzdZQ0VMUFhPUUVZQjdFUVJNUEtaUE1BTEg0TkFFRUtKNkNHIiwibmF0cyI6eyJ0eXBlIjoib3BlcmF0b3IiLCJ2ZXJzaW9uIjoyfX0.dQURTb-zIQMc-OYd9328oY887AEnvog6gOXY1-VCsDG3L89nq5x_ks4ME7dJ4Pn-Pvm2eyBi1Jx6ubgkthHgCQ"
system_account: "ADIQCYK4K3OKTPODGCLI4PDQ6XBO52MISBPTAIDESEJMLZCMNULDKCCY"
resolver_preload: {
// NOTEL: Make sure you do not include the trailing , in the SYS_ACCOUNT_JWT
"ADIQCYK4K3OKTPODGCLI4PDQ6XBO52MISBPTAIDESEJMLZCMNULDKCCY": "eyJ0eXAiOiJKV1QiLCJhbGciOiJlZDI1NTE5LW5rZXkifQ.eyJqdGkiOiI2SEVMNlhKSUdWUElMNFBURVI1MkUzTkFITjZLWkVUUUdFTlFVS0JWRzNUWlNLRzVLT09RIiwiaWF0IjoxNzEzMjIxMzE1LCJpc3MiOiJPREtQM0RZTzc3T1NBRU5IU0FFR0s3WUNFTFBYT1FFWUI3RVFSTVBLWlBNQUxINE5BRUVLSjZDRyIsIm5hbWUiOiJTWVMiLCJzdWIiOiJBRElRQ1lLNEszT0tUUE9ER0NMSTRQRFE2WEJPNTJNSVNCUFRBSURFU0VKTUxaQ01OVUxES0NDWSIsIm5hdHMiOnsibGltaXRzIjp7InN1YnMiOi0xLCJkYXRhIjotMSwicGF5bG9hZCI6LTEsImltcG9ydHMiOi0xLCJleHBvcnRzIjotMSwid2lsZGNhcmRzIjp0cnVlLCJjb25uIjotMSwibGVhZiI6LTF9LCJkZWZhdWx0X3Blcm1pc3Npb25zIjp7InB1YiI6e30sInN1YiI6e319LCJhdXRob3JpemF0aW9uIjp7fSwidHlwZSI6ImFjY291bnQiLCJ2ZXJzaW9uIjoyfX0.TiGIk8XON394D9SBEowGHY_nTeOyHiM-ihyw6HZs8AngOnYPFXH9OVjsaAf8Poa2k_V84VtH7yVNgNdjBgduDA"
}
}
cluster: enabled: true
jetstream: enabled: true
websocket: enabled: true
monitor: enabled: true
}
promExporter: enabled: true
promExporter: podMonitor: enabled: true
}
},
#HelmChart & {
metadata: name: "jeff-holos-nack"
namespace: Namespace
_dependsOn: "jeff-holos-nats": _
chart: {
name: "nack"
version: "0.25.2"
repository: NatsRepository
}
},
]
let NatsRepository = {
name: "nats"
url: "https://nats-io.github.io/k8s/helm/charts/"
}

View File

@@ -0,0 +1,722 @@
package holos
#NatsValues: {
//###############################################################################
// Global options
//###############################################################################
global: {
image: {
// global image pull policy to use for all container images in the chart
// can be overridden by individual image pullPolicy
pullPolicy: null
// global list of secret names to use as image pull secrets for all pod specs in the chart
// secrets must exist in the same namespace
// https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
pullSecretNames: []
// global registry to use for all container images in the chart
// can be overridden by individual image registry
registry: null
}
// global labels will be applied to all resources deployed by the chart
labels: {}
}
//###############################################################################
// Common options
//###############################################################################
// override name of the chart
nameOverride: null
// override full name of the chart+release
fullnameOverride: null
// override the namespace that resources are installed into
namespaceOverride: null
// reference a common CA Certificate or Bundle in all nats config `tls` blocks and nats-box contexts
// note: `tls.verify` still must be set in the appropriate nats config `tls` blocks to require mTLS
tlsCA: {
enabled: false
// set configMapName in order to mount an existing configMap to dir
configMapName: null
// set secretName in order to mount an existing secretName to dir
secretName: null
// directory to mount the configMap or secret to
dir: "/etc/nats-ca-cert"
// key in the configMap or secret that contains the CA Certificate or Bundle
key: "ca.crt"
}
//###############################################################################
// NATS Stateful Set and associated resources
//###############################################################################
//###########################################################
// NATS config
//###########################################################
config: {
cluster: {
enabled: true | *false
port: 6222
// must be 2 or higher when jetstream is enabled
replicas: 3
// apply to generated route URLs that connect to other pods in the StatefulSet
routeURLs: {
// if both user and password are set, they will be added to route URLs
// and the cluster authorization block
user: null
password: null
// set to true to use FQDN in route URLs
useFQDN: false
k8sClusterDomain: "cluster.local"
}
tls: {
enabled: true | *false
// set secretName in order to mount an existing secret to dir
secretName: null
dir: "/etc/nats-certs/cluster"
cert: "tls.crt"
key: "tls.key"
// merge or patch the tls config
// https://docs.nats.io/running-a-nats-service/configuration/securing_nats/tls
merge: {}
patch: []
}
// merge or patch the cluster config
// https://docs.nats.io/running-a-nats-service/configuration/clustering/cluster_config
merge: {}
patch: []
}
jetstream: {
enabled: true | *false
fileStore: {
enabled: true
dir: "/data"
//###########################################################
// stateful set -> volume claim templates -> jetstream pvc
//###########################################################
pvc: {
enabled: true
size: "10Gi"
storageClassName: null
// merge or patch the jetstream pvc
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#persistentvolumeclaim-v1-core
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}-js"
name: null
}
// defaults to the PVC size
maxSize: null
}
memoryStore: {
enabled: false
// ensure that container has a sufficient memory limit greater than maxSize
maxSize: "1Gi"
}
// merge or patch the jetstream config
// https://docs.nats.io/running-a-nats-service/configuration#jetstream
merge: {}
patch: []
}
nats: {
port: 4222
tls: {
enabled: false
// set secretName in order to mount an existing secret to dir
secretName: null
dir: "/etc/nats-certs/nats"
cert: "tls.crt"
key: "tls.key"
// merge or patch the tls config
// https://docs.nats.io/running-a-nats-service/configuration/securing_nats/tls
merge: {}
patch: []
}
}
leafnodes: {
enabled: false
port: 7422
tls: {
enabled: false
// set secretName in order to mount an existing secret to dir
secretName: null
dir: "/etc/nats-certs/leafnodes"
cert: "tls.crt"
key: "tls.key"
// merge or patch the tls config
// https://docs.nats.io/running-a-nats-service/configuration/securing_nats/tls
merge: {}
patch: []
}
// merge or patch the leafnodes config
// https://docs.nats.io/running-a-nats-service/configuration/leafnodes/leafnode_conf
merge: {}
patch: []
}
websocket: {
enabled: true | *false
port: 8080
tls: {
enabled: false
// set secretName in order to mount an existing secret to dir
secretName: null
dir: "/etc/nats-certs/websocket"
cert: "tls.crt"
key: "tls.key"
// merge or patch the tls config
// https://docs.nats.io/running-a-nats-service/configuration/securing_nats/tls
merge: {}
patch: []
}
//###########################################################
// ingress
//###########################################################
// service must be enabled also
ingress: {
enabled: false
// must contain at least 1 host otherwise ingress will not be created
hosts: []
path: "/"
pathType: "Exact"
// sets to the ingress class name
className: null
// set to an existing secret name to enable TLS on the ingress; applies to all hosts
tlsSecretName: null
// merge or patch the ingress
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#ingress-v1-networking-k8s-io
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}-ws"
name: null
}
// merge or patch the websocket config
// https://docs.nats.io/running-a-nats-service/configuration/websocket/websocket_conf
merge: {}
patch: []
}
mqtt: {
enabled: false
port: 1883
tls: {
enabled: false
// set secretName in order to mount an existing secret to dir
secretName: null
dir: "/etc/nats-certs/mqtt"
cert: "tls.crt"
key: "tls.key"
// merge or patch the tls config
// https://docs.nats.io/running-a-nats-service/configuration/securing_nats/tls
merge: {}
patch: []
}
// merge or patch the mqtt config
// https://docs.nats.io/running-a-nats-service/configuration/mqtt/mqtt_config
merge: {}
patch: []
}
gateway: {
enabled: false
port: 7222
tls: {
enabled: false
// set secretName in order to mount an existing secret to dir
secretName: null
dir: "/etc/nats-certs/gateway"
cert: "tls.crt"
key: "tls.key"
// merge or patch the tls config
// https://docs.nats.io/running-a-nats-service/configuration/securing_nats/tls
merge: {}
patch: []
}
// merge or patch the gateway config
// https://docs.nats.io/running-a-nats-service/configuration/gateways/gateway#gateway-configuration-block
merge: {}
patch: []
}
monitor: {
enabled: true
port: 8222
tls: {
// config.nats.tls must be enabled also
// when enabled, monitoring port will use HTTPS with the options from config.nats.tls
enabled: false
}
}
profiling: {
enabled: false
port: 65432
}
resolver: {
enabled: true | *false
dir: "/data/resolver"
//###########################################################
// stateful set -> volume claim templates -> resolver pvc
//###########################################################
pvc: {
enabled: true
size: "1Gi"
storageClassName: null
// merge or patch the pvc
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#persistentvolumeclaim-v1-core
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}-resolver"
name: null
}
// merge or patch the resolver
// https://docs.nats.io/running-a-nats-service/configuration/securing_nats/auth_intro/jwt/resolver
merge: {
type?: string
interval?: string
timeout?: string
}
patch: []
}
// adds a prefix to the server name, which defaults to the pod name
// helpful for ensuring server name is unique in a super cluster
serverNamePrefix: ""
// merge or patch the nats config
// https://docs.nats.io/running-a-nats-service/configuration
// following special rules apply
// 1. strings that start with << and end with >> will be unquoted
// use this for variables and numbers with units
// 2. keys ending in $include will be switched to include directives
// keys are sorted alphabetically, use prefix before $includes to control includes ordering
// paths should be relative to /etc/nats-config/nats.conf
// example:
//
// merge:
// $include: ./my-config.conf
// zzz$include: ./my-config-last.conf
// server_name: nats
// authorization:
// token: << $TOKEN >>
// jetstream:
// max_memory_store: << 1GB >>
//
// will yield the config:
// {
// include ./my-config.conf;
// "authorization": {
// "token": $TOKEN
// },
// "jetstream": {
// "max_memory_store": 1GB
// },
// "server_name": "nats",
// include ./my-config-last.conf;
// }
merge: {
operator?: string
system_account?: string
resolver_preload?: [string]: string
}
patch: []
}
//###########################################################
// stateful set -> pod template -> nats container
//###########################################################
container: {
image: {
repository: "nats"
tag: "2.10.12-alpine"
pullPolicy: null
registry: null
}
// container port options
// must be enabled in the config section also
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#containerport-v1-core
ports: {
nats: {}
leafnodes: {}
websocket: {}
mqtt: {}
cluster: {}
gateway: {}
monitor: {}
profiling: {}
}
// map with key as env var name, value can be string or map
// example:
//
// env:
// GOMEMLIMIT: 7GiB
// TOKEN:
// valueFrom:
// secretKeyRef:
// name: nats-auth
// key: token
env: {}
// merge or patch the container
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#container-v1-core
merge: {}
patch: []
}
//###########################################################
// stateful set -> pod template -> reloader container
//###########################################################
reloader: {
enabled: true
image: {
repository: "natsio/nats-server-config-reloader"
tag: "0.14.1"
pullPolicy: null
registry: null
}
// env var map, see nats.env for an example
env: {}
// all nats container volume mounts with the following prefixes
// will be mounted into the reloader container
natsVolumeMountPrefixes: ["/etc/"]
// merge or patch the container
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#container-v1-core
merge: {}
patch: []
}
//###########################################################
// stateful set -> pod template -> prom-exporter container
//###########################################################
// config.monitor must be enabled
promExporter: {
enabled: true | *false
image: {
repository: "natsio/prometheus-nats-exporter"
tag: "0.14.0"
pullPolicy: null
registry: null
}
port: 7777
// env var map, see nats.env for an example
env: {}
// merge or patch the container
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#container-v1-core
merge: {}
patch: []
//###########################################################
// prometheus pod monitor
//###########################################################
podMonitor: {
enabled: true | *false
// merge or patch the pod monitor
// https://prometheus-operator.dev/docs/operator/api/#monitoring.coreos.com/v1.PodMonitor
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}"
name: null
}
}
//###########################################################
// service
//###########################################################
service: {
enabled: true
// service port options
// additional boolean field enable to control whether port is exposed in the service
// must be enabled in the config section also
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#serviceport-v1-core
ports: {
nats: enabled: true
leafnodes: enabled: true
websocket: enabled: true
mqtt: enabled: true
cluster: enabled: false
gateway: enabled: false
monitor: enabled: false
profiling: enabled: false
}
// merge or patch the service
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#service-v1-core
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}"
name: null
}
//###########################################################
// other nats extension points
//###########################################################
// stateful set
statefulSet: {
// merge or patch the stateful set
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#statefulset-v1-apps
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}"
name: null
}
// stateful set -> pod template
podTemplate: {
// adds a hash of the ConfigMap as a pod annotation
// this will cause the StatefulSet to roll when the ConfigMap is updated
configChecksumAnnotation: true
// map of topologyKey: topologySpreadConstraint
// labelSelector will be added to match StatefulSet pods
//
// topologySpreadConstraints:
// kubernetes.io/hostname:
// maxSkew: 1
//
topologySpreadConstraints: {}
// merge or patch the pod template
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#pod-v1-core
merge: {}
patch: []
}
// headless service
headlessService: {
// merge or patch the headless service
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#service-v1-core
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}-headless"
name: null
}
// config map
configMap: {
// merge or patch the config map
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#configmap-v1-core
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}-config"
name: null
}
// pod disruption budget
podDisruptionBudget: {
enabled: true
// merge or patch the pod disruption budget
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#poddisruptionbudget-v1-policy
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}"
name: null
}
// service account
serviceAccount: {
enabled: false
// merge or patch the service account
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#serviceaccount-v1-core
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}"
name: null
}
//###########################################################
// natsBox
//
// NATS Box Deployment and associated resources
//###########################################################
natsBox: {
enabled: true
//###########################################################
// NATS contexts
//###########################################################
contexts: {
default: {
creds: {
// set contents in order to create a secret with the creds file contents
contents: null
// set secretName in order to mount an existing secret to dir
secretName: null
// defaults to /etc/nats-creds/<context-name>
dir: null
key: "nats.creds"
}
nkey: {
// set contents in order to create a secret with the nkey file contents
contents: null
// set secretName in order to mount an existing secret to dir
secretName: null
// defaults to /etc/nats-nkeys/<context-name>
dir: null
key: "nats.nk"
}
// used to connect with client certificates
tls: {
// set secretName in order to mount an existing secret to dir
secretName: null
// defaults to /etc/nats-certs/<context-name>
dir: null
cert: "tls.crt"
key: "tls.key"
}
// merge or patch the context
// https://docs.nats.io/using-nats/nats-tools/nats_cli#nats-contexts
merge: {}
patch: []
}
}
// name of context to select by default
defaultContextName: "default"
//###########################################################
// deployment -> pod template -> nats-box container
//###########################################################
container: {
image: {
repository: "natsio/nats-box"
tag: "0.14.2"
pullPolicy: null
registry: null
}
// env var map, see nats.env for an example
env: {}
// merge or patch the container
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#container-v1-core
merge: {}
patch: []
}
//###########################################################
// other nats-box extension points
//###########################################################
// deployment
deployment: {
// merge or patch the deployment
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#deployment-v1-apps
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}-box"
name: null
}
// deployment -> pod template
podTemplate: {
// merge or patch the pod template
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#pod-v1-core
merge: {}
patch: []
}
// contexts secret
contextsSecret: {
// merge or patch the context secret
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#secret-v1-core
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}-box-contexts"
name: null
}
// contents secret
contentsSecret: {
// merge or patch the contents secret
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#secret-v1-core
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}-box-contents"
name: null
}
// service account
serviceAccount: {
enabled: false
// merge or patch the service account
// https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#serviceaccount-v1-core
merge: {}
patch: []
// defaults to "{{ include "nats.fullname" $ }}-box"
name: null
}
}
//###############################################################################
// Extra user-defined resources
//###############################################################################
//
// add arbitrary user-generated resources
// example:
//
// config:
// websocket:
// enabled: true
// extraResources:
// - apiVersion: networking.istio.io/v1beta1
// kind: VirtualService
// metadata:
// name:
// $tplYaml: >
// {{ include "nats.fullname" $ | quote }}
// labels:
// $tplYaml: |
// {{ include "nats.labels" $ }}
// spec:
// hosts:
// - demo.nats.io
// gateways:
// - my-gateway
// http:
// - name: default
// match:
// - name: root
// uri:
// exact: /
// route:
// - destination:
// host:
// $tplYaml: >
// {{ .Values.service.name | quote }}
// port:
// number:
// $tplYaml: >
// {{ .Values.config.websocket.port }}
//
extraResources: []
}

View File

@@ -0,0 +1,81 @@
#! /bin/bash
#
# This script initializes authorization for a nats cluster. The process is:
#
# Locally:
# 1. Generate the nats operator jwt.
# 2. Generate a SYS account jwt issued by the operator.
# 3. Store both into vault
#
# When nats is deployed an ExternalSecret populates auth.conf which is included
# into nats.conf. This approach allows helm values to be used for most things
# except for secrets.
#
# Clean up by removing the nsc directory.
set -euo pipefail
tmpdir="$(mktemp -d)"
finish() {
[[ -d "$tmpdir" ]] && rm -rf "$tmpdir"
}
trap finish EXIT
PARENT="$(cd "$(dirname $0)" && pwd)"
: "${OPERATOR_NAME:="Holos"}"
: "${OIX_NAMESPACE:=$(kubectl config view --minify --flatten -ojsonpath='{.contexts[0].context.namespace}')}"
nsc="${HOME}/.bin/nsc"
ROOT="${PARENT}/${OIX_NAMESPACE}/nsc"
export NKEYS_PATH="${ROOT}/nkeys"
export NSC_HOME="${ROOT}/accounts"
mkdir -p "$NKEYS_PATH"
mkdir -p "$NSC_HOME"
# Install nsc if not already installed
if ! [[ -x $nsc ]]; then
platform="$(kubectl version --output=json | jq .clientVersion.platform -r)"
platform="${platform//\//-}"
curl -fSLo "${tmpdir}/nsc.zip" "https://github.com/nats-io/nsc/releases/download/v2.8.6/nsc-${platform}.zip"
(cd "${tmpdir}" && unzip nsc.zip)
sudo install -o 0 -g 0 -m 0755 "${tmpdir}/nsc" $nsc
fi
echo "export NKEYS_PATH='${NKEYS_PATH}'" > "${ROOT}/nsc.env"
echo "export NSC_HOME='${NSC_HOME}'" >> "${ROOT}/nsc.env"
# use kubectl port-forward nats-headless 4222
echo "export NATS_URL='nats://localhost:4222'" >> "${ROOT}/nsc.env"
echo "export NATS_CREDS='${ROOT}/nkeys/creds/${OPERATOR_NAME}/SYS/sys.creds'" >> "${ROOT}/nsc.env"
echo "export NATS_CA='${ROOT}/ca.crt'" >> "${ROOT}/nsc.env"
echo "export NATS_CERT='${ROOT}/tls.crt'" >> "${ROOT}/nsc.env"
echo "export NATS_KEY='${ROOT}/tls.key'" >> "${ROOT}/nsc.env"
$nsc --data-dir="${ROOT}/stores" list operators
# Create operator
$nsc add operator --name "${OPERATOR_NAME}"
# Create system account
$nsc add account --name SYS
$nsc add user --name sys
# Create account for STAN purposes.
$nsc add account --name STAN
$nsc add user --name stan
# Generate an auth config compatible with the StatefulSet mounting the
# nats-jwt-pvc PersistentVolumeClaim at path /data/accounts
$nsc generate config --sys-account SYS --nats-resolver \
| sed "s,dir.*jwt',dir: '/data/accounts'" \
> "${ROOT}/auth.conf"
# Store the auth config in vault.
# vault kv put kv/${OIX_CLUSTER_NAME}/kube-namespace/holos-dev/nats-auth-config "auth.conf=@${tmpdir}/auth.conf"
# Store the SYS creds in vault for use by the nack controller.
# vault kv put kv/${OIX_CLUSTER_NAME}/kube-namespace/holos-dev/nats-sys-creds "sys.creds=@${OIX_CLUSTER_NAME}/nsc/nkeys/creds/${OPERATOR_NAME}/SYS/sys.creds"
echo "After deploying the nats component, use the get-cert command to fetch the client cert."
echo "Use kubectl port-forward svc/nats-headless 4222" >&2
echo "source ${ROOT}/nsc.env to make it all work." >&2

View File

@@ -0,0 +1,5 @@
# Holos
This subtree contains holos components for holos itself. We strive for minimal dependencies, so this is likely going to contain NATS and/or Postgres resources.
Components depend on the holos project and may iterate over the defined environments in the project stages.

View File

@@ -1,5 +1,21 @@
package holos
#PlatformServers: {
for cluster in #Platform.clusters {
(cluster.name): {
"https-istio-ingress-httpbin": {
let cert = #PlatformCerts[cluster.name+"-httpbin"]
hosts: [for host in cert.spec.dnsNames {"istio-ingress/\(host)"}]
port: name: "https-istio-ingress-httpbin"
port: number: 443
port: protocol: "HTTPS"
tls: credentialName: cert.spec.secretName
tls: mode: "SIMPLE"
}
}
}
}
#PlatformCerts: {
// Globally scoped platform services are defined here.
login: #PlatformCert & {

View File

@@ -4,4 +4,4 @@ package holos
#InputKeys: project: "iam"
// Shared dependencies for all components in this collection.
#DependsOn: namespaces: name: "\(#StageName)-secrets-namespaces"
#DependsOn: namespaces: name: "\(#StageName)-secrets-stores"

View File

@@ -0,0 +1,10 @@
To deploy monitoring:
> **_NOTE:_** For more detailed instructions on deploying, see the [documentation on installing Monitoring](https://access.crunchydata.com/documentation/postgres-operator/latest/installation/monitoring/kustomize).
1. verify the namespace is correct in kustomization.yaml
2. If you are deploying in openshift, comment out the fsGroup line under securityContext in the following files:
- `alertmanager/deployment.yaml`
- `grafana/deployment.yaml`
- `prometheus/deployment.yaml`
3. kubectl apply -k .

View File

@@ -0,0 +1,78 @@
###
#
# Copyright © 2017-2024 Crunchy Data Solutions, Inc. All Rights Reserved.
#
###
# Based on upstream example file found here: https://github.com/prometheus/alertmanager/blob/master/doc/examples/simple.yml
global:
smtp_smarthost: 'localhost: 25'
smtp_require_tls: false
smtp_from: 'Alertmanager <abc@yahoo.com>'
# smtp_smarthost: 'smtp.example.com:587'
# smtp_from: 'Alertmanager <abc@yahoo.com>'
# smtp_auth_username: '<username>'
# smtp_auth_password: '<password>'
# templates:
# - '/etc/alertmanager/template/*.tmpl'
inhibit_rules:
# Apply inhibition of warning if the alertname for the same system and service is already critical
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'job', 'service']
receivers:
- name: 'default-receiver'
email_configs:
- to: 'example@crunchydata.com'
send_resolved: true
## Examples of alternative alert receivers. See documentation for more info on how to configure these fully
#- name: 'pagerduty-dba'
# pagerduty_configs:
# - service_key: <RANDOMKEYSTUFF>
#- name: 'pagerduty-sre'
# pagerduty_configs:
# - service_key: <RANDOMKEYSTUFF>
#- name: 'dba-team'
# email_configs:
# - to: 'example-dba-team@crunchydata.com'
# send_resolved: true
#- name: 'sre-team'
# email_configs:
# - to: 'example-sre-team@crunchydata.com'
# send_resolved: true
route:
receiver: default-receiver
group_by: [severity, service, job, alertname]
group_wait: 30s
group_interval: 5m
repeat_interval: 24h
## Example routes to show how to route outgoing alerts based on the content of that alert
# routes:
# - match_re:
# service: ^(postgresql|mysql|oracle)$
# receiver: dba-team
# # sub route to send critical dba alerts to pagerduty
# routes:
# - match:
# severity: critical
# receiver: pagerduty-dba
#
# - match:
# service: system
# receiver: sre-team
# # sub route to send critical sre alerts to pagerduty
# routes:
# - match:
# severity: critical
# receiver: pagerduty-sre

View File

@@ -0,0 +1,46 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: crunchy-alertmanager
spec:
selector: {}
template:
spec:
containers:
- name: alertmanager
image: prom/alertmanager:v0.24.0
args:
- --config.file=/etc/alertmanager/alertmanager.yml
- --storage.path=/alertmanager
- --log.level=info
- --cluster.advertise-address=0.0.0.0:9093
livenessProbe:
httpGet:
path: /-/healthy
port: 9093
initialDelaySeconds: 25
periodSeconds: 20
ports:
- containerPort: 9093
readinessProbe:
httpGet:
path: /-/ready
port: 9093
volumeMounts:
- mountPath: /etc/alertmanager
name: alertmanagerconf
- mountPath: /alertmanager
name: alertmanagerdata
securityContext:
fsGroup: 26
# supplementalGroups:
# - 65534
serviceAccountName: alertmanager
volumes:
- name: alertmanagerdata
persistentVolumeClaim:
claimName: alertmanagerdata
- name: alertmanagerconf
configMap:
defaultMode: 420
name: alertmanager-config

View File

@@ -0,0 +1,21 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
labels:
- includeSelectors: true
pairs:
app.kubernetes.io/component: crunchy-alertmanager
resources:
- deployment.yaml
- pvc.yaml
- service.yaml
- serviceaccount.yaml
configMapGenerator:
- name: alertmanager-config
files:
- config/alertmanager.yml
generatorOptions:
disableNameSuffixHash: true

View File

@@ -0,0 +1,10 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: alertmanagerdata
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi

View File

@@ -0,0 +1,9 @@
apiVersion: v1
kind: Service
metadata:
name: crunchy-alertmanager
spec:
type: ClusterIP
ports:
- name: alertmanager
port: 9093

View File

@@ -0,0 +1,4 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: alertmanager

View File

@@ -0,0 +1,16 @@
###
#
# Copyright © 2017-2024 Crunchy Data Solutions, Inc. All Rights Reserved.
#
###
apiVersion: 1
providers:
- name: 'crunchy_dashboards'
orgId: 1
folder: ''
type: file
disableDeletion: false
updateIntervalSeconds: 3 #how often Grafana will scan for changed dashboards
options:
path: /etc/grafana/provisioning/dashboards

View File

@@ -0,0 +1,18 @@
###
#
# Copyright © 2017-2024 Crunchy Data Solutions, Inc. All Rights Reserved.
#
###
# config file version
apiVersion: 1
datasources:
- name: PROMETHEUS
type: prometheus
access: proxy
url: http://$PROM_HOST:$PROM_PORT
isDefault: True
editable: False
orgId: 1
version: 1

View File

@@ -0,0 +1,16 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
configMapGenerator:
- name: grafana-dashboards
files:
- pgbackrest.json
- pod_details.json
- postgresql_overview.json
- postgresql_details.json
- postgresql_service_health.json
- prometheus_alerts.json
- query_statistics.json
generatorOptions:
disableNameSuffixHash: true

View File

@@ -0,0 +1,687 @@
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"label": "PROMETHEUS",
"description": "",
"type": "datasource",
"pluginId": "prometheus",
"pluginName": "Prometheus"
}
],
"__requires": [
{
"type": "grafana",
"id": "grafana",
"name": "Grafana",
"version": "7.4.5"
},
{
"type": "panel",
"id": "graph",
"name": "Graph",
"version": ""
},
{
"type": "datasource",
"id": "prometheus",
"name": "Prometheus",
"version": "1.0.0"
},
{
"type": "panel",
"id": "stat",
"name": "Stat",
"version": ""
}
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": false,
"gnetId": null,
"graphTooltip": 0,
"id": null,
"iteration": 1625069660860,
"links": [
{
"asDropdown": false,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"vendor=crunchydata"
],
"title": "",
"type": "dashboards"
}
],
"panels": [
{
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "semi-dark-blue",
"value": null
}
]
},
"unit": "dtdhms"
},
"overrides": []
},
"gridPos": {
"h": 3,
"w": 24,
"x": 0,
"y": 0
},
"id": 8,
"options": {
"colorMode": "background",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"last"
],
"fields": "/^Value$/",
"values": false
},
"text": {
"valueSize": 45
},
"textMode": "auto"
},
"pluginVersion": "7.4.5",
"targets": [
{
"expr": "time()-ccp_backrest_oldest_full_backup_time_seconds{pg_cluster=\"[[cluster]]\", role=\"master\"}",
"format": "table",
"instant": true,
"interval": "",
"legendFormat": "Recovery window",
"refId": "A"
}
],
"title": "Recovery Window",
"type": "stat"
},
{
"aliasColors": {
"Differential": "dark-blue",
"Differential Backup": "dark-blue",
"Full": "dark-green",
"Full Backup": "dark-green",
"Incremental": "light-blue",
"Incremental Backup": "light-blue"
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 3
},
"hiddenSeries": false,
"id": 2,
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": 150,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": false
},
"percentage": false,
"pluginVersion": "7.4.5",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "min(ccp_backrest_last_incr_backup_time_since_completion_seconds{pg_cluster=\"[[cluster]]\", role=\"master\"}) without(deployment,instance,ip,pod)",
"format": "time_series",
"instant": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Incremental Backup",
"refId": "A"
},
{
"expr": "min(ccp_backrest_last_diff_backup_time_since_completion_seconds{pg_cluster=\"[[cluster]]\", role=\"master\"}) without(deployment, instance,ip,pod)",
"hide": false,
"interval": "",
"legendFormat": "Differential Backup",
"refId": "B"
},
{
"expr": "min(ccp_backrest_last_full_backup_time_since_completion_seconds{pg_cluster=\"[[cluster]]\", role=\"master\"}) without(deployment, instance,ip,pod)",
"hide": false,
"interval": "",
"legendFormat": "Full Backup",
"refId": "C"
},
{
"expr": "min(ccp_archive_command_status_seconds_since_last_archive{pg_cluster=\"[[cluster]]\", role=\"master\"}) without(deployment, instance,ip,pod)",
"hide": false,
"interval": "",
"legendFormat": "WAL Archive",
"refId": "D"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Time Since",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "s",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {
"Differential": "dark-blue",
"Full": "dark-green",
"Incremental": "light-blue"
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 3
},
"hiddenSeries": false,
"id": 4,
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"hideEmpty": false,
"hideZero": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": 150,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "7.4.5",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "min(ccp_backrest_last_info_backup_runtime_seconds{pg_cluster=\"[[cluster]]\", role=\"master\", backup_type=\"incr\"}) without (deployment,instance,pod,ip)",
"format": "time_series",
"instant": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Incremental",
"refId": "A"
},
{
"expr": "min(ccp_backrest_last_info_backup_runtime_seconds{pg_cluster=\"[[cluster]]\", role=\"master\", backup_type=\"diff\"}) without (deployment,instance,pod,ip)",
"hide": false,
"interval": "",
"legendFormat": "Differential",
"refId": "B"
},
{
"expr": "min(ccp_backrest_last_info_backup_runtime_seconds{pg_cluster=\"[[cluster]]\", role=\"master\", backup_type=\"full\"}) without (deployment,instance,pod,ip)",
"hide": false,
"interval": "",
"legendFormat": "Full",
"refId": "C"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Backup Runtimes",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "s",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 2,
"max": null,
"min": null,
"show": false
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {
"Differential": "dark-blue",
"Full": "dark-green",
"Incremental": "light-blue"
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "PROMETHEUS",
"description": "",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 10
},
"hiddenSeries": false,
"id": 5,
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"hideEmpty": false,
"hideZero": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": 150,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "7.4.5",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "min(ccp_backrest_last_info_repo_backup_size_bytes{pg_cluster=\"[[cluster]]\", role=\"master\", backup_type=\"incr\"}) without (deployment, instance,pod,ip)",
"format": "time_series",
"instant": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Incremental",
"refId": "A"
},
{
"expr": "min(ccp_backrest_last_info_repo_backup_size_bytes{pg_cluster=\"[[cluster]]\", role=\"master\", backup_type=\"diff\"}) without (deployment,instance,pod,ip)",
"hide": false,
"interval": "",
"legendFormat": "Differential",
"refId": "B"
},
{
"expr": "min(ccp_backrest_last_info_repo_backup_size_bytes{pg_cluster=\"[[cluster]]\", role=\"master\", backup_type=\"full\"}) without (deployment,instance,pod,ip)",
"hide": false,
"interval": "",
"legendFormat": "Full",
"refId": "C"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Backup Size",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "bytes",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 2,
"max": null,
"min": null,
"show": false
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {
"Archive age": "blue",
"Archive count": "green",
"Differential": "dark-blue",
"Failed count": "red",
"Full": "dark-green",
"Incremental": "light-blue"
},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "PROMETHEUS",
"description": "",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 3,
"fillGradient": 0,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 10
},
"hiddenSeries": false,
"id": 6,
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"hideEmpty": false,
"hideZero": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": 150,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "7.4.5",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "avg(idelta(ccp_archive_command_status_failed_count{pg_cluster=\"[[cluster]]\", role=\"master\"}[1m])) without (instance,ip)",
"format": "time_series",
"instant": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Failed count",
"refId": "A"
},
{
"expr": "avg(idelta(ccp_archive_command_status_archived_count{pg_cluster=\"[[cluster]]\", role=\"master\"}[1m])) without (instance,pod, ip)",
"hide": false,
"interval": "",
"legendFormat": "Archive count",
"refId": "B"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "WAL Stats",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": "",
"logBase": 1,
"max": null,
"min": "0",
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": "0",
"show": false
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
}
],
"refresh": "5m",
"schemaVersion": 27,
"style": "dark",
"tags": [
"vendor=crunchydata"
],
"templating": {
"list": [
{
"allValue": null,
"current": {},
"datasource": "PROMETHEUS",
"definition": "label_values(pg_cluster)",
"description": null,
"error": null,
"hide": 0,
"includeAll": false,
"label": "cluster",
"multi": false,
"name": "cluster",
"options": [],
"query": {
"query": "label_values(pg_cluster)",
"refId": "PROMETHEUS-cluster-Variable-Query"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 1,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-2w",
"to": "now"
},
"timepicker": {
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "browser",
"title": "pgBackRest",
"uid": "2fcFZ6PGk",
"version": 1
}

View File

@@ -0,0 +1,237 @@
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"label": "PROMETHEUS",
"description": "",
"type": "datasource",
"pluginId": "prometheus",
"pluginName": "Prometheus"
}
],
"__requires": [
{
"type": "grafana",
"id": "grafana",
"name": "Grafana",
"version": "7.4.5"
},
{
"type": "datasource",
"id": "prometheus",
"name": "Prometheus",
"version": "1.0.0"
},
{
"type": "panel",
"id": "stat",
"name": "Stat",
"version": ""
}
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": false,
"gnetId": null,
"graphTooltip": 0,
"id": null,
"iteration": 1625069480601,
"links": [],
"panels": [
{
"cacheTimeout": null,
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {},
"links": [
{
"targetBlank": true,
"title": "Cluster Details",
"url": "d/fMip0cuMk/postgresqldetails?$__all_variables"
},
{
"targetBlank": true,
"title": "Backup Details",
"url": "d/2fcFZ6PGk/pgbackrest?$__all_variables"
},
{
"targetBlank": true,
"title": "Pod Details",
"url": "d/4auP6Mk7k/pod-details?$__all_variables"
},
{
"targetBlank": true,
"title": "Query Statistics",
"url": "d/ZKoTOHDGk/query-statistics?$__all_variables"
},
{
"targetBlank": true,
"title": "Service Health",
"url": "d/dhG1wgsMz/postgresql-service-health?$__all_variables"
}
],
"mappings": [
{
"from": "0",
"id": 0,
"text": "DOWN",
"to": "99",
"type": 2
},
{
"from": "100",
"id": 1,
"text": "Standalone Cluster",
"to": "199",
"type": 2
},
{
"from": "200",
"id": 2,
"text": "HA CLUSTER",
"to": "1000",
"type": 2
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "#bf1b00",
"value": null
},
{
"color": "#eab839",
"value": 10
},
{
"color": "#56A64B",
"value": 100
}
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": {
"h": 2,
"w": 12,
"x": 0,
"y": 0
},
"id": 1,
"interval": null,
"links": [],
"maxDataPoints": 100,
"maxPerRow": 2,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {
"valueSize": 30
},
"textMode": "auto"
},
"pluginVersion": "7.4.5",
"repeat": "cluster",
"repeatDirection": "h",
"targets": [
{
"$hashKey": "object:243",
"expr": "sum(pg_up{pg_cluster=~\"$cluster\"})*100+sum(ccp_is_in_recovery_status{pg_cluster=~\"$cluster\"})",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "{{cluster}}",
"metric": "up",
"refId": "A",
"step": 2
}
],
"title": "$cluster - Overview",
"type": "stat"
}
],
"refresh": "5m",
"schemaVersion": 27,
"style": "dark",
"tags": [],
"templating": {
"list": [
{
"allFormat": "glob",
"allValue": null,
"current": {},
"datasource": "PROMETHEUS",
"definition": "label_values(pg_cluster)",
"description": null,
"error": null,
"hide": 1,
"includeAll": true,
"label": "cluster",
"multi": true,
"name": "cluster",
"options": [],
"query": {
"query": "label_values(pg_cluster)",
"refId": "PROMETHEUS-cluster-Variable-Query"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-5m",
"to": "now"
},
"timepicker": {
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "browser",
"title": "PostgreSQL Overview",
"uid": "D2X39SlGk",
"version": 1
}

View File

@@ -0,0 +1,649 @@
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"label": "PROMETHEUS",
"description": "",
"type": "datasource",
"pluginId": "prometheus",
"pluginName": "Prometheus"
}
],
"__requires": [
{
"type": "grafana",
"id": "grafana",
"name": "Grafana",
"version": "7.4.5"
},
{
"type": "panel",
"id": "graph",
"name": "Graph",
"version": ""
},
{
"type": "datasource",
"id": "prometheus",
"name": "Prometheus",
"version": "1.0.0"
}
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": false,
"gnetId": null,
"graphTooltip": 0,
"id": null,
"iteration": 1625069909806,
"links": [
{
"asDropdown": false,
"icon": "external link",
"includeVars": true,
"keepTime": true,
"tags": [
"vendor=crunchydata"
],
"title": "",
"type": "dashboards"
}
],
"panels": [
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 5,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 0
},
"hiddenSeries": false,
"id": 6,
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": 150,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "7.4.5",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(ccp_connection_stats_total{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}) without (pod,instance,ip) / sum(ccp_connection_stats_max_connections{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}) without (pod,instance,ip)",
"format": "time_series",
"instant": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Connections",
"refId": "C"
},
{
"expr": "100 - 100 * avg(ccp_nodemx_data_disk_available_bytes{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}) without (pod,instance,ip) / avg(ccp_nodemx_data_disk_total_bytes{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}) without (pod,instance,ip)",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "Mount:{{mount_point}}",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Saturation (pct used)",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"decimals": null,
"format": "percent",
"label": null,
"logBase": 1,
"max": "100",
"min": "0",
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"cacheTimeout": null,
"dashLength": 10,
"dashes": false,
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 5,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 0
},
"hiddenSeries": false,
"id": 18,
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": 150,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "7.4.5",
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"exemplar": false,
"expr": " sum(irate(ccp_stat_database_xact_commit{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}[1m])) \n+ sum(irate(ccp_stat_database_xact_rollback{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}[1m]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "Transactions",
"refId": "A"
},
{
"expr": "max(ccp_connection_stats_active{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}) without (pod,instance,ip,dbname)",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
"legendFormat": "Active connections",
"refId": "C"
},
{
"expr": "sum(irate(ccp_pg_stat_statements_total_calls_count{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}[1m]))",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Queries",
"refId": "B"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Traffic",
"tooltip": {
"shared": true,
"sort": 2,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "short",
"label": "",
"logBase": 1,
"max": null,
"min": "0.001",
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "PROMETHEUS",
"description": "Errors",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 5,
"gridPos": {
"h": 7,
"w": 12,
"x": 0,
"y": 7
},
"hiddenSeries": false,
"id": 4,
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": 150,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "7.4.5",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(irate(ccp_stat_database_xact_rollback{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}[1m]) without(pod,instance,ip))",
"format": "time_series",
"hide": true,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Rollbacks",
"refId": "A"
},
{
"expr": "sum(irate(ccp_stat_database_deadlocks{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}[1m])) without(pod,instance,ip,dbname)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Deadlock ",
"refId": "D"
},
{
"expr": "sum(irate(ccp_stat_database_conflicts{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}[1m])) without(pod,instance,ip,dbname)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Conflicts",
"refId": "B"
},
{
"expr": "max(pg_exporter_last_scrape_error{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}) without(pod,instance,ip,dbname)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "scrape error",
"refId": "C"
},
{
"expr": "max(clamp_max(ccp_archive_command_status_seconds_since_last_fail{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"},1)) without (instance,pod,ip)",
"format": "time_series",
"hide": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "archive error",
"refId": "E"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Errors",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"decimals": null,
"format": "short",
"label": "",
"logBase": 2,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"custom": {},
"links": []
},
"overrides": []
},
"fill": 1,
"fillGradient": 1,
"gridPos": {
"h": 7,
"w": 12,
"x": 12,
"y": 7
},
"hiddenSeries": false,
"id": 10,
"legend": {
"alignAsTable": true,
"avg": false,
"current": false,
"max": false,
"min": false,
"rightSide": true,
"show": true,
"sideWidth": 150,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {
"alertThreshold": true
},
"percentage": false,
"pluginVersion": "7.4.5",
"pointradius": 5,
"points": false,
"renderer": "flot",
"seriesOverrides": [
{
"alias": "/Max:/",
"color": "#E02F44",
"nullPointMode": "null as zero"
},
{
"alias": "/Avg:/",
"color": "#8AB8FF"
}
],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "max(ccp_pg_stat_statements_total_mean_exec_time_ms{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}) without (pod,instance,ip)",
"format": "time_series",
"hide": false,
"instant": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Avg: {{exported_role}}({{dbname}})",
"refId": "A"
},
{
"expr": "max(ccp_pg_stat_statements_top_max_exec_time_ms{pg_cluster=\"[[cluster]]\",role=\"[[role]]\"}) without (pod,instance,ip,query,queryid)",
"format": "time_series",
"hide": false,
"instant": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Max: {{exported_role}}({{dbname}})",
"refId": "B"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "Query Duration",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"decimals": null,
"format": "ms",
"label": null,
"logBase": 2,
"max": null,
"min": "0",
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": false
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
}
],
"refresh": "5m",
"schemaVersion": 27,
"style": "dark",
"tags": [
"vendor=crunchydata"
],
"templating": {
"list": [
{
"allValue": null,
"current": {},
"datasource": "PROMETHEUS",
"definition": "label_values(pg_cluster)",
"description": null,
"error": null,
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "cluster",
"options": [],
"query": {
"query": "label_values(pg_cluster)",
"refId": "PROMETHEUS-cluster-Variable-Query"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
},
{
"allValue": null,
"current": {},
"datasource": "PROMETHEUS",
"definition": "label_values({pg_cluster=\"[[cluster]]\"},role)",
"description": null,
"error": null,
"hide": 0,
"includeAll": false,
"label": null,
"multi": false,
"name": "role",
"options": [],
"query": {
"query": "label_values({pg_cluster=\"[[cluster]]\"},role)",
"refId": "PROMETHEUS-role-Variable-Query"
},
"refresh": 1,
"regex": "",
"skipUrlSync": false,
"sort": 0,
"tagValuesQuery": "",
"tags": [],
"tagsQuery": "",
"type": "query",
"useTags": false
}
]
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "browser",
"title": "PostgreSQL Service Health",
"uid": "dhG1wgsMz",
"version": 1
}

View File

@@ -0,0 +1,961 @@
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"label": "PROMETHEUS",
"description": "",
"type": "datasource",
"pluginId": "prometheus",
"pluginName": "Prometheus"
}
],
"__requires": [
{
"type": "grafana",
"id": "grafana",
"name": "Grafana",
"version": "7.4.5"
},
{
"type": "datasource",
"id": "prometheus",
"name": "Prometheus",
"version": "1.0.0"
},
{
"type": "panel",
"id": "stat",
"name": "Stat",
"version": ""
},
{
"type": "panel",
"id": "table",
"name": "Table",
"version": ""
}
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"description": "Show current firing and pending alerts, and severity alert counts.",
"editable": false,
"gnetId": 4181,
"graphTooltip": 0,
"id": null,
"links": [
{
"icon": "external link",
"tags": [
"vendor=crunchydata"
],
"type": "dashboards"
}
],
"panels": [
{
"collapsed": false,
"datasource": "PROMETHEUS",
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 0
},
"id": 10,
"panels": [],
"repeat": null,
"title": "Environment Summary",
"type": "row"
},
{
"cacheTimeout": null,
"datasource": "PROMETHEUS",
"description": "",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {},
"mappings": [
{
"id": 0,
"op": "=",
"text": "N/A",
"type": 1,
"value": "null"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "semi-dark-blue",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 2,
"w": 4,
"x": 0,
"y": 1
},
"id": 6,
"interval": null,
"links": [],
"maxDataPoints": 100,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [],
"fields": "",
"values": false
},
"text": {},
"textMode": "auto"
},
"pluginVersion": "7.4.5",
"targets": [
{
"expr": "count(count by (kubernetes_namespace) (pg_up))",
"format": "time_series",
"instant": true,
"interval": "",
"intervalFactor": 2,
"legendFormat": "Namespaces",
"refId": "A"
}
],
"title": "Namespaces",
"type": "stat"
},
{
"cacheTimeout": null,
"datasource": "PROMETHEUS",
"description": "",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {},
"mappings": [
{
"id": 0,
"op": "=",
"text": "N/A",
"type": 1,
"value": "null"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "semi-dark-blue",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 2,
"w": 4,
"x": 4,
"y": 1
},
"id": 13,
"interval": null,
"links": [],
"maxDataPoints": 100,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"mean"
],
"fields": "",
"values": false
},
"text": {},
"textMode": "auto"
},
"pluginVersion": "7.4.5",
"targets": [
{
"expr": "count(count by (pg_cluster) (pg_up))",
"format": "time_series",
"instant": true,
"interval": "",
"intervalFactor": 2,
"legendFormat": "PostgreSQL Clusters",
"refId": "A"
}
],
"title": "PG Clusters",
"type": "stat"
},
{
"cacheTimeout": null,
"datasource": "PROMETHEUS",
"description": "",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {},
"mappings": [
{
"id": 0,
"op": "=",
"text": "N/A",
"type": 1,
"value": "null"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "semi-dark-blue",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 2,
"w": 4,
"x": 8,
"y": 1
},
"id": 14,
"interval": null,
"links": [],
"maxDataPoints": 100,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"mean"
],
"fields": "",
"values": false
},
"text": {},
"textMode": "auto"
},
"pluginVersion": "7.4.5",
"targets": [
{
"expr": "count(pg_up)",
"format": "time_series",
"instant": true,
"interval": "",
"intervalFactor": 2,
"legendFormat": "PostgreSQL Clusters",
"refId": "A"
}
],
"title": "PG Instances",
"type": "stat"
},
{
"collapsed": false,
"datasource": "PROMETHEUS",
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 3
},
"id": 11,
"panels": [],
"repeat": null,
"title": "Alert Summary",
"type": "row"
},
{
"cacheTimeout": null,
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {},
"mappings": [
{
"id": 0,
"op": "=",
"text": "N/A",
"type": 1,
"value": "null"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "semi-dark-red",
"value": null
},
{
"color": "#F2495C",
"value": 1
},
{
"color": "#F2495C"
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 2,
"w": 4,
"x": 0,
"y": 4
},
"id": 2,
"interval": null,
"links": [],
"maxDataPoints": 100,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"mean"
],
"fields": "",
"values": false
},
"text": {},
"textMode": "auto"
},
"pluginVersion": "7.4.5",
"targets": [
{
"bucketAggs": [
{
"id": "2",
"settings": {
"interval": "auto",
"min_doc_count": 0,
"trimEdges": 0
},
"type": "date_histogram"
}
],
"dsType": "elasticsearch",
"expr": "sum(ALERTS{alertstate=\"firing\",severity=\"critical\"} > 0) OR on() vector(0)",
"format": "time_series",
"instant": true,
"interval": "",
"intervalFactor": 1,
"legendFormat": "Critical",
"metrics": [
{
"field": "select field",
"id": "1",
"type": "count"
}
],
"refId": "A"
}
],
"title": "Critical",
"type": "stat"
},
{
"cacheTimeout": null,
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {},
"mappings": [
{
"id": 0,
"op": "=",
"text": "N/A",
"type": 1,
"value": "null"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "semi-dark-orange",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 2,
"w": 4,
"x": 4,
"y": 4
},
"id": 5,
"interval": null,
"links": [],
"maxDataPoints": 100,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [],
"fields": "",
"values": false
},
"text": {},
"textMode": "auto"
},
"pluginVersion": "7.4.5",
"targets": [
{
"expr": "sum(ALERTS{alertstate=\"firing\",severity=\"warning\"} > 0) OR on() vector(0)",
"format": "time_series",
"instant": true,
"interval": "",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"title": "Warning",
"type": "stat"
},
{
"cacheTimeout": null,
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {},
"mappings": [
{
"id": 0,
"op": "=",
"text": "N/A",
"type": 1,
"value": "null"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "#299c46",
"value": null
}
]
},
"unit": "none"
},
"overrides": []
},
"gridPos": {
"h": 2,
"w": 4,
"x": 8,
"y": 4
},
"id": 9,
"interval": null,
"links": [],
"maxDataPoints": 100,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions": {
"calcs": [
"mean"
],
"fields": "",
"values": false
},
"text": {},
"textMode": "auto"
},
"pluginVersion": "7.4.5",
"targets": [
{
"expr": "sum(ALERTS{alertstate=\"firing\",severity=\"info\"} > 0) OR on() vector(0)",
"format": "time_series",
"interval": "",
"intervalFactor": 2,
"legendFormat": "",
"refId": "A"
}
],
"title": "Info",
"type": "stat"
},
{
"collapsed": false,
"datasource": "PROMETHEUS",
"gridPos": {
"h": 1,
"w": 24,
"x": 0,
"y": 6
},
"id": 12,
"panels": [],
"repeat": null,
"title": "Alerts",
"type": "row"
},
{
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": null,
"displayMode": "auto",
"filterable": true
},
"decimals": 2,
"displayName": "",
"mappings": [
{
"from": "",
"id": 1,
"text": "",
"to": "",
"type": 1,
"value": ""
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "blue",
"value": 100
},
{
"color": "#EAB839",
"value": 200
},
{
"color": "red",
"value": 300
}
]
},
"unit": "short"
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "severity_num"
},
"properties": [
{
"id": "custom.displayMode",
"value": "color-background"
},
{
"id": "custom.width",
"value": 124
}
]
},
{
"matcher": {
"id": "byName",
"options": "Time"
},
"properties": [
{
"id": "custom.width",
"value": 170
}
]
},
{
"matcher": {
"id": "byName",
"options": "severity"
},
"properties": [
{
"id": "custom.width",
"value": 119
}
]
},
{
"matcher": {
"id": "byName",
"options": "alertname"
},
"properties": [
{
"id": "custom.width",
"value": 206
}
]
},
{
"matcher": {
"id": "byName",
"options": "alertstate"
},
"properties": [
{
"id": "custom.width",
"value": 128
}
]
}
]
},
"gridPos": {
"h": 5,
"w": 24,
"x": 0,
"y": 7
},
"id": 1,
"links": [],
"options": {
"showHeader": true,
"sortBy": []
},
"pluginVersion": "7.4.5",
"targets": [
{
"expr": "ALERTS{alertstate='firing'} > 0",
"format": "table",
"instant": true,
"interval": "2s",
"intervalFactor": 1,
"legendFormat": "",
"refId": "A"
}
],
"title": "Firing",
"transformations": [
{
"id": "merge",
"options": {
"reducers": []
}
},
{
"id": "organize",
"options": {
"excludeByName": {
"Value": true,
"__name__": true,
"alertstate": false,
"deployment": false,
"exp_type": true,
"fs_type": true,
"instance": true,
"job": true,
"kubernetes_namespace": true,
"mount_point": true,
"server": true,
"service": true,
"severity_num": false
},
"indexByName": {
"Time": 0,
"Value": 16,
"__name__": 3,
"alertname": 4,
"alertstate": 5,
"deployment": 7,
"exp_type": 9,
"instance": 10,
"ip": 11,
"job": 12,
"kubernetes_namespace": 13,
"pg_cluster": 6,
"pod": 8,
"role": 14,
"service": 15,
"severity": 2,
"severity_num": 1
},
"renameByName": {
"Time": "",
"__name__": "",
"severity": "",
"severity_num": ""
}
}
}
],
"type": "table"
},
{
"datasource": "PROMETHEUS",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": null,
"filterable": true
},
"decimals": 2,
"displayName": "",
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
},
"unit": "short"
},
"overrides": [
{
"matcher": {
"id": "byRegexp",
"options": "/(instance|__name__|Time|alertstate|job|type|Value)/"
},
"properties": [
{
"id": "unit",
"value": "short"
},
{
"id": "decimals",
"value": 2
},
{
"id": "custom.align",
"value": null
}
]
},
{
"matcher": {
"id": "byName",
"options": "Time"
},
"properties": [
{
"id": "custom.width",
"value": null
}
]
},
{
"matcher": {
"id": "byName",
"options": "severity_num"
},
"properties": [
{
"id": "custom.width",
"value": 126
}
]
},
{
"matcher": {
"id": "byName",
"options": "severity"
},
"properties": [
{
"id": "custom.width",
"value": 115
}
]
},
{
"matcher": {
"id": "byName",
"options": "alertname"
},
"properties": [
{
"id": "custom.width",
"value": 207
}
]
},
{
"matcher": {
"id": "byName",
"options": "alertstate"
},
"properties": [
{
"id": "custom.width",
"value": 131
}
]
}
]
},
"gridPos": {
"h": 7,
"w": 24,
"x": 0,
"y": 12
},
"id": 3,
"links": [],
"options": {
"showHeader": true,
"sortBy": []
},
"pluginVersion": "7.4.5",
"targets": [
{
"expr": "ALERTS{alertstate=\"pending\"}",
"format": "table",
"instant": false,
"interval": "",
"intervalFactor": 1,
"legendFormat": "",
"refId": "A"
}
],
"title": "Alerts (1 week)",
"transformations": [
{
"id": "organize",
"options": {
"excludeByName": {
"Value": true,
"__name__": true,
"exp_type": true,
"instance": true,
"job": true,
"kubernetes_namespace": true,
"service": true
},
"indexByName": {
"Time": 0,
"Value": 16,
"__name__": 3,
"alertname": 4,
"alertstate": 5,
"deployment": 7,
"exp_type": 8,
"instance": 9,
"ip": 11,
"job": 12,
"kubernetes_namespace": 13,
"pg_cluster": 6,
"pod": 10,
"role": 14,
"service": 15,
"severity": 2,
"severity_num": 1
},
"renameByName": {}
}
}
],
"type": "table"
}
],
"refresh": "15m",
"schemaVersion": 27,
"style": "dark",
"tags": [
"vendor=crunchydata"
],
"templating": {
"list": []
},
"time": {
"from": "now-1h",
"to": "now"
},
"timepicker": {
"time_options": [
"5m",
"15m",
"1h",
"6h",
"12h",
"24h",
"2d",
"7d",
"30d"
]
},
"timezone": "browser",
"title": "Prometheus Alerts",
"uid": "lwxXsZsMk",
"version": 1
}

View File

@@ -0,0 +1,64 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: crunchy-grafana
spec:
selector: {}
template:
spec:
containers:
- name: grafana
image: grafana/grafana:9.2.20
ports:
- containerPort: 3000
env:
- name: GF_PATHS_DATA
value: /data/grafana/data
- name: GF_SECURITY_ADMIN_USER__FILE
value: /conf/admin/username
- name: GF_SECURITY_ADMIN_PASSWORD__FILE
value: /conf/admin/password
- name: PROM_HOST
value: crunchy-prometheus
- name: PROM_PORT
value: "9090"
livenessProbe:
httpGet:
path: /api/health
port: 3000
initialDelaySeconds: 25
periodSeconds: 20
readinessProbe:
httpGet:
path: /api/health
port: 3000
volumeMounts:
- mountPath: /data
name: grafanadata
- mountPath: /conf/admin
name: grafana-admin
- mountPath: /etc/grafana/provisioning/datasources
name: grafana-datasources
- mountPath: /etc/grafana/provisioning/dashboards
name: grafana-dashboards
securityContext:
fsGroup: 26
# supplementalGroups:
# - 65534
serviceAccountName: grafana
volumes:
- name: grafanadata
persistentVolumeClaim:
claimName: grafanadata
- name: grafana-admin
secret:
defaultMode: 420
secretName: grafana-admin
- name: grafana-datasources
configMap:
defaultMode: 420
name: grafana-datasources
- name: grafana-dashboards
configMap:
defaultMode: 420
name: grafana-dashboards

View File

@@ -0,0 +1,33 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
labels:
- includeSelectors: true
pairs:
app.kubernetes.io/component: crunchy-grafana
resources:
- deployment.yaml
- pvc.yaml
- service.yaml
- serviceaccount.yaml
- dashboards
configMapGenerator:
- name: grafana-datasources
files:
- config/crunchy_grafana_datasource.yml
- name: grafana-dashboards
behavior: merge
files:
- config/crunchy_grafana_dashboards.yml
secretGenerator:
- name: grafana-admin
literals:
- password=admin
- username=admin
type: Opaque
generatorOptions:
disableNameSuffixHash: true

View File

@@ -0,0 +1,10 @@
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafanadata
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi

View File

@@ -0,0 +1,9 @@
apiVersion: v1
kind: Service
metadata:
name: crunchy-grafana
spec:
type: ClusterIP
ports:
- name: grafana
port: 3000

View File

@@ -0,0 +1,4 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: grafana

View File

@@ -0,0 +1,14 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: prod-iam
labels:
- includeSelectors: true
pairs:
app.kubernetes.io/name: prod-iam-obs
vendor: holos
resources:
- grafana
- prometheus
- alertmanager

View File

@@ -0,0 +1,9 @@
package holos
spec: components: KustomizeBuildList: [
#KustomizeBuild & {
_dependsOn: "prod-secrets-stores": _
metadata: name: "prod-iam-obs"
},
]

View File

@@ -0,0 +1,13 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- resources:
- pods
apiGroups:
- ""
verbs:
- get
- list
- watch

View File

@@ -0,0 +1,11 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus

View File

@@ -0,0 +1,418 @@
###
#
# Copyright © 2017-2024 Crunchy Data Solutions, Inc. All Rights Reserved.
#
###
groups:
- name: alert-rules
rules:
########## EXPORTER RULES ##########
- alert: PGExporterScrapeError
expr: pg_exporter_last_scrape_error > 0
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
summary: 'Postgres Exporter running on {{ $labels.job }} (instance: {{ $labels.instance }}) is encountering scrape errors processing queries. Error count: ( {{ $value }} )'
########## SYSTEM RULES ##########
- alert: ExporterDown
expr: avg_over_time(up[5m]) < 0.5
for: 10s
labels:
service: system
severity: critical
severity_num: 300
annotations:
description: 'Metrics exporter service for {{ $labels.job }} running on {{ $labels.instance }} has been down at least 50% of the time for the last 5 minutes. Service may be flapping or down.'
summary: 'Prometheus Exporter Service Down'
########## POSTGRESQL RULES ##########
- alert: PGIsUp
expr: pg_up < 1
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
summary: 'postgres_exporter running on {{ $labels.job }} is unable to communicate with the configured database'
# Example to check for current version of PostgreSQL. Metric returns the version that the exporter is running on, so you can set a rule to check for the minimum version you'd like all systems to be on. Number returned is the 6 digit integer representation contained in the setting "server_version_num".
#
# - alert: PGMinimumVersion
# expr: ccp_postgresql_version_current < 110005
# for: 60s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# summary: '{{ $labels.job }} is not running at least version 11.5 of PostgreSQL'
# Whether a system switches from primary to replica or vice versa must be configured per named job.
# No way to tell what value a system is supposed to be without a rule expression for that specific system
# 2 to 1 means it changed from primary to replica. 1 to 2 means it changed from replica to primary
# Set this alert for each system that you want to monitor a recovery status change
# Below is an example for a target job called "Replica" and watches for the value to change above 1 which means it's no longer a replica
#
# - alert: PGRecoveryStatusSwitch_Replica
# expr: ccp_is_in_recovery_status{job="Replica"} > 1
# for: 60s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# summary: '{{ $labels.job }} has changed from replica to primary'
# Absence alerts must be configured per named job, otherwise there's no way to know which job is down
# Below is an example for a target job called "Prod"
# - alert: PGConnectionAbsent_Prod
# expr: absent(ccp_connection_stats_max_connections{job="Prod"})
# for: 10s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# description: 'Connection metric is absent from target (Prod). Check that postgres_exporter can connect to PostgreSQL.'
# Optional monitor for changes to pg_settings (postgresql.conf) system catalog.
# A similar metric is available for monitoring pg_hba.conf. See ccp_hba_settings_checksum().
# If metric returns 0, then NO settings have changed for either pg_settings since last known valid state
# If metric returns 1, then pg_settings have changed since last known valid state
# To see what may have changed, check the monitor.pg_settings_checksum table for a history of config state.
# - alert: PGSettingsChecksum
# expr: ccp_pg_settings_checksum > 0
# for 60s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# description: 'Configuration settings on {{ $labels.job }} have changed from previously known valid state. To reset current config to a valid state after alert fires, run monitor.pg_settings_checksum_set_valid().'
# summary: 'PGSQL Instance settings checksum'
# Monitor for data block checksum failures. Only works in PG12+
# - alert: PGDataChecksum
# expr: ccp_data_checksum_failure > 0
# for 60s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# description: '{{ $labels.job }} has at least one data checksum failure in database {{ $labels.dbname }}. See pg_stat_database system catalog for more information.'
# summary: 'PGSQL Data Checksum failure'
- alert: PGIdleTxn
expr: ccp_connection_stats_max_idle_in_txn_time > 300
for: 60s
labels:
service: postgresql
severity: warning
severity_num: 200
annotations:
description: '{{ $labels.job }} has at least one session idle in transaction for over 5 minutes.'
summary: 'PGSQL Instance idle transactions'
- alert: PGIdleTxn
expr: ccp_connection_stats_max_idle_in_txn_time > 900
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: '{{ $labels.job }} has at least one session idle in transaction for over 15 minutes.'
summary: 'PGSQL Instance idle transactions'
- alert: PGQueryTime
expr: ccp_connection_stats_max_query_time > 43200
for: 60s
labels:
service: postgresql
severity: warning
severity_num: 200
annotations:
description: '{{ $labels.job }} has at least one query running for over 12 hours.'
summary: 'PGSQL Max Query Runtime'
- alert: PGQueryTime
expr: ccp_connection_stats_max_query_time > 86400
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: '{{ $labels.job }} has at least one query running for over 1 day.'
summary: 'PGSQL Max Query Runtime'
- alert: PGConnPerc
expr: 100 * (ccp_connection_stats_total / ccp_connection_stats_max_connections) > 75
for: 60s
labels:
service: postgresql
severity: warning
severity_num: 200
annotations:
description: '{{ $labels.job }} is using 75% or more of available connections ({{ $value }}%)'
summary: 'PGSQL Instance connections'
- alert: PGConnPerc
expr: 100 * (ccp_connection_stats_total / ccp_connection_stats_max_connections) > 90
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: '{{ $labels.job }} is using 90% or more of available connections ({{ $value }}%)'
summary: 'PGSQL Instance connections'
- alert: DiskFillPredict
expr: predict_linear(ccp_nodemx_data_disk_available_bytes{mount_point!~"tmpfs"}[1h], 24 * 3600) < 0 and 100 * ((ccp_nodemx_data_disk_total_bytes - ccp_nodemx_data_disk_available_bytes) / ccp_nodemx_data_disk_total_bytes) > 70
for: 5m
labels:
service: postgresql
severity: warning
severity_num: 200
annotations:
summary: 'Disk predicted to be full in 24 hours'
description: 'Disk on {{ $labels.pg_cluster }}:{{ $labels.kubernetes_pod_name }} is predicted to fill in 24 hrs based on current usage'
- alert: PGClusterRoleChange
expr: count by (pg_cluster) (ccp_is_in_recovery_status != ignoring(instance,ip,pod,role) (ccp_is_in_recovery_status offset 5m)) >= 1
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
summary: '{{ $labels.pg_cluster }} has had a switchover/failover event. Please check this cluster for more details'
- alert: PGDiskSize
expr: 100 * ((ccp_nodemx_data_disk_total_bytes - ccp_nodemx_data_disk_available_bytes) / ccp_nodemx_data_disk_total_bytes) > 75
for: 60s
labels:
service: postgresql
severity: warning
severity_num: 200
annotations:
description: 'PGSQL Instance {{ $labels.deployment }} over 75% disk usage at mount point "{{ $labels.mount_point }}": {{ $value }}%'
summary: PGSQL Instance usage warning
- alert: PGDiskSize
expr: 100 * ((ccp_nodemx_data_disk_total_bytes - ccp_nodemx_data_disk_available_bytes) / ccp_nodemx_data_disk_total_bytes) > 90
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: 'PGSQL Instance {{ $labels.deployment }} over 90% disk usage at mount point "{{ $labels.mount_point }}": {{ $value }}%'
summary: 'PGSQL Instance size critical'
- alert: PGReplicationByteLag
expr: ccp_replication_lag_size_bytes > 5.24288e+07
for: 60s
labels:
service: postgresql
severity: warning
severity_num: 200
annotations:
description: 'PGSQL Instance {{ $labels.job }} has at least one replica lagging over 50MB behind.'
summary: 'PGSQL Instance replica lag warning'
- alert: PGReplicationByteLag
expr: ccp_replication_lag_size_bytes > 1.048576e+08
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: 'PGSQL Instance {{ $labels.job }} has at least one replica lagging over 100MB behind.'
summary: 'PGSQL Instance replica lag warning'
- alert: PGReplicationSlotsInactive
expr: ccp_replication_slots_active == 0
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: 'PGSQL Instance {{ $labels.job }} has one or more inactive replication slots'
summary: 'PGSQL Instance inactive replication slot'
- alert: PGXIDWraparound
expr: ccp_transaction_wraparound_percent_towards_wraparound > 50
for: 60s
labels:
service: postgresql
severity: warning
severity_num: 200
annotations:
description: 'PGSQL Instance {{ $labels.job }} is over 50% towards transaction id wraparound.'
summary: 'PGSQL Instance {{ $labels.job }} transaction id wraparound imminent'
- alert: PGXIDWraparound
expr: ccp_transaction_wraparound_percent_towards_wraparound > 75
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: 'PGSQL Instance {{ $labels.job }} is over 75% towards transaction id wraparound.'
summary: 'PGSQL Instance transaction id wraparound imminent'
- alert: PGEmergencyVacuum
expr: ccp_transaction_wraparound_percent_towards_emergency_autovac > 110
for: 60s
labels:
service: postgresql
severity: warning
severity_num: 200
annotations:
description: 'PGSQL Instance {{ $labels.job }} is over 110% beyond autovacuum_freeze_max_age value. Autovacuum may need tuning to better keep up.'
summary: 'PGSQL Instance emergency vacuum imminent'
- alert: PGEmergencyVacuum
expr: ccp_transaction_wraparound_percent_towards_emergency_autovac > 125
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: 'PGSQL Instance {{ $labels.job }} is over 125% beyond autovacuum_freeze_max_age value. Autovacuum needs tuning to better keep up.'
summary: 'PGSQL Instance emergency vacuum imminent'
- alert: PGArchiveCommandStatus
expr: ccp_archive_command_status_seconds_since_last_fail > 300
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: 'PGSQL Instance {{ $labels.job }} has a recent failing archive command'
summary: 'Seconds since the last recorded failure of the archive_command'
- alert: PGSequenceExhaustion
expr: ccp_sequence_exhaustion_count > 0
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: 'Count of sequences on instance {{ $labels.job }} at over 75% usage: {{ $value }}. Run following query to see full sequence status: SELECT * FROM monitor.sequence_status() WHERE percent >= 75'
- alert: PGSettingsPendingRestart
expr: ccp_settings_pending_restart_count > 0
for: 60s
labels:
service: postgresql
severity: critical
severity_num: 300
annotations:
description: 'One or more settings in the pg_settings system catalog on system {{ $labels.job }} are in a pending_restart state. Check the system catalog for which settings are pending and review postgresql.conf for changes.'
########## PGBACKREST RULES ##########
#
# Uncomment and customize one or more of these rules to monitor your pgbackrest backups.
# Full backups are considered the equivalent of both differentials and incrementals since both are based on the last full
# And differentials are considered incrementals since incrementals will be based off the last diff if one exists
# This avoid false alerts, for example when you don't run diff/incr backups on the days that you run a full
# Stanza should also be set if different intervals are expected for each stanza.
# Otherwise rule will be applied to all stanzas returned on target system if not set.
#
# Relevant metric names are:
# ccp_backrest_last_full_backup_time_since_completion_seconds
# ccp_backrest_last_incr_backup_time_since_completion_seconds
# ccp_backrest_last_diff_backup_time_since_completion_seconds
#
# To avoid false positives on backup time alerts, 12 hours are added onto each threshold to allow a buffer if the backup runtime varies from day to day.
# Further adjustment may be needed depending on your backup runtimes/schedule.
#
# - alert: PGBackRestLastCompletedFull_main
# expr: ccp_backrest_last_full_backup_time_since_completion_seconds{stanza="main"} > 648000
# for: 60s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# summary: 'Full backup for stanza [main] on system {{ $labels.job }} has not completed in the last week.'
#
# - alert: PGBackRestLastCompletedIncr_main
# expr: ccp_backrest_last_incr_backup_time_since_completion_seconds{stanza="main"} > 129600
# for: 60s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# summary: 'Incremental backup for stanza [main] on system {{ $labels.job }} has not completed in the last 24 hours.'
#
#
# Runtime monitoring is handled with a single metric:
#
# ccp_backrest_last_info_backup_runtime_seconds
#
# Runtime monitoring should have the "backup_type" label set.
# Otherwise the rule will apply to the last run of all backup types returned (full, diff, incr)
# Stanza should also be set if runtimes per stanza have different expected times
#
# - alert: PGBackRestLastRuntimeFull_main
# expr: ccp_backrest_last_info_backup_runtime_seconds{backup_type="full", stanza="main"} > 14400
# for: 60s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# summary: 'Expected runtime of full backup for stanza [main] has exceeded 4 hours'
#
# - alert: PGBackRestLastRuntimeDiff_main
# expr: ccp_backrest_last_info_backup_runtime_seconds{backup_type="diff", stanza="main"} > 3600
# for: 60s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# summary: 'Expected runtime of diff backup for stanza [main] has exceeded 1 hour'
##
#
## If the pgbackrest command fails to run, the metric disappears from the exporter output and the alert never fires.
## An absence alert must be configured explicitly for each target (job) that backups are being monitored.
## Checking for absence of just the full backup type should be sufficient (no need for diff/incr).
## Note that while the backrest check command failing will likely also cause a scrape error alert, the addition of this
## check gives a clearer answer as to what is causing it and that something is wrong with the backups.
#
# - alert: PGBackrestAbsentFull_Prod
# expr: absent(ccp_backrest_last_full_backup_time_since_completion_seconds{job="Prod"})
# for: 10s
# labels:
# service: postgresql
# severity: critical
# severity_num: 300
# annotations:
# description: 'Backup Full status missing for Prod. Check that pgbackrest info command is working on target system.'

Some files were not shown because too many files have changed in this diff Show More