Compare commits

...

421 Commits

Author SHA1 Message Date
Andrei Kvapil
102ec2a6f2 [ingress] bump version
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-07 14:32:14 +02:00
Andrei Kvapil
b55db668d1 [ingress] Refactor cdiUploadProxy ingress
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-07 14:31:39 +02:00
Andrei Kvapil
49984e64a0 [ingress] Fix vmExportProxy ingress
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-07 14:31:39 +02:00
Andrei Kvapil
7897190c3f [ingress] Introduce Kubernetes API proxy
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-07 14:31:39 +02:00
klinch0
29b49496f2 [platform] delete extra dependencies for piraeus operator (#856)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated dependency configuration so that piraeus-operator no longer
depends on victoria-metrics-operator.
- **Refactor**
- Improved compatibility by ensuring certain resources (VMPodScrape and
alert definitions) are only rendered if the required API versions are
available in the Kubernetes cluster.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-07 12:30:31 +03:00
kklinch0
3c27192d3e [platform] delete extra dependencies for piraeus operator
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-05-05 16:56:12 +03:00
klinch0
dca732cde0 [platform] add hr reconciler (#870)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new controller to synchronize tenant HelmReleases and
propagate configuration changes.
- Added dynamic host value overrides in multiple Helm templates by
conditionally retrieving values from the "tenant-root" HelmRelease.
- Updated RBAC permissions to allow management of HelmRelease resources.

- **Improvements**
  - Added support for Helm v2 API integration.
- Enhanced HelmRelease reconciliation logic and configuration
propagation for tenant environments.

- **Bug Fixes**
- Fixed periodic reconciliation for the "tenant-root" HelmRelease by
setting its interval to zero.

- **Version Updates**
  - Incremented version numbers for the "info" and "ingress" packages.

- **Chores**
  - Updated version mappings and commit references.
  - Improved .gitignore to exclude the .vscode directory.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-05 16:41:34 +03:00
Timofei Larkin
0346dc05bb Enable user-added params in tenant cluster Cilium (#917)
Users requested the possibility of passing custom values to the Cilium
HelmRelease in tenant k8s clusters to enable its latest features, such
as support for the Gateway API. This customization is now available via
the `valuesOverride` field under `addons.cilium` in the kubernetes' app
values.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for custom override values for the Cilium addon,
allowing users to configure Cilium settings via the values file.
- **Chores**
  - Updated the Kubernetes chart version to 0.20.0.
  - Updated version mappings to reflect the new chart version.
- **Documentation**
- Updated Kubernetes managed service docs to include configuration
details for Cilium addon overrides.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-05 16:55:17 +04:00
Timofei Larkin
a03cdeff04 Enable user-added params in tenant cluster Cilium
Users requested the possibility of passing custom values to the Cilium
HelmRelease in tenant k8s clusters to enable its latest features, such
as support for the Gateway API. This customization is now available via
the `valuesOverride` field under `addons.cilium` in the kubernetes' app
values.

Additionally add dummy schema for S3 bucket, as it breaks the pre-commit
checks.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-05 15:37:34 +03:00
Nick Volynkin
062d72805a [docs] Update release policy: Release Candidate versions (#897)
*Documentation**
- Expanded the release documentation with a new section explaining
Cozystack's staged release process, including details on Release
Candidates, Regular Releases, and Patch Releases.
- Clarified the workflow and purpose of Release Candidates and updated
the explanation of how regular releases are created.

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-05 16:15:57 +07:00
Nick Volynkin
70fed8148d [ci] Run pre-commit checks even on doc changes
Pre-commit is now required to merge PRs, so let it run even on documentation updates.
An alternative is to merge with administrator permissions, bypassing rules,
which is not a good practice.

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-05 11:57:02 +03:00
Nick Volynkin
12c6df83f5 [docs] Update release policy: Release Candidate versions
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-05 11:52:06 +03:00
kklinch0
f61a7817e6 [platform] add hr reconciler
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-05-05 09:26:50 +03:00
Timofei Larkin
c482289b14 Make kubevirt's CPU allocation ratio configurable (#905)
Kubevirt's default cpu-to-vcpu ration is 1:10, which might be a bit
extreme for some users. This patch introduces a new key in the Cozystack
configmap, "cpu-allocation-ratio" where admins of Cozystack can specify
an alternative value, if needed.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added support for optionally configuring a CPU allocation ratio for
KubeVirt deployments when the relevant setting is provided.
- **Chores**
- Improved configuration flexibility for KubeVirt by allowing dynamic
injection of CPU allocation settings.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-30 10:53:42 +04:00
Timofei Larkin
1e59e5fbb6 Fix virtual machine resource tracking (#904)
* Count Workload resources for pods by requests, not limits
* Do not count init container requests
* Prefix Workloads for pods with `pod-`, just like the other types to
prevent possible name collisions (closes #787)

The previous version of the WorkloadMonitor controller incorrectly
summed resource limits on pods, rather than requests. This prevented it
from tracking the resource allocation for pods, which only had requests
specified, which is particularly the case for kubevirt's virtual machine
pods. Additionally, it counted the limits for all containers, including
init containers, which are short-lived and do not contribute much to the
total resource usage.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved handling of workloads with unrecognized prefixes by ensuring
they are properly deleted and not processed further.
- Corrected resource aggregation for Pods to sum container resource
requests instead of limits, and now only includes normal containers.

- **New Features**
	- Added support for monitoring workloads with names prefixed by "pod-".

- **Tests**
- Introduced unit tests to verify correct handling of workload name
prefixes and monitored object creation.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-30 10:52:17 +04:00
Timofei Larkin
6106a9fe51 Make kubevirt's CPU allocation ratio configurable
Kubevirt's default cpu-to-vcpu ration is 1:10, which might be a bit
extreme for some users. This patch introduces a new key in the Cozystack
configmap, "cpu-allocation-ratio" where admins of Cozystack can specify
an alternative value, if needed.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-29 16:13:18 +03:00
Timofei Larkin
ec9e26c054 Fix virtual machine resource tracking
* Count Workload resources for pods by requests, not limits
* Do not count init container requests
* Prefix Workloads for pods with `pod-`, just like the other types to
  prevent possible name collisions (closes #787)

The previous version of the WorkloadMonitor controller incorrectly
summed resource limits on pods, rather than requests. This prevented it
from tracking the resource allocation for pods, which only had requests
specified, which is particularly the case for kubevirt's virtual machine
pods. Additionally, it counted the limits for all containers, including
init containers, which are short-lived and do not contribute much to the
total resource usage.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-29 15:22:46 +03:00
Andrei Kvapil
108fc647ea [ci] Use dots in release candidtate versions, as per SemVer (#901)
This change also fixes `finalizing release` workflow
https://github.com/cozystack/cozystack/pull/890#issuecomment-2830525103

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated release tag validation to require a dot between "rc" and the
number (e.g., `v0.31.5-rc.1` instead of `v0.31.5-rc1`).
  - Adjusted error messages to reflect the new release tag format.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 17:01:13 +02:00
Andrei Kvapil
a9b235048d [ci] Use dots in release candidtate versions, as per SemVer
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 16:54:41 +02:00
Andrei Kvapil
e1c14619d2 Revert "[ci] automatically trigger tests in releasing PR" (#900)
Revert https://github.com/cozystack/cozystack/pull/894 due to fact this
logic does not trigger checks in pull requests

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Removed support for manually triggering the pull request release
workflow.
- Simplified release workflow to run automatically only on labeled pull
requests.
- Eliminated the step in the tags workflow that triggered release
verification via manual dispatch.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 16:50:06 +02:00
Andrei Kvapil
f644bf20c5 Revert "[ci] automatically trigger tests in releasing PR"
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 16:47:45 +02:00
Andrei Kvapil
93bdf41144 Release v0.31.0-rc.1 (#895)
This PR prepares the release `v0.31.0-rc.1`.
2025-04-25 14:56:48 +02:00
Andrei Kvapil
bacf15f037 [e2e] Fix device_ownership_from_security_context CRI (#896)
Currently, you can't create VMDisk or VMInstance. The importer pod in
Error state with logs

`kubectl -n tenant-root logs
importer-prime-84b44042-c0ac-4e52-8fbd-a0313f4701a6`

```
I0422 07:37:02.928787       1 importer.go:107] Starting importer
E0422 07:37:02.929473       1 importer.go:137] exit status 1, blockdev: cannot open /dev/cdi-block-volume: Permission denied

kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceBlock
        pkg/util/file.go:135
kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceByVolumeMode
        pkg/util/util.go:99
main.main
        cmd/cdi-importer/importer.go:135
runtime.main
        GOROOT/src/runtime/proc.go:271
runtime.goexit
        src/runtime/asm_amd64.s:1695
```

This change solves the issue with importer pod

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
  - Improved formatting of script commands for better readability.
  - Updated container runtime configuration for enhanced customization.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 14:54:56 +02:00
dtrdnk
9239852ec8 Update permissions version for CRI containerd
Signed-off-by: dtrdnk <4demenko@gmail.com>
2025-04-25 15:45:07 +03:00
github-actions
87a286fc74 Prepare release v0.31.0-rc.1
Signed-off-by: github-actions <github-actions@github.com>
2025-04-25 12:37:42 +00:00
Andrei Kvapil
6d253b937b [ci] fix triggering releasing pr tests (#898)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 14:33:57 +02:00
Andrei Kvapil
255176c321 [ci] fix triggering releasing pr tests
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 14:33:29 +02:00
Andrei Kvapil
fa341deaac [ci] automatically trigger tests in releasing PR (#894)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added the ability to manually trigger the release verification
workflow with a specific commit SHA.
- The release verification workflow now supports both pull request
events and manual triggers.
- **Chores**
- Automated triggering of release verification tests from the tags
workflow when a new release is detected.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 14:00:26 +02:00
Nick Volynkin
f08566d3f1 [ci] Use dots in release candidtate versions, as per SemVer (#890)
Before: 0.31.0-rc1
After:  0.31.0-rc.1

Why this matters: we want to do things the right way from the start.
Version patten affects how versions are parsed and sorted.
For example, we have release candidates number 9 and 10:

* In 'rc.9' and 'rc.10', the numeric parts are compared as numbers,
  so 9 comes before 10.
* In 'rc9' and 'rc10', versions are compared lexicographically,
  so 10 comes before 9, which is wrong.

Reference: SemVer items 9–11. https://semver.org/#spec-item-9
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-25 14:13:57 +03:00
Andrei Kvapil
a29040faf7 [ci] automatically trigger tests in releasing PR
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:59:46 +02:00
Nick Volynkin
637551eb33 [ci] Use dots in release candidtate versions, as per SemVer
Before: 0.31.0-rc1
After:  0.31.0-rc.1

Why this matters: we want to do things the right way from the start.
Version patten affects how versions are parsed and sorted.
For example, we have release candidates number 9 and 10:

* In 'rc.9' and 'rc.10', the numeric parts are compared as numbers,
  so 9 comes before 10.
* In 'rc9' and 'rc10', versions are compared lexicographically,
  so 10 comes before 9, which is wrong.

Reference: SemVer items 9–11. https://semver.org/#spec-item-9
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-25 13:57:03 +03:00
Andrei Kvapil
58d959b305 [tests] refactor tests and remove e2e.applications (#893)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:55:09 +02:00
Andrei Kvapil
fcc7056e5c [platform] Fix installing release candidate versions (#891)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated version constraints for multiple HelmRelease resources to use
an explicit semantic version range (>= 0.0.0-0) instead of a wildcard or
unspecified value, clarifying eligible chart versions for deployment.
- Renamed and updated version variable in build scripts to improve
version tagging and packaging consistency.
- Enhanced deployment verification by adding readiness checks for
HelmReleases, with failure detection and reporting for non-ready
releases.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 12:42:20 +02:00
Andrei Kvapil
5d7e56bffe [tests] refactor tests and remove e2e.applications
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:28:43 +02:00
Andrei Kvapil
69b3ddf717 [e2e] Better output in case of failed HelmReleases
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:21:49 +02:00
Andrei Kvapil
79b5c6b5af [platform] Use devel versions notation for HelmCharts
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:13:40 +02:00
Andrei Kvapil
076128c783 [platform] Fix installing release candidate versions
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:07:30 +02:00
Andrei Kvapil
894cb14d49 [kubernetes] Fix ubuntu-container-disk tag (#887)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 16:40:26 +02:00
Andrei Kvapil
a0935e9ae4 [kubernetes] Fix ubuntu-container-disk tag
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 16:38:42 +02:00
Andrei Kvapil
f2c248acbd [ci] Create long‑lived maintenance branch after release published (#886)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated release workflows to ensure maintenance branches are created
during release finalization instead of during tag creation.
- Removed maintenance branch creation from the tag workflow and added it
to the release finalization process.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 16:23:01 +02:00
Andrei Kvapil
590f14a614 [ci] Create long‑lived maintenance branch after release published
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 16:22:22 +02:00
Andrei Kvapil
4c8dba880a [ci] fix release branch creation (#884)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:57:23 +02:00
Andrei Kvapil
de0c7b94f4 [ci] fix release branch creation
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:55:46 +02:00
Andrei Kvapil
2682a6e674 [kube-ovn] fix versions mapping in Makefile (#883)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:37:10 +02:00
Andrei Kvapil
e3e0b21612 [kube-ovn] fix versions mapping in Makefile
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:36:25 +02:00
Andrei Kvapil
455d66fbe4 [ci] Do not run tests in release building pipeline (#882)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Removed the "Test" step from the release workflow, so tests will no
longer run as part of this process.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 15:27:27 +02:00
Andrei Kvapil
7db7277636 [ci] Do not run tests in release building pipeline
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:00:06 +02:00
Andrei Kvapil
7be5db8cff [fluxcd] update to flux-operator 0.19.0 (#880)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced configurable API priority and fairness settings for the
Flux Operator, allowing prioritization of API requests and inclusion of
extra service accounts.
- Added support for a new `skip` field in the `ResourceSetInputProvider`
CRD to control update skipping based on label conditions.

- **Bug Fixes**
- Updated service account reference in admin ClusterRoleBinding to use
the dedicated service account name for improved accuracy.

- **Documentation**
- Updated Helm chart and app version numbers to 0.19.0 in documentation
and metadata.
- Added documentation for the new `apiPriority` configuration option in
the Flux Operator Helm chart.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 14:45:43 +02:00
Andrei Kvapil
249950d94b [kubernetes] Update tenant Kubernetes to v1.32 (#871)
This PR also updates ubuntu-container-disk image to latest 24.04 LTS
(Noble Numbat)

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated Kubernetes version references from v1.30.1 to v1.32 in build
and deployment configurations.
	- Changed the base image for Ubuntu container disk to Ubuntu 24.04.
	- Made the Kubernetes version configurable during build processes.
- Updated the kubectl container image in pre-delete jobs to use the
latest tag.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 14:43:59 +02:00
Kingdon B
44565dca88 [fluxcd] update to flux-operator 0.19.0
Signed-off-by: Kingdon B <kingdon@urmanac.com>
2025-04-24 08:25:29 -04:00
Andrei Kvapil
cefcd24ebb [ci] Fix uploading assets to release (#876)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated release workflow to use the full tag string when uploading
assets.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 14:25:29 +02:00
Andrei Kvapil
13d7df47d7 [kubernetes] Fix merging valuesOverride for tenant clusters (#879) 2025-04-24 14:24:53 +02:00
Andrei Kvapil
1ccd3074dc [kubernetes] Fix merging valuesOverride for tenant clusters
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 14:24:07 +02:00
Andrei Kvapil
70d3591ed2 [kubernetes] Refactor controlPlane settings (#866)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Updated documentation to rename and restructure the control plane
resource configuration section, replacing the old naming with a unified
"Kubernetes control plane configuration" and updated parameter prefixes.
- **Refactor**
- Consolidated and renamed control plane configuration from
`kamajiControlPlane` to `controlPlane` across configuration files.
- Flattened configuration structure and updated all related parameter
references and hierarchy for improved clarity and consistency.
- **New Features**
- Enhanced resource preset options with expanded enum values for control
plane components.
- **Bug Fixes**
- Simplified HelmRelease manifests by embedding override values inline,
removing dependency on external Secret resources for addons including
cert-manager, GPU operator, ingress-nginx, and vertical-pod-autoscaler.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 14:08:54 +02:00
Andrei Kvapil
700991f4fa [ci] let CI to cancel previus job if new one is scheduled (#873)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved reliability of GitHub Actions workflows by ensuring only one
job per pull request or branch runs at a time. If a new workflow run is
triggered, any previous in-progress runs for the same group will be
automatically canceled, preventing overlapping executions.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 13:58:23 +02:00
Andrei Kvapil
d89acbf44d [ci] get rid of ok-to-test label (#875)
Github requires approval for external users anyway:


https://docs.github.com/en/actions/managing-workflow-runs-and-deployments/managing-workflow-runs/approving-workflow-runs-from-public-forks

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Simplified conditions for running GitHub Actions workflows on pull
requests, removing dependencies on the "ok-to-test" label and repository
origin.
  - Updated comments to reflect the new workflow logic.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 13:58:12 +02:00
Andrei Kvapil
59ef3296f0 [ci] Fix uploading assets to release
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 13:57:24 +02:00
Andrei Kvapil
3ed0cdee1c [kubernetes] Update tenant Kubernetes to v1.32
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 13:43:56 +02:00
Andrei Kvapil
9f5230a342 [kubernetes] Refactor controlPlane settings
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 13:35:10 +02:00
Andrei Kvapil
b895ccfdeb [cluster-api] Update operator, providers, remove Kamaji workaround (#867)
- Update Cluster API operator to v0.19.0
- Update Cluster API Kamaji control-plane provider to v0.14.2.
- This change includes [upstream
fix](https://github.com/clastix/cluster-api-control-plane-provider-kamaji/pull/175),
so our workaround get removed
- Update Cluster API KubeVirt infrastructure provider to v0.1.10
- Update Cluster API core provider to v1.10.0
- Update Cluster API kubeadm config provider to v1.10.0



Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 13:30:15 +02:00
Andrei Kvapil
d54a407d68 [ci] Disable pre-commit for release branches
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 11:46:26 +02:00
Andrei Kvapil
f9ec630509 [ci] get rid of ok-to-test label
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 11:39:45 +02:00
Andrei Kvapil
3f47181c10 [postgres] remove douplicated template from backup manifest
Resolves https://github.com/cozystack/cozystack/issues/869



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Updated backup cron job configuration for improved clarity and
structure. No changes to backup behavior or scheduling.
- **Chores**
  - Incremented the application chart version to 0.10.1.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 11:34:36 +02:00
Ian Simon
19409d801d [postgres] remove douplicated template from backup manifest
Signed-off-by: Ian Simon <cheatmaster114@gmail.com>
2025-04-24 11:29:30 +02:00
Andrei Kvapil
8a4793d571 [ci] let CI to cancel previus job if new one is scheduled
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 11:25:10 +02:00
Andrei Kvapil
0fc3fdcb3d Update Kube-OVN to v1.13.10 (#847)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
	- Updated kube-ovn chart and container image to version v1.13.10.
- **Bug Fixes**
- Adjusted volume mount paths in the ovncni DaemonSet for improved
configuration consistency.
- **Chores**
	- Streamlined Dockerfile to use the official kube-ovn image directly.
- Automated version synchronization between chart files and Dockerfile
for better maintainability.
- **Improvements**
- Removed NetworkManager synchronization to optimize controller runtime
behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 00:24:59 +02:00
Andrei Kvapil
04e2b3952b Update Kube-OVN to v1.13.10
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 23:25:19 +02:00
Andrei Kvapil
b56624a781 [cluster-api] Update operator, providers, remove Kamaji workaround
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 17:19:29 +02:00
Timofei Larkin
07d7fadb1a Suppress wget progress bar (#865)
In our CI wget spams thousands of lines of the progress bar into the
output, making it hard to read. Turns out, it doesn't have an option to
just remove the progress bar, but explicitly directing wget's log to
stdout and invoking --show-progress sends that to stderr which we
redirect to dev/null. The downloaded size is still reported at regular
intervals, but --progress=dot:giga shortens that to one line per 32M
which is manageable.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved file download process to display clearer progress updates
during downloads.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 19:02:12 +04:00
Andrei Kvapil
8db92d53d1 [kubernetes] Add gpu-operator and introduce GPU support for tenant Kubernetes clusters (#834)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for GPU resources in Kubernetes clusters, including the
ability to specify GPUs per node group and deploy the NVIDIA GPU
Operator as an optional addon.
- Introduced new configuration options for customizing Kamaji control
plane resources and presets.
- Added support for vertical pod autoscaler customization via override
values.

- **Bug Fixes**
- Corrected typographical errors in label keys across multiple
HelmRelease manifests to ensure consistent labeling.

- **Documentation**
- Updated documentation to describe new GPU and control plane
configuration options, removed the instance type feature matrix, and
added detailed parameter explanations.

- **Chores**
- Incremented Kubernetes app chart version to 0.19.0 and updated version
mappings.
  - Fixed typos in parameter descriptions and comments.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 16:44:01 +02:00
Andrei Kvapil
7537235f43 [kubernetes] Add gpu-operator and introduce GPU support for tenant Kubernetes clusters
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 16:39:10 +02:00
Timofei Larkin
4bb524e53d Suppress wget progress bar
In our CI wget spams thousands of lines of the progress bar into the
output, making it hard to read. Turns out, it doesn't have an option to
just remove the progress bar, but explicitly directing wget's log to
stdout and invoking --show-progress sends that to stderr which we
redirect to dev/null. The downloaded size is still reported at regular
intervals, but --progress=dot:giga shortens that to one line per 32M
which is manageable.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-23 17:37:57 +03:00
Andrei Kvapil
e7ded52f93 [virtual-machine] Fix: Add GPU names to virtual machines spec (#862)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Each GPU device entry now includes a unique identifier alongside its
device name in both VirtualMachine and VM Instance templates.

- **Configuration**
- The default GPU configuration now includes a specific GPU entry by
default, instead of being empty.

- **Version Updates**
- Chart versions for VirtualMachine and VM Instance applications have
been incremented.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 16:26:13 +02:00
Andrei Kvapil
8547dc3b21 [virtual-machine] Fix: Add GPU names to virtual machines spec
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 15:27:47 +02:00
Andrei Kvapil
c22603bf7e [tenant] Fix networkpolicy for accessing externalIPs from the cluster (#854)
This PR fixes an issue with accessing external IPs of cluster from
cluster itself

```
Policy verdict log: flow 0x6c9bf32e local EP ID 1155, remote ID remote-node, proto 6, ingress, action deny, auth: disabled, match none, 172.27.88.13:46124 -> 10.244.4.174:30274 tcp SYN
xx drop (Policy denied) flow 0x6c9bf32e to endpoint 1155, ifindex 247, file bpf_lxc.c:2181, , identity remote-node->56986: 172.27.88.13:46124 -> 10.244.4.174:30274 tcp SYN
```

related doc:
https://docs.cilium.io/en/stable/security/policy/language/#entities-based


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Expanded network access for the tenant application to allow
connections from both external sources and within the cluster.

- **Chores**
	- Updated the tenant application to version 1.9.2.
	- Adjusted version mappings to reflect the latest release.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 14:30:55 +02:00
Andrei Kvapil
89525dedb5 [e2e] fix timeouts for capi and keycloak (#858)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Increased timeout durations for waiting on certain Kubernetes
resources to improve reliability during environment setup.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 14:26:17 +02:00
Andrei Kvapil
1c53a6f9f6 [e2e] fix timeouts for capi and keycloak
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 14:01:51 +02:00
Andrei Kvapil
16ee0f2c3a [platform]: add vpa for cozy etcd operator (#850)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added support for Vertical Pod Autoscaler (VPA) configuration in the
etcd-operator Helm chart, allowing automatic scaling of CPU and memory
resources for both the operator and kube-rbac-proxy components.
- Introduced new configuration options for enabling VPA, setting
resource limits, and specifying update policies.
- **Documentation**
- Updated documentation to describe the new VPA configuration options
and usage.
- **Chores**
  - Incremented chart version to 0.4.2.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 13:58:47 +02:00
Andrei Kvapil
72d0394475 Revert "[platform] Hash tenant config and store in configmap" (#855)
Reverts cozystack/cozystack#818, according to decicion made in
https://github.com/cozystack/cozystack/issues/802#issuecomment-2823950243

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Removed configuration hash ConfigMaps and related logic from the
system.
- Updated resource templates to no longer reference configuration hash
values.
- Cleaned up internal constants and code related to configuration hash
handling.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 13:24:34 +02:00
Andrei Kvapil
0a998c8b49 Revert "[platform] Hash tenant config and store in configmap"
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 13:24:14 +02:00
Andrei Kvapil
7bfad655c2 Fix: networkpolicy for tenant to access from cluster
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 12:18:40 +02:00
Andrei Kvapil
e81cbf780c [ci] Enable release-candidates and backport functionality (#841)
This PR includes refactored pipeline:
- Automatcially create long-term releasing branch `release-X.Y` after
any tag `vX.Y.*` has publushed
- Allow only tags with names `vX.Y.Z` or `vX.Y.Z-rcN`
- Automatically set `prerelease` option for the release if release is
candidate
- Automatically set `latest` option for the release according to semver
- Add a new workflow to backport PRs with `backport` label into current
feature release
- Do not requrie `ok-to-test` label for internal PRs
2025-04-23 12:06:50 +02:00
kklinch0
e8cc44450a [platform]: add vpa for cozy etcd operator
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 22:48:47 +03:00
Andrei Kvapil
d3a8a4a7de Update Cilium to v1.17.3 (#848)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 20:02:06 +02:00
Andrei Kvapil
fc2c5a0f6b [kubevirt] Enable VMExport feature (#808)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new configuration option to control the virtual export
proxy service (default disabled).
- Deployed a dedicated ingress configuration to support flexible routing
for the virtual export proxy.
- Enabled a feature toggle for VM export capabilities in KubeVirt
deployments.
- **Documentation**
- Updated user documentation to include details about the new virtual
export proxy parameter.
- **Chores**
- Upgraded the associated ingress component from version 1.4.0 to 1.5.0.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 20:01:47 +02:00
Andrei Kvapil
0f8b8e1744 Update LINSTOR to v1.31.0 (#846)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated Helm chart version and container image tags for Piraeus
Operator and related components to newer releases. This includes updates
for controller, satellite, CSI, DRBD, and sig-storage images. No other
configuration changes were made.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 19:59:50 +02:00
Andrei Kvapil
197434ff94 [platform] Hash tenant config and store in configmap (#818)
Every tenant now creates a configmap in its __tenant__ namespace with a
sha256 of its values. Tenants (and eventually all other apps), watch the
configmap in their __release__ namespace, by referencing it in the
valuesFrom part of the HelmRelease. `tenant-root` is an exception, since
it is the only tenant where the release namespace is the same as the
tenant namespace. It references a different configmap in its valesFrom,
created and reconciled by the cozystack installer script. Part of #802.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Summary by CodeRabbit

- **New Features**
- Introduced ConfigMaps that provide SHA256 hashes representing
aggregated tenant and system configurations for improved configuration
tracking.
- Configuration hashes are now injected into application releases,
including a special system configuration hash for the root tenant.

- **Chores**
- Added new constants for configuration hash naming to improve
consistency and maintainability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 19:38:37 +02:00
Andrei Kvapil
703073a164 Update Cilium to v1.17.3
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 19:30:30 +02:00
Andrei Kvapil
6a0fc64475 Update LINSTOR to v1.31.0
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 19:12:44 +02:00
Timofei Larkin
f1624353ef Hash tenant config and store in configmap
Every tenant now creates a configmap in its __tenant__ namespace with a
sha256 of its values. Tenants (and eventually all other apps), watch the
configmap in their __release__ namespace, by referencing it in the
valuesFrom part of the HelmRelease. `tenant-root` is an exception, since
it is the only tenant where the release namespace is the same as the
tenant namespace. It references a different configmap in its valesFrom,
created and reconciled by the cozystack installer script. Part of #802.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-22 18:57:18 +02:00
Andrei Kvapil
277b438f68 [monitoring] Drop legacy label condition. (#826)
ref: https://github.com/deckhouse/deckhouse/pull/960/files

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Updated dashboard metrics filters to exclude containers with empty
names instead of specifically excluding containers named "POD". This
change applies to all relevant CPU, memory, network, and storage metrics
across capacity planning, controller, namespace, namespaces, and pod
dashboards. No other dashboard functionality or structure was changed.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 18:55:47 +02:00
Andrei Kvapil
405863cb11 Drop legacy label condition also for FluxCD.
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 18:53:05 +02:00
Andrei Kvapil
63ebab5c2a [ci] Enable release-candidates and backport functionality
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 18:49:40 +02:00
Andrei Kvapil
0ddaff9380 [kubevirt] Enable VMExport feature
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 18:40:01 +02:00
Andrei Kvapil
a6b02bf381 [ci] Fix checkout and improve error output for gen_versions_map.sh (#845)
Third attempt to fix https://github.com/cozystack/cozystack/pull/842 and
https://github.com/cozystack/cozystack/pull/836

tested in
https://github.com/cozystack/cozystack/actions/runs/14599981710/job/40955508728?pr=808

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved GitHub Actions workflow to fetch full git history and tags
during pre-commit checks.
- **Refactor**
- Updated script behavior to display error messages when version
extraction from git fails, making troubleshooting easier.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 18:38:08 +02:00
Andrei Kvapil
39ede77fec [ci] Fix checkout and improve error output for gen_versions_map.sh
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 18:34:50 +02:00
Andrei Kvapil
e505857832 [ci] Fix escaping for gen_versions_map.sh script (#842)
second attept of https://github.com/cozystack/cozystack/pull/836

fixes errors like this:

-
https://github.com/cozystack/cozystack/actions/runs/14591720553/job/40928276862?pr=835

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes**
- Improved reliability of version generation by handling empty or
special values safely in the process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 17:53:36 +02:00
Andrei Kvapil
d8f3547db7 [ci] Fix escaping for gen_versions_map.sh script
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 17:52:53 +02:00
Denis Seleznev
6d8a99269b Drop legacy label condition.
Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-22 17:42:15 +02:00
klinch0
b9112a398e [platform]: fix migrations (#840)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
	- Updated installer image to include additional system utilities.
- Migration scripts now update Kubernetes ConfigMap with the current
stack version for improved version tracking.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 18:11:24 +03:00
kklinch0
719fdd29cc [platform]: fix migrations
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 17:40:59 +03:00
Timofei Larkin
9e1376f709 Indicate the IP address pool and storage class (#831)
When populating the WorkloadMonitor objects, the status field is now
populated with a specially formatted string, mimicking the keys of
ResourceQuota.spec.hard, e.g.
`<storageclassname>.storageclass.storage.k8s.io/requests.storage` or
`<ipaddresspoolname>.ipaddresspool.metallb.io/requests.ipaddresses`
so the storage class or IP pool in use can be tracked. Part of #788.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Improved labeling of resource usage in workload status by using more
descriptive, context-based keys for IP addresses and storage resources.
This enhances clarity when viewing resource allocation details.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 17:48:51 +04:00
klinch0
7a9a1fcba4 [ci] Fix escaping for gen_versions_map.sh script (#836)
fixes errors like this:

-
https://github.com/cozystack/cozystack/actions/runs/14591720553/job/40928276862?pr=835

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes**
- Improved reliability of version generation by handling empty or
special values safely in the process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 16:35:54 +03:00
kklinch0
2def9f4e83 [ci] Fix escaping for gen_versions_map.sh script
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 16:33:40 +03:00
klinch0
c1046aae6a [github] Add @klinch0 to CODEOWNERS (#838) 2025-04-22 16:31:08 +03:00
klinch0
53cf1c537c [dx] automatically detect version for migrations in installer.sh (#837)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated migration versioning to automatically determine the next
version based on existing migration scripts, removing the need for
manual updates.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 16:24:01 +03:00
klinch0
ccedcb7419 [kubernetes] Fix tenant addons removal (#835)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Expanded the pre-delete operation to target additional components,
including cert-manager and vertical pod autoscaler resources.
- **Chores**
- Updated chart version to 0.18.1 and revised version mappings for
improved tracking.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 16:07:54 +03:00
Timofei Larkin
f94a01febd Indicate the IP address pool and storage class
When populating the WorkloadMonitor objects, the status field is now
populated with a specially formatted string, mimicking the keys of
ResourceQuota.spec.hard, e.g.
`<storageclassname>.storageclass.storage.k8s.io/requests.storage` or
`<ipaddresspoolname>.ipaddresspool.metallb.io/requests.ipaddresses`
so the storage class or IP pool in use can be tracked. Part of #788.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-22 15:59:17 +03:00
Andrei Kvapil
495e584313 [github] Add @klinch0 to CODEOWNERS
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 12:47:42 +02:00
Andrei Kvapil
172e660cd1 [dx] automatically detect version for migrations in installer.sh
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 12:46:54 +02:00
Andrei Kvapil
14262cdd2a [platform]: add migration for kube-rbac-proxy daemonset (#830)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Introduced a migration script to update monitoring resources, ensuring
refreshed configurations and pod restarts for improved system stability.
	- Updated installer version tracking to support the latest migration.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 12:44:56 +02:00
Andrei Kvapil
80576cb757 [platform]: add VerticalPodAutoscaler for Cozystack dashboard (#828)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced automated resource management for dashboard components
using Kubernetes VerticalPodAutoscaler, enabling dynamic adjustment of
CPU and memory resources.
- **Chores**
- Updated configuration to explicitly set resource presets to "none" for
dashboard, frontend, and related components.
- Added a migration script to ensure Keycloak configuration is properly
reconciled in managed environments.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 12:44:27 +02:00
kklinch0
fde6e9cc73 [platform]: add migration for kube-rbac-proxy daemonset
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 13:05:48 +03:00
Timofei Larkin
57ca60c5a5 [platform] Fix installing HelmReleases on initial setup (#833)
fixes https://github.com/cozystack/cozystack/issues/832

This PR fixes regression on installing helmreleases, also some refactor

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 14:01:32 +04:00
Andrei Kvapil
1d0ee15948 [kubernetes] Fix tenant addons removal
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 11:42:40 +02:00
kklinch0
eeaa1b4517 [platform]: add migration for kube-rbac-proxy daemonset
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 12:38:49 +03:00
Andrei Kvapil
a14bcf98dd [platform]: make lower resource request for capi-kamaji-controller-manager (#825)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated resource specifications for the "kamaji" provider to include
CPU and memory requests in addition to existing limits.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 11:22:33 +02:00
Andrei Kvapil
be84fc6e4e Fix: installing HelmReleases on initial setup
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 09:48:53 +02:00
kklinch0
73a3f481bc (platform): make lower resource request for capi-kamaji-controller-manager
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-18 15:00:52 +03:00
Andrei Kvapil
5903bbc64a [ci] Fix: do not run tests in case of release skipped (#822)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-17 23:31:07 +02:00
Andrei Kvapil
f204809e43 [ci] Revert: Workflows: Use real username to commit changes and fix assets (#823)
Let's revert 3c511023f3, because DCO don't
like such commits
2025-04-17 23:30:51 +02:00
Andrei Kvapil
fe4806ce49 [ci] Revert: Workflows: Use real username to commit changes and fix assets
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-17 23:29:41 +02:00
Andrei Kvapil
8f535acc3f [ci] Fix: do not run tests in case of release skipped
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-17 23:24:20 +02:00
Andrei Kvapil
53cbb4ae12 [monitoring] fix vpa for vmagent delete resources (#820)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated resource allocation settings for monitoring agents by removing
predefined CPU and memory limits.
- Added an option to specify separate resource settings for the config
reloader component.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-17 23:16:12 +02:00
kklinch0
4e9446d934 [monitoring] fix vpa for vmagent delete resources
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-17 21:38:28 +03:00
Andrei Kvapil
acbfb6ad64 [docs] Describe the Cozystack release workflow (#817)
See preview in
https://github.com/cozystack/cozystack/blob/127-document-release-workflow/docs/release.md

Resolves #127

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Added a comprehensive "Release Workflow" section detailing steps for
regular and patch releases, including tagging, CI workflows, pull
request management, artifact building, and publication.
- Included diagrams illustrating branching and release flows for
improved clarity.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-17 09:14:47 +02:00
Andrei Kvapil
8570449080 [ci] Update pipeline for patch releases (#816)
This PR includes the following changes:

* Do not remove version tag as part of releasing pipeline
* Overwrite tag only by fact of merging releasing pull request
* Automatically detect merge base and prepare pull request for this base
* Allow to run pipeline only for tags created on `main` and
`release-X.Y` branches


Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved workflow reliability by forcing Git tag creation and push to
overwrite existing tags if necessary.
- Enhanced workflow documentation with detailed, numbered comments for
greater clarity.
- Updated tag-based workflow to dynamically determine the base branch,
ensuring only valid branches are used.
	- Removed the automatic deletion of pushed tags in the workflow.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-17 09:14:28 +02:00
Nick Volynkin
ffe6109dfb [docs] Describe the Cozystack release workflow
Resolves #127

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-16 19:31:58 +03:00
Andrei Kvapil
7dbb8a1d75 [ci] Update pipeline for patch releases
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-16 16:54:19 +02:00
Andrei Kvapil
86210c1fc1 Release v0.30.2 (#813)
This PR prepares the release `v0.30.2`.
(Please merge it before releasing draft)
2025-04-16 09:45:47 +02:00
kvaps
e96f15773d Prepare release v0.30.2
Signed-off-by: kvaps <kvaps@users.noreply.github.com>
2025-04-15 07:42:59 +00:00
klinch0
bc5635dd8e [monitoring] add vpa for users k8s clusters (#806)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the application version to 0.18.0 with refined version
tracking for improved deployment clarity.
  
- **New Features**
- Enhanced the monitoring agents integration with updated dependency
management.
- Introduced new deployment configurations for the vertical pod
autoscaler and its custom resource definitions, offering customizable
override options and improved reconciliation strategies.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-15 09:38:38 +02:00
Andrei Kvapil
5d71c90f0a [platform] Another logic for deleting components (#811)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Streamlined the internal deployment process by consolidating deletion
operations and simplifying task dependencies.
- **New Features**
- Enhanced release management with updated logic that automatically
determines whether to deploy or remove components based on their enabled
status.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 17:34:28 +02:00
Andrei Kvapil
05d6ab9516 [platform] Another logic for deleting components
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-14 17:02:50 +02:00
Andrei Kvapil
ccb001ee97 [platform] revert API_VERSIONS_FLAGS (#810)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Improved the deployment process to better incorporate API version
settings, enhancing the consistency and accuracy of resource generation
during deployment.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 14:40:32 +02:00
kklinch0
5a5cf91742 (platform): revert API_VERSIONS_FLAGS
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-14 15:36:16 +03:00
klinch0
6a0d4913f2 [platform] fix deleting bundles (#809)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced the container image with an additional YAML processing tool
for improved configuration management.
- Introduced new workflow commands that streamline deployment operations
by reconciling resource changes and automating cleanup.
- Enabled management of disabled components by automatically suspending
and flagging inactive deployments for optimized system performance.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 14:28:08 +03:00
klinch0
685e50bf6c [monitoring] add vpa for users k8s clusters (#806)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the application version to 0.18.0 with refined version
tracking for improved deployment clarity.
  
- **New Features**
- Enhanced the monitoring agents integration with updated dependency
management.
- Introduced new deployment configurations for the vertical pod
autoscaler and its custom resource definitions, offering customizable
override options and improved reconciliation strategies.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 14:07:35 +03:00
kklinch0
f90fc6f681 [platform] fix deleting bundles
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-14 13:22:33 +03:00
Andrei Kvapil
d8f3f2dee1 [ci] Fix matching tag for release branch (#805)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the automated release process to format version tags with a
"v" prefix for consistent version naming.
  - Performed minor cleanup to improve overall code clarity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 09:04:57 +02:00
kklinch0
da8100965f [monitoring] add vpa for users k8s clusters
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-11 14:52:26 +03:00
Andrei Kvapil
6d2ea1295e [ci] Fix matching tag for release branch
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-11 12:41:33 +02:00
Andrei Kvapil
d7914ff9aa Release v0.30.1 (#804)
This PR prepares the release `v0.30.1`.
(Please merge it before releasing draft)
2025-04-11 12:36:18 +02:00
kvaps
7f4af5ebbc Prepare release v0.30.1
Signed-off-by: kvaps <kvaps@users.noreply.github.com>
2025-04-11 10:07:16 +00:00
Andrei Kvapil
8f575c455c [monitoring] Refactor management etcd monitoring config (#799)
* Reuse the vmagent's serviceaccount
* Mount the serviceaccount token instead of manually creating secrets
* Give the kube-rbac-proxy a unique labelset to avoid targeting wrong
pods

Resolves #789 

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new proxy component within the monitoring configuration.

- **Refactor**
  - Updated resource labeling for improved consistency.
- Revised service account references and authentication settings for a
more streamlined operation.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-11 10:57:30 +02:00
Andrei Kvapil
819166eb35 [monitoring] create a new version to include fix for vlogs image (#803)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
  - Updated the monitoring application version from 1.9.1 to 1.9.2.
- Refined version reference details by shifting from a dynamic reference
to a fixed commit identifier for improved tracking and reliability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-11 10:55:54 +02:00
kklinch0
f507802ec9 (monitoring) patch for fix vlogs image
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-11 11:51:33 +03:00
Andrei Kvapil
62267811cb [ci] Fix release branch matching (#800)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the branch naming convention for release management to remove
the 'v' prefix.
  - Revised the displayed error messages to match the new format.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-11 10:46:32 +02:00
Andrei Kvapil
12bedef2d3 [ci] Fix release branch matching 2025-04-11 08:55:11 +02:00
Andrei Kvapil
fd240701f8 Release v0.30.0 (#783)
This PR prepares the release `v0.30.0`.
(Please merge it before releasing draft)
2025-04-10 16:35:41 +02:00
Timofei Larkin
60b96e0a62 Refactor management etcd monitoring config
* Reuse the vmagent's serviceaccount
* Mount the serviceaccount token instead of manually creating secrets
* Give the kube-rbac-proxy a unique labelset to avoid targeting wrong
  pods

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-10 16:59:43 +03:00
kvaps
1d377bab9d Prepare release v0.30.0
Signed-off-by: kvaps <kvaps@users.noreply.github.com>
2025-04-10 12:50:47 +00:00
Andrei Kvapil
799690dc07 [ci] Allow tests to run in paralel (#786)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:36:01 +02:00
Andrei Kvapil
aa02d0c5e6 [tests] fix tests and add retry for monitoring hr tests (#791)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced explicit versioning for monitoring components, now using
version *v1.17.0-victorialogs*.
- Enhanced the Docker setup by including Flux CD support for improved
deployments.

- **Chores**
- Optimized deployment automation to streamline service readiness checks
and improve overall reliability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 14:35:50 +02:00
Andrei Kvapil
6b8ecf3953 Upd: Kube-OVN to v1.13.8 (#797) 2025-04-10 14:35:22 +02:00
Andrei Kvapil
e5b81f367e Upd: Cilium to v1.17.2 (#796)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:35:07 +02:00
Andrei Kvapil
ba4798464d Upd: Kamaji to edge-25.3.2 (#795)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:34:50 +02:00
Andrei Kvapil
655e8be382 Upd: Keycloak-operator to v1.25.0 (#794)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Upgraded to a new release version, offering enhanced integration and
secure client configuration options.
- Expanded realm settings now support advanced user profile
customization and robust email configuration for streamlined operations.
- Improved administrative views deliver clearer insights for managing
your system.

- **Documentation**
- Installation and release details have been updated to accurately
reflect the latest version.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 14:34:32 +02:00
Andrei Kvapil
fd9a5b0d7b Update Cluster-API operator to v0.18.1 (#793)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Upgraded the Cluster API Operator release to version 0.18.1 with an
updated manager image.
- Introduced a new configuration option that enables conditional
deployment hooks, allowing for custom post-install and post-upgrade
actions.
- Enhanced resource synchronization with dynamic feature gate settings,
providing more flexible and coordinated deployments.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 14:34:16 +02:00
Andrei Kvapil
d09314bbb5 Upd: victoria-metrics operator to v0.55.0 (#792)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Upgraded operator and chart versions with additional configuration
options for enhanced metrics scraping, security, and resource
management.
- Introduced new settings to fine-tune monitoring endpoints and resource
definitions.

- **Documentation**
  - Updated release notes and metadata for clearer change tracking.
  - Streamlined user documentation by removing legacy files.

- **Improvements**
- Enhanced error handling and validation logic for smoother deployments.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 14:33:58 +02:00
Andrei Kvapil
371791215a Upd: Kube-OVN to v1.13.8
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:32:34 +02:00
Andrei Kvapil
4c220bb443 Upd: Cilium to v1.17.2
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:28:51 +02:00
Andrei Kvapil
e27611d45a Upd: Kamaji to edge-25.3.2
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:28:09 +02:00
Andrei Kvapil
a3e647c547 Upd: Keycloak-operator to v1.25.0
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:26:58 +02:00
Andrei Kvapil
be52fe5461 Update Cluster-API operator to v0.18.1
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:26:09 +02:00
kklinch0
1d639fda0d [tests] fix tests and add retry for monitoring hr tests
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-10 15:23:31 +03:00
Andrei Kvapil
bbdde79428 Upd: victoria-metrics operator to v0.55.0
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:23:12 +02:00
Andrei Kvapil
1966f86120 [ci] Allow to run tests in paralel
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 11:35:06 +02:00
Andrei Kvapil
fa7c98bc38 [ci] Automatically run tests as part of release pipepline (#785)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 11:15:38 +02:00
Andrei Kvapil
6671869acc [ci] Automatically run tests as part of release pipepline
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 11:13:51 +02:00
Andrei Kvapil
434c5d1b9c [ci] Add talos-kernel and talos-initramfs to assets (#784)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 10:55:23 +02:00
Andrei Kvapil
cc9abfe03f [ci] Add talos-kernel and talos-initramfs to assets
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 10:54:54 +02:00
Andrei Kvapil
e02fd14a3c Fix: versions_map, use awk instead of grep (#780)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Enhanced the internal version verification process to ensure improved
precision and reliability in version validation.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 10:42:52 +02:00
Andrei Kvapil
559eb8dea9 Fix: versions_map, use awk instead of grep
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 10:42:31 +02:00
Andrei Kvapil
9e6478b9c9 [linstor] Add plunger check for disconnected DRBD peers. (#707)
Sometimes DRBD devices get stuck in "Connecting" state, probably due to
some
race conditions. This scriptlet provides a workaround for such
situations.
2025-04-10 10:38:29 +02:00
Andrei Kvapil
3a295c4474 Add guard against empty cloudInit in vm-instance app (#646)
Prevent the VM resource from referencing a non-existent secret when
`sshKeys` are set and `cloudInit` is set to empty.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Improved cloud-init configuration handling with conditional logic and
clearer error messaging when expected configuration values are missing.

- **Documentation**
- Refined virtual machine configuration guides by reformatting parameter
tables and correcting typographical errors in parameter descriptions.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 10:37:50 +02:00
Andrei Kvapil
f8dfc43cae [e2e] Add mirror.gcr.io as default mirror for docker.io (#782)
related issues:
- https://github.com/cozystack/talm/pull/48
- https://github.com/cozystack/website/pull/154
- https://github.com/cozystack/cozystack/pull/782


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced additional configuration options that enable using Docker
image mirrors. This enhancement can improve image retrieval performance
and provide redundancy while maintaining the existing functionality.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 10:36:23 +02:00
Andrei Kvapil
3e19bc74d4 [tests] Add mirror.gcr.io as default mirror for docker.io
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 10:36:02 +02:00
klinch0
2966922c0b feat(vpa): separate-crds (#781)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Improved autoscaling deployment by integrating an additional component
for managing custom resource definitions.
- Enhanced dependency management now ensures critical prerequisites are
deployed in the correct order.
- Introduced an automated update mechanism to keep resource definitions
current.
- Added a new configuration option, giving users the flexibility to
enable or disable custom resource definitions as needed.
- Introduced two new Custom Resource Definitions:
`VerticalPodAutoscalerCheckpoint` and `VerticalPodAutoscaler`.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 11:35:36 +03:00
Denis Seleznev
991c7e1943 Handle empty cloudInit.
Add a no-op user-data when sshKeys are specified.

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-10 10:07:43 +02:00
kklinch0
c31a7710ad feat(vpa): separate-crds
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-10 10:57:50 +03:00
Andrei Kvapil
f4cace093c Add a setting to VMs that allows users to trigger cloud-init full reconfiguration. (#767)
This will trigger cloud-init reinitialization, including ssh keys update
and static network config refresh.
2025-04-10 09:20:55 +02:00
Denis Seleznev
01e417d436 Add Linstor plunger scriptlet to fix DRBD devices that are stuck disconnected.
Sometimes DRBD devices get stuck in "Connecting" state, probably due to some
race conditions. This scriptlet provides a workaround for such situations.

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-10 03:49:23 +02:00
Denis Seleznev
261ce4278f Add a setting to VMs that allows users to trigger cloud-init full reconfiguration.
Changing `cloudInitSeed`  will trigger cloud-init reinitialization, including ssh keys update and static network config refresh.

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-09 20:48:18 +02:00
Timofei Larkin
785898b507 Delete a Workload if the related object is absent (#779)
Workload object counts were previously getting out of control as the recreation of a related Pod would spawn a new workload, while the old one would never get deleted (except for StatefulSets, where the names of Pods are stable). Workloads without a matching object are now deleted.
2025-04-09 21:02:22 +04:00
Timofei Larkin
47a2cf7cd5 Track public IP usage (#769)
Like the existing behavior for Pods and the recently merged behavior for PVCs, the WorkloadMonitor controller now creates Workload objects for Services with Type==LoadBalancer to keep track of public IP reservations.
2025-04-09 21:00:35 +04:00
Timofei Larkin
1f19793613 Merge branch 'main' into 176-track-ips 2025-04-09 20:26:29 +04:00
Timofei Larkin
a0df2989af Track public IP usage
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-09 19:24:36 +03:00
Timofei Larkin
bdb538ab42 Track PVCs with WorkloadMonitor (#768)
The WorkloadMonitor controller now also watches PVCs, just like it has been watching Pods and creates Workloads per PVC according to the `spec.selector` field to track the used storage space.
2025-04-09 20:01:13 +04:00
klinch0
c844a4fb2b Fix: versions_map, include only versions from tags (#777)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes**
- Improved error handling so that missing chart versions no longer halt
processing, ensuring smoother operations.

- **Chores**
- Simplified the version tag lookup to rely solely on remote repository
tags for increased consistency and reliability.
  - Updated the Kafka application version from `0.5.0` to `0.5.2`.
- Adjusted versioning information for the Kafka package to reflect fixed
commit references.
- Streamlined the pre-commit workflow by removing unnecessary steps and
logging.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-09 18:46:58 +03:00
Timofei Larkin
fea142774a Delete a Workload if the related object is absent
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-09 18:36:11 +03:00
Andrei Kvapil
4b575299bc Log verbose state for DRBD devices that are not healthy. (#771)
This will help troubleshoot issues that occurred in the past but have
already been resolved.
2025-04-09 14:16:53 +02:00
Andrei Kvapil
4eec016f7d Merge pull request #757 from jokeOps/main
kubevirt for able to run CX or RT type of instances.
2025-04-09 14:12:31 +02:00
Andrei Kvapil
4078b21ac6 Merge branch 'main' into main 2025-04-09 14:09:59 +02:00
Andrei Kvapil
1721d397a7 Fix: versions_map, include only versions from tags
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-09 14:07:07 +02:00
Andrei Kvapil
558a0572f5 Merge pull request #776 from cozystack/bugfix/fix_version_map
fix version map
2025-04-09 14:06:31 +02:00
kklinch0
d60b81c8a0 fix version map
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-09 14:44:42 +03:00
Timofei Larkin
cc14c1fbab Track PVCs with WorkloadMonitor
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-09 14:09:36 +03:00
Nick Volynkin
80aee1354b Merge pull request #774 from cozystack/update-readme
*   [docs] Update links after restructuring docs
    
    Follow-up to cozystack/website#138
*   [docs] Proofread the readme and contributing
    
    Fix a few errors here and there.

*   [ci] Run pre-commit checks once on PRs
    
    Pre-commit checks used to trigger twice on PRs: for `push` and `pull_request`
    triggers. Now they will only run on `push` to the main branch and on regular
    updates to pull requests, except for those that only change the documentation.
    
    Note that pushes to feature branches will not trigger this check until
    a PR was opened.
2025-04-09 11:47:43 +03:00
Andrei Kvapil
332d69259b Merge pull request #766 from cozystack/vm-gpu
[virtual-machine] Add GPU support
2025-04-09 10:40:52 +02:00
Andrei Kvapil
9ad6b0d726 [virtual-machine] Add GPU support
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-09 10:39:49 +02:00
Andrei Kvapil
ea9df9e371 Merge pull request #765 from cozystack/gpu-operator
[gpu-operator] Introduce GPU-operator
2025-04-09 10:37:52 +02:00
Nick Volynkin
d69a9c4862 [docs] Proofread the readme and contributing
Fix a few errors here and there.

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-09 11:09:19 +03:00
Nick Volynkin
6270a11bb1 [docs] Update links after restructuring docs
Follow-up to cozystack/website#138

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-09 11:09:18 +03:00
Nick Volynkin
18726483a6 [ci] Run pre-commit checks once on PRs
Pre-commit checks used to trigger twice on PRs: for `push` and `pull_request`
triggers. Now they will only run on `push` to the main branch and on regular
updates to pull requests, except for those that only change the documentation.

Note that pushes to feature branches will not trigger this check until
a PR was opened.

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-09 11:07:38 +03:00
Denis Seleznev
aed184f6ef Log verbose state for DRBD devices that are not healthy.
This will help troubleshoot issues that occurred in the past but have already been resolved.

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-09 03:46:37 +02:00
Andrei Kvapil
f688a57132 Merge pull request #773 from cozystack/upload-vmlinuz-and-initramfs
Upload kernel and initramfs to release assets
2025-04-08 23:31:34 +02:00
Andrei Kvapil
e954ab7f8b Upload kernel and initramfs to release assets
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-08 23:27:37 +02:00
Timofei Larkin
c9c8235c64 Merge pull request #772 from klinch0/monitoring-add-vpa-for-vmagent
[monitoring] add vpa for vmagent
2025-04-08 17:48:38 +04:00
kklinch0
8e2e77da56 [monitoring] add vpa for vmagent
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-08 16:40:39 +03:00
Andrei Kvapil
1e27dedde5 [gpu-operator] Introduce GPU-operator
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-08 14:03:52 +02:00
Timofei Larkin
e947805c15 Track PVCs with WorkloadMonitor
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-08 11:44:36 +03:00
Pavlo Gaponuk
7a1c3b6209 Need for CX or RX type of instances
Signed-off-by: Pavlo Gaponuk <pashagaponuk@gmail.com>
2025-04-07 19:03:04 +02:00
Andrei Kvapil
49b5b510ee Merge pull request #758 from klinch0/k8s-change-CP-default-resourcesPreset
[k8s] change CP default resourcesPreset
2025-04-05 21:35:11 +02:00
kklinch0
3cf850c2c4 [k8s] change CP default resourcesPreset
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-05 21:31:17 +03:00
Andrei Kvapil
1fbbfcd063 [ci] Rename workflows
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 17:05:19 +02:00
Andrei Kvapil
de19450f44 Merge pull request #751 from cozystack/release-0.29.1
Release v0.29.1
2025-04-03 16:38:59 +02:00
Andrei Kvapil
09c94cc1a0 Finalize workflows
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 16:38:27 +02:00
kvaps
da301373fa Prepare release v0.29.1
Signed-off-by: kvaps <kvaps@users.noreply.github.com>
2025-04-03 14:27:23 +00:00
Andrei Kvapil
1f558baa9b add release workflows
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 16:24:35 +02:00
Andrei Kvapil
3c511023f3 Workflows: Use real username to commit changes and fix assets
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 15:42:38 +02:00
Andrei Kvapil
d10a9ad4e6 Workflows Fix uploading assets 2025-04-03 15:37:48 +02:00
Andrei Kvapil
9ff9f8f601 Workflows fix DCO 2025-04-03 15:34:58 +02:00
Andrei Kvapil
05a1099fd0 Allow workflow to upload assets 2025-04-03 15:28:43 +02:00
Andrei Kvapil
b2980afcd1 Allow workflow to create pull requests 2025-04-03 15:21:31 +02:00
Andrei Kvapil
6980dc59c5 Add workflow to run e2e tests using GitHub CI
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 15:02:52 +02:00
Andrei Kvapil
a9c8133fd4 fix dependencies for kafka-operator and clickhouse-operator (#748)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 00:58:16 +02:00
Andrei Kvapil
065abdd95a [linstor] fix reloader patch (#747)
this PR fixes problem

```
* admission webhook "vlinstorcluster.kb.io" denied the request: LinstorCluster.piraeus.io "linstorcluster" is invalid: spec.patches.0.patch: Invalid value: "apiVersion: apps/v1\nkind: Deployment\nmetadata:\n  annotations:\n    secret.reloader.stakater.com/auto: \"true\"\n": Failed to parse patch as either Strategic Merge Patch (missing metadata.name in object {{apps/v1 Deployment} {{ } map[] map[secret.reloader.stakater.com/auto:true]}}) or JSON Patch (json: cannot unmarshal object into Go value of type jsonpatch.Patch)
```

used solution from
-
https://github.com/piraeusdatastore/piraeus-operator/issues/701#issuecomment-2377702085

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 00:45:27 +02:00
Andrei Kvapil
cd8c6a8b9a Fix dependency for clickhouse-operator (#746)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 00:36:25 +02:00
Andrei Kvapil
459673f764 Fix CiliumNetworkPolicy depends on cilium (#745) 2025-04-03 00:21:13 +02:00
Nick Volynkin
c795e4fb68 Prepare release v0.29.0 (#740)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Streamlined the asset release process to automatically replace
existing files during uploads.
  
- **Container Image Updates**
- Upgraded versions across multiple components—including backup,
caching, autoscaling, API, dashboard, monitoring, and more—to align with
the latest release (e.g., updating from v0.28.0 to v0.29.0 and other
minor version increments).
- Updated specific images for Grafana, PostgreSQL, MariaDB, ClickHouse,
and others to their latest versions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 23:45:25 +02:00
Andrei Kvapil
7c98248e45 Update Talos Linux to v1.9.5 (#744)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 22:56:41 +02:00
Andrei Kvapil
16c771aa77 [vm-disk] disable immediate bind for non-upload disks (#742)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 22:42:51 +02:00
Andrei Kvapil
d971f2ff29 Enhance versions_map generator logic (#741)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Enhanced version mapping with improved error reporting and clearer
version resolution, ensuring more accurate and reliable version
displays.
- Updated version references for multiple packages to maintain
consistency and stability across the system.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 21:43:17 +02:00
Nick Volynkin
ead0cca5af Fix bootbox version commit (#739)
Fixup to 116196b4d4

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Upgraded the Bootbox package to the latest release, ensuring continued
stability and a solid foundation for future improvements.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-02 15:35:41 +02:00
Nick Volynkin
47ff861f00 Upload build assets with make upload_assets (#737)
# Upload build assets with make upload_assets; keep image digests

* Keep image digests in cozystack-installer.yml
* Remove manifests/cozysаtack-installer.yaml, which is a build asset,
not source file.
* Requires `gh`, the GitHub CLI app.

Co-authored-by:  Andrei Kvapil <kvapss@gmail.com>
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced an automated process that uploads release assets during
each update cycle, ensuring smoother and more reliable deployments.
	- Added a new target for uploading assets in the build process.

- **Refactor**
- Streamlined the organization and generation of deployment artifacts to
improve efficiency and consistency in our release workflow.
	- Adjusted output file management for better structure.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 14:46:02 +02:00
Andrei Kvapil
dbc1fb8a09 Enable Cilium host firewall (#738)
This commit enables Cilium's host firewall feature and makes use of it
to deny external connections to two exporters running as daemonset pods
in the host network namespace.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
Co-authored-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-02 14:42:19 +02:00
Timofei Larkin
d9c6fb7625 Enable Cilium host firewall (#736)
This commit enables Cilium's host firewall feature and makes use of it
to deny external connections to two exporters running as daemonset pods
in the host network namespace.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Host firewall is now enabled by default, adding an extra layer of
security.
  - Enhanced network traffic management with new policies:
    - One policy tightens access to critical service ports.
- Another secures monitoring endpoints by restricting unauthorized
external access.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-02 13:16:15 +02:00
Andrei Kvapil
cd23a30e76 [kubernetes] Fix patches for kubevirt-cloud-provider (#735)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
	- Updated the cloud provider’s container image to a new stable version.
- **New Features**
- Enhanced multi-cluster support to ensure services are correctly scoped
to their respective clusters.
- **Refactor**
	- Removed legacy service management logic for streamlined processing.
- **Tests**
- Adjusted test coverage to validate the improved multi-cluster and
service filtering behaviors.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 11:02:29 +02:00
Timur Tukaev
f4fba7924b Create GOVERNANCE.md (#733)
This is a first draft of Cozystack project governance

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Introduced a new governance document outlining the project's community
roles and responsibilities.
- Clarified roles for various community members including users,
contributors, and leadership.
- Detailed the decision-making process and guidelines to promote
transparent engagement and active participation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-01 18:48:14 +02:00
Timofei Larkin
01b3a82ee2 [linstor] Introduce Reloader to automatically reload certificates (#715)
* Add stakater/Reloader to the storage-enabled bundles.
* Add annotations to Linstor components to reload when secrets change.

Closes #456 

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new reloader component that triggers automatic rolling
updates when configuration or secret changes are detected.
- Delivered a fully customizable Helm chart and configuration schema,
including a reload strategy based on annotations for enhanced deployment
control.
  
- **Tests**
- Added test cases to validate container security settings and
environment variable propagation, ensuring robust high-availability
configurations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-01 18:47:18 +02:00
Andrei Kvapil
0403f30cd6 [kubernetes] Fix race-condition between two KubeVirt CCMs working with the services with identifical names (#722)
This PR fixes race condition when you have two clusters with two
services with simillar names, two eps-controllers might continiusly
conflict between each other.

Upstream issue
https://github.com/kubevirt/cloud-provider-kubevirt/pull/341

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced multi-cluster support with a new configuration option to
ensure that services and deployments are correctly scoped across
different environments.

- **Bug Fixes**
- Refined endpoint management by improving service port handling and
eliminating duplicate entries, resulting in more consistent and reliable
service routing.

- **Chores**
- Updated the image versioning strategy to a dynamic tag, streamlining
deployments and simplifying future upgrades.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-01 18:47:01 +02:00
Andrei Kvapil
116196b4d4 [bootbox] Automatically lowercase mac addresses (#732)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- MAC addresses are now consistently displayed in lowercase for uniform
formatting.
- **Chores**
- Updated the package version to 0.1.1 and refined version tracking for
improved package management.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-01 18:45:26 +02:00
xy2
bdc3525c9b [linstor] Make satellite plunger script continue on errors. (#728)
Do not restart the container if something goes wrong while plunger tries
to fix the linstor state.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved error notifications during system operations to ensure any
issues are promptly reported.
- **Documentation**
- Updated descriptive information on secondary volume management to help
clarify operational conditions.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-03-31 19:47:26 +02:00
Andrei Kvapil
9a0f459655 [linstor] Plunger: detach also / devices (#731)
I just noticed that many devices are still attached to `/`, they are not
deleted but still blocking DRBD operations

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved detection of inactive loop devices by broadening the
filtering criteria, ensuring more accurate identification in various
scenarios.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-31 15:51:10 +02:00
Timofei Larkin
59aac50ebd Merge pull request #711 from cozystack/fix-regression
Fix dependency for piraeus-operator
2025-03-28 20:06:12 +04:00
Timofei Larkin
5f1b14ab53 Merge pull request #720 from cozystack/709-backport-ingress-nginx
Use backported ingress controller
2025-03-28 20:00:22 +04:00
Timofei Larkin
5c900a7467 Revert ingress nginx chart to v4.11.2
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-03-28 17:26:24 +03:00
Timofei Larkin
cc9abbc505 Use backported ingress controller
Due to upstream compat issues we backport the security patches to
v1.11.2 of the ingress controller and do not rebuild the existing
protobuf exporter.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-03-28 16:49:31 +03:00
Timofei Larkin
c66eb9f94c Merge pull request #710 from cozystack/709-update-ingress-nginx
Update ingress-nginx to mitigate CVE-2025-1974
2025-03-26 14:11:44 +04:00
Timofei Larkin
0045ddc757 Update ingress-nginx to mitigate CVE-2025-1974
Closes #709

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-03-26 12:12:06 +03:00
Andrei Kvapil
209a3ef181 Fix dependency for piraeus-operator
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-25 12:58:21 +01:00
Nick Volynkin
92e2173fa5 Fix typo in VirtualPodAutoscaler Makefile
Makefile was copied from VictoriaMetrics Operator, some lines were not
changed.

Follow-up to #676
Resolves #705

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-03-25 12:51:01 +02:00
xy2
2d997a4f8d Remove workaround for ZFS tools. (#704)
The mounting of /run parts was fixed in a consistent way upstream.

fixes #699

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Streamlined satellite configurations by removing an unnecessary volume
setting from a container.
- Eliminated an unneeded container entry along with its related
configuration, reducing overall complexity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-03-23 21:22:26 +01:00
Kingdon Barrett
b1baaa7d98 Update Flux Operator to 0.18.0 (#703)
Released early this week


https://github.com/controlplaneio-fluxcd/flux-operator/releases/tag/v0.18.0

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
  - Upgraded the charts to version 0.18.0.
  - Added options for custom pod scheduling using node selectors.
- Introduced a reporting configuration with a customizable interval
through an environment variable.

- **Documentation**
- Updated release information and configuration details to reflect the
new options and version update.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Kingdon B <kingdon@urmanac.com>
2025-03-23 21:22:03 +01:00
Timofei Larkin
869bd4f12c Merge pull request #698 from klinch0/bugfix/fix-vpa
bugfix/fix-vpa
2025-03-20 16:12:52 +04:00
kklinch0
f6d4541db3 fix cpu
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-03-20 14:59:47 +03:00
kklinch0
5729666e72 fix cpu
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-03-20 14:59:47 +03:00
kklinch0
aa3a36831c fix mi
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-03-20 14:59:47 +03:00
kklinch0
1e03ba4a02 revert image
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-03-20 14:59:47 +03:00
kklinch0
d12dd0e117 fix vm and k8s resources
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-03-20 14:59:47 +03:00
kklinch0
077045b094 fix apps resources
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-03-20 14:59:47 +03:00
kklinch0
85ec09b8de bugfix/add-limits-for-extra-packages
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-03-20 14:59:47 +03:00
kklinch0
e0a63c32b0 bugfix/fix-monitoring-resources
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-03-20 14:59:47 +03:00
klinch0
a2af07d1dc bugfix/fix-longterm (#697)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Updated the remote write configuration to support multiple endpoints,
allowing data ingestion from both short-term and long-term services for
improved flexibility.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-03-14 10:58:16 +01:00
Timofei Larkin
0cb9e72f99 Merge pull request #695 from klinch0/feature/add-presets
feature/add-presets
2025-03-13 19:25:55 +04:00
kklinch0
b4584b4d17 fix rebbit template 2025-03-13 18:22:40 +03:00
kklinch0
ea3b092128 feature/add-presets 2025-03-13 17:03:00 +03:00
Timofei Larkin
728743db07 Merge pull request #688 from cozystack/release-v0.28.0
Prepare release v0.28.0
2025-03-13 17:17:59 +04:00
Andrei Kvapil
3d03b22775 Prepare release v0.28.0
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-13 16:02:07 +03:00
Andrei Kvapil
d2210df9ec Fix conflict of deploying KubeVirt Instancetypes (#691)
We have common-instance-types deployed by kubevirt-operator and separate
helm-chart.
In this case operator always overrides the resource by default ones.

This PR disables common-instance-types deployment for kubevirt-operator

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the package’s internal identifier to enhance consistency
across integration points.

- **New Features**
- Introduced a new configuration option that enables a deployment toggle
for managing instance types (disabled by default).

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-12 15:28:20 +01:00
xy2
423514b338 Add missing permissions to the Linstor plunger. (#693)
The Linstor satellite creates problems with admin privileges, so the
plunger needs the same privileges to fix those problems.

Also, use the native `losetup`. The Linstor image has a wrapper with an
additional function that we do not need here.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Improved the management of unused loop devices with clearer feedback
and refined error handling.
  
- **New Features**
- Enhanced container configuration by adding elevated system
permissions, allowing the container to perform higher-level operations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-12 15:28:04 +01:00
Andrei Kvapil
0e10f95293 Merge pull request #687 from cozystack/kube-ovn-webhook
Move source-ip validation from cilium to kube-ovn side
2025-03-11 12:02:33 +01:00
Andrei Kvapil
478a3f5191 Apply suggestions from code review
Co-authored-by: Timofei Larkin <lllamnyp@gmail.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-11 11:54:22 +01:00
Andrei Kvapil
750e452abc Move source-ip validation from cilium to kube-ovn side
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-11 00:27:27 +01:00
Andrei Kvapil
d266bde1ad Merge pull request #690 from cozystack/kube-ovn-v1.13.3
Upd: kubeovn v1.13.3
2025-03-11 00:17:23 +01:00
Andrei Kvapil
e904f6bb16 Upd: kubeovn v1.13.3
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-11 00:16:54 +01:00
Andrei Kvapil
2c33c4d78c Merge pull request #689 from cozystack/kube-ovn-fix
kube-ovn: Fix flush ip xfrm policy rule error
2025-03-10 23:33:38 +01:00
Andrei Kvapil
cc2a1db651 kube-ovn: Fix flush ip xfrm policy rule error 2025-03-10 23:32:10 +01:00
Andrei Kvapil
d18f4d311f Add label to link repository for the packeges 2025-03-10 22:51:47 +01:00
Andrei Kvapil
588c491f4c Add label to link repository for the packeges 2025-03-10 22:15:18 +01:00
Andrei Kvapil
abf4ea129c Merge pull request #686 from cozystack/enable-isolated-by-default
Enable tenant isolation by default
2025-03-10 21:17:52 +01:00
Andrei Kvapil
5778a68501 Merge pull request #684 from cozystack/neutral-repository
Move project from aenix-io to cozystack repository
2025-03-10 21:17:18 +01:00
Andrei Kvapil
3d962685ce Move project from aenix-io to cozystack repository
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-10 21:16:58 +01:00
Andrei Kvapil
99a53b8f9a Enable tenant isolation by default
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-10 21:09:23 +01:00
Andrei Kvapil
e7fa940139 Merge pull request #676 from klinch0/feature/add-vpa
feature/add-vpa-for-monitoring
2025-03-10 21:04:48 +01:00
Andrei Kvapil
e836b62a1e Merge pull request #681 from kingdonb/flux-oper-0.17
Update Flux Operator
2025-03-10 21:02:20 +01:00
Andrei Kvapil
023058d3a1 Merge pull request #683 from cozystack/kubeovn-v1.13.3
kubeovn v1.13.3
2025-03-10 21:01:55 +01:00
Andrei Kvapil
9daf4c90df Merge branch 'main' into kubeovn-v1.13.3 2025-03-10 21:01:36 +01:00
Andrei Kvapil
618b04e634 Merge pull request #682 from cozystack/cilium-v1.17.1
Upd: cilium v1.17.1
2025-03-10 21:01:00 +01:00
kklinch0
8a030058eb fix image 2025-03-10 12:39:18 +03:00
kklinch0
a6a0752feb add metric-labels-allowlist 2025-03-10 12:35:58 +03:00
kklinch0
6354b564b4 update monitoring-agents stack 2025-03-10 12:04:50 +03:00
kklinch0
40aa65caba f 2025-03-10 10:02:43 +03:00
kklinch0
fd5ab80f2e fix-values 2025-03-10 10:02:12 +03:00
kklinch0
aa084b4635 feature/add-vpa-for-monitoring 2025-03-10 10:02:12 +03:00
Andrei Kvapil
16e40372bd Upd: kubeovn v1.13.3 2025-03-07 12:56:11 +01:00
Andrei Kvapil
b1a969b1d0 Upd: cilium v1.17.1 2025-03-07 12:50:02 +01:00
Kingdon B
b4973945eb Update Flux Operator
Signed-off-by: Kingdon B <kingdon@urmanac.com>
2025-03-06 23:04:24 -05:00
Timofei Larkin
8267072da2 Merge pull request #678 from aenix-io/release-v0.27.0
Prepare release v0.27.0
2025-03-06 22:17:33 +04:00
Timofei Larkin
8dd8a718a7 Prepare release v0.27.0 2025-03-06 18:54:54 +03:00
Timofei Larkin
63358e3e6c Recreate etcd pods and certificates on update (#675)
Since updating from 2.5.0 to 2.6.0 renews all certificates including the
trusted CAs for etcd, all etcd pods need a restart. As many people had
problems with their etcds post-update, we decided to script the
procedure, so the renewal of certs and pods is done automatically and in
a predictable order. The hook fires when upgrading from <2.6.0 to
>=2.6.1.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
	- Introduced pre- and post-upgrade tasks for the etcd package.
- Added supporting resources, including roles, role bindings, service
accounts, and a ConfigMap to facilitate smoother upgrade operations.
- **Chores**
	- Updated the application version from 2.6.0 to 2.6.1.
	- Revised version mapping entries to reflect the updated release.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-06 15:24:12 +01:00
xy2
063439ac94 Raise maxLabelsPerTimeseries for VictoriaMetrics vminsert. (#677) 2025-03-06 14:44:58 +01:00
xy2
a7425b0caf Add linstor plunger scripts as sidecars (#672)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced new management scripts for automating controller
maintenance tasks and satellite operations, including automatic cleanup
and reconnection of resources.
- Added dynamic configuration templates to streamline container image
management and refine deployment settings for enhanced security and
monitoring.
- Rolled out updated configurations for satellite deployments, improving
network performance and resource handling.
- New YAML configurations for Linstor satellites have been added,
enhancing TLS management and pod specifications.

- **Revert**
- Removed legacy configuration resources to consolidate and simplify
system management.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-05 18:49:05 +01:00
Timofei Larkin
9ad81f2577 Patch kamaji to hardcode ControlplaneEndpoint port (#674)
When the port number is omitted in
TenantControlPlane.spec.controlPlane.ingress.hostname, Kamaji defaults
to using the assigned port number from TenantControlPlane.status. But
usually the internal port is 6443, while we expect to have 443 for
external clients of the k8s API. If we explicitly do hostname:443, we
run into clastix/kamaji#679, so for a quick hotfix we are temporarily
hardcoding 443.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated the control plane connection configuration to consistently use
the secure port 443, ensuring reliable and predictable endpoint
behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-05 18:46:11 +01:00
Andrei Kvapil
485b1dffb7 Merge pull request #673 from aenix-io/KubevirtMachineTemplate
Kubernetes: fix namespace for KubevirtMachineTemplate
2025-03-05 17:55:36 +01:00
Andrei Kvapil
1877f17ca1 Kubernetes: fix namespace for KubevirtMachineTemplate
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-05 16:28:38 +01:00
Andrei Kvapil
43e593c72d Merge pull request #670 from aenix-io/upd-etcd-operator0.4.1
Update etcd-operator v0.4.1
2025-03-05 15:26:02 +01:00
Andrei Kvapil
159d0a2294 Update etcd-operator v0.4.1 2025-03-05 15:25:38 +01:00
Andrei Kvapil
6765f66e11 Merge pull request #669 from aenix-io/upd-cozy-proxy-0.1.3
Update cozy-proxy v0.1.3
2025-03-05 15:04:04 +01:00
Andrei Kvapil
73215dca16 Update cozy-proxy v0.1.3 2025-03-05 15:03:20 +01:00
Andrei Kvapil
85499e2bdc Merge pull request #668 from aenix-io/bump-monitoring2
bump monitoring chart version
2025-03-05 15:00:41 +01:00
Andrei Kvapil
06daf34102 bump monitoring chart version 2025-03-05 15:00:14 +01:00
Andrei Kvapil
47dfaaafe1 Update Cluster-API and providers (#667)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
  - Introduced dynamic IP address management support.
- Enabled comprehensive lifecycle hooks that trigger during both
installation and upgrades.
- Expanded configuration options with new fields for flexible
deployments and customizations.

- **Chores**
  - Upgraded the application and chart versions.
- Improved deployment settings with enhanced health checks, diagnostic
endpoints, and service account management.
- Updated provider versions to enhance overall stability and
performance.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-05 14:52:23 +01:00
xy2
c60b7c0730 Import Piraeus dashboard and alerts. (#658)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Expanded the monitored dashboards with a new storage dashboard entry.
- Introduced proactive alert configurations that cover key storage
components.
- Added templated alert management to streamline dynamic configuration.
- Enhanced metric collection by integrating monitoring endpoints for
storage components.
- Delivered a comprehensive dashboard offering real-time insights into
storage performance.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-05 14:51:23 +01:00
Andrei Kvapil
266d097cab Fix regression for updating Kamaji (#665)
This fix introduced Kamaji update
https://github.com/aenix-io/cozystack/pull/633
But helm chart didn't actually updated

This affected issue with creating new clusters.
Ref https://github.com/clastix/kamaji/issues/623

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Revised application and chart version information alongside updated
dependency requirements.
- **New Features**
- Added new configuration options for tenant control planes, including
enhanced network and load balancer settings.
- **Documentation**
- Updated version indicators and clarified configuration details for
default datastore behavior.
- **Bug Fixes**
- Improved deployment stability by conditionally applying the default
datastore setting to avoid potential errors.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-05 14:14:41 +01:00
Andrei Kvapil
d4452ea708 Merge pull request #660 from xy2/169-victoria-limits
Increase VMSelect default cpu limit
2025-03-05 14:13:32 +01:00
Andrei Kvapil
ec603bc3ef CAPI-operator: Remove the invalid caBundle (#666)
Upstream:
- https://github.com/kubernetes-sigs/cluster-api-operator/issues/590
- https://github.com/kubernetes-sigs/cluster-api-operator/pull/591

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Removed an outdated internal configuration setting for webhook
communication. This cleanup streamlines the system’s setup while keeping
public functionality unchanged.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-05 14:05:47 +01:00
Timofei Larkin
48af411878 Merge pull request #664 from klinch0/feature/change-severity-cert-alerts
feature/change-severity-for-kube-client-certificate-expiration
2025-03-05 13:56:23 +04:00
Timofei Larkin
57d0a236df Merge pull request #656 from klinch0/feature/add-workloads
feature/add-workloads
2025-03-05 13:46:19 +04:00
kklinch0
554d5dbbca feature/change-severity-for-kube-client-certificate-expiration 2025-03-05 12:41:26 +03:00
kklinch0
0793b1eaf6 feature/add-workload-monitors 2025-03-05 12:15:23 +03:00
Timofei Larkin
425ce77f60 Merge pull request #655 from klinch0/feature/add-multi-dc
feature/add-multi-dc-for-pg
2025-03-05 12:51:36 +04:00
kklinch0
88729e4124 rename globalAppTopologySpreadConstraints 2025-03-05 11:39:41 +03:00
Timofei Larkin
48f6a248c8 Merge pull request #663 from klinch0/bugfix/mv-ds-var
bugfix/mv-ds-var-for-goldpinger
2025-03-05 12:08:32 +04:00
kklinch0
9714b130a8 bugfix/mv-ds-var-for-goldpinger 2025-03-05 11:05:59 +03:00
kklinch0
4cce138d31 feature/add-topologyspreadconstraints-pg 2025-03-05 10:41:43 +03:00
Timofei Larkin
e7d6f2dfa3 Merge pull request #661 from klinch0/feature/add-ch-monitoring
feature/add-ch-dashboard
2025-03-04 20:22:15 +04:00
Timofei Larkin
b68a72614a Add TL as codeowner (#662)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated internal team management roles to include an additional
responsible contributor.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-04 17:19:21 +01:00
kklinch0
36b66a681d feature/add-ch-dashboard 2025-03-04 10:53:51 +03:00
Denis Seleznev
3e273c03b6 Increase the default cpu limit for vminsert. 2025-03-03 19:31:27 +01:00
Denis Seleznev
da0437a774 Make it possible to set cpu limit too. 2025-03-03 19:31:05 +01:00
Denis Seleznev
78cff8c223 Change defaults calculation logic. 2025-03-03 19:18:24 +01:00
Andrei Kvapil
8c4605284c Prepare release v0.26.1 (#659)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
  - Upgraded core platform components to version **v0.26.1**.
- Refreshed container images for key services including backups,
caching, autoscaling, dashboard integrations, and cloud providers.
- These updates improve overall stability, consistency, and performance
across the system.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-01 21:04:40 +01:00
Andrei Kvapil
f708dc2043 VirtualMachine: Fix WholeIP enum check (#657)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the virtual machine component to version 0.8.2, ensuring more
reliable version references.
- Standardized a configuration option's casing to maintain consistency.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-01 11:08:03 +01:00
Andrei Kvapil
160e4e2a32 Update installation manifests
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-27 14:06:50 +01:00
xy2
79eadda494 Escape mustaches in prometheus rules for Helm. (#645)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a dynamic alert configuration system that aggregates
multiple alert settings into a single, streamlined document for easier
management.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-27 13:16:54 +01:00
Timofei Larkin
3da1a4ed92 Merge pull request #654 from aenix-io/release-v0.26.0
Prepare release v0.26.0
2025-02-27 15:59:11 +04:00
Timofei Larkin
a5dc2d5382 Prepare release v0.26.0 2025-02-27 11:51:46 +03:00
Timofei Larkin
705eb06078 Merge pull request #651 from aenix-io/linstor-snapshots
linstor: add basic snapshot functionality
2025-02-27 11:16:26 +04:00
Andrei Kvapil
e735f96555 kubevirt: Enable live-migration by default (#652)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Expanded configuration options now include the ability to enable live
migration for virtual machine management, offering smoother transitions
and enhanced flexibility.
- Introduced a new eviction strategy for managing virtual machine
evictions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-26 23:18:01 +01:00
Andrei Kvapil
f976ff8ed3 Upd cilium v1.16.7 (#653)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a configurable option for adjusting the Envoy access log
buffer size, allowing users to better tune log handling.
	- Improved startup feedback with more prompt service restarts.

- **Chores**
	- Upgraded all core components to version 1.16.7.
- Updated documentation and configuration settings to reflect the latest
release.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-26 23:17:34 +01:00
Andrei Kvapil
9ae6b2b0da linstor: add basic snapshot functionality
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-26 19:44:42 +01:00
Andrei Kvapil
86bb64000e Add new info logo in common style (#649)
New info icon for Cozystack

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-authored-by: Viktoriia Kvapil <159528100+kvapsova@users.noreply.github.com>
2025-02-25 15:12:06 +01:00
Kingdon Barrett
19e0e4c2dc Flux Operator v0.15 (#631)
A new release of the Flux Operator (v0.15.0) - to go with the newly
created Flux v2.5.0 release

(And to go with that, a new version of the flux-instance chart.)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Summary by CodeRabbit

- **New Features**
- Introduced enhanced operator capabilities by adding new resource
types, including `ResourceSetInputProvider` and `ResourceSet`.
- Expanded configuration options for deployments, including settings for
artifact pull secrets and customizable synchronization intervals.
- Added support for multitenancy and role-based access control
configurations.

- **Documentation**
- Updated version information and badges to reflect the upgrade to
version 0.15.0.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Kingdon B <kingdon@urmanac.com>
2025-02-25 14:57:49 +01:00
Kingdon Barrett
86724a6860 Upgrade to Flux 2.5.0 (#640)
Flux v2.5 is out:

* https://github.com/fluxcd/flux2/releases/tag/v2.5.0

* https://fluxcd.io/blog/2025/02/flux-v2.5.0/

🎉 🏆 

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Upgraded the FluxCD system from version 2.4.x to 2.5.x for improved
integration and performance.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Kingdon B <kingdon@urmanac.com>
2025-02-25 14:56:48 +01:00
klinch0
a226fdd242 bugfix/fix-nil-pointer (#643)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced dashboard and identity management displays with updated
branding and localization settings, ensuring a refreshed user interface
and experience.
  
- **Style**
- Streamlined dashboard appearance by removing legacy custom styling,
resulting in a more consistent and contemporary look.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-25 14:54:23 +01:00
klinch0
e2369bae68 feature/add-quota (#644)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new configurable parameter for tenant resource quotas,
enabling flexible CPU and memory management.
	- Added a new YAML template for Kubernetes ResourceQuota configuration.
	- Updated application version to 1.8.0.
- **Documentation**
- Added documentation for the new `resourceQuotas` parameter in tenant
configuration.
- **Chores**
	- Updated versioning entries for the tenant application.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-25 14:53:52 +01:00
Timofei Larkin
46f0bb2078 Merge pull request #620 from aenix-io/chore/improve-etcd-tls
Improve TLS handling in etcd helm chart
2025-02-25 17:29:54 +04:00
Timofei Larkin
6ff8b527ea Merge branch 'main' into chore/improve-etcd-tls 2025-02-25 13:38:58 +03:00
Timofei Larkin
0f87c73051 Improve TLS handling in etcd helm chart
1. Add a `commonName` to every certificate.
2. Move 127.0.0.1 from DNS names to IP Addresses in the certificate
   spec.
3. Add **client** auth usage to the etcd-**server** certificate (yes,
   that's necessary), because etcd queries itself using its
   [server cert as a client cert](https://github.com/etcd-io/etcd/issues/9785#issuecomment-432438748).
4. Default all CA certificates' durations to 10 years.
5. Set subject org to release namespace and OU to name so that subjects
   are unique
2025-02-25 13:36:46 +03:00
klinch0
d0d62e8847 feature/add-goldpinger (#648)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a comprehensive Grafana dashboard for Goldpinger, offering
real-time insights into node health, error occurrences, and response
times with intuitive filtering.
- Expanded deployment configurations to include Goldpinger across
environments, streamlining release management and dependency handling.
- Launched a dedicated deployment package featuring customizable
templates for secure, efficient Kubernetes deployments—including
workloads, services, ingress, and monitoring integrations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-25 10:08:08 +01:00
xy2
439381e474 Allow lookup function in 'make diff'. (#647)
Many applications require the lookup function on the live server. Allow
its usage as well as in `make show`.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the release diff operation to simulate the upgrade process on
the server side, ensuring a safer preview without applying changes.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-25 09:54:18 +01:00
Timofei Larkin
a6a95b0091 Merge pull request #633 from aenix-io/119-update-kamaji
Update kamaji version, fix kubernetes chart for compat with new kamaji version
2025-02-25 12:50:28 +04:00
Timofei Larkin
392cd862e9 Update scripts/migrations/9
Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-25 10:38:17 +03:00
Timofei Larkin
b32106484f New schema version 10
BREAKING: all kuberneteses will be upgraded to chart version 0.15.1
2025-02-24 16:33:21 +03:00
Timofei Larkin
77df31e105 Merge branch 'main' into 119-update-kamaji 2025-02-24 13:15:28 +03:00
Timofei Larkin
24fa722276 Merge pull request #642 from aenix-io/release-0.25.3
Prepare release v0.25.3
2025-02-22 11:41:53 +04:00
Timofei Larkin
0211c57bed Prepare release v0.25.3 2025-02-22 10:33:32 +03:00
Timofei Larkin
135b0609b4 Merge pull request #638 from klinch0/feature/move-kubeconfig
feature/mv-kubeconfig
2025-02-21 13:57:33 +04:00
Floppy Disk
6c73e3f3ae feature/mv-kubeconfig 2025-02-20 15:23:54 +03:00
Timofei Larkin
bc95159a80 Merge pull request #634 from aenix-io/release/v0.25.2
Prepare release v0.25.2
2025-02-18 21:03:29 +04:00
Timofei Larkin
0f68db6793 Merge pull request #635 from klinch0/feature/update-limits
feature/add-more-resources
2025-02-18 20:01:09 +03:00
Floppy Disk
9a55747885 add more resources 2025-02-18 17:40:54 +03:00
Timofei Larkin
bd90eb267f Prepare release v0.25.2 2025-02-18 17:22:41 +03:00
Timofei Larkin
a31c3a5796 Update kamaji version
* Stripped port number from KamajiControlPlane hostname due to clastix/kamaji#679
* Bumped versions for kamaji and dependent charts
2025-02-18 10:52:15 +03:00
Timofei Larkin
7d5b22e662 Merge pull request #632 from klinch0/feature/add-white-label
feature/add-wl
2025-02-17 14:03:25 +04:00
Floppy Disk
42f1dabc31 add wl 2025-02-14 17:47:37 +03:00
Timofei Larkin
eefef8b09f Merge pull request #626 from klinch0/feature/add-workloadmonitors-roles
feature/add-workloadmonitors-roles
2025-02-13 17:58:33 +04:00
Timofei Larkin
93c4616115 Merge pull request #630 from aenix-io/release-0.25.1
Prepare release v0.25.1
2025-02-13 17:32:52 +04:00
Andrei Kvapil
1f6ea333b6 Prepare release v0.25.1
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-13 16:00:02 +03:00
Floppy Disk
4cc48e6f34 add-workloadmonitors-roles 2025-02-13 13:33:35 +03:00
Timofei Larkin
ecfb02a76f Merge pull request #625 from klinch0/feature/add-kafka-monitoring
feature/add-kafka-monitoring
2025-02-13 14:21:52 +04:00
Floppy Disk
cc0222aa11 fix dashboard 2025-02-13 13:09:34 +03:00
Andrei Kvapil
65036e8145 Upd cozy-proxy to fix reconciliation logic (#629) 2025-02-12 18:41:27 +01:00
Andrei Kvapil
e2e32096a3 Fix VM services selector (#627)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated deployment configurations with the latest application versions
(0.8.1 and 0.5.1) to ensure improved stability and compatibility.
- **Bug Fixes**
- Enhanced service connectivity by refining the criteria used for
routing requests to the correct application endpoints.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-12 14:03:10 +01:00
Floppy Disk
84a23947b0 fix 2025-02-12 11:45:16 +03:00
Floppy Disk
d234d58a16 update kafka operator version 2025-02-10 13:29:59 +03:00
Floppy Disk
b75aaf177b add kafka monitoring 2025-02-10 13:29:17 +03:00
klinch0
87328a6ff3 Merge branch 'aenix-io:main' into main 2025-02-10 12:58:03 +03:00
Andrei Kvapil
3fa4dd3af9 Prepare release v0.25.0 (#622)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Upgraded multiple system components to the latest version, ensuring
improved performance, stability, and enhanced security.
- Updated deployment and testing configurations across the platform for
a more reliable user experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-09 11:41:28 +01:00
Andrei Kvapil
6245976d3e Fix bootbox chartname (#623) 2025-02-08 22:07:24 +01:00
Andrei Kvapil
dacabe6317 Update cozy-proxy v0.1.1 (#624) 2025-02-08 22:07:12 +01:00
Andrei Kvapil
bf68404c53 Update Talos v1.9.3 (#617)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Upgraded the core installer and related system images from version
v1.9.2 to v1.9.3.
- Refreshed firmware and driver references for improved consistency
across all installation profiles.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-07 14:47:45 +01:00
Andrei Kvapil
5f40685161 fix: running=false for the VMs (#621)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Revised Virtual Machine configuration to require explicit confirmation
for the running state. The system no longer auto-activates instances by
default, giving users more direct control over instance activation.
Existing validations continue to ensure that only valid configurations
are applied, resulting in a more reliable deployment process.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-07 13:53:56 +01:00
Andrei Kvapil
f768dc1632 Introduce externalMethod=WholeIP for VMs (#616)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new configuration option for specifying the method to
handle external traffic. Users can now choose between "WholeIP" and
"PortList" (default) across virtual machine and instance deployments.
- Service settings now adjust automatically based on the selected
external traffic method.

- **Documentation**
- Updated configuration guides to include details on the new
`externalMethod` parameter and its usage for managing external traffic.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-07 08:40:49 +01:00
Andrei Kvapil
1a88883a3b Update cilium v1.16.6 (#618)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced proxy configuration with dedicated endpoints for metrics,
administration, and health checks.

- **Documentation**
- Updated displayed version number and badge to v1.16.6 for improved
clarity.

- **Chores**
- Upgraded component versions and image digests from v1.16.5 to v1.16.6.
- Streamlined configuration by removing legacy conditional settings and
obsolete CORS directives.
- Refined formatting of tag filters for clearer configuration
management.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-06 13:51:46 +01:00
klinch0
a42f98e04c Merge branch 'aenix-io:main' into main 2025-02-06 15:51:19 +03:00
klinch0
842d3e55bc feature/add-flux-dashboards (#619)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new dashboard for Flux Control Plane monitoring that
visualizes key performance metrics like CPU, memory, API requests, and
more.
- Added a second dashboard for Flux Cluster Stats to display resource
reconciliation, operation durations, and readiness indicators.
- Seamlessly integrated these dashboards into the monitoring workflow
with dynamic querying and periodic refresh options.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-06 13:50:57 +01:00
klinch0
f02397aab5 Merge branch 'aenix-io:main' into main 2025-02-06 15:41:38 +03:00
klinch0
5a47754a92 feature/add-etcd-vm-node-scrape (#614)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced system monitoring with a new configuration option to collect
etcd metrics. Users can now enable the scraping of etcd metrics via
updated settings, which improves observability.
- Introduced a secure proxy mechanism that conditionally routes metrics
data from etcd, offering administrators greater control over monitoring
capabilities.
- New configuration sections added to various bundles to support etcd
metrics scraping.
  
- **Bug Fixes**
- Removed outdated configuration for VMNodeScrape resource, ensuring
clarity and accuracy in monitoring configurations.

- **Chores**
- Added new service accounts, roles, and bindings to facilitate secure
access for monitoring etcd metrics.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-06 13:40:30 +01:00
Andrei Kvapil
d91bc52594 Introduce cozy-proxy (#615)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added a new proxy component to enhance deployment orchestration and
dependency management.
- Introduced dynamic update capabilities for fetching and deploying the
latest assets.
- Enabled configurable settings for container images, networking, and
access control.
- Incorporated streamlined resource naming and labeling for improved
management.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-06 12:11:01 +01:00
klinch0
f67816e2d3 Merge branch 'aenix-io:main' into main 2025-02-05 15:38:21 +03:00
klinch0
861e6c464b feature/add-client-etcd-monitoring (#613)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced etcd monitoring with new metrics exposure, pod scraping
configuration, and comprehensive alert rules for proactive
observability.
- Introduced a new `VMPodScrape` resource for improved pod metrics
collection.
- Added a new PrometheusRule configuration for monitoring etcd clusters
with various alert conditions.
- **Chores**
  - Upgraded the etcd release from version 2.4.0 to 2.5.0.
- Consolidated and renamed monitoring dashboard references for better
consistency.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-05 13:13:28 +01:00
Andrei Kvapil
835ee117f7 Rebiild matchobx (#612)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Upgraded the deployment Docker image to version 0.24.1, ensuring
improved stability and potential performance enhancements.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-03 16:42:04 +01:00
klinch0
e5e14722b8 Merge branch 'aenix-io:main' into main 2025-01-30 00:27:55 +03:00
Andrei Kvapil
af48519d65 Prepare release v0.24.1 (#611)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-29 11:39:30 +01:00
Andrei Kvapil
d6e9765604 Revert kamaji edge-24.12.1 (#610)
Due to upstream issue: https://github.com/clastix/kamaji/issues/679

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
	- Updated Kamaji application version to v1.0.0
	- Modified dependency version constraints for kamaji-etcd

- **Documentation**
	- Updated README with new version information
- Clarified configuration descriptions for DataStore and network
profiles

- **Chores**
	- Updated Chart version to 2.0.0
	- Simplified configuration management in deployment templates
	- Updated Dockerfile to use a different source code version

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-29 10:48:12 +01:00
Andrei Kvapil
0ab39f207c Prepare release v0.24.0 (#606)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes for CozyStack v0.24.0

- **Image Updates**
  - Upgraded CozyStack core components to version v0.24.0
- Updated multiple system images, including cluster-autoscaler, kubevirt
cloud provider, and CSI driver
  - Refreshed images for dashboard, API, and controller components
  - Updated Grafana image to version 1.8.0

- **Infrastructure Changes**
- Replaced `darkhttpd` container with new `assets` container in
deployment configuration
  - Updated image digests across various system components

- **Version Bump**
  - Incremented CozyStack version from v0.23.1 to v0.24.0
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-27 19:44:54 +01:00
Andrei Kvapil
ef2e065c77 Update Grafana, build plugins (#607)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

- **New Features**
  - Updated Grafana to version 11.4.0
- Added new Grafana plugins: VictoriaMetrics logs datasource, Natel
Discrete Panel, and Worldmap Panel

- **Improvements**
  - Enhanced Grafana image build process
  - Dynamically manage Grafana image versioning
  - Updated plugin installation method

- **Version Update**
  - Monitoring package version bumped from 1.7.0 to 1.8.0

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-27 18:45:48 +01:00
Andrei Kvapil
80b4c151bd Replace darkhttpd with cozystack-assets-server (#596)
fixes https://github.com/aenix-io/cozystack/issues/602
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
	- Introduced a new custom assets server for serving static files
	- Replaced `darkhttpd` with a custom Go-based file server

- **Improvements**
	- Updated base images to Alpine Linux 3.21
	- Simplified container dependencies
	- Enhanced server configuration with command-line flags

- **Infrastructure**
	- Rebuilt Kubernetes deployment configuration for assets service
	- Updated server startup parameters and container settings
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-27 13:57:33 +01:00
klinch0
719cedde02 Merge branch 'aenix-io:main' into main 2025-01-27 15:15:11 +03:00
Andrei Kvapil
cc5eb4765c Introduce BootBox (#601)
- Introduce tinkerbell essentials
- Introduce bootbox


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

# Release Notes: BootBox Package (v0.1.0)

## New Features
- Added BootBox, a PXE hardware provisioning service.
- Introduced network boot configuration with Matchbox and Smee.
- Enabled hardware management through Kubernetes Custom Resource
Definitions.
- Added support for managing physical machine specifications and
configurations.
- New HelmRelease configuration for streamlined deployment.
- Added new application entry for BootBox in the configuration.

## Configuration
- Supports configuring physical machine instances.
- Provides flexible network boot and DHCP settings.
- Includes role-based access control (RBAC) configurations.
- New parameters for trusted proxies and syslog settings.
- Enhanced configuration options for deployment parameters and resource
allocations.
- Introduced new schema for validating configuration values.

## Deployment
- Deployed in `tenant-root` namespace.
- Optional and privileged installation.
- Depends on Cilium and KubeOVN networking components.
- Configurable deployment strategies and resource allocations.
- Introduced new Service and Ingress resources for improved traffic
management.
- Added support for host networking and public IP configurations.

## Compatibility
- Supports single-node and multi-node cluster configurations.
- Compatible with Kubernetes environments.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-27 10:56:23 +01:00
Andrei Kvapil
d557050eca Fix vm-update hook access to api (#597)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Version Updates**
	- Updated `virtual-machine` package version from `0.7.0` to `0.7.1`
	- Updated `vm-instance` package version from `0.4.0` to `0.4.1`

- **Configuration Changes**
- Added new policy annotation `policy.cozystack.io/allow-to-apiserver:
"true"` to update hook templates for both virtual machine and VM
instance

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-27 10:56:04 +01:00
klinch0
469d1e9801 Merge branch 'aenix-io:main' into main 2025-01-23 17:48:38 +03:00
Sergio
05857b954d Fix kamaji make update (#598)
Fix makefile update target for kamaji
Fix Dockerfile for kamaji (golang:1.23 as builder)
kamaji Chart updated

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

- **New Features**
- Enhanced DataStore configuration with more flexible inheritance and
schema definition
  - Added support for advanced network profile settings

- **Improvements**
  - Updated Kamaji application to version `edge-24.12.1`
  - Upgraded Go runtime to version 1.23
  - Improved documentation for DataStore and configuration settings

- **Dependency Updates**
  - Updated `kamaji-etcd` dependency to version `>=0.8.1`

- **Version Changes**
  - Reset application and chart versions to `0.0.0`

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: SSR <sergey.rabinovich@nexign.com>
2025-01-23 15:28:46 +01:00
klinch0
81819661dc Merge branch 'aenix-io:main' into main 2025-01-21 16:31:57 +03:00
klinch0
06afcf27a3 feature/fix-k8s-config-with-OIDC (#594)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
	- Updated tenant application version from 1.6.6 to 1.6.7
	- Updated version tracking in package management system
	- Minor configuration adjustments in kubeconfig template
- Enhanced logic for determining API server endpoint based on kubeconfig
presence
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-20 14:13:57 +01:00
Timofei Larkin
9587caa4f7 Update cert-manager (#595)
Bumped the embedded cert-manager chart to the latest upstream version.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

Based on the comprehensive changes across multiple files in the
cert-manager Helm chart, here are the release notes:

- **New Features**
	- Added support for dynamic TLS serving certificates for metrics
- Enhanced Prometheus monitoring configuration with ServiceMonitor and
PodMonitor options
	- Introduced more flexible IP family configuration for services

- **Improvements**
	- Updated cert-manager to version v1.16.3
- Expanded configuration options for controller, webhook, and CA
injector
	- Improved RBAC permissions and service account management
	- Enhanced documentation and configuration guidance

- **Bug Fixes**
	- Deprecated `installCRDs` option in favor of more explicit settings
	- Refined namespace and resource selection for webhooks

- **Chores**
	- Updated Helm chart dependencies and compatibility
	- Improved template rendering and configuration management

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-20 14:13:18 +01:00
klinch0
2f0d0924a7 Merge branch 'aenix-io:main' into main 2025-01-20 12:05:14 +03:00
Andrei Kvapil
2a976afe99 Prepare release v0.23.1 (#593)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-18 15:41:43 +01:00
Andrei Kvapil
fb723bc650 Fix dashboard error: Unable to get installed package (#592)
This PR includes fix for dashboard

dd02680d79

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-18 15:18:28 +01:00
Andrei Kvapil
e23286a336 Prepare release v0.23.0 (#591)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes for Cozystack v0.23.0

- **Image Updates**
  - Upgraded core Cozystack components to version v0.23.0
- Updated multiple system and application images across various packages
- Refreshed image digests for components like Kubernetes, backup, and
infrastructure tools

- **Version Bump**
  - Incremented overall system version from v0.22.0 to v0.23.0
  - Updated configuration and deployment manifests accordingly

- **System Components**
  - Updated Cozystack API, Controller, and Dashboard configurations
- Refreshed image references for Kamaji, KubeOVN, and other system
services

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-17 18:23:53 +01:00
Andrei Kvapil
2f5336388c Add hooks to update instanceType, instanceProfile, and storage (#590)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
	- Added update hook for Virtual Machine configurations
- Enhanced version management for virtual machine and VM instance
packages

- **Version Updates**
	- Virtual Machine package version updated from 0.6.0 to 0.7.0
	- VM Instance package version updated from 0.3.0 to 0.4.0

- **Improvements**
- Introduced dynamic configuration update mechanisms for Kubernetes
deployments
- Added service account and role permissions for VM configuration
management

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-17 17:06:39 +01:00
klinch0
af58018a1e Bugfix/fix kk configure reconciliation (#589)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Configuration Update**
- Added a new `configHash` field in the `keycloak-configure` release for
both `paas-full` and `paas-hosted` configurations.
- Introduced a SHA256 checksum mechanism for the `cozyConfig` data to
enhance configuration integrity checks.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-17 17:05:48 +01:00
Andrei Kvapil
cfb171b000 Update Talos Linux v1.9.2 (#588)
fixes https://github.com/aenix-io/cozystack/issues/541
2025-01-17 14:50:54 +01:00
klinch0
e037cb0e3e feature/add-tg-severity-setting (#585)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added option to disable Telegram alerts for specific severity levels
in the Monitoring Hub.

- **Documentation**
- Updated README with new parameter
`alerta.alerts.telegram.disabledSeverity`.

- **Chores**
  - Bumped monitoring package version from 1.6.1 to 1.7.0.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-17 14:49:33 +01:00
Kingdon Barrett
749110aaa2 Update fluxcd-operator to 0.13.0 (#586)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added support for common metadata (annotations and labels) in Flux
instance configuration
  - Introduced a `name` field for sync configuration in Flux instance

- **Version Updates**
  - Upgraded Flux Operator chart from v0.12.0 to v0.13.0
  - Upgraded Flux Instance chart from v0.12.0 to v0.13.0

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Kingdon B <kingdon@urmanac.com>
2025-01-17 14:48:00 +01:00
klinch0
59b4a0fb91 bugfix/monitoring add nil checker (#587)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Version Update**
  - Monitoring application version updated from 1.6.1 to 1.6.2
- **Configuration Improvements**
  - Enhanced resource configuration checks for VM cluster components
- Improved handling of resource definitions to prevent potential errors

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-17 14:47:29 +01:00
klinch0
191c8b4061 Merge branch 'aenix-io:main' into main 2025-01-16 15:26:08 +03:00
Floppy Disk
9de782e719 fix 2025-01-16 13:23:56 +03:00
937 changed files with 95991 additions and 22834 deletions

2
.github/CODEOWNERS vendored
View File

@@ -1 +1 @@
* @kvaps
* @kvaps @lllamnyp @klinch0

53
.github/workflows/backport.yaml vendored Normal file
View File

@@ -0,0 +1,53 @@
name: Automatic Backport
on:
pull_request_target:
types: [closed] # fires when PR is closed (merged)
concurrency:
group: backport-${{ github.workflow }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
permissions:
contents: write
pull-requests: write
jobs:
backport:
if: |
github.event.pull_request.merged == true &&
contains(github.event.pull_request.labels.*.name, 'backport')
runs-on: [self-hosted]
steps:
# 1. Decide which maintenance branch should receive the backport
- name: Determine target maintenance branch
id: target
uses: actions/github-script@v7
with:
script: |
let rel;
try {
rel = await github.rest.repos.getLatestRelease({
owner: context.repo.owner,
repo: context.repo.repo
});
} catch (e) {
core.setFailed('No existing releases found; cannot determine backport target.');
return;
}
const [maj, min] = rel.data.tag_name.replace(/^v/, '').split('.');
const branch = `release-${maj}.${min}`;
core.setOutput('branch', branch);
console.log(`Latest release ${rel.data.tag_name}; backporting to ${branch}`);
# 2. Checkout (required by backportaction)
- name: Checkout repository
uses: actions/checkout@v4
# 3. Create the backport pull request
- name: Create backport PR
uses: korthout/backport-action@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
label_pattern: '' # don't read labels for targets
target_branches: ${{ steps.target.outputs.branch }}

View File

@@ -1,6 +1,12 @@
name: Pre-Commit Checks
on: [push, pull_request]
on:
pull_request:
types: [labeled, opened, synchronize, reopened]
concurrency:
group: pre-commit-${{ github.workflow }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
pre-commit:
@@ -8,6 +14,9 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0
fetch-tags: true
- name: Set up Python
uses: actions/setup-python@v4

View File

@@ -0,0 +1,172 @@
name: Releasing PR
on:
pull_request:
types: [labeled, opened, synchronize, reopened, closed]
concurrency:
group: pull-requests-release-${{ github.workflow }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
verify:
name: Test Release
runs-on: [self-hosted]
permissions:
contents: read
packages: write
# Run only when the PR carries the "release" label and not closed.
if: |
contains(github.event.pull_request.labels.*.name, 'release') &&
github.event.action != 'closed'
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
fetch-tags: true
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
registry: ghcr.io
- name: Run tests
run: make test
finalize:
name: Finalize Release
runs-on: [self-hosted]
permissions:
contents: write
if: |
github.event.pull_request.merged == true &&
contains(github.event.pull_request.labels.*.name, 'release')
steps:
# Extract tag from branch name (branch = release-X.Y.Z*)
- name: Extract tag from branch name
id: get_tag
uses: actions/github-script@v7
with:
script: |
const branch = context.payload.pull_request.head.ref;
const m = branch.match(/^release-(\d+\.\d+\.\d+(?:[-\w\.]+)?)$/);
if (!m) {
core.setFailed(`Branch '${branch}' does not match 'release-X.Y.Z[-suffix]'`);
return;
}
const tag = `v${m[1]}`;
core.setOutput('tag', tag);
console.log(`✅ Tag to publish: ${tag}`);
# Checkout repo & create / push annotated tag
- name: Checkout repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Create tag on merge commit
run: |
git tag -f ${{ steps.get_tag.outputs.tag }} ${{ github.sha }}
git push -f origin ${{ steps.get_tag.outputs.tag }}
# Ensure maintenance branch release-X.Y
- name: Ensure maintenance branch release-X.Y
uses: actions/github-script@v7
with:
script: |
const tag = '${{ steps.get_tag.outputs.tag }}'; // e.g. v0.1.3 or v0.1.3-rc3
const match = tag.match(/^v(\d+)\.(\d+)\.\d+(?:[-\w\.]+)?$/);
if (!match) {
core.setFailed(`❌ tag '${tag}' must match 'vX.Y.Z' or 'vX.Y.Z-suffix'`);
return;
}
const line = `${match[1]}.${match[2]}`;
const branch = `release-${line}`;
try {
await github.rest.repos.getBranch({
owner: context.repo.owner,
repo: context.repo.repo,
branch
});
console.log(`Branch '${branch}' already exists`);
} catch (_) {
await github.rest.git.createRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: `refs/heads/${branch}`,
sha: context.sha
});
console.log(`✅ Branch '${branch}' created at ${context.sha}`);
}
# Get the latest published release
- name: Get the latest published release
id: latest_release
uses: actions/github-script@v7
with:
script: |
try {
const rel = await github.rest.repos.getLatestRelease({
owner: context.repo.owner,
repo: context.repo.repo
});
core.setOutput('tag', rel.data.tag_name);
} catch (_) {
core.setOutput('tag', '');
}
# Compare current tag vs latest using semver-utils
- name: Semver compare
id: semver
uses: madhead/semver-utils@v4.3.0
with:
version: ${{ steps.get_tag.outputs.tag }}
compare-to: ${{ steps.latest_release.outputs.tag }}
# Derive flags: prerelease? make_latest?
- name: Calculate publish flags
id: flags
uses: actions/github-script@v7
with:
script: |
const tag = '${{ steps.get_tag.outputs.tag }}'; // v0.31.5-rc.1
const m = tag.match(/^v(\d+\.\d+\.\d+)(-rc\.\d+)?$/);
if (!m) {
core.setFailed(`❌ tag '${tag}' must match 'vX.Y.Z' or 'vX.Y.Z-rc.N'`);
return;
}
const version = m[1] + (m[2] ?? ''); // 0.31.5rc.1
const isRc = Boolean(m[2]);
core.setOutput('is_rc', isRc);
const outdated = '${{ steps.semver.outputs.comparison-result }}' === '<';
core.setOutput('make_latest', isRc || outdated ? 'false' : 'legacy');
# Publish draft release with correct flags
- name: Publish draft release
uses: actions/github-script@v7
with:
script: |
const tag = '${{ steps.get_tag.outputs.tag }}';
const releases = await github.rest.repos.listReleases({
owner: context.repo.owner,
repo: context.repo.repo
});
const draft = releases.data.find(r => r.tag_name === tag && r.draft);
if (!draft) throw new Error(`Draft release for ${tag} not found`);
await github.rest.repos.updateRelease({
owner: context.repo.owner,
repo: context.repo.repo,
release_id: draft.id,
draft: false,
prerelease: ${{ steps.flags.outputs.is_rc }},
make_latest: '${{ steps.flags.outputs.make_latest }}'
});
console.log(`🚀 Published release for ${tag}`);

41
.github/workflows/pull-requests.yaml vendored Normal file
View File

@@ -0,0 +1,41 @@
name: Pull Request
on:
pull_request:
types: [labeled, opened, synchronize, reopened]
concurrency:
group: pull-requests-${{ github.workflow }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
e2e:
name: Build and Test
runs-on: [self-hosted]
permissions:
contents: read
packages: write
# Never run when the PR carries the "release" label.
if: |
!contains(github.event.pull_request.labels.*.name, 'release')
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
fetch-tags: true
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
registry: ghcr.io
- name: Build
run: make build
- name: Test
run: make test

227
.github/workflows/tags.yaml vendored Normal file
View File

@@ -0,0 +1,227 @@
name: Versioned Tag
on:
push:
tags:
- 'v*.*.*' # vX.Y.Z
- 'v*.*.*-rc.*' # vX.Y.Z-rc.N
concurrency:
group: tags-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
prepare-release:
name: Prepare Release
runs-on: [self-hosted]
permissions:
contents: write
packages: write
pull-requests: write
actions: write
steps:
# Check if a non-draft release with this tag already exists
- name: Check if release already exists
id: check_release
uses: actions/github-script@v7
with:
script: |
const tag = context.ref.replace('refs/tags/', '');
const releases = await github.rest.repos.listReleases({
owner: context.repo.owner,
repo: context.repo.repo
});
const exists = releases.data.some(r => r.tag_name === tag && !r.draft);
core.setOutput('skip', exists);
console.log(exists ? `Release ${tag} already published` : `No published release ${tag}`);
# If a published release already exists, skip the rest of the workflow
- name: Skip if release already exists
if: steps.check_release.outputs.skip == 'true'
run: echo "Release already exists, skipping workflow."
# Parse tag metadata (rc?, maintenance line, etc.)
- name: Parse tag
if: steps.check_release.outputs.skip == 'false'
id: tag
uses: actions/github-script@v7
with:
script: |
const ref = context.ref.replace('refs/tags/', ''); // e.g. v0.31.5-rc.1
const m = ref.match(/^v(\d+\.\d+\.\d+)(-rc\.\d+)?$/); // ['0.31.5', '-rc.1']
if (!m) {
core.setFailed(`❌ tag '${ref}' must match 'vX.Y.Z' or 'vX.Y.Z-rc.N'`);
return;
}
const version = m[1] + (m[2] ?? ''); // 0.31.5rc.1
const isRc = Boolean(m[2]);
const [maj, min] = m[1].split('.');
core.setOutput('tag', ref); // v0.31.5-rc.1
core.setOutput('version', version); // 0.31.5-rc.1
core.setOutput('is_rc', isRc); // true
core.setOutput('line', `${maj}.${min}`); // 0.31
# Detect base branch (main or releaseX.Y) the tag was pushed from
- name: Get base branch
if: steps.check_release.outputs.skip == 'false'
id: get_base
uses: actions/github-script@v7
with:
script: |
const baseRef = context.payload.base_ref;
if (!baseRef) {
core.setFailed(`❌ base_ref is empty. Push the tag via 'git push origin HEAD:refs/tags/<tag>'.`);
return;
}
const branch = baseRef.replace('refs/heads/', '');
const ok = branch === 'main' || /^release-\d+\.\d+$/.test(branch);
if (!ok) {
core.setFailed(`❌ Tagged commit must belong to 'main' or 'release-X.Y'. Got '${branch}'`);
return;
}
core.setOutput('branch', branch);
# Checkout & login once
- name: Checkout code
if: steps.check_release.outputs.skip == 'false'
uses: actions/checkout@v4
with:
fetch-depth: 0
fetch-tags: true
- name: Login to GHCR
if: steps.check_release.outputs.skip == 'false'
uses: docker/login-action@v3
with:
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
registry: ghcr.io
# Build project artifacts
- name: Build
if: steps.check_release.outputs.skip == 'false'
run: make build
# Commit built artifacts
- name: Commit release artifacts
if: steps.check_release.outputs.skip == 'false'
run: |
git config user.name "github-actions"
git config user.email "github-actions@github.com"
git add .
git commit -m "Prepare release ${GITHUB_REF#refs/tags/}" -s || echo "No changes to commit"
git push origin HEAD || true
# Get `latest_version` from latest published release
- name: Get latest published release
if: steps.check_release.outputs.skip == 'false'
id: latest_release
uses: actions/github-script@v7
with:
script: |
try {
const rel = await github.rest.repos.getLatestRelease({
owner: context.repo.owner,
repo: context.repo.repo
});
core.setOutput('tag', rel.data.tag_name);
} catch (_) {
core.setOutput('tag', '');
}
# Compare tag (A) with latest (B)
- name: Semver compare
if: steps.check_release.outputs.skip == 'false'
id: semver
uses: madhead/semver-utils@v4.3.0
with:
version: ${{ steps.tag.outputs.tag }} # A
compare-to: ${{ steps.latest_release.outputs.tag }} # B
# Create or reuse DRAFT GitHub Release
- name: Create / reuse draft release
if: steps.check_release.outputs.skip == 'false'
id: release
uses: actions/github-script@v7
with:
script: |
const tag = '${{ steps.tag.outputs.tag }}';
const isRc = ${{ steps.tag.outputs.is_rc }};
const outdated = '${{ steps.semver.outputs.comparison-result }}' === '<';
const makeLatest = outdated ? false : 'legacy';
const releases = await github.rest.repos.listReleases({
owner: context.repo.owner,
repo: context.repo.repo
});
let rel = releases.data.find(r => r.tag_name === tag);
if (!rel) {
rel = await github.rest.repos.createRelease({
owner: context.repo.owner,
repo: context.repo.repo,
tag_name: tag,
name: tag,
draft: true,
prerelease: isRc,
make_latest: makeLatest
});
console.log(`Draft release created for ${tag}`);
} else {
console.log(`Reusing existing release ${tag}`);
}
core.setOutput('upload_url', rel.upload_url);
# Build + upload assets (optional)
- name: Build & upload assets
if: steps.check_release.outputs.skip == 'false'
run: |
make assets
make upload_assets VERSION=${{ steps.tag.outputs.tag }}
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Create releaseX.Y.Z branch and push (forceupdate)
- name: Create release branch
if: steps.check_release.outputs.skip == 'false'
run: |
BRANCH="release-${GITHUB_REF#refs/tags/v}"
git branch -f "$BRANCH"
git push -f origin "$BRANCH"
# Create pull request into original base branch (if absent)
- name: Create pull request if not exists
if: steps.check_release.outputs.skip == 'false'
uses: actions/github-script@v7
with:
script: |
const version = context.ref.replace('refs/tags/v', '');
const base = '${{ steps.get_base.outputs.branch }}';
const head = `release-${version}`;
const prs = await github.rest.pulls.list({
owner: context.repo.owner,
repo: context.repo.repo,
head: `${context.repo.owner}:${head}`,
base
});
if (prs.data.length === 0) {
const pr = await github.rest.pulls.create({
owner: context.repo.owner,
repo: context.repo.repo,
head,
base,
title: `Release v${version}`,
body: `This PR prepares the release \`v${version}\`.`,
draft: false
});
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: pr.data.number,
labels: ['release']
});
console.log(`Created PR #${pr.data.number}`);
} else {
console.log(`PR already exists from ${head} to ${base}`);
}

3
.gitignore vendored
View File

@@ -1,6 +1,7 @@
_out
.git
.idea
.vscode
# User-specific stuff
.idea/**/workspace.xml
@@ -75,4 +76,4 @@ fabric.properties
.idea/caches/build_file_checksums.ser
.DS_Store
**/.DS_Store
**/.DS_Store

View File

@@ -13,8 +13,8 @@ but it means a lot to us.
To add your organization to this list, you can either:
- [open a pull request](https://github.com/aenix-io/cozystack/pulls) to directly update this file, or
- [edit this file](https://github.com/aenix-io/cozystack/blob/main/ADOPTERS.md) directly in GitHub
- [open a pull request](https://github.com/cozystack/cozystack/pulls) to directly update this file, or
- [edit this file](https://github.com/cozystack/cozystack/blob/main/ADOPTERS.md) directly in GitHub
Feel free to ask in the Slack chat if you any questions and/or require
assistance with updating this list.

View File

@@ -6,13 +6,13 @@ As you get started, you are in the best position to give us feedbacks on areas o
* Problems found while setting up the development environment
* Gaps in our documentation
* Bugs in our Github actions
* Bugs in our GitHub actions
First, though, it is important that you read the [code of conduct](CODE_OF_CONDUCT.md).
First, though, it is important that you read the [CNCF Code of Conduct](https://github.com/cncf/foundation/blob/master/code-of-conduct.md).
The guidelines below are a starting point. We don't want to limit your
creativity, passion, and initiative. If you think there's a better way, please
feel free to bring it up in a Github discussion, or open a pull request. We're
feel free to bring it up in a GitHub discussion, or open a pull request. We're
certain there are always better ways to do things, we just need to start some
constructive dialogue!
@@ -23,9 +23,9 @@ We welcome many types of contributions including:
* New features
* Builds, CI/CD
* Bug fixes
* [Documentation](https://github.com/aenix-io/cozystack-website/tree/main)
* [Documentation](https://GitHub.com/cozystack/cozystack-website/tree/main)
* Issue Triage
* Answering questions on Slack or Github Discussions
* Answering questions on Slack or GitHub Discussions
* Web design
* Communications / Social Media / Blog Posts
* Events participation
@@ -34,7 +34,7 @@ We welcome many types of contributions including:
## Ask for Help
The best way to reach us with a question when contributing is to drop a line in
our [Telegram channel](https://t.me/cozystack), or start a new Github discussion.
our [Telegram channel](https://t.me/cozystack), or start a new GitHub discussion.
## Raising Issues

91
GOVERNANCE.md Normal file
View File

@@ -0,0 +1,91 @@
# Cozystack Governance
This document defines the governance structure of the Cozystack community, outlining how members collaborate to achieve shared goals.
## Overview
**Cozystack**, a Cloud Native Computing Foundation (CNCF) project, is committed
to building an open, inclusive, productive, and self-governing open source
community focused on building a high-quality open source PaaS and framework for building clouds.
## Code Repositories
The following code repositories are governed by the Cozystack community and
maintained under the `cozystack` namespace:
* **[Cozystack](https://github.com/cozystack/cozystack):** Main Cozystack codebase
* **[website](https://github.com/cozystack/website):** Cozystack website and documentation sources
* **[Talm](https://github.com/cozystack/talm):** Tool for managing Talos Linux the GitOps way
* **[cozy-proxy](https://github.com/cozystack/cozy-proxy):** A simple kube-proxy addon for 1:1 NAT services in Kubernetes with NFT backend
* **[cozystack-telemetry-server](https://github.com/cozystack/cozystack-telemetry-server):** Cozystack telemetry
* **[talos-bootstrap](https://github.com/cozystack/talos-bootstrap):** An interactive Talos Linux installer
* **[talos-meta-tool](https://github.com/cozystack/talos-meta-tool):** Tool for writing network metadata into META partition
## Community Roles
* **Users:** Members that engage with the Cozystack community via any medium, including Slack, Telegram, GitHub, and mailing lists.
* **Contributors:** Members contributing to the projects by contributing and reviewing code, writing documentation,
responding to issues, participating in proposal discussions, and so on.
* **Directors:** Non-technical project leaders.
* **Maintainers**: Technical project leaders.
## Contributors
Cozystack is for everyone. Anyone can become a Cozystack contributor simply by
contributing to the project, whether through code, documentation, blog posts,
community management, or other means.
As with all Cozystack community members, contributors are expected to follow the
[Cozystack Code of Conduct](https://github.com/cozystack/cozystack/blob/main/CODE_OF_CONDUCT.md).
All contributions to Cozystack code, documentation, or other components in the
Cozystack GitHub organisation must follow the
[contributing guidelines](https://github.com/cozystack/cozystack/blob/main/CONTRIBUTING.md).
Whether these contributions are merged into the project is the prerogative of the maintainers.
## Directors
Directors are responsible for non-technical leadership functions within the project.
This includes representing Cozystack and its maintainers to the community, to the press,
and to the outside world; interfacing with CNCF and other governance entities;
and participating in project decision-making processes when appropriate.
Directors are elected by a majority vote of the maintainers.
## Maintainers
Maintainers have the right to merge code into the project.
Anyone can become a Cozystack maintainer (see "Becoming a maintainer" below).
### Expectations
Cozystack maintainers are expected to:
* Review pull requests, triage issues, and fix bugs in their areas of
expertise, ensuring that all changes go through the project's code review
and integration processes.
* Monitor cncf-cozystack-* emails, the Cozystack Slack channels in Kubernetes
and CNCF Slack workspaces, Telegram groups, and help out when possible.
* Rapidly respond to any time-sensitive security release processes.
* Attend Cozystack community meetings.
If a maintainer is no longer interested in or cannot perform the duties
listed above, they should move themselves to emeritus status.
If necessary, this can also occur through the decision-making process outlined below.
### Becoming a Maintainer
Anyone can become a Cozystack maintainer. Maintainers should be extremely
proficient in cloud native technologies and/or Go; have relevant domain expertise;
have the time and ability to meet the maintainer's expectations above;
and demonstrate the ability to work with the existing maintainers and project processes.
To become a maintainer, start by expressing interest to existing maintainers.
Existing maintainers will then ask you to demonstrate the qualifications above
by contributing PRs, doing code reviews, and other such tasks under their guidance.
After several months of working together, maintainers will decide whether to grant maintainer status.
## Project Decision-making Process
Ideally, all project decisions are resolved by consensus of maintainers and directors.
If this is not possible, a vote will be called.
The voting process is a simple majority in which each maintainer and director receives one vote.

View File

@@ -1,15 +1,24 @@
.PHONY: manifests repos assets
build:
build-deps:
@command -V find docker skopeo jq gh helm > /dev/null
@yq --version | grep -q "mikefarah" || (echo "mikefarah/yq is required" && exit 1)
@tar --version | grep -q GNU || (echo "GNU tar is required" && exit 1)
@sed --version | grep -q GNU || (echo "GNU sed is required" && exit 1)
@awk --version | grep -q GNU || (echo "GNU awk is required" && exit 1)
build: build-deps
make -C packages/apps/http-cache image
make -C packages/apps/postgres image
make -C packages/apps/mysql image
make -C packages/apps/clickhouse image
make -C packages/apps/kubernetes image
make -C packages/extra/monitoring image
make -C packages/system/cozystack-api image
make -C packages/system/cozystack-controller image
make -C packages/system/cilium image
make -C packages/system/kubeovn image
make -C packages/system/kubeovn-webhook image
make -C packages/system/dashboard image
make -C packages/system/kamaji image
make -C packages/system/bucket image
@@ -17,10 +26,6 @@ build:
make -C packages/core/installer image
make manifests
manifests:
(cd packages/core/installer/; helm template -n cozy-installer installer .) > manifests/cozystack-installer.yaml
sed -i 's|@sha256:[^"]\+||' manifests/cozystack-installer.yaml
repos:
rm -rf _out
make -C packages/apps check-version-map
@@ -31,13 +36,20 @@ repos:
mkdir -p _out/logos
cp ./packages/apps/*/logos/*.svg ./packages/extra/*/logos/*.svg _out/logos/
manifests:
mkdir -p _out/assets
(cd packages/core/installer/; helm template -n cozy-installer installer .) > _out/assets/cozystack-installer.yaml
assets:
make -C packages/core/installer/ assets
test:
make -C packages/core/testing apply
make -C packages/core/testing test
make -C packages/core/testing test-applications
generate:
hack/update-codegen.sh
upload_assets: manifests
hack/upload-assets.sh

View File

@@ -2,30 +2,31 @@
![Cozystack](img/cozystack-logo-white.svg#gh-dark-mode-only)
[![Open Source](https://img.shields.io/badge/Open-Source-brightgreen)](https://opensource.org/)
[![Apache-2.0 License](https://img.shields.io/github/license/aenix-io/cozystack)](https://opensource.org/licenses/)
[![Support](https://img.shields.io/badge/$-support-12a0df.svg?style=flat)](https://aenix.io/contact-us/#meet)
[![Active](http://img.shields.io/badge/Status-Active-green.svg)](https://aenix.io/cozystack/)
[![GitHub Release](https://img.shields.io/github/release/aenix-io/cozystack.svg?style=flat)](https://github.com/aenix-io/cozystack)
[![GitHub Commit](https://img.shields.io/github/commit-activity/y/aenix-io/cozystack)](https://github.com/aenix-io/cozystack)
[![Apache-2.0 License](https://img.shields.io/github/license/cozystack/cozystack)](https://opensource.org/licenses/)
[![Support](https://img.shields.io/badge/$-support-12a0df.svg?style=flat)](https://cozystack.io/support/)
[![Active](http://img.shields.io/badge/Status-Active-green.svg)](https://github.com/cozystack/cozystack)
[![GitHub Release](https://img.shields.io/github/release/cozystack/cozystack.svg?style=flat)](https://github.com/cozystack/cozystack/releases/latest)
[![GitHub Commit](https://img.shields.io/github/commit-activity/y/cozystack/cozystack)](https://github.com/cozystack/cozystack/graphs/contributors)
# Cozystack
**Cozystack** is a free PaaS platform and framework for building clouds.
With Cozystack, you can transform your bunch of servers into an intelligent system with a simple REST API for spawning Kubernetes clusters, Database-as-a-Service, virtual machines, load balancers, HTTP caching services, and other services with ease.
With Cozystack, you can transform a bunch of servers into an intelligent system with a simple REST API for spawning Kubernetes clusters,
Database-as-a-Service, virtual machines, load balancers, HTTP caching services, and other services with ease.
You can use Cozystack to build your own cloud or to provide a cost-effective development environments.
Use Cozystack to build your own cloud or provide a cost-effective development environment.
## Use-Cases
* [**Using Cozystack to build public cloud**](https://cozystack.io/docs/use-cases/public-cloud/)
You can use Cozystack as backend for a public cloud
* [**Using Cozystack to build a public cloud**](https://cozystack.io/docs/guides/use-cases/public-cloud/)
You can use Cozystack as a backend for a public cloud
* [**Using Cozystack to build private cloud**](https://cozystack.io/docs/use-cases/private-cloud/)
You can use Cozystack as platform to build a private cloud powered by Infrastructure-as-Code approach
* [**Using Cozystack to build a private cloud**](https://cozystack.io/docs/guides/use-cases/private-cloud/)
You can use Cozystack as a platform to build a private cloud powered by Infrastructure-as-Code approach
* [**Using Cozystack as Kubernetes distribution**](https://cozystack.io/docs/use-cases/kubernetes-distribution/)
You can use Cozystack as Kubernetes distribution for Bare Metal
* [**Using Cozystack as a Kubernetes distribution**](https://cozystack.io/docs/guides/use-cases/kubernetes-distribution/)
You can use Cozystack as a Kubernetes distribution for Bare Metal
## Screenshot
@@ -33,32 +34,32 @@ You can use Cozystack as Kubernetes distribution for Bare Metal
## Documentation
The documentation is located on official [cozystack.io](https://cozystack.io) website.
The documentation is located on the [cozystack.io](https://cozystack.io) website.
Read [Get Started](https://cozystack.io/docs/get-started/) section for a quick start.
Read the [Getting Started](https://cozystack.io/docs/getting-started/) section for a quick start.
If you encounter any difficulties, start with the [troubleshooting guide](https://cozystack.io/docs/troubleshooting/), and work your way through the process that we've outlined.
If you encounter any difficulties, start with the [troubleshooting guide](https://cozystack.io/docs/operations/troubleshooting/) and work your way through the process that we've outlined.
## Versioning
Versioning adheres to the [Semantic Versioning](http://semver.org/) principles.
A full list of the available releases is available in the GitHub repository's [Release](https://github.com/aenix-io/cozystack/releases) section.
A full list of the available releases is available in the GitHub repository's [Release](https://github.com/cozystack/cozystack/releases) section.
- [Roadmap](https://github.com/orgs/aenix-io/projects/2)
- [Roadmap](https://cozystack.io/docs/roadmap/)
## Contributions
Contributions are highly appreciated and very welcomed!
In case of bugs, please, check if the issue has been already opened by checking the [GitHub Issues](https://github.com/aenix-io/cozystack/issues) section.
In case it isn't, you can open a new one: a detailed report will help us to replicate it, assess it, and work on a fix.
In case of bugs, please check if the issue has already been opened by checking the [GitHub Issues](https://github.com/cozystack/cozystack/issues) section.
If it isn't, you can open a new one. A detailed report will help us replicate it, assess it, and work on a fix.
You can express your intention in working on the fix on your own.
You can express your intention to on the fix on your own.
Commits are used to generate the changelog, and their author will be referenced in it.
In case of **Feature Requests** please use the [Discussion's Feature Request section](https://github.com/aenix-io/cozystack/discussions/categories/feature-requests).
If you have **Feature Requests** please use the [Discussion's Feature Request section](https://github.com/cozystack/cozystack/discussions/categories/feature-requests).
You can join our weekly community meetings (just add this events to your [Google Calendar](https://calendar.google.com/calendar?cid=ZTQzZDIxZTVjOWI0NWE5NWYyOGM1ZDY0OWMyY2IxZTFmNDMzZTJlNjUzYjU2ZGJiZGE3NGNhMzA2ZjBkMGY2OEBncm91cC5jYWxlbmRhci5nb29nbGUuY29t) or [iCal](https://calendar.google.com/calendar/ical/e43d21e5c9b45a95f28c5d649c2cb1e1f433e2e653b56dbbda74ca306f0d0f68%40group.calendar.google.com/public/basic.ics)) or [Telegram group](https://t.me/cozystack).
You are welcome to join our weekly community meetings (just add this events to your [Google Calendar](https://calendar.google.com/calendar?cid=ZTQzZDIxZTVjOWI0NWE5NWYyOGM1ZDY0OWMyY2IxZTFmNDMzZTJlNjUzYjU2ZGJiZGE3NGNhMzA2ZjBkMGY2OEBncm91cC5jYWxlbmRhci5nb29nbGUuY29t) or [iCal](https://calendar.google.com/calendar/ical/e43d21e5c9b45a95f28c5d649c2cb1e1f433e2e653b56dbbda74ca306f0d0f68%40group.calendar.google.com/public/basic.ics)) or [Telegram group](https://t.me/cozystack).
## License
@@ -67,8 +68,4 @@ The code is provided as-is with no warranties.
## Commercial Support
[**Ænix**](https://aenix.io) offers enterprise-grade support, available 24/7.
We provide all types of assistance, including consultations, development of missing features, design, assistance with installation, and integration.
[Contact us](https://aenix.io/contact/)
A list of companies providing commercial support for this project can be found on [official site](https://cozystack.io/support/).

View File

@@ -1,4 +1,4 @@
API rule violation: list_type_missing,github.com/aenix-io/cozystack/pkg/apis/apps/v1alpha1,ApplicationStatus,Conditions
API rule violation: list_type_missing,github.com/cozystack/cozystack/pkg/apis/apps/v1alpha1,ApplicationStatus,Conditions
API rule violation: names_match,k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1,JSONSchemaProps,Ref
API rule violation: names_match,k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1,JSONSchemaProps,Schema
API rule violation: names_match,k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1,JSONSchemaProps,XEmbeddedResource

View File

@@ -19,7 +19,7 @@ package main
import (
"os"
"github.com/aenix-io/cozystack/pkg/cmd/server"
"github.com/cozystack/cozystack/pkg/cmd/server"
genericapiserver "k8s.io/apiserver/pkg/server"
"k8s.io/component-base/cli"
)

View File

@@ -0,0 +1,29 @@
package main
import (
"flag"
"log"
"net/http"
"path/filepath"
)
func main() {
addr := flag.String("address", ":8123", "Address to listen on")
dir := flag.String("dir", "/cozystack/assets", "Directory to serve files from")
flag.Parse()
absDir, err := filepath.Abs(*dir)
if err != nil {
log.Fatalf("Error getting absolute path for %s: %v", *dir, err)
}
fs := http.FileServer(http.Dir(absDir))
http.Handle("/", fs)
log.Printf("Server starting on %s, serving directory %s", *addr, absDir)
err = http.ListenAndServe(*addr, nil)
if err != nil {
log.Fatalf("Server failed to start: %v", err)
}
}

View File

@@ -36,9 +36,11 @@ import (
metricsserver "sigs.k8s.io/controller-runtime/pkg/metrics/server"
"sigs.k8s.io/controller-runtime/pkg/webhook"
cozystackiov1alpha1 "github.com/aenix-io/cozystack/api/v1alpha1"
"github.com/aenix-io/cozystack/internal/controller"
"github.com/aenix-io/cozystack/internal/telemetry"
cozystackiov1alpha1 "github.com/cozystack/cozystack/api/v1alpha1"
"github.com/cozystack/cozystack/internal/controller"
"github.com/cozystack/cozystack/internal/telemetry"
helmv2 "github.com/fluxcd/helm-controller/api/v2"
// +kubebuilder:scaffold:imports
)
@@ -51,6 +53,7 @@ func init() {
utilruntime.Must(clientgoscheme.AddToScheme(scheme))
utilruntime.Must(cozystackiov1alpha1.AddToScheme(scheme))
utilruntime.Must(helmv2.AddToScheme(scheme))
// +kubebuilder:scaffold:scheme
}
@@ -178,6 +181,23 @@ func main() {
setupLog.Error(err, "unable to create controller", "controller", "WorkloadMonitor")
os.Exit(1)
}
if err = (&controller.WorkloadReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "WorkloadReconciler")
os.Exit(1)
}
if err = (&controller.TenantHelmReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "Workload")
os.Exit(1)
}
// +kubebuilder:scaffold:builder
if err := mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -450,7 +450,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval])))\n / sum(sum by (node) (avg_over_time(kube_node_status_allocatable{resource=\"cpu\",unit=\"core\",node=~\"$node\"}[$__rate_interval])))",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"\",node=~\"$node\"}[$__rate_interval])))\n / sum(sum by (node) (avg_over_time(kube_node_status_allocatable{resource=\"cpu\",unit=\"core\",node=~\"$node\"}[$__rate_interval])))",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -520,7 +520,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",node=~\"$node\"})) / sum(sum by (node) (avg_over_time(kube_node_status_allocatable{resource=\"memory\",unit=\"byte\",node=~\"$node\"}[$__rate_interval])))",
"expr": "sum(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"\",node=~\"$node\"})) / sum(sum by (node) (avg_over_time(kube_node_status_allocatable{resource=\"memory\",unit=\"byte\",node=~\"$node\"}[$__rate_interval])))",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -590,7 +590,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval]))) / sum(sum by (node) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\"}[$__rate_interval])))",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"\",node=~\"$node\"}[$__rate_interval]))) / sum(sum by (node) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\"}[$__rate_interval])))",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -660,7 +660,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",node=~\"$node\"} )) / sum(sum by (node) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"}[$__rate_interval])))",
"expr": "sum(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"\",node=~\"$node\"} )) / sum(sum by (node) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"}[$__rate_interval])))",
"hide": false,
"legendFormat": "__auto",
"range": true,
@@ -1128,7 +1128,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval]) - on (namespace,pod,container,node) group_left avg by (namespace,pod,container, node)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"})) * -1 > 0\n",
"expr": "sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"\",node=~\"$node\"}[$__rate_interval]) - on (namespace,pod,container,node) group_left avg by (namespace,pod,container, node)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"})) * -1 > 0\n",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -1143,7 +1143,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval]) - on (namespace,pod,container,node) group_left avg by (namespace,pod,container, node)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"})) * -1 > 0)",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"\",node=~\"$node\"}[$__rate_interval]) - on (namespace,pod,container,node) group_left avg by (namespace,pod,container, node)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"})) * -1 > 0)",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -1527,7 +1527,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",node=~\"$node\"} ) - sum by (node) (kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"})) * -1 > 0\n",
"expr": "(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"\",node=~\"$node\"} ) - sum by (node) (kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"})) * -1 > 0\n",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -1542,7 +1542,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum((sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",node=~\"$node\"} ) - sum by (node) (kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"})) * -1 > 0)",
"expr": "sum((sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"\",node=~\"$node\"} ) - sum by (node) (kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"})) * -1 > 0)",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -1909,7 +1909,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "topk(10, (sum by (namespace,pod,container)((rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval])) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"}))) * -1 > 0)\n",
"expr": "topk(10, (sum by (namespace,pod,container)((rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!=\"\",node=~\"$node\"}[$__rate_interval])) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"}))) * -1 > 0)\n",
"format": "table",
"instant": true,
"range": false,
@@ -2037,7 +2037,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "topk(10, (sum by (namespace,container,pod) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",namespace=~\"$namespace\",node=~\"$node\"}) - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"memory\",namespace=~\"$namespace\",node=~\"$node\"})) * -1 >0)\n",
"expr": "topk(10, (sum by (namespace,container,pod) (container_memory_working_set_bytes:without_kmem{container!=\"\",namespace=~\"$namespace\",node=~\"$node\"}) - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"memory\",namespace=~\"$namespace\",node=~\"$node\"})) * -1 >0)\n",
"format": "table",
"instant": true,
"range": false,
@@ -2160,7 +2160,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "topk(10, (sum by (namespace,pod,container)((rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval])) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"}))) > 0)\n",
"expr": "topk(10, (sum by (namespace,pod,container)((rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!=\"\",node=~\"$node\"}[$__rate_interval])) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"}))) > 0)\n",
"format": "table",
"instant": true,
"range": false,
@@ -2288,7 +2288,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "topk(10, (sum by (namespace,container,pod) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",namespace=~\"$namespace\",node=~\"$node\"}) - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"memory\",namespace=~\"$namespace\",node=~\"$node\"})) >0)\n",
"expr": "topk(10, (sum by (namespace,container,pod) (container_memory_working_set_bytes:without_kmem{container!=\"\",namespace=~\"$namespace\",node=~\"$node\"}) - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"memory\",namespace=~\"$namespace\",node=~\"$node\"})) >0)\n",
"format": "table",
"instant": true,
"range": false,

View File

@@ -684,7 +684,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -710,7 +710,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (pod)\n(\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\", namespace=\"$namespace\", resource=\"cpu\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "sum by (pod)\n(\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\", namespace=\"$namespace\", resource=\"cpu\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -723,7 +723,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) > 0\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) > 0\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -736,7 +736,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) \n (\n (\n (\n sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) or sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) > 0\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) \n (\n (\n (\n sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) or sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) > 0\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -762,7 +762,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -786,7 +786,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (pod)\n(\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\", namespace=\"$namespace\", resource=\"memory\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "sum by (pod)\n(\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\", namespace=\"$namespace\", resource=\"memory\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -798,7 +798,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) > 0\n )\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) > 0\n )\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -810,7 +810,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) or sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) > 0\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) or sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) > 0\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -848,7 +848,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -860,7 +860,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -1315,7 +1315,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -1488,7 +1488,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -1502,7 +1502,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -1642,7 +1642,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by (pod)\n (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (pod)\n (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -1779,7 +1779,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": " (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n )\n or\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n )\n) > 0",
"expr": " (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n )\n or\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n )\n) > 0",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -2095,7 +2095,7 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Usage",
@@ -2109,7 +2109,7 @@
"refId": "D"
},
{
"expr": "sum by (pod)\n(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\", namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))\n)",
"expr": "sum by (pod)\n(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\", namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
@@ -2295,7 +2295,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "System",
@@ -2306,7 +2306,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -2468,7 +2468,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -2653,7 +2653,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
@@ -2666,7 +2666,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Cache",
@@ -2679,7 +2679,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Swap",
@@ -2692,7 +2692,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
@@ -2705,7 +2705,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -2837,7 +2837,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (pod) group_left()\n sum by (pod)\n (\n (\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n ) > 0\n )\n)",
"expr": "(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (pod) group_left()\n sum by (pod)\n (\n (\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n ) > 0\n )\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -2974,7 +2974,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (pod) group_left()\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n ) or sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n ) > 0\n )\n)",
"expr": "(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (pod) group_left()\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n ) or sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n ) > 0\n )\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -3290,56 +3290,56 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Cache",
"refId": "B"
},
{
"expr": "sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Swap",
"refId": "C"
},
{
"expr": "sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum by (pod)\n(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{namespace=\"$namespace\", container!=\"POD\", resource=\"memory\"}[$__rate_interval]))\n)",
"expr": "sum by (pod)\n(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{namespace=\"$namespace\", container!=\"\", resource=\"memory\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
"refId": "E"
},
{
"expr": "sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "F"
},
{
"expr": "sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "G"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -3834,7 +3834,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -3972,7 +3972,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",

View File

@@ -656,7 +656,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -680,7 +680,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"cpu\"}[$__range]))\n ) \nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"cpu\"}[$__range]))\n ) \nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -692,7 +692,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -704,7 +704,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -728,7 +728,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -740,7 +740,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (pod) group_left()\n sum by (namespace, pod)\n (\n avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__range])\n )\n )\n or\n count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (pod) group_left()\n sum by (namespace, pod)\n (\n avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__range])\n )\n )\n or\n count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -752,7 +752,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"memory\"}[$__range]))\n ) \n or \ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"memory\"}[$__range]))\n ) \n or \ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -764,7 +764,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -776,7 +776,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -814,7 +814,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -826,7 +826,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -877,7 +877,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -1475,7 +1475,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -1646,7 +1646,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum (sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))))",
"expr": "sum (sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "System",
@@ -1657,7 +1657,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum (sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))))",
"expr": "sum (sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -1798,7 +1798,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -1939,7 +1939,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -2257,28 +2257,28 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Usage",
"refId": "D"
},
{
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "C"
},
{
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "E"
},
{
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
@@ -2458,7 +2458,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval])))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2470,7 +2470,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -2622,7 +2622,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -2799,14 +2799,14 @@
"pluginVersion": "8.5.13",
"targets": [
{
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2814,7 +2814,7 @@
"refId": "B"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2822,14 +2822,14 @@
"refId": "C"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -2955,7 +2955,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -3091,7 +3091,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n )\n or\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n +\n sum by(namespace, pod, container) (avg_over_time(container_memory:kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n )\n ) > 0\n )\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n )\n or\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n +\n sum by(namespace, pod, container) (avg_over_time(container_memory:kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n )\n ) > 0\n )\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -3408,14 +3408,14 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} \n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} \n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3423,7 +3423,7 @@
"refId": "B"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3431,35 +3431,35 @@
"refId": "C"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\",namespace=\"$namespace\"}[$__rate_interval]))\n ) ",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))\n ) ",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "E"
},
{
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} \n * on (pod) group_left() \n sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} \n * on (pod) group_left() \n sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "F"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"memory\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"memory\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
"refId": "G"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -3910,7 +3910,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -4049,7 +4049,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",

View File

@@ -869,7 +869,7 @@
"refId": "A"
},
{
"expr": "100 * count by (namespace) (\n sum by (namespace, verticalpodautoscaler) ( \n count by (namespace, controller_name, verticalpodautoscaler) (avg_over_time(vpa_target_recommendation{namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\n / on (controller_name, namespace) group_left\n count by (namespace, controller_name) (avg_over_time(kube_controller_pod{namespace=~\"$namespace\"}[$__range]))\n )\n) \n/ count by (namespace) (sum by (namespace, controller) (avg_over_time(kube_controller_pod{namespace=~\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "100 * count by (namespace) (\n sum by (namespace, verticalpodautoscaler) ( \n count by (namespace, controller_name, verticalpodautoscaler) (avg_over_time(vpa_target_recommendation{namespace=~\"$namespace\", container!=\"\"}[$__range]))\n / on (controller_name, namespace) group_left\n count by (namespace, controller_name) (avg_over_time(kube_controller_pod{namespace=~\"$namespace\"}[$__range]))\n )\n) \n/ count by (namespace) (sum by (namespace, controller) (avg_over_time(kube_controller_pod{namespace=~\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"hide": false,
"instant": true,
@@ -878,7 +878,7 @@
"refId": "B"
},
{
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"hide": false,
"instant": true,
@@ -895,7 +895,7 @@
"refId": "D"
},
{
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n ) > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n ) > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -903,7 +903,7 @@
"refId": "E"
},
{
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n )\n > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n )\n > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -919,7 +919,7 @@
"refId": "G"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -935,7 +935,7 @@
"refId": "I"
},
{
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n ) > 0\n )\nor\ncount(avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n ) > 0\n )\nor\ncount(avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -943,7 +943,7 @@
"refId": "J"
},
{
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n )\n > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n )\n > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -968,7 +968,7 @@
"refId": "M"
},
{
"expr": "sum by (namespace) (rate(container_fs_reads_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace) (rate(container_fs_reads_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"hide": false,
"instant": true,
@@ -977,7 +977,7 @@
"refId": "N"
},
{
"expr": "sum by (namespace) (rate(container_fs_writes_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace) (rate(container_fs_writes_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"hide": false,
"instant": true,
@@ -1449,7 +1449,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -1616,7 +1616,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "System",
@@ -1627,7 +1627,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -1764,7 +1764,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -1901,7 +1901,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n )\n > 0\n )",
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n )\n > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -2210,7 +2210,7 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2218,21 +2218,21 @@
"refId": "A"
},
{
"expr": "sum by (namespace) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval])* on (uid) group_left(phase) kube_pod_status_phase{phase=\"Running\"})",
"expr": "sum by (namespace) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval])* on (uid) group_left(phase) kube_pod_status_phase{phase=\"Running\"})",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "B"
},
{
"expr": "sum by (namespace) (avg_over_time(kube_pod_container_resource_limits{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval])* on (uid) group_left(phase) kube_pod_status_phase{phase=\"Running\"})",
"expr": "sum by (namespace) (avg_over_time(kube_pod_container_resource_limits{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval])* on (uid) group_left(phase) kube_pod_status_phase{phase=\"Running\"})",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "C"
},
{
"expr": "sum by (namespace) (avg_over_time(vpa_target_recommendation{container!=\"POD\", namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(vpa_target_recommendation{container!=\"\", namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
@@ -2407,7 +2407,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2419,7 +2419,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -2572,7 +2572,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -2754,14 +2754,14 @@
"pluginVersion": "8.5.13",
"targets": [
{
"expr": "sum (avg_over_time(container_memory_rss{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory_rss{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum (avg_over_time(container_memory_cache{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory_cache{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2769,7 +2769,7 @@
"refId": "B"
},
{
"expr": "sum (avg_over_time(container_memory_swap{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory_swap{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2777,14 +2777,14 @@
"refId": "C"
},
{
"expr": "sum (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -2910,7 +2910,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -3046,7 +3046,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n )\n > 0\n )",
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n )\n > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -3370,14 +3370,14 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by (namespace) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3385,7 +3385,7 @@
"refId": "B"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3393,35 +3393,35 @@
"refId": "C"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum by(namespace) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"memory\"}[$__rate_interval]))",
"expr": "sum by(namespace) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"memory\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
"refId": "E"
},
{
"expr": "sum by(namespace) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(namespace) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "F"
},
{
"expr": "sum by(namespace) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(namespace) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "G"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -3873,7 +3873,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_fs_reads_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_fs_reads_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -4008,7 +4008,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_fs_writes_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_fs_writes_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",

View File

@@ -686,7 +686,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (container) (rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"expr": "sum by (container) (rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -759,7 +759,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"expr": "sum by (container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -847,7 +847,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by(container) (rate(container_fs_reads_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__range]))",
"expr": "sum by(container) (rate(container_fs_reads_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__range]))",
"format": "table",
"hide": false,
"instant": true,
@@ -860,7 +860,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by(container) (rate(container_fs_writes_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__range]))",
"expr": "sum by(container) (rate(container_fs_writes_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__range]))",
"format": "table",
"hide": false,
"instant": true,
@@ -899,7 +899,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (container) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"expr": "sum by (container) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -1503,7 +1503,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_cpu_usage_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_cpu_usage_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -1669,7 +1669,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(pod) (rate(container_cpu_system_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_system_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -1681,7 +1681,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(pod) (rate(container_cpu_user_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_user_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -1820,7 +1820,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace, pod, container)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{container!=\"POD\", namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (namespace, pod, container)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{container!=\"\", namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -2269,7 +2269,7 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by(container) (rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Usage",
@@ -2476,7 +2476,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_cpu_system_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_cpu_system_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -2488,7 +2488,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_cpu_user_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_cpu_user_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -2639,7 +2639,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (avg_over_time(container_memory_working_set_bytes:without_kmem{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_working_set_bytes:without_kmem{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -2816,7 +2816,7 @@
"pluginVersion": "8.5.13",
"targets": [
{
"expr": "sum by(pod) (avg_over_time(container_memory_rss{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory_rss{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -2824,28 +2824,28 @@
"refId": "A"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory_cache{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory_cache{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Cache",
"refId": "B"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory_swap{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory_swap{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Swap",
"refId": "C"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -2974,7 +2974,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (container)\n (\n (\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"POD\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (container)\n (\n (\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -3110,7 +3110,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (container)\n (\n (\n (\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"POD\"}[$__rate_interval]))\n ) or sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"POD\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (container)\n (\n (\n (\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"\"}[$__rate_interval]))\n ) or sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -3431,7 +3431,7 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by(container) (avg_over_time(container_memory_rss{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_rss{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -3439,7 +3439,7 @@
"refId": "A"
},
{
"expr": "sum by(container) (avg_over_time(container_memory_cache{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_cache{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3447,28 +3447,28 @@
"refId": "B"
},
{
"expr": "sum by(container) (avg_over_time(container_memory_swap{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_swap{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Swap",
"refId": "C"
},
{
"expr": "sum by(container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum by(container) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "E"
},
{
"expr": "sum by(container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
@@ -3482,7 +3482,7 @@
"refId": "G"
},
{
"expr": "sum by(container) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -3930,7 +3930,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_fs_reads_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_fs_reads_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ container }}",
@@ -4068,7 +4068,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_fs_writes_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_fs_writes_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ container }}",

File diff suppressed because it is too large Load Diff

166
docs/release.md Normal file
View File

@@ -0,0 +1,166 @@
# Release Workflow
This document describes Cozystacks release process.
## Introduction
Cozystack uses a staged release process to ensure stability and flexibility during development.
There are three types of releases:
- **Release Candidates (RC)** Preview versions (e.g., `v0.42.0-rc.1`) used for final testing and validation.
- **Regular Releases** Final versions (e.g., `v0.42.0`) that are feature-complete and thoroughly tested.
- **Patch Releases** Bugfix-only updates (e.g., `v0.42.1`) made after a stable release, based on a dedicated release branch.
Each type plays a distinct role in delivering reliable and tested updates while allowing ongoing development to continue smoothly.
## Release Candidates
Release candidates are Cozystack versions that introduce new features and are published before a stable release.
Their purpose is to help validate stability before finalizing a new feature release.
They allow for final rounds of testing and bug fixes without freezing development.
Release candidates are given numbers `vX.Y.0-rc.N`, for example, `v0.42.0-rc.1`.
They are created directly in the `main` branch.
An RC is typically tagged when all major features for the upcoming release have been merged into main and the release enters its testing phase.
However, new features and changes can still be added before the regular release `vX.Y.0`.
Each RC contributes to a cumulative set of release notes that will be finalized when `vX.Y.0` is released.
After testing, if no critical issues remain, the regular release (`vX.Y.0`) is tagged from the last RC or a later commit in main.
This begins the regular release process, creates a dedicated `release-X.Y` branch, and opens the way for patch releases.
## Regular Releases
When making a regular release, we tag the latest RC or a subsequent minimal-change commit as `vX.Y.0`.
In this explanation, we'll use version `v0.42.0` as an example:
```mermaid
gitGraph
commit id: "feature"
commit id: "feature 2"
commit id: "feature 3" tag: "v0.42.0"
```
A regular release sequence starts in the following way:
1. Maintainer tags a commit in `main` with `v0.42.0` and pushes it to GitHub.
2. CI workflow triggers on tag push:
1. Creates a draft page for release `v0.42.0`, if it wasn't created before.
2. Takes code from tag `v0.42.0`, builds images, and pushes them to ghcr.io.
3. Makes a new commit `Prepare release v0.42.0` with updated digests, pushes it to the new branch `release-0.42.0`, and opens a PR to `main`.
4. Builds Cozystack release assets from the new commit `Prepare release v0.42.0` and uploads them to the release draft page.
3. Maintainer reviews PR, tests build artifacts, and edits changelogs on the release draft page.
```mermaid
gitGraph
commit id: "feature"
commit id: "feature 2"
commit id: "feature 3" tag: "v0.42.0"
branch release-0.42.0
checkout release-0.42.0
commit id: "Prepare release v0.42.0"
checkout main
merge release-0.42.0 id: "Pull Request"
```
When testing and editing are completed, the sequence goes on.
4. Maintainer merges the PR. GitHub removes the merged branch `release-0.42.0`.
5. CI workflow triggers on merge:
1. Moves the tag `v0.42.0` to the newly created merge commit by force-pushing a tag to GitHub.
2. Publishes the release page (`draft` → `latest`).
6. The maintainer can now announce the release to the community.
```mermaid
gitGraph
commit id: "feature"
commit id: "feature 2"
commit id: "feature 3"
branch release-0.42.0
checkout release-0.42.0
commit id: "Prepare release v0.42.0"
checkout main
merge release-0.42.0 id: "Release v0.42.0" tag: "v0.42.0"
```
## Patch Releases
Making a patch release has a lot in common with a regular release, with a couple of differences:
* A release branch is used instead of `main`
* Patch commits are cherry-picked to the release branch.
* A pull request is opened against the release branch.
Let's assume that we've released `v0.42.0` and that development is ongoing.
We have introduced a couple of new features and some fixes to features that we have released
in `v0.42.0`.
Once problems were found and fixed, a patch release is due.
```mermaid
gitGraph
commit id: "Release v0.42.0" tag: "v0.42.0"
checkout main
commit id: "feature 4"
commit id: "patch 1"
commit id: "feature 5"
commit id: "patch 2"
```
1. The maintainer creates a release branch, `release-0.42,` and cherry-picks patch commits from `main` to `release-0.42`.
These must be only patches to features that were present in version `v0.42.0`.
Cherry-picking can be done as soon as each patch is merged into `main`,
or directly before the release.
```mermaid
gitGraph
commit id: "Release v0.42.0" tag: "v0.42.0"
branch release-0.42
checkout main
commit id: "feature 4"
commit id: "patch 1"
commit id: "feature 5"
commit id: "patch 2"
checkout release-0.42
cherry-pick id: "patch 1"
cherry-pick id: "patch 2"
```
When all relevant patch commits are cherry-picked, the branch is ready for release.
2. The maintainer tags the `HEAD` commit of branch `release-0.42` as `v0.42.1` and then pushes it to GitHub.
3. CI workflow triggers on tag push:
1. Creates a draft page for release `v0.42.1`, if it wasn't created before.
2. Takes code from tag `v0.42.1`, builds images, and pushes them to ghcr.io.
3. Makes a new commit `Prepare release v0.42.1` with updated digests, pushes it to the new branch `release-0.42.1`, and opens a PR to `release-0.42`.
4. Builds Cozystack release assets from the new commit `Prepare release v0.42.1` and uploads them to the release draft page.
4. Maintainer reviews PR, tests build artifacts, and edits changelogs on the release draft page.
```mermaid
gitGraph
commit id: "Release v0.42.0" tag: "v0.42.0"
branch release-0.42
checkout main
commit id: "feature 4"
commit id: "patch 1"
commit id: "feature 5"
commit id: "patch 2"
checkout release-0.42
cherry-pick id: "patch 1"
cherry-pick id: "patch 2" tag: "v0.42.1"
branch release-0.42.1
commit id: "Prepare release v0.42.1"
checkout release-0.42
merge release-0.42.1 id: "Pull request"
```
Finally, when release is confirmed, the release sequence goes on.
5. Maintainer merges the PR. GitHub removes the merged branch `release-0.42.1`.
6. CI workflow triggers on merge:
1. Moves the tag `v0.42.1` to the newly created merge commit by force-pushing a tag to GitHub.
2. Publishes the release page (`draft` → `latest`).
7. The maintainer can now announce the release to the community.

2
go.mod
View File

@@ -1,6 +1,6 @@
// This is a generated file. Do not edit directly.
module github.com/aenix-io/cozystack
module github.com/cozystack/cozystack
go 1.23.0

View File

@@ -21,7 +21,7 @@ fix_d8() {
}
swap_pvc_overview() {
jq '(.panels[] | select(.title=="PVC Detailed") | .panels[] | select(.title=="Overview")) as $a | del(.panels[] | select(.title=="PVC Detailed").panels[] | select(.title=="Overview")) | ( (.panels[] | select(.title=="PVC Detailed"))) as $b | del( .panels[] | select(.title=="PVC Detailed")) | (.panels[.panels|length]=($a|.gridPos.y=$b.gridPos.y)) | (.panels[.panels|length]=($b|.gridPos.y=$a.gridPos.y))'
jq '(.panels[] | select(.title=="PVC Detailed") | .panels[] | select(.title=="Overview")) as $a | del(.panels[] | select(.title=="PVC Detailed").panels[] | select(.title=="Overview")) | ( (.panels[] | select(.title=="PVC Detailed"))) as $b | del( .panels[] | select(.title=="PVC Detailed")) | (.panels[.panels|length]=($a|.gridPos.y=$b.gridPos.y)) | (.panels[.panels|length]=($b|.gridPos.y=$a.gridPos.y))'
}
deprectaed_remove_faq() {
@@ -68,7 +68,7 @@ modules/402-ingress-nginx/monitoring/grafana-dashboards/ingress-nginx/namespace/
modules/402-ingress-nginx/monitoring/grafana-dashboards/ingress-nginx/vhost/vhost_detail.json
modules/402-ingress-nginx/monitoring/grafana-dashboards/ingress-nginx/vhost/vhosts.json
modules/340-monitoring-kubernetes-control-plane/monitoring/grafana-dashboards/kubernetes-cluster/control-plane-status.json
modules/340-monitoring-kubernetes-control-plane/monitoring/grafana-dashboards/kubernetes-cluster/kube-etcd3.json #TODO
modules/340-monitoring-kubernetes-control-plane/monitoring/grafana-dashboards/kubernetes-cluster/kube-etcd.json #TODO
modules/340-monitoring-kubernetes-control-plane/monitoring/grafana-dashboards/kubernetes-cluster/deprecated-resources.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//kubernetes-cluster/nodes/ntp.json #TODO
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//kubernetes-cluster/nodes/nodes.json
@@ -78,6 +78,10 @@ modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//main/pod.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//main/namespace/namespaces.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//main/namespace/namespace.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//main/capacity-planning/capacity-planning.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//flux/flux-control-plane.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//flux/flux-stats.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//kafka/strimzi-kafka.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//goldpinger/goldpinger.json
EOT
@@ -109,4 +113,3 @@ done <<\EOT
https://raw.githubusercontent.com/dotdc/grafana-dashboards-kubernetes/master/dashboards/k8s-views-namespaces.json
https://raw.githubusercontent.com/dotdc/grafana-dashboards-kubernetes/master/dashboards/k8s-views-pods.json
EOT

View File

@@ -1,165 +0,0 @@
#!/bin/bash
RED='\033[0;31m'
GREEN='\033[0;32m'
RESET='\033[0m'
YELLOW='\033[0;33m'
ROOT_NS="tenant-root"
TEST_TENANT="tenant-e2e"
values_base_path="/hack/testdata/"
checks_base_path="/hack/testdata/"
function delete_hr() {
local release_name="$1"
local namespace="$2"
if [[ -z "$release_name" ]]; then
echo -e "${RED}Error: Release name is required.${RESET}"
exit 1
fi
if [[ -z "$namespace" ]]; then
echo -e "${RED}Error: Namespace name is required.${RESET}"
exit 1
fi
if [[ "$release_name" == "tenant-e2e" ]]; then
echo -e "${YELLOW}Skipping deletion for release tenant-e2e.${RESET}"
return 0
fi
kubectl delete helmrelease $release_name -n $namespace
}
function install_helmrelease() {
local release_name="$1"
local namespace="$2"
local chart_path="$3"
local repo_name="$4"
local repo_ns="$5"
local values_file="$6"
if [[ -z "$release_name" ]]; then
echo -e "${RED}Error: Release name is required.${RESET}"
exit 1
fi
if [[ -z "$namespace" ]]; then
echo -e "${RED}Error: Namespace name is required.${RESET}"
exit 1
fi
if [[ -z "$chart_path" ]]; then
echo -e "${RED}Error: Chart path name is required.${RESET}"
exit 1
fi
if [[ -n "$values_file" && -f "$values_file" ]]; then
local values_section
values_section=$(echo " values:" && sed 's/^/ /' "$values_file")
fi
local helmrelease_file=$(mktemp /tmp/HelmRelease.XXXXXX.yaml)
{
echo "apiVersion: helm.toolkit.fluxcd.io/v2"
echo "kind: HelmRelease"
echo "metadata:"
echo " labels:"
echo " cozystack.io/ui: \"true\""
echo " name: \"$release_name\""
echo " namespace: \"$namespace\""
echo "spec:"
echo " chart:"
echo " spec:"
echo " chart: \"$chart_path\""
echo " reconcileStrategy: Revision"
echo " sourceRef:"
echo " kind: HelmRepository"
echo " name: \"$repo_name\""
echo " namespace: \"$repo_ns\""
echo " version: '*'"
echo " interval: 1m0s"
echo " timeout: 5m0s"
[[ -n "$values_section" ]] && echo "$values_section"
} > "$helmrelease_file"
kubectl apply -f "$helmrelease_file"
rm -f "$helmrelease_file"
}
function install_tenant (){
local release_name="$1"
local namespace="$2"
local values_file="${values_base_path}tenant/values.yaml"
local repo_name="cozystack-apps"
local repo_ns="cozy-public"
install_helmrelease "$release_name" "$namespace" "tenant" "$repo_name" "$repo_ns" "$values_file"
}
function make_extra_checks(){
local checks_file="$1"
echo "after exec make $checks_file"
if [[ -n "$checks_file" && -f "$checks_file" ]]; then
echo -e "${YELLOW}Start extra checks with file: ${checks_file}${RESET}"
fi
}
function check_helmrelease_status() {
local release_name="$1"
local namespace="$2"
local checks_file="$3"
local timeout=300 # Timeout in seconds
local interval=5 # Interval between checks in seconds
local elapsed=0
while [[ $elapsed -lt $timeout ]]; do
local status_output
status_output=$(kubectl get helmrelease "$release_name" -n "$namespace" -o json | jq -r '.status.conditions[-1].reason')
if [[ "$status_output" == "InstallSucceeded" || "$status_output" == "UpgradeSucceeded" ]]; then
echo -e "${GREEN}Helm release '$release_name' is ready.${RESET}"
make_extra_checks "$checks_file"
delete_hr $release_name $namespace
return 0
elif [[ "$status_output" == "InstallFailed" ]]; then
echo -e "${RED}Helm release '$release_name': InstallFailed${RESET}"
exit 1
else
echo -e "${YELLOW}Helm release '$release_name' is not ready. Current status: $status_output${RESET}"
fi
sleep "$interval"
elapsed=$((elapsed + interval))
done
echo -e "${RED}Timeout reached. Helm release '$release_name' is still not ready after $timeout seconds.${RESET}"
exit 1
}
chart_name="$1"
if [ -z "$chart_name" ]; then
echo -e "${RED}No chart name provided. Exiting...${RESET}"
exit 1
fi
checks_file="${checks_base_path}${chart_name}/check.sh"
repo_name="cozystack-apps"
repo_ns="cozy-public"
release_name="$chart_name-e2e"
values_file="${values_base_path}${chart_name}/values.yaml"
install_tenant $TEST_TENANT $ROOT_NS
check_helmrelease_status $TEST_TENANT $ROOT_NS "${checks_base_path}tenant/check.sh"
echo -e "${YELLOW}Running tests for chart: $chart_name${RESET}"
install_helmrelease $release_name $TEST_TENANT $chart_name $repo_name $repo_ns $values_file
check_helmrelease_status $release_name $TEST_TENANT $checks_file

View File

@@ -60,7 +60,8 @@ done
# Prepare system drive
if [ ! -f nocloud-amd64.raw ]; then
wget https://github.com/aenix-io/cozystack/releases/latest/download/nocloud-amd64.raw.xz -O nocloud-amd64.raw.xz
wget https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz \
-O nocloud-amd64.raw.xz --show-progress --output-file /dev/stdout --progress=dot:giga 2>/dev/null
rm -f nocloud-amd64.raw
xz --decompress nocloud-amd64.raw.xz
fi
@@ -84,8 +85,9 @@ done
# Start VMs
for i in 1 2 3; do
qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host -smp 4 -m 8192 \
-device virtio-net,netdev=net0,mac=52:54:00:12:34:5$i -netdev tap,id=net0,ifname=cozy-srv$i,script=no,downscript=no \
qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host -smp 8 -m 16384 \
-device virtio-net,netdev=net0,mac=52:54:00:12:34:5$i \
-netdev tap,id=net0,ifname=cozy-srv$i,script=no,downscript=no \
-drive file=srv$i/system.img,if=virtio,format=raw \
-drive file=srv$i/seed.img,if=virtio,format=raw \
-drive file=srv$i/data.img,if=virtio,format=raw \
@@ -113,10 +115,15 @@ machine:
- usermode_helper=disabled
- name: zfs
- name: spl
registries:
mirrors:
docker.io:
endpoints:
- https://mirror.gcr.io
files:
- content: |
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.cri.v1.runtime"]
device_ownership_from_security_context = true
path: /etc/cri/conf.d/20-customization.part
op: create
@@ -226,11 +233,18 @@ timeout 60 sh -c 'until kubectl get hr -A | grep cozy; do sleep 1; done'
sleep 5
# Wait for all HelmReleases to be installed
kubectl get hr -A | awk 'NR>1 {print "kubectl wait --timeout=15m --for=condition=ready -n " $1 " hr/" $2 " &"} END{print "wait"}' | sh -x
failed_hrs=$(kubectl get hr -A | grep -v True)
if [ -n "$(echo "$failed_hrs" | grep -v NAME)" ]; then
printf 'Failed HelmReleases:\n%s\n' "$failed_hrs" >&2
exit 1
fi
# Wait for Cluster-API providers
timeout 30 sh -c 'until kubectl get deploy -n cozy-cluster-api capi-controller-manager capi-kamaji-controller-manager capi-kubeadm-bootstrap-controller-manager capi-operator-cluster-api-operator capk-controller-manager; do sleep 1; done'
kubectl wait deploy --timeout=30s --for=condition=available -n cozy-cluster-api capi-controller-manager capi-kamaji-controller-manager capi-kubeadm-bootstrap-controller-manager capi-operator-cluster-api-operator capk-controller-manager
timeout 60 sh -c 'until kubectl get deploy -n cozy-cluster-api capi-controller-manager capi-kamaji-controller-manager capi-kubeadm-bootstrap-controller-manager capi-operator-cluster-api-operator capk-controller-manager; do sleep 1; done'
kubectl wait deploy --timeout=1m --for=condition=available -n cozy-cluster-api capi-controller-manager capi-kamaji-controller-manager capi-kubeadm-bootstrap-controller-manager capi-operator-cluster-api-operator capk-controller-manager
# Wait for linstor controller
kubectl wait deploy --timeout=5m --for=condition=available -n cozy-linstor linstor-controller
@@ -313,7 +327,12 @@ kubectl patch -n tenant-root tenants.apps.cozystack.io root --type=merge -p '{"s
timeout 60 sh -c 'until kubectl get hr -n tenant-root etcd ingress monitoring tenant-root; do sleep 1; done'
# Wait for HelmReleases be installed
kubectl wait --timeout=2m --for=condition=ready -n tenant-root hr etcd ingress monitoring tenant-root
kubectl wait --timeout=2m --for=condition=ready -n tenant-root hr etcd ingress tenant-root
if ! kubectl wait --timeout=2m --for=condition=ready -n tenant-root hr monitoring; then
flux reconcile hr monitoring -n tenant-root --force
kubectl wait --timeout=2m --for=condition=ready -n tenant-root hr monitoring
fi
kubectl patch -n tenant-root ingresses.apps.cozystack.io ingress --type=merge -p '{"spec":{
"dashboard": true
@@ -328,7 +347,7 @@ kubectl wait --timeout=5m --for=jsonpath=.status.readyReplicas=3 -n tenant-root
# Wait for Victoria metrics
kubectl wait --timeout=5m --for=jsonpath=.status.updateStatus=operational -n tenant-root vmalert/vmalert-shortterm vmalertmanager/alertmanager
kubectl wait --timeout=5m --for=jsonpath=.status.status=operational -n tenant-root vlogs/generic
kubectl wait --timeout=5m --for=jsonpath=.status.updateStatus=operational -n tenant-root vlogs/generic
kubectl wait --timeout=5m --for=jsonpath=.status.clusterStatus=operational -n tenant-root vmcluster/shortterm vmcluster/longterm
# Wait for grafana
@@ -347,5 +366,5 @@ kubectl patch -n cozy-system cm/cozystack --type=merge -p '{"data":{
"oidc-enabled": "true"
}}'
timeout 60 sh -c 'until kubectl get hr -n cozy-keycloak keycloak keycloak-configure keycloak-operator; do sleep 1; done'
timeout 120 sh -c 'until kubectl get hr -n cozy-keycloak keycloak keycloak-configure keycloak-operator; do sleep 1; done'
kubectl wait --timeout=10m --for=condition=ready -n cozy-keycloak hr keycloak keycloak-configure keycloak-operator

View File

@@ -1,12 +1,13 @@
#!/bin/sh
set -e
file=versions_map
charts=$(find . -mindepth 2 -maxdepth 2 -name Chart.yaml | awk 'sub("/Chart.yaml", "")')
# <chart> <version> <commit>
new_map=$(
for chart in $charts; do
awk '/^name:/ {chart=$2} /^version:/ {version=$2} END{printf "%s %s %s\n", chart, version, "HEAD"}' $chart/Chart.yaml
awk '/^name:/ {chart=$2} /^version:/ {version=$2} END{printf "%s %s %s\n", chart, version, "HEAD"}' "$chart/Chart.yaml"
done
)
@@ -15,47 +16,46 @@ if [ ! -f "$file" ] || [ ! -s "$file" ]; then
exit 0
fi
miss_map=$(echo "$new_map" | awk 'NR==FNR { new_map[$1 " " $2] = $3; next } { if (!($1 " " $2 in new_map)) print $1, $2, $3}' - $file)
miss_map=$(echo "$new_map" | awk 'NR==FNR { nm[$1 " " $2] = $3; next } { if (!($1 " " $2 in nm)) print $1, $2, $3}' - "$file")
# search accross all tags sorted by version
search_commits=$(git ls-remote --tags origin | awk -F/ '$3 ~ /v[0-9]+.[0-9]+.[0-9]+/ {print}' | sort -k2,2 -rV | awk '{print $1}')
resolved_miss_map=$(
echo "$miss_map" | while read chart version commit; do
if [ "$commit" = HEAD ]; then
line=$(awk '/^version:/ {print NR; exit}' "./$chart/Chart.yaml")
change_commit=$(git --no-pager blame -L"$line",+1 -- "$chart/Chart.yaml" | awk '{print $1}')
if [ "$change_commit" = "00000000" ]; then
# Not committed yet, use previous commit
line=$(git show HEAD:"./$chart/Chart.yaml" | awk '/^version:/ {print NR; exit}')
commit=$(git --no-pager blame -L"$line",+1 HEAD -- "$chart/Chart.yaml" | awk '{print $1}')
if [ $(echo $commit | cut -c1) = "^" ]; then
# Previous commit not exists
commit=$(echo $commit | cut -c2-)
fi
else
# Committed, but version_map wasn't updated
line=$(git show HEAD:"./$chart/Chart.yaml" | awk '/^version:/ {print NR; exit}')
change_commit=$(git --no-pager blame -L"$line",+1 HEAD -- "$chart/Chart.yaml" | awk '{print $1}')
if [ $(echo $change_commit | cut -c1) = "^" ]; then
# Previous commit not exists
commit=$(echo $change_commit | cut -c2-)
else
commit=$(git describe --always "$change_commit~1")
fi
echo "$miss_map" | while read -r chart version commit; do
# if version is found in HEAD, it's HEAD
if [ "$(awk '$1 == "version:" {print $2}' ./${chart}/Chart.yaml)" = "${version}" ]; then
echo "$chart $version HEAD"
continue
fi
# if commit is not HEAD, check if it's valid
if [ "$commit" != "HEAD" ]; then
if [ "$(git show "${commit}:./${chart}/Chart.yaml" | awk '$1 == "version:" {print $2}')" != "${version}" ]; then
echo "Commit $commit for $chart $version is not valid" >&2
exit 1
fi
# Check if the commit belongs to the main branch
if ! git merge-base --is-ancestor "$commit" main; then
# Find the closest parent commit that belongs to main
commit_in_main=$(git log --pretty=format:"%h" main -- "$chart" | head -n 1)
if [ -n "$commit_in_main" ]; then
commit="$commit_in_main"
else
# No valid commit found in main branch for $chart, skipping..."
continue
fi
fi
commit=$(git rev-parse --short "$commit")
echo "$chart $version $commit"
continue
fi
echo "$chart $version $commit"
# if commit is HEAD, but version is not found in HEAD, check all tags
found_tag=""
for tag in $search_commits; do
if [ "$(git show "${tag}:./${chart}/Chart.yaml" | awk '$1 == "version:" {print $2}')" = "${version}" ]; then
found_tag=$(git rev-parse --short "${tag}")
break
fi
done
if [ -z "$found_tag" ]; then
echo "Can't find $chart $version in any version tag, removing it" >&2
continue
fi
echo "$chart $version $found_tag"
done
)

View File

@@ -1 +0,0 @@
return 0

View File

@@ -1,2 +0,0 @@
endpoints:
- 8.8.8.8:443

View File

@@ -1 +0,0 @@
return 0

View File

@@ -1,62 +0,0 @@
## @section Common parameters
## @param host The hostname used to access the Kubernetes cluster externally (defaults to using the cluster name as a subdomain for the tenant host).
## @param controlPlane.replicas Number of replicas for Kubernetes contorl-plane components
## @param storageClass StorageClass used to store user data
##
host: ""
controlPlane:
replicas: 2
storageClass: replicated
## @param nodeGroups [object] nodeGroups configuration
##
nodeGroups:
md0:
minReplicas: 0
maxReplicas: 10
instanceType: "u1.medium"
ephemeralStorage: 20Gi
roles:
- ingress-nginx
resources:
cpu: ""
memory: ""
## @section Cluster Addons
##
addons:
## Cert-manager: automatically creates and manages SSL/TLS certificate
##
certManager:
## @param addons.certManager.enabled Enables the cert-manager
## @param addons.certManager.valuesOverride Custom values to override
enabled: true
valuesOverride: {}
## Ingress-NGINX Controller
##
ingressNginx:
## @param addons.ingressNginx.enabled Enable Ingress-NGINX controller (expect nodes with 'ingress-nginx' role)
## @param addons.ingressNginx.valuesOverride Custom values to override
##
enabled: true
## @param addons.ingressNginx.hosts List of domain names that should be passed through to the cluster by upper cluster
## e.g:
## hosts:
## - example.org
## - foo.example.net
##
hosts: []
valuesOverride: {}
## Flux CD
##
fluxcd:
## @param addons.fluxcd.enabled Enables Flux CD
## @param addons.fluxcd.valuesOverride Custom values to override
##
enabled: true
valuesOverride: {}

View File

@@ -1 +0,0 @@
return 0

View File

@@ -1,10 +0,0 @@
## @section Common parameters
## @param external Enable external access from outside the cluster
## @param replicas Persistent Volume size for NATS
## @param storageClass StorageClass used to store the data
##
external: false
replicas: 2
storageClass: ""

View File

@@ -1 +0,0 @@
return 0

View File

@@ -1,6 +0,0 @@
host: ""
etcd: false
monitoring: false
ingress: false
seaweedfs: false
isolated: true

11
hack/upload-assets.sh Executable file
View File

@@ -0,0 +1,11 @@
#!/bin/bash
set -xe
version=${VERSION:-$(git describe --tags)}
gh release upload --clobber $version _out/assets/cozystack-installer.yaml
gh release upload --clobber $version _out/assets/metal-amd64.iso
gh release upload --clobber $version _out/assets/metal-amd64.raw.xz
gh release upload --clobber $version _out/assets/nocloud-amd64.raw.xz
gh release upload --clobber $version _out/assets/kernel-amd64
gh release upload --clobber $version _out/assets/initramfs-metal-amd64.xz

View File

@@ -33,7 +33,7 @@ import (
logf "sigs.k8s.io/controller-runtime/pkg/log"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
cozystackiov1alpha1 "github.com/aenix-io/cozystack/api/v1alpha1"
cozystackiov1alpha1 "github.com/cozystack/cozystack/api/v1alpha1"
// +kubebuilder:scaffold:imports
)

View File

@@ -0,0 +1,158 @@
package controller
import (
"context"
"fmt"
"strings"
"time"
e "errors"
helmv2 "github.com/fluxcd/helm-controller/api/v2"
"gopkg.in/yaml.v2"
corev1 "k8s.io/api/core/v1"
apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
"k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/log"
)
type TenantHelmReconciler struct {
client.Client
Scheme *runtime.Scheme
}
func (r *TenantHelmReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
logger := log.FromContext(ctx)
hr := &helmv2.HelmRelease{}
if err := r.Get(ctx, req.NamespacedName, hr); err != nil {
if errors.IsNotFound(err) {
return ctrl.Result{}, nil
}
logger.Error(err, "unable to fetch HelmRelease")
return ctrl.Result{}, err
}
if !strings.HasPrefix(hr.Name, "tenant-") {
return ctrl.Result{}, nil
}
if len(hr.Status.Conditions) == 0 || hr.Status.Conditions[0].Type != "Ready" {
return ctrl.Result{}, nil
}
if len(hr.Status.History) == 0 {
logger.Info("no history in HelmRelease status", "name", hr.Name)
return ctrl.Result{}, nil
}
if hr.Status.History[0].Status != "deployed" {
return ctrl.Result{}, nil
}
newDigest := hr.Status.History[0].Digest
var hrList helmv2.HelmReleaseList
childNamespace := getChildNamespace(hr.Namespace, hr.Name)
if childNamespace == "tenant-root" && hr.Name == "tenant-root" {
if hr.Spec.Values == nil {
logger.Error(e.New("hr.Spec.Values is nil"), "cant annotate tenant-root ns")
return ctrl.Result{}, nil
}
err := annotateTenantRootNs(*hr.Spec.Values, r.Client)
if err != nil {
logger.Error(err, "cant annotate tenant-root ns")
return ctrl.Result{}, nil
}
logger.Info("namespace 'tenant-root' annotated")
}
if err := r.List(ctx, &hrList, client.InNamespace(childNamespace)); err != nil {
logger.Error(err, "unable to list HelmReleases in namespace", "namespace", hr.Name)
return ctrl.Result{}, err
}
for _, item := range hrList.Items {
if item.Name == hr.Name {
continue
}
oldDigest := item.GetAnnotations()["cozystack.io/tenant-config-digest"]
if oldDigest == newDigest {
continue
}
patchTarget := item.DeepCopy()
if patchTarget.Annotations == nil {
patchTarget.Annotations = map[string]string{}
}
ts := time.Now().Format(time.RFC3339Nano)
patchTarget.Annotations["cozystack.io/tenant-config-digest"] = newDigest
patchTarget.Annotations["reconcile.fluxcd.io/forceAt"] = ts
patchTarget.Annotations["reconcile.fluxcd.io/requestedAt"] = ts
patch := client.MergeFrom(item.DeepCopy())
if err := r.Patch(ctx, patchTarget, patch); err != nil {
logger.Error(err, "failed to patch HelmRelease", "name", patchTarget.Name)
continue
}
logger.Info("patched HelmRelease with new digest", "name", patchTarget.Name, "digest", newDigest, "version", hr.Status.History[0].Version)
}
return ctrl.Result{}, nil
}
func (r *TenantHelmReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&helmv2.HelmRelease{}).
Complete(r)
}
func getChildNamespace(currentNamespace, hrName string) string {
tenantName := strings.TrimPrefix(hrName, "tenant-")
switch {
case currentNamespace == "tenant-root" && hrName == "tenant-root":
// 1) root tenant inside root namespace
return "tenant-root"
case currentNamespace == "tenant-root":
// 2) any other tenant in root namespace
return fmt.Sprintf("tenant-%s", tenantName)
default:
// 3) tenant in a dedicated namespace
return fmt.Sprintf("%s-%s", currentNamespace, tenantName)
}
}
func annotateTenantRootNs(values apiextensionsv1.JSON, c client.Client) error {
var data map[string]interface{}
if err := yaml.Unmarshal(values.Raw, &data); err != nil {
return fmt.Errorf("failed to parse HelmRelease values: %w", err)
}
host, ok := data["host"].(string)
if !ok || host == "" {
return fmt.Errorf("host field not found or not a string")
}
var ns corev1.Namespace
if err := c.Get(context.TODO(), client.ObjectKey{Name: "tenant-root"}, &ns); err != nil {
return fmt.Errorf("failed to get namespace tenant-root: %w", err)
}
if ns.Annotations == nil {
ns.Annotations = map[string]string{}
}
ns.Annotations["namespace.cozystack.io/host"] = host
if err := c.Update(context.TODO(), &ns); err != nil {
return fmt.Errorf("failed to update namespace: %w", err)
}
return nil
}

View File

@@ -0,0 +1,99 @@
package controller
import (
"context"
"strings"
corev1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/types"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/log"
cozyv1alpha1 "github.com/cozystack/cozystack/api/v1alpha1"
)
// WorkloadMonitorReconciler reconciles a WorkloadMonitor object
type WorkloadReconciler struct {
client.Client
Scheme *runtime.Scheme
}
func (r *WorkloadReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
logger := log.FromContext(ctx)
w := &cozyv1alpha1.Workload{}
err := r.Get(ctx, req.NamespacedName, w)
if err != nil {
if apierrors.IsNotFound(err) {
return ctrl.Result{}, nil
}
logger.Error(err, "Unable to fetch Workload")
return ctrl.Result{}, err
}
// it's being deleted, nothing to handle
if w.DeletionTimestamp != nil {
return ctrl.Result{}, nil
}
t := getMonitoredObject(w)
if t == nil {
err = r.Delete(ctx, w)
if err != nil {
logger.Error(err, "failed to delete workload")
}
return ctrl.Result{}, err
}
err = r.Get(ctx, types.NamespacedName{Name: t.GetName(), Namespace: t.GetNamespace()}, t)
// found object, nothing to do
if err == nil {
return ctrl.Result{}, nil
}
// error getting object but not 404 -- requeue
if !apierrors.IsNotFound(err) {
logger.Error(err, "failed to get dependent object", "kind", t.GetObjectKind(), "dependent-object-name", t.GetName())
return ctrl.Result{}, err
}
err = r.Delete(ctx, w)
if err != nil {
logger.Error(err, "failed to delete workload")
}
return ctrl.Result{}, err
}
// SetupWithManager registers our controller with the Manager and sets up watches.
func (r *WorkloadReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
// Watch WorkloadMonitor objects
For(&cozyv1alpha1.Workload{}).
Complete(r)
}
func getMonitoredObject(w *cozyv1alpha1.Workload) client.Object {
switch {
case strings.HasPrefix(w.Name, "pvc-"):
obj := &corev1.PersistentVolumeClaim{}
obj.Name = strings.TrimPrefix(w.Name, "pvc-")
obj.Namespace = w.Namespace
return obj
case strings.HasPrefix(w.Name, "svc-"):
obj := &corev1.Service{}
obj.Name = strings.TrimPrefix(w.Name, "svc-")
obj.Namespace = w.Namespace
return obj
case strings.HasPrefix(w.Name, "pod-"):
obj := &corev1.Pod{}
obj.Name = strings.TrimPrefix(w.Name, "pod-")
obj.Namespace = w.Namespace
return obj
}
var obj client.Object
return obj
}

View File

@@ -0,0 +1,26 @@
package controller
import (
"testing"
cozyv1alpha1 "github.com/cozystack/cozystack/api/v1alpha1"
corev1 "k8s.io/api/core/v1"
)
func TestUnprefixedMonitoredObjectReturnsNil(t *testing.T) {
w := &cozyv1alpha1.Workload{}
w.Name = "unprefixed-name"
obj := getMonitoredObject(w)
if obj != nil {
t.Errorf(`getMonitoredObject(&Workload{Name: "%s"}) == %v, want nil`, w.Name, obj)
}
}
func TestPodMonitoredObject(t *testing.T) {
w := &cozyv1alpha1.Workload{}
w.Name = "pod-mypod"
obj := getMonitoredObject(w)
if pod, ok := obj.(*corev1.Pod); !ok || pod.Name != "mypod" {
t.Errorf(`getMonitoredObject(&Workload{Name: "%s"}) == %v, want &Pod{Name: "mypod"}`, w.Name, obj)
}
}

View File

@@ -3,6 +3,7 @@ package controller
import (
"context"
"encoding/json"
"fmt"
"sort"
apierrors "k8s.io/apimachinery/pkg/api/errors"
@@ -19,7 +20,7 @@ import (
"k8s.io/apimachinery/pkg/api/resource"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
cozyv1alpha1 "github.com/aenix-io/cozystack/api/v1alpha1"
cozyv1alpha1 "github.com/cozystack/cozystack/api/v1alpha1"
)
// WorkloadMonitorReconciler reconciles a WorkloadMonitor object
@@ -33,6 +34,17 @@ type WorkloadMonitorReconciler struct {
// +kubebuilder:rbac:groups=cozystack.io,resources=workloads,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=cozystack.io,resources=workloads/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch
// +kubebuilder:rbac:groups=core,resources=persistentvolumeclaims,verbs=get;list;watch
// isServiceReady checks if the service has an external IP bound
func (r *WorkloadMonitorReconciler) isServiceReady(svc *corev1.Service) bool {
return len(svc.Status.LoadBalancer.Ingress) > 0
}
// isPVCReady checks if the PVC is bound
func (r *WorkloadMonitorReconciler) isPVCReady(pvc *corev1.PersistentVolumeClaim) bool {
return pvc.Status.Phase == corev1.ClaimBound
}
// isPodReady checks if the Pod is in the Ready condition.
func (r *WorkloadMonitorReconciler) isPodReady(pod *corev1.Pod) bool {
@@ -88,6 +100,110 @@ func updateOwnerReferences(obj metav1.Object, monitor client.Object) {
obj.SetOwnerReferences(owners)
}
// reconcileServiceForMonitor creates or updates a Workload object for the given Service and WorkloadMonitor.
func (r *WorkloadMonitorReconciler) reconcileServiceForMonitor(
ctx context.Context,
monitor *cozyv1alpha1.WorkloadMonitor,
svc corev1.Service,
) error {
logger := log.FromContext(ctx)
workload := &cozyv1alpha1.Workload{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("svc-%s", svc.Name),
Namespace: svc.Namespace,
},
}
resources := make(map[string]resource.Quantity)
quantity := resource.MustParse("0")
for _, ing := range svc.Status.LoadBalancer.Ingress {
if ing.IP != "" {
quantity.Add(resource.MustParse("1"))
}
}
var resourceLabel string
if svc.Annotations != nil {
var ok bool
resourceLabel, ok = svc.Annotations["metallb.universe.tf/ip-allocated-from-pool"]
if !ok {
resourceLabel = "default"
}
}
resourceLabel = fmt.Sprintf("%s.ipaddresspool.metallb.io/requests.ipaddresses", resourceLabel)
resources[resourceLabel] = quantity
_, err := ctrl.CreateOrUpdate(ctx, r.Client, workload, func() error {
// Update owner references with the new monitor
updateOwnerReferences(workload.GetObjectMeta(), monitor)
workload.Labels = svc.Labels
// Fill Workload status fields:
workload.Status.Kind = monitor.Spec.Kind
workload.Status.Type = monitor.Spec.Type
workload.Status.Resources = resources
workload.Status.Operational = r.isServiceReady(&svc)
return nil
})
if err != nil {
logger.Error(err, "Failed to CreateOrUpdate Workload", "workload", workload.Name)
return err
}
return nil
}
// reconcilePVCForMonitor creates or updates a Workload object for the given PVC and WorkloadMonitor.
func (r *WorkloadMonitorReconciler) reconcilePVCForMonitor(
ctx context.Context,
monitor *cozyv1alpha1.WorkloadMonitor,
pvc corev1.PersistentVolumeClaim,
) error {
logger := log.FromContext(ctx)
workload := &cozyv1alpha1.Workload{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("pvc-%s", pvc.Name),
Namespace: pvc.Namespace,
},
}
resources := make(map[string]resource.Quantity)
for resourceName, resourceQuantity := range pvc.Status.Capacity {
storageClass := "default"
if pvc.Spec.StorageClassName != nil || *pvc.Spec.StorageClassName == "" {
storageClass = *pvc.Spec.StorageClassName
}
resourceLabel := fmt.Sprintf("%s.storageclass.storage.k8s.io/requests.%s", storageClass, resourceName.String())
resources[resourceLabel] = resourceQuantity
}
_, err := ctrl.CreateOrUpdate(ctx, r.Client, workload, func() error {
// Update owner references with the new monitor
updateOwnerReferences(workload.GetObjectMeta(), monitor)
workload.Labels = pvc.Labels
// Fill Workload status fields:
workload.Status.Kind = monitor.Spec.Kind
workload.Status.Type = monitor.Spec.Type
workload.Status.Resources = resources
workload.Status.Operational = r.isPVCReady(&pvc)
return nil
})
if err != nil {
logger.Error(err, "Failed to CreateOrUpdate Workload", "workload", workload.Name)
return err
}
return nil
}
// reconcilePodForMonitor creates or updates a Workload object for the given Pod and WorkloadMonitor.
func (r *WorkloadMonitorReconciler) reconcilePodForMonitor(
ctx context.Context,
@@ -96,15 +212,12 @@ func (r *WorkloadMonitorReconciler) reconcilePodForMonitor(
) error {
logger := log.FromContext(ctx)
// Combine both init containers and normal containers to sum resources properly
combinedContainers := append(pod.Spec.InitContainers, pod.Spec.Containers...)
// totalResources will store the sum of all container resource limits
// totalResources will store the sum of all container resource requests
totalResources := make(map[string]resource.Quantity)
// Iterate over all containers to aggregate their Limits
for _, container := range combinedContainers {
for name, qty := range container.Resources.Limits {
// Iterate over all containers to aggregate their requests
for _, container := range pod.Spec.Containers {
for name, qty := range container.Resources.Requests {
if existing, exists := totalResources[name.String()]; exists {
existing.Add(qty)
totalResources[name.String()] = existing
@@ -133,7 +246,7 @@ func (r *WorkloadMonitorReconciler) reconcilePodForMonitor(
workload := &cozyv1alpha1.Workload{
ObjectMeta: metav1.ObjectMeta{
Name: pod.Name,
Name: fmt.Sprintf("pod-%s", pod.Name),
Namespace: pod.Namespace,
},
}
@@ -205,6 +318,45 @@ func (r *WorkloadMonitorReconciler) Reconcile(ctx context.Context, req ctrl.Requ
}
}
pvcList := &corev1.PersistentVolumeClaimList{}
if err := r.List(
ctx,
pvcList,
client.InNamespace(monitor.Namespace),
client.MatchingLabels(monitor.Spec.Selector),
); err != nil {
logger.Error(err, "Unable to list PVCs for WorkloadMonitor", "monitor", monitor.Name)
return ctrl.Result{}, err
}
for _, pvc := range pvcList.Items {
if err := r.reconcilePVCForMonitor(ctx, monitor, pvc); err != nil {
logger.Error(err, "Failed to reconcile Workload for PVC", "PVC", pvc.Name)
continue
}
}
svcList := &corev1.ServiceList{}
if err := r.List(
ctx,
svcList,
client.InNamespace(monitor.Namespace),
client.MatchingLabels(monitor.Spec.Selector),
); err != nil {
logger.Error(err, "Unable to list Services for WorkloadMonitor", "monitor", monitor.Name)
return ctrl.Result{}, err
}
for _, svc := range svcList.Items {
if svc.Spec.Type != corev1.ServiceTypeLoadBalancer {
continue
}
if err := r.reconcileServiceForMonitor(ctx, monitor, svc); err != nil {
logger.Error(err, "Failed to reconcile Workload for Service", "Service", svc.Name)
continue
}
}
// Update WorkloadMonitor status based on observed pods
monitor.Status.ObservedReplicas = observedReplicas
monitor.Status.AvailableReplicas = availableReplicas
@@ -233,41 +385,51 @@ func (r *WorkloadMonitorReconciler) SetupWithManager(mgr ctrl.Manager) error {
// Also watch Pod objects and map them back to WorkloadMonitor if labels match
Watches(
&corev1.Pod{},
handler.EnqueueRequestsFromMapFunc(func(ctx context.Context, obj client.Object) []reconcile.Request {
pod, ok := obj.(*corev1.Pod)
if !ok {
return nil
}
var monitorList cozyv1alpha1.WorkloadMonitorList
// List all WorkloadMonitors in the same namespace
if err := r.List(ctx, &monitorList, client.InNamespace(pod.Namespace)); err != nil {
return nil
}
// Match each monitor's selector with the Pod's labels
var requests []reconcile.Request
for _, m := range monitorList.Items {
matches := true
for k, v := range m.Spec.Selector {
if podVal, exists := pod.Labels[k]; !exists || podVal != v {
matches = false
break
}
}
if matches {
requests = append(requests, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: m.Namespace,
Name: m.Name,
},
})
}
}
return requests
}),
handler.EnqueueRequestsFromMapFunc(mapObjectToMonitor(&corev1.Pod{}, r.Client)),
).
// Watch PVCs as well
Watches(
&corev1.PersistentVolumeClaim{},
handler.EnqueueRequestsFromMapFunc(mapObjectToMonitor(&corev1.PersistentVolumeClaim{}, r.Client)),
).
// Watch for changes to Workload objects we create (owned by WorkloadMonitor)
Owns(&cozyv1alpha1.Workload{}).
Complete(r)
}
func mapObjectToMonitor[T client.Object](_ T, c client.Client) func(ctx context.Context, obj client.Object) []reconcile.Request {
return func(ctx context.Context, obj client.Object) []reconcile.Request {
concrete, ok := obj.(T)
if !ok {
return nil
}
var monitorList cozyv1alpha1.WorkloadMonitorList
// List all WorkloadMonitors in the same namespace
if err := c.List(ctx, &monitorList, client.InNamespace(concrete.GetNamespace())); err != nil {
return nil
}
labels := concrete.GetLabels()
// Match each monitor's selector with the Pod's labels
var requests []reconcile.Request
for _, m := range monitorList.Items {
matches := true
for k, v := range m.Spec.Selector {
if labelVal, exists := labels[k]; !exists || labelVal != v {
matches = false
break
}
}
if matches {
requests = append(requests, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: m.Namespace,
Name: m.Name,
},
})
}
}
return requests
}
}

View File

@@ -16,7 +16,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/log"
cozyv1alpha1 "github.com/aenix-io/cozystack/api/v1alpha1"
cozyv1alpha1 "github.com/cozystack/cozystack/api/v1alpha1"
)
// Collector handles telemetry data collection and sending

View File

@@ -1,105 +0,0 @@
---
# Source: cozy-installer/templates/cozystack.yaml
apiVersion: v1
kind: Namespace
metadata:
name: cozy-system
labels:
pod-security.kubernetes.io/enforce: privileged
---
# Source: cozy-installer/templates/cozystack.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: cozystack
namespace: cozy-system
---
# Source: cozy-installer/templates/cozystack.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cozystack
subjects:
- kind: ServiceAccount
name: cozystack
namespace: cozy-system
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
# Source: cozy-installer/templates/cozystack.yaml
apiVersion: v1
kind: Service
metadata:
name: cozystack
namespace: cozy-system
spec:
ports:
- name: http
port: 80
targetPort: 8123
selector:
app: cozystack
type: ClusterIP
---
# Source: cozy-installer/templates/cozystack.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: cozystack
namespace: cozy-system
spec:
replicas: 1
selector:
matchLabels:
app: cozystack
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
template:
metadata:
labels:
app: cozystack
spec:
hostNetwork: true
serviceAccountName: cozystack
containers:
- name: cozystack
image: "ghcr.io/aenix-io/cozystack/cozystack:v0.22.0"
env:
- name: KUBERNETES_SERVICE_HOST
value: localhost
- name: KUBERNETES_SERVICE_PORT
value: "7445"
- name: K8S_AWAIT_ELECTION_ENABLED
value: "1"
- name: K8S_AWAIT_ELECTION_NAME
value: cozystack
- name: K8S_AWAIT_ELECTION_LOCK_NAME
value: cozystack
- name: K8S_AWAIT_ELECTION_LOCK_NAMESPACE
value: cozy-system
- name: K8S_AWAIT_ELECTION_IDENTITY
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: darkhttpd
image: "ghcr.io/aenix-io/cozystack/cozystack:v0.22.0"
command:
- /usr/bin/darkhttpd
- /cozystack/assets
- --port
- "8123"
ports:
- name: http
containerPort: 8123
tolerations:
- key: "node.kubernetes.io/not-ready"
operator: "Exists"
effect: "NoSchedule"
- key: "node.cilium.io/agent-not-ready"
operator: "Exists"
effect: "NoSchedule"

View File

@@ -0,0 +1,3 @@
# S3 bucket
## Parameters

View File

@@ -11,7 +11,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '*'
version: '>= 0.0.0-0'
interval: 1m0s
timeout: 5m0s
values:

View File

@@ -0,0 +1,5 @@
{
"title": "Chart Values",
"type": "object",
"properties": {}
}

View File

@@ -0,0 +1 @@
{}

View File

@@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.6.1
version: 0.7.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to

View File

@@ -14,6 +14,7 @@ image:
--cache-to type=inline \
--metadata-file images/clickhouse-backup.json \
--push=$(PUSH) \
--label "org.opencontainers.image.source=https://github.com/cozystack/cozystack" \
--load=$(LOAD)
echo "$(REGISTRY)/clickhouse-backup:$(call settag,$(CLICKHOUSE_BACKUP_TAG))@$$(yq e '."containerimage.digest"' images/clickhouse-backup.json -o json -r)" \
> images/clickhouse-backup.tag

View File

@@ -36,13 +36,15 @@ more details:
### Backup parameters
| Name | Description | Value |
| ------------------------ | ---------------------------------------------- | ------------------------------------------------------ |
| `backup.enabled` | Enable pereiodic backups | `false` |
| `backup.s3Region` | The AWS S3 region where backups are stored | `us-east-1` |
| `backup.s3Bucket` | The S3 bucket used for storing backups | `s3.example.org/clickhouse-backups` |
| `backup.schedule` | Cron schedule for automated backups | `0 2 * * *` |
| `backup.cleanupStrategy` | The strategy for cleaning up old backups | `--keep-last=3 --keep-daily=3 --keep-within-weekly=1m` |
| `backup.s3AccessKey` | The access key for S3, used for authentication | `oobaiRus9pah8PhohL1ThaeTa4UVa7gu` |
| `backup.s3SecretKey` | The secret key for S3, used for authentication | `ju3eum4dekeich9ahM1te8waeGai0oog` |
| `backup.resticPassword` | The password for Restic backup encryption | `ChaXoveekoh6eigh4siesheeda2quai0` |
| Name | Description | Value |
| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ |
| `backup.enabled` | Enable pereiodic backups | `false` |
| `backup.s3Region` | The AWS S3 region where backups are stored | `us-east-1` |
| `backup.s3Bucket` | The S3 bucket used for storing backups | `s3.example.org/clickhouse-backups` |
| `backup.schedule` | Cron schedule for automated backups | `0 2 * * *` |
| `backup.cleanupStrategy` | The strategy for cleaning up old backups | `--keep-last=3 --keep-daily=3 --keep-within-weekly=1m` |
| `backup.s3AccessKey` | The access key for S3, used for authentication | `oobaiRus9pah8PhohL1ThaeTa4UVa7gu` |
| `backup.s3SecretKey` | The secret key for S3, used for authentication | `ju3eum4dekeich9ahM1te8waeGai0oog` |
| `backup.resticPassword` | The password for Restic backup encryption | `ChaXoveekoh6eigh4siesheeda2quai0` |
| `resources` | Resources | `{}` |
| `resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `nano` |

View File

@@ -1 +1 @@
ghcr.io/aenix-io/cozystack/clickhouse-backup:0.6.1@sha256:7a99cabdfd541f863aa5d1b2f7b49afd39838fb94c8448986634a1dc9050751c
ghcr.io/cozystack/cozystack/clickhouse-backup:0.7.0@sha256:3faf7a4cebf390b9053763107482de175aa0fdb88c1e77424fd81100b1c3a205

View File

@@ -0,0 +1,50 @@
{{/*
Copyright Broadcom, Inc. All Rights Reserved.
SPDX-License-Identifier: APACHE-2.0
*/}}
{{/* vim: set filetype=mustache: */}}
{{/*
Return a resource request/limit object based on a given preset.
These presets are for basic testing and not meant to be used in production
{{ include "resources.preset" (dict "type" "nano") -}}
*/}}
{{- define "resources.preset" -}}
{{/* The limits are the requests increased by 50% (except ephemeral-storage and xlarge/2xlarge sizes)*/}}
{{- $presets := dict
"nano" (dict
"requests" (dict "cpu" "100m" "memory" "128Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "150m" "memory" "192Mi" "ephemeral-storage" "2Gi")
)
"micro" (dict
"requests" (dict "cpu" "250m" "memory" "256Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "375m" "memory" "384Mi" "ephemeral-storage" "2Gi")
)
"small" (dict
"requests" (dict "cpu" "500m" "memory" "512Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "768Mi" "ephemeral-storage" "2Gi")
)
"medium" (dict
"requests" (dict "cpu" "500m" "memory" "1024Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "1536Mi" "ephemeral-storage" "2Gi")
)
"large" (dict
"requests" (dict "cpu" "1.0" "memory" "2048Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "1.5" "memory" "3072Mi" "ephemeral-storage" "2Gi")
)
"xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "3.0" "memory" "6144Mi" "ephemeral-storage" "2Gi")
)
"2xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "6.0" "memory" "12288Mi" "ephemeral-storage" "2Gi")
)
}}
{{- if hasKey $presets .type -}}
{{- index $presets .type | toYaml -}}
{{- else -}}
{{- printf "ERROR: Preset key '%s' invalid. Allowed values are %s" .type (join "," (keys $presets)) | fail -}}
{{- end -}}
{{- end -}}

View File

@@ -121,6 +121,11 @@ spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:24.9.2.42
{{- if .Values.resources }}
resources: {{- toYaml .Values.resources | nindent 16 }}
{{- else if ne .Values.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.resourcesPreset "Release" .Release) | nindent 16 }}
{{- end }}
volumeMounts:
- name: data-volume-template
mountPath: /var/lib/clickhouse

View File

@@ -17,3 +17,10 @@ rules:
resourceNames:
- {{ .Release.Name }}-credentials
verbs: ["get", "list", "watch"]
- apiGroups:
- cozystack.io
resources:
- workloadmonitors
resourceNames:
- {{ .Release.Name }}
verbs: ["get", "list", "watch"]

View File

@@ -0,0 +1,13 @@
---
apiVersion: cozystack.io/v1alpha1
kind: WorkloadMonitor
metadata:
name: {{ $.Release.Name }}
spec:
replicas: {{ .Values.replicas }}
minReplicas: 1
kind: clickhouse
type: clickhouse
selector:
clickhouse.altinity.com/chi: {{ $.Release.Name }}
version: {{ $.Chart.Version }}

View File

@@ -76,6 +76,16 @@
"default": "ChaXoveekoh6eigh4siesheeda2quai0"
}
}
},
"resources": {
"type": "object",
"description": "Resources",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).",
"default": "nano"
}
}
}

View File

@@ -46,3 +46,16 @@ backup:
s3AccessKey: oobaiRus9pah8PhohL1ThaeTa4UVa7gu
s3SecretKey: ju3eum4dekeich9ahM1te8waeGai0oog
resticPassword: ChaXoveekoh6eigh4siesheeda2quai0
## @param resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
resourcesPreset: "nano"

View File

@@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.4.1
version: 0.5.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to

View File

@@ -21,15 +21,17 @@
### Backup parameters
| Name | Description | Value |
| ------------------------ | ---------------------------------------------- | ------------------------------------------------------ |
| `backup.enabled` | Enable pereiodic backups | `false` |
| `backup.s3Region` | The AWS S3 region where backups are stored | `us-east-1` |
| `backup.s3Bucket` | The S3 bucket used for storing backups | `s3.example.org/postgres-backups` |
| `backup.schedule` | Cron schedule for automated backups | `0 2 * * *` |
| `backup.cleanupStrategy` | The strategy for cleaning up old backups | `--keep-last=3 --keep-daily=3 --keep-within-weekly=1m` |
| `backup.s3AccessKey` | The access key for S3, used for authentication | `oobaiRus9pah8PhohL1ThaeTa4UVa7gu` |
| `backup.s3SecretKey` | The secret key for S3, used for authentication | `ju3eum4dekeich9ahM1te8waeGai0oog` |
| `backup.resticPassword` | The password for Restic backup encryption | `ChaXoveekoh6eigh4siesheeda2quai0` |
| Name | Description | Value |
| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ |
| `backup.enabled` | Enable pereiodic backups | `false` |
| `backup.s3Region` | The AWS S3 region where backups are stored | `us-east-1` |
| `backup.s3Bucket` | The S3 bucket used for storing backups | `s3.example.org/postgres-backups` |
| `backup.schedule` | Cron schedule for automated backups | `0 2 * * *` |
| `backup.cleanupStrategy` | The strategy for cleaning up old backups | `--keep-last=3 --keep-daily=3 --keep-within-weekly=1m` |
| `backup.s3AccessKey` | The access key for S3, used for authentication | `oobaiRus9pah8PhohL1ThaeTa4UVa7gu` |
| `backup.s3SecretKey` | The secret key for S3, used for authentication | `ju3eum4dekeich9ahM1te8waeGai0oog` |
| `backup.resticPassword` | The password for Restic backup encryption | `ChaXoveekoh6eigh4siesheeda2quai0` |
| `resources` | Resources | `{}` |
| `resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `nano` |

View File

@@ -1 +1 @@
ghcr.io/aenix-io/cozystack/postgres-backup:0.8.0@sha256:6a8ec7e7052f2d02ec5457d7cbac6ee52b3ed93a883988a192d1394fc7c88117
ghcr.io/cozystack/cozystack/postgres-backup:0.10.1@sha256:10179ed56457460d95cd5708db2a00130901255fa30c4dd76c65d2ef5622b61f

View File

@@ -0,0 +1,50 @@
{{/*
Copyright Broadcom, Inc. All Rights Reserved.
SPDX-License-Identifier: APACHE-2.0
*/}}
{{/* vim: set filetype=mustache: */}}
{{/*
Return a resource request/limit object based on a given preset.
These presets are for basic testing and not meant to be used in production
{{ include "resources.preset" (dict "type" "nano") -}}
*/}}
{{- define "resources.preset" -}}
{{/* The limits are the requests increased by 50% (except ephemeral-storage and xlarge/2xlarge sizes)*/}}
{{- $presets := dict
"nano" (dict
"requests" (dict "cpu" "100m" "memory" "128Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "150m" "memory" "192Mi" "ephemeral-storage" "2Gi")
)
"micro" (dict
"requests" (dict "cpu" "250m" "memory" "256Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "375m" "memory" "384Mi" "ephemeral-storage" "2Gi")
)
"small" (dict
"requests" (dict "cpu" "500m" "memory" "512Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "768Mi" "ephemeral-storage" "2Gi")
)
"medium" (dict
"requests" (dict "cpu" "500m" "memory" "1024Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "1536Mi" "ephemeral-storage" "2Gi")
)
"large" (dict
"requests" (dict "cpu" "1.0" "memory" "2048Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "1.5" "memory" "3072Mi" "ephemeral-storage" "2Gi")
)
"xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "3.0" "memory" "6144Mi" "ephemeral-storage" "2Gi")
)
"2xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "6.0" "memory" "12288Mi" "ephemeral-storage" "2Gi")
)
}}
{{- if hasKey $presets .type -}}
{{- index $presets .type | toYaml -}}
{{- else -}}
{{- printf "ERROR: Preset key '%s' invalid. Allowed values are %s" .type (join "," (keys $presets)) | fail -}}
{{- end -}}
{{- end -}}

View File

@@ -17,3 +17,10 @@ rules:
resourceNames:
- {{ .Release.Name }}-credentials
verbs: ["get", "list", "watch"]
- apiGroups:
- cozystack.io
resources:
- workloadmonitors
resourceNames:
- {{ .Release.Name }}
verbs: ["get", "list", "watch"]

View File

@@ -6,10 +6,20 @@ metadata:
spec:
instances: {{ .Values.replicas }}
enableSuperuserAccess: true
{{- $configMap := lookup "v1" "ConfigMap" "cozy-system" "cozystack-scheduling" }}
{{- if $configMap }}
{{- $rawConstraints := get $configMap.data "globalAppTopologySpreadConstraints" }}
{{- if $rawConstraints }}
{{- $rawConstraints | fromYaml | toYaml | nindent 2 }}
{{- end }}
{{- end }}
minSyncReplicas: {{ .Values.quorum.minSyncReplicas }}
maxSyncReplicas: {{ .Values.quorum.maxSyncReplicas }}
{{- if .Values.resources }}
resources: {{- toYaml .Values.resources | nindent 4 }}
{{- else if ne .Values.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.resourcesPreset "Release" .Release) | nindent 4 }}
{{- end }}
monitoring:
enablePodMonitor: true

View File

@@ -0,0 +1,13 @@
---
apiVersion: cozystack.io/v1alpha1
kind: WorkloadMonitor
metadata:
name: {{ $.Release.Name }}
spec:
replicas: {{ .Values.replicas }}
minReplicas: 1
kind: ferretdb
type: ferretdb
selector:
app: {{ $.Release.Name }}
version: {{ $.Chart.Version }}

View File

@@ -81,6 +81,16 @@
"default": "ChaXoveekoh6eigh4siesheeda2quai0"
}
}
},
"resources": {
"type": "object",
"description": "Resources",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).",
"default": "nano"
}
}
}

View File

@@ -48,3 +48,16 @@ backup:
s3AccessKey: oobaiRus9pah8PhohL1ThaeTa4UVa7gu
s3SecretKey: ju3eum4dekeich9ahM1te8waeGai0oog
resticPassword: ChaXoveekoh6eigh4siesheeda2quai0
## @param resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
resourcesPreset: "nano"

View File

@@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.3.1
version: 0.4.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to

View File

@@ -13,6 +13,7 @@ image-nginx:
--cache-to type=inline \
--metadata-file images/nginx-cache.json \
--push=$(PUSH) \
--label "org.opencontainers.image.source=https://github.com/cozystack/cozystack" \
--load=$(LOAD)
echo "$(REGISTRY)/nginx-cache:$(call settag,$(NGINX_CACHE_TAG))@$$(yq e '."containerimage.digest"' images/nginx-cache.json -o json -r)" \
> images/nginx-cache.tag

View File

@@ -60,13 +60,17 @@ VTS module shows wrong upstream resonse time
### Common parameters
| Name | Description | Value |
| ------------------ | ----------------------------------------------- | ------- |
| `external` | Enable external access from outside the cluster | `false` |
| `size` | Persistent Volume size | `10Gi` |
| `storageClass` | StorageClass used to store the data | `""` |
| `haproxy.replicas` | Number of HAProxy replicas | `2` |
| `nginx.replicas` | Number of Nginx replicas | `2` |
| Name | Description | Value |
| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `external` | Enable external access from outside the cluster | `false` |
| `size` | Persistent Volume size | `10Gi` |
| `storageClass` | StorageClass used to store the data | `""` |
| `haproxy.replicas` | Number of HAProxy replicas | `2` |
| `nginx.replicas` | Number of Nginx replicas | `2` |
| `haproxy.resources` | Resources | `{}` |
| `haproxy.resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `nano` |
| `nginx.resources` | Resources | `{}` |
| `nginx.resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `nano` |
### Configuration parameters

View File

@@ -1 +1 @@
ghcr.io/aenix-io/cozystack/nginx-cache:0.3.1@sha256:a3c25199acb8e8426e6952658ccc4acaadb50fe2cfa6359743b64e5166b3fc70
ghcr.io/cozystack/cozystack/nginx-cache:0.4.0@sha256:4e1f5153d2673a399b315252238f4dc3eb5d6c59295aef594691710cc5b72eb4

View File

@@ -0,0 +1,50 @@
{{/*
Copyright Broadcom, Inc. All Rights Reserved.
SPDX-License-Identifier: APACHE-2.0
*/}}
{{/* vim: set filetype=mustache: */}}
{{/*
Return a resource request/limit object based on a given preset.
These presets are for basic testing and not meant to be used in production
{{ include "resources.preset" (dict "type" "nano") -}}
*/}}
{{- define "resources.preset" -}}
{{/* The limits are the requests increased by 50% (except ephemeral-storage and xlarge/2xlarge sizes)*/}}
{{- $presets := dict
"nano" (dict
"requests" (dict "cpu" "100m" "memory" "128Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "150m" "memory" "192Mi" "ephemeral-storage" "2Gi")
)
"micro" (dict
"requests" (dict "cpu" "250m" "memory" "256Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "375m" "memory" "384Mi" "ephemeral-storage" "2Gi")
)
"small" (dict
"requests" (dict "cpu" "500m" "memory" "512Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "768Mi" "ephemeral-storage" "2Gi")
)
"medium" (dict
"requests" (dict "cpu" "500m" "memory" "1024Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "1536Mi" "ephemeral-storage" "2Gi")
)
"large" (dict
"requests" (dict "cpu" "1.0" "memory" "2048Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "1.5" "memory" "3072Mi" "ephemeral-storage" "2Gi")
)
"xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "3.0" "memory" "6144Mi" "ephemeral-storage" "2Gi")
)
"2xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "6.0" "memory" "12288Mi" "ephemeral-storage" "2Gi")
)
}}
{{- if hasKey $presets .type -}}
{{- index $presets .type | toYaml -}}
{{- else -}}
{{- printf "ERROR: Preset key '%s' invalid. Allowed values are %s" .type (join "," (keys $presets)) | fail -}}
{{- end -}}
{{- end -}}

View File

@@ -33,6 +33,11 @@ spec:
containers:
- image: haproxy:latest
name: haproxy
{{- if .Values.haproxy.resources }}
resources: {{- toYaml .Values.haproxy.resources | nindent 10 }}
{{- else if ne .Values.haproxy.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.haproxy.resourcesPreset "Release" .Release) | nindent 10 }}
{{- end }}
ports:
- containerPort: 8080
name: http

View File

@@ -52,6 +52,11 @@ spec:
shareProcessNamespace: true
containers:
- name: nginx
{{- if $.Values.nginx.resources }}
resources: {{- toYaml $.Values.nginx.resources | nindent 10 }}
{{- else if ne $.Values.nginx.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" $.Values.nginx.resourcesPreset "Release" $.Release) | nindent 10 }}
{{- end }}
image: "{{ $.Files.Get "images/nginx-cache.tag" | trim }}"
readinessProbe:
httpGet:
@@ -83,6 +88,13 @@ spec:
- name: reloader
image: "{{ $.Files.Get "images/nginx-cache.tag" | trim }}"
command: ["/usr/bin/nginx-reloader.sh"]
resources:
limits:
cpu: 50m
memory: 50Mi
requests:
cpu: 50m
memory: 50Mi
#command: ["sleep", "infinity"]
volumeMounts:
- mountPath: /etc/nginx/nginx.conf

View File

@@ -24,6 +24,16 @@
"type": "number",
"description": "Number of HAProxy replicas",
"default": 2
},
"resources": {
"type": "object",
"description": "Resources",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).",
"default": "nano"
}
}
},
@@ -34,6 +44,16 @@
"type": "number",
"description": "Number of Nginx replicas",
"default": 2
},
"resources": {
"type": "object",
"description": "Resources",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).",
"default": "nano"
}
}
},

View File

@@ -12,8 +12,32 @@ size: 10Gi
storageClass: ""
haproxy:
replicas: 2
## @param haproxy.resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param haproxy.resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
resourcesPreset: "nano"
nginx:
replicas: 2
## @param nginx.resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param nginx.resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
resourcesPreset: "nano"
## @section Configuration parameters

View File

@@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.3.1
version: 0.5.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to

View File

@@ -4,15 +4,19 @@
### Common parameters
| Name | Description | Value |
| ------------------------ | ----------------------------------------------- | ------- |
| `external` | Enable external access from outside the cluster | `false` |
| `kafka.size` | Persistent Volume size for Kafka | `10Gi` |
| `kafka.replicas` | Number of Kafka replicas | `3` |
| `kafka.storageClass` | StorageClass used to store the Kafka data | `""` |
| `zookeeper.size` | Persistent Volume size for ZooKeeper | `5Gi` |
| `zookeeper.replicas` | Number of ZooKeeper replicas | `3` |
| `zookeeper.storageClass` | StorageClass used to store the ZooKeeper data | `""` |
| Name | Description | Value |
| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `external` | Enable external access from outside the cluster | `false` |
| `kafka.size` | Persistent Volume size for Kafka | `10Gi` |
| `kafka.replicas` | Number of Kafka replicas | `3` |
| `kafka.storageClass` | StorageClass used to store the Kafka data | `""` |
| `zookeeper.size` | Persistent Volume size for ZooKeeper | `5Gi` |
| `zookeeper.replicas` | Number of ZooKeeper replicas | `3` |
| `zookeeper.storageClass` | StorageClass used to store the ZooKeeper data | `""` |
| `kafka.resources` | Resources | `{}` |
| `kafka.resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `nano` |
| `zookeeper.resources` | Resources | `{}` |
| `zookeeper.resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `nano` |
### Configuration parameters

View File

@@ -0,0 +1,50 @@
{{/*
Copyright Broadcom, Inc. All Rights Reserved.
SPDX-License-Identifier: APACHE-2.0
*/}}
{{/* vim: set filetype=mustache: */}}
{{/*
Return a resource request/limit object based on a given preset.
These presets are for basic testing and not meant to be used in production
{{ include "resources.preset" (dict "type" "nano") -}}
*/}}
{{- define "resources.preset" -}}
{{/* The limits are the requests increased by 50% (except ephemeral-storage and xlarge/2xlarge sizes)*/}}
{{- $presets := dict
"nano" (dict
"requests" (dict "cpu" "100m" "memory" "128Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "150m" "memory" "192Mi" "ephemeral-storage" "2Gi")
)
"micro" (dict
"requests" (dict "cpu" "250m" "memory" "256Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "375m" "memory" "384Mi" "ephemeral-storage" "2Gi")
)
"small" (dict
"requests" (dict "cpu" "500m" "memory" "512Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "768Mi" "ephemeral-storage" "2Gi")
)
"medium" (dict
"requests" (dict "cpu" "500m" "memory" "1024Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "1536Mi" "ephemeral-storage" "2Gi")
)
"large" (dict
"requests" (dict "cpu" "1.0" "memory" "2048Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "1.5" "memory" "3072Mi" "ephemeral-storage" "2Gi")
)
"xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "3.0" "memory" "6144Mi" "ephemeral-storage" "2Gi")
)
"2xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "6.0" "memory" "12288Mi" "ephemeral-storage" "2Gi")
)
}}
{{- if hasKey $presets .type -}}
{{- index $presets .type | toYaml -}}
{{- else -}}
{{- printf "ERROR: Preset key '%s' invalid. Allowed values are %s" .type (join "," (keys $presets)) | fail -}}
{{- end -}}
{{- end -}}

View File

@@ -17,3 +17,11 @@ rules:
resourceNames:
- {{ .Release.Name }}-clients-ca
verbs: ["get", "list", "watch"]
- apiGroups:
- cozystack.io
resources:
- workloadmonitors
resourceNames:
- {{ .Release.Name }}
- {{ $.Release.Name }}-zookeeper
verbs: ["get", "list", "watch"]

View File

@@ -8,6 +8,11 @@ metadata:
spec:
kafka:
replicas: {{ .Values.kafka.replicas }}
{{- if .Values.kafka.resources }}
resources: {{- toYaml .Values.kafka.resources | nindent 6 }}
{{- else if ne .Values.kafka.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.kafka.resourcesPreset "Release" .Release) | nindent 6 }}
{{- end }}
listeners:
- name: plain
port: 9092
@@ -57,8 +62,19 @@ spec:
class: {{ . }}
{{- end }}
deleteClaim: true
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: {{ .Release.Name }}-metrics
key: kafka-metrics-config.yml
zookeeper:
replicas: {{ .Values.zookeeper.replicas }}
{{- if .Values.zookeeper.resources }}
resources: {{- toYaml .Values.zookeeper.resources | nindent 6 }}
{{- else if ne .Values.zookeeper.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.zookeeper.resourcesPreset "Release" .Release) | nindent 6 }}
{{- end }}
storage:
type: persistent-claim
{{- with .Values.zookeeper.size }}
@@ -68,6 +84,12 @@ spec:
class: {{ . }}
{{- end }}
deleteClaim: false
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: {{ .Release.Name }}-metrics
key: kafka-metrics-config.yml
entityOperator:
topicOperator: {}
userOperator: {}

View File

@@ -0,0 +1,198 @@
kind: ConfigMap
apiVersion: v1
metadata:
name: {{ .Release.Name }}-metrics
data:
kafka-metrics-config.yml: |
# See https://github.com/prometheus/jmx_exporter for more info about JMX Prometheus Exporter metrics
lowercaseOutputName: true
rules:
# Special cases and very specific rules
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.*)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
topic: "$4"
partition: "$5"
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), brokerHost=(.+), brokerPort=(.+)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
broker: "$4:$5"
- pattern: kafka.server<type=(.+), cipher=(.+), protocol=(.+), listener=(.+), networkProcessor=(.+)><>connections
name: kafka_server_$1_connections_tls_info
type: GAUGE
labels:
cipher: "$2"
protocol: "$3"
listener: "$4"
networkProcessor: "$5"
- pattern: kafka.server<type=(.+), clientSoftwareName=(.+), clientSoftwareVersion=(.+), listener=(.+), networkProcessor=(.+)><>connections
name: kafka_server_$1_connections_software
type: GAUGE
labels:
clientSoftwareName: "$2"
clientSoftwareVersion: "$3"
listener: "$4"
networkProcessor: "$5"
- pattern: "kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+-total):"
name: kafka_server_$1_$4
type: COUNTER
labels:
listener: "$2"
networkProcessor: "$3"
- pattern: "kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+):"
name: kafka_server_$1_$4
type: GAUGE
labels:
listener: "$2"
networkProcessor: "$3"
- pattern: kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+-total)
name: kafka_server_$1_$4
type: COUNTER
labels:
listener: "$2"
networkProcessor: "$3"
- pattern: kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+)
name: kafka_server_$1_$4
type: GAUGE
labels:
listener: "$2"
networkProcessor: "$3"
# Some percent metrics use MeanRate attribute
# Ex) kafka.server<type=(KafkaRequestHandlerPool), name=(RequestHandlerAvgIdlePercent)><>MeanRate
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*><>MeanRate
name: kafka_$1_$2_$3_percent
type: GAUGE
# Generic gauges for percents
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*><>Value
name: kafka_$1_$2_$3_percent
type: GAUGE
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*, (.+)=(.+)><>Value
name: kafka_$1_$2_$3_percent
type: GAUGE
labels:
"$4": "$5"
# Generic per-second counters with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
# Generic gauges with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
# Emulate Prometheus 'Summary' metrics for the exported 'Histogram's.
# Note that these are missing the '_sum' metric!
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*), (.+)=(.+)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
"$6": "$7"
quantile: "0.$8"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
quantile: "0.$6"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
quantile: "0.$4"
# KRaft overall related metrics
# distinguish between always increasing COUNTER (total and max) and variable GAUGE (all others) metrics
- pattern: "kafka.server<type=raft-metrics><>(.+-total|.+-max):"
name: kafka_server_raftmetrics_$1
type: COUNTER
- pattern: "kafka.server<type=raft-metrics><>(current-state): (.+)"
name: kafka_server_raftmetrics_$1
value: 1
type: UNTYPED
labels:
$1: "$2"
- pattern: "kafka.server<type=raft-metrics><>(.+):"
name: kafka_server_raftmetrics_$1
type: GAUGE
# KRaft "low level" channels related metrics
# distinguish between always increasing COUNTER (total and max) and variable GAUGE (all others) metrics
- pattern: "kafka.server<type=raft-channel-metrics><>(.+-total|.+-max):"
name: kafka_server_raftchannelmetrics_$1
type: COUNTER
- pattern: "kafka.server<type=raft-channel-metrics><>(.+):"
name: kafka_server_raftchannelmetrics_$1
type: GAUGE
# Broker metrics related to fetching metadata topic records in KRaft mode
- pattern: "kafka.server<type=broker-metadata-metrics><>(.+):"
name: kafka_server_brokermetadatametrics_$1
type: GAUGE
zookeeper-metrics-config.yml: |
# See https://github.com/prometheus/jmx_exporter for more info about JMX Prometheus Exporter metrics
lowercaseOutputName: true
rules:
# replicated Zookeeper
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+)><>(\\w+)"
name: "zookeeper_$2"
type: GAUGE
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+)><>(\\w+)"
name: "zookeeper_$3"
type: GAUGE
labels:
replicaId: "$2"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+)><>(Packets\\w+)"
name: "zookeeper_$4"
type: COUNTER
labels:
replicaId: "$2"
memberType: "$3"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+)><>(\\w+)"
name: "zookeeper_$4"
type: GAUGE
labels:
replicaId: "$2"
memberType: "$3"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+), name3=(\\w+)><>(\\w+)"
name: "zookeeper_$4_$5"
type: GAUGE
labels:
replicaId: "$2"
memberType: "$3"

View File

@@ -0,0 +1,40 @@
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: {{ .Release.Name }}
spec:
podMetricsEndpoints:
- port: tcp-prometheus
scheme: http
relabelConfigs:
- separator: ;
regex: __meta_kubernetes_pod_label_(strimzi_io_.+)
replacement: $1
action: labelmap
- sourceLabels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
targetLabel: namespace
replacement: $1
action: replace
- sourceLabels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
targetLabel: pod
replacement: $1
action: replace
- sourceLabels: [__meta_kubernetes_pod_node_name]
separator: ;
regex: (.*)
targetLabel: node
replacement: $1
action: replace
- sourceLabels: [__meta_kubernetes_pod_host_ip]
separator: ;
regex: (.*)
targetLabel: node_ip
replacement: $1
action: replace
selector:
matchLabels:
app.kubernetes.io/instance: {{ .Release.Name }}

View File

@@ -0,0 +1,30 @@
---
apiVersion: cozystack.io/v1alpha1
kind: WorkloadMonitor
metadata:
name: {{ $.Release.Name }}
spec:
replicas: {{ .Values.replicas }}
minReplicas: 1
kind: kafka
type: kafka
selector:
app.kubernetes.io/instance: {{ $.Release.Name }}
app.kubernetes.io/name: kafka
version: {{ $.Chart.Version }}
---
apiVersion: cozystack.io/v1alpha1
kind: WorkloadMonitor
metadata:
name: {{ $.Release.Name }}-zookeeper
spec:
replicas: {{ .Values.replicas }}
minReplicas: 1
kind: kafka
type: zookeeper
selector:
app.kubernetes.io/instance: {{ $.Release.Name }}
app.kubernetes.io/name: zookeeper
version: {{ $.Chart.Version }}

View File

@@ -24,6 +24,16 @@
"type": "string",
"description": "StorageClass used to store the Kafka data",
"default": ""
},
"resources": {
"type": "object",
"description": "Resources",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).",
"default": "nano"
}
}
},
@@ -44,6 +54,16 @@
"type": "string",
"description": "StorageClass used to store the ZooKeeper data",
"default": ""
},
"resources": {
"type": "object",
"description": "Resources",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).",
"default": "nano"
}
}
},

View File

@@ -14,10 +14,35 @@ kafka:
size: 10Gi
replicas: 3
storageClass: ""
## @param kafka.resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param kafka.resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
resourcesPreset: "nano"
zookeeper:
size: 5Gi
replicas: 3
storageClass: ""
## @param zookeeper.resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param zookeeper.resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
resourcesPreset: "nano"
## @section Configuration parameters

View File

@@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.15.0
version: 0.20.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to

View File

@@ -1,4 +1,4 @@
UBUNTU_CONTAINER_DISK_TAG = v1.30.1
KUBERNETES_VERSION = v1.32
KUBERNETES_PKG_TAG = $(shell awk '$$1 == "version:" {print $$2}' Chart.yaml)
include ../../../scripts/common-envs.mk
@@ -6,20 +6,26 @@ include ../../../scripts/package.mk
generate:
readme-generator -v values.yaml -s values.schema.json -r README.md
yq -o json -i '.properties.controlPlane.properties.apiServer.properties.resourcesPreset.enum = ["none","nano","micro","small","medium","large","xlarge","2xlarge"]' values.schema.json
yq -o json -i '.properties.controlPlane.properties.controllerManager.properties.resourcesPreset.enum = ["none","nano","micro","small","medium","large","xlarge","2xlarge"]' values.schema.json
yq -o json -i '.properties.controlPlane.properties.scheduler.properties.resourcesPreset.enum = ["none","nano","micro","small","medium","large","xlarge","2xlarge"]' values.schema.json
yq -o json -i '.properties.controlPlane.properties.konnectivity.properties.server.properties.resourcesPreset.enum = ["none","nano","micro","small","medium","large","xlarge","2xlarge"]' values.schema.json
image: image-ubuntu-container-disk image-kubevirt-cloud-provider image-kubevirt-csi-driver image-cluster-autoscaler
image-ubuntu-container-disk:
docker buildx build --platform linux/amd64 --build-arg ARCH=amd64 images/ubuntu-container-disk \
--provenance false \
--tag $(REGISTRY)/ubuntu-container-disk:$(call settag,$(UBUNTU_CONTAINER_DISK_TAG)) \
--tag $(REGISTRY)/ubuntu-container-disk:$(call settag,$(UBUNTU_CONTAINER_DISK_TAG)-$(TAG)) \
--build-arg KUBERNETES_VERSION=${KUBERNETES_VERSION} \
--tag $(REGISTRY)/ubuntu-container-disk:$(call settag,$(KUBERNETES_VERSION)) \
--tag $(REGISTRY)/ubuntu-container-disk:$(call settag,$(KUBERNETES_VERSION)-$(TAG)) \
--cache-from type=registry,ref=$(REGISTRY)/ubuntu-container-disk:latest \
--cache-to type=inline \
--metadata-file images/ubuntu-container-disk.json \
--push=$(PUSH) \
--label "org.opencontainers.image.source=https://github.com/cozystack/cozystack" \
--load=$(LOAD)
echo "$(REGISTRY)/ubuntu-container-disk:$(call settag,$(UBUNTU_CONTAINER_DISK_TAG))@$$(yq e '."containerimage.digest"' images/ubuntu-container-disk.json -o json -r)" \
echo "$(REGISTRY)/ubuntu-container-disk:$(call settag,$(KUBERNETES_VERSION))@$$(yq e '."containerimage.digest"' images/ubuntu-container-disk.json -o json -r)" \
> images/ubuntu-container-disk.tag
rm -f images/ubuntu-container-disk.json
@@ -32,6 +38,7 @@ image-kubevirt-cloud-provider:
--cache-to type=inline \
--metadata-file images/kubevirt-cloud-provider.json \
--push=$(PUSH) \
--label "org.opencontainers.image.source=https://github.com/cozystack/cozystack" \
--load=$(LOAD)
echo "$(REGISTRY)/kubevirt-cloud-provider:$(call settag,$(KUBERNETES_PKG_TAG))@$$(yq e '."containerimage.digest"' images/kubevirt-cloud-provider.json -o json -r)" \
> images/kubevirt-cloud-provider.tag
@@ -46,6 +53,7 @@ image-kubevirt-csi-driver:
--cache-to type=inline \
--metadata-file images/kubevirt-csi-driver.json \
--push=$(PUSH) \
--label "org.opencontainers.image.source=https://github.com/cozystack/cozystack" \
--load=$(LOAD)
echo "$(REGISTRY)/kubevirt-csi-driver:$(call settag,$(KUBERNETES_PKG_TAG))@$$(yq e '."containerimage.digest"' images/kubevirt-csi-driver.json -o json -r)" \
> images/kubevirt-csi-driver.tag
@@ -61,6 +69,7 @@ image-cluster-autoscaler:
--cache-to type=inline \
--metadata-file images/cluster-autoscaler.json \
--push=$(PUSH) \
--label "org.opencontainers.image.source=https://github.com/cozystack/cozystack" \
--load=$(LOAD)
echo "$(REGISTRY)/cluster-autoscaler:$(call settag,$(KUBERNETES_PKG_TAG))@$$(yq e '."containerimage.digest"' images/cluster-autoscaler.json -o json -r)" \
> images/cluster-autoscaler.tag

View File

@@ -27,20 +27,47 @@ How to access to deployed cluster:
kubectl get secret -n <namespace> kubernetes-<clusterName>-admin-kubeconfig -o go-template='{{ printf "%s\n" (index .data "super-admin.conf" | base64decode) }}' > test
```
# Series
## Parameters
<!-- source: https://github.com/kubevirt/common-instancetypes/blob/main/README.md -->
### Common parameters
. | U | O | CX | M | RT
----------------------------|-----|-----|------|-----|------
*Has GPUs* | | | | |
*Hugepages* | | | | ✓ | ✓
*Overcommitted Memory* | | | | |
*Dedicated CPU* | | | | | ✓
*Burstable CPU performance* | ✓ | ✓ | | ✓ |
*Isolated emulator threads* | | | ✓ | | ✓
*vNUMA* | | | ✓ | | ✓
*vCPU-To-Memory Ratio* | 1:4 | 1:4 | 1:2 | 1:8 | 1:4
| Name | Description | Value |
| ----------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | ------------ |
| `host` | The hostname used to access the Kubernetes cluster externally (defaults to using the cluster name as a subdomain for the tenant host). | `""` |
| `controlPlane.replicas` | Number of replicas for Kubernetes control-plane components | `2` |
| `storageClass` | StorageClass used to store user data | `replicated` |
| `nodeGroups` | nodeGroups configuration | `{}` |
### Cluster Addons
| Name | Description | Value |
| --------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `addons.certManager.enabled` | Enables the cert-manager | `false` |
| `addons.certManager.valuesOverride` | Custom values to override | `{}` |
| `addons.cilium.valuesOverride` | Custom values to override | `{}` |
| `addons.ingressNginx.enabled` | Enable Ingress-NGINX controller (expect nodes with 'ingress-nginx' role) | `false` |
| `addons.ingressNginx.valuesOverride` | Custom values to override | `{}` |
| `addons.ingressNginx.hosts` | List of domain names that should be passed through to the cluster by upper cluster | `[]` |
| `addons.gpuOperator.enabled` | Enables the gpu-operator | `false` |
| `addons.gpuOperator.valuesOverride` | Custom values to override | `{}` |
| `addons.fluxcd.enabled` | Enables Flux CD | `false` |
| `addons.fluxcd.valuesOverride` | Custom values to override | `{}` |
| `addons.monitoringAgents.enabled` | Enables MonitoringAgents (fluentbit, vmagents for sending logs and metrics to storage) if tenant monitoring enabled, send to tenant storage, else to root storage | `false` |
| `addons.monitoringAgents.valuesOverride` | Custom values to override | `{}` |
| `addons.verticalPodAutoscaler.valuesOverride` | Custom values to override | `{}` |
### Kubernetes control plane configuration
| Name | Description | Value |
| -------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `controlPlane.apiServer.resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `small` |
| `controlPlane.apiServer.resources` | Resources | `{}` |
| `controlPlane.controllerManager.resources` | Resources | `{}` |
| `controlPlane.controllerManager.resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `micro` |
| `controlPlane.scheduler.resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `micro` |
| `controlPlane.scheduler.resources` | Resources | `{}` |
| `controlPlane.konnectivity.server.resourcesPreset` | Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production). | `micro` |
| `controlPlane.konnectivity.server.resources` | Resources | `{}` |
## U Series

View File

@@ -1 +1 @@
ghcr.io/aenix-io/cozystack/cluster-autoscaler:0.15.0@sha256:973dc89e1fe1c9beb109d74a48297426ed5d340b43d0102b8e16f63dc2eb4016
ghcr.io/cozystack/cozystack/cluster-autoscaler:0.19.0@sha256:85371c6aabf5a7fea2214556deac930c600e362f92673464fe2443784e2869c3

View File

@@ -1 +1 @@
ghcr.io/aenix-io/cozystack/kubevirt-cloud-provider:0.15.0@sha256:3a94fe11523b1411eab33bd72b26d6df42dda83086249ba72ad6f2aa1b209c1e
ghcr.io/cozystack/cozystack/kubevirt-cloud-provider:0.19.0@sha256:795d8e1ef4b2b0df2aa1e09d96cd13476ebb545b4bf4b5779b7547a70ef64cf9

View File

@@ -3,12 +3,11 @@ FROM --platform=linux/amd64 golang:1.20.6 AS builder
RUN git clone https://github.com/kubevirt/cloud-provider-kubevirt /go/src/kubevirt.io/cloud-provider-kubevirt \
&& cd /go/src/kubevirt.io/cloud-provider-kubevirt \
&& git checkout da9e0cf
&& git checkout 443a1fe
WORKDIR /go/src/kubevirt.io/cloud-provider-kubevirt
# see: https://github.com/kubevirt/cloud-provider-kubevirt/pull/335
# see: https://github.com/kubevirt/cloud-provider-kubevirt/pull/336
ADD patches /patches
RUN git apply /patches/*.diff
RUN go get 'k8s.io/endpointslice/util@v0.28' 'k8s.io/apiserver@v0.28'

View File

@@ -1,20 +0,0 @@
diff --git a/pkg/controller/kubevirteps/kubevirteps_controller.go b/pkg/controller/kubevirteps/kubevirteps_controller.go
index a3c1aa33..95c31438 100644
--- a/pkg/controller/kubevirteps/kubevirteps_controller.go
+++ b/pkg/controller/kubevirteps/kubevirteps_controller.go
@@ -412,11 +412,11 @@ func (c *Controller) reconcileByAddressType(service *v1.Service, tenantSlices []
// Create the desired port configuration
var desiredPorts []discovery.EndpointPort
- for _, port := range service.Spec.Ports {
+ for i := range service.Spec.Ports {
desiredPorts = append(desiredPorts, discovery.EndpointPort{
- Port: &port.TargetPort.IntVal,
- Protocol: &port.Protocol,
- Name: &port.Name,
+ Port: &service.Spec.Ports[i].TargetPort.IntVal,
+ Protocol: &service.Spec.Ports[i].Protocol,
+ Name: &service.Spec.Ports[i].Name,
})
}

Some files were not shown because too many files have changed in this diff Show More