Compare commits

...

337 Commits

Author SHA1 Message Date
kklinch0
f785f3eae5 f 2025-06-09 00:11:35 +03:00
klinch0
730ea4d5ef [fix] CloudInit (#1019)
If ssh key provided - deploy
If cloudinit provided - deploy
If ssh key and cloudinit provided - deploy both
If none provided - init empty to avoid issues w/
network

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Improved handling of SSH keys and cloud-init data in the Virtual
Machine setup, clearly distinguishing cases when SSH keys, cloud-init,
or both are provided.
  - Enhanced template readability with added spacing for better clarity.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-05 15:53:08 +03:00
klinch0
13fccdc465 bump tenant version (#1028) 2025-06-05 15:44:53 +03:00
kklinch0
f1b66c80f6 bump tenant version
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-06-05 15:40:45 +03:00
klinch0
f34f140d49 Add RBAC rules to allow portforward in kubevirt for SSH via virtctl (#1027)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Expanded user permissions to allow port forwarding for virtual machine
instances, enabling enhanced remote access capabilities.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-05 11:07:53 +03:00
mattia
520fbfb2e4 Add RBAC rules to allow portforward in kubevirt for SSH via virtctl
Signed-off-by: mattia <mattia@hidora.io>
2025-06-05 09:38:40 +02:00
klinch0
25016580c1 (k8s) configure containerd for client k8s cluster (#979)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced granular Helm charts for Cluster API providers: bootstrap,
core, control plane, and infrastructure, each with dedicated
configuration, metadata, and compressed component packaging.
- Added a new configuration option to the Kubernetes app to enable using
a custom secret for patching containerd.
- Enhanced Kubernetes deployment to conditionally manage containerd
registry certificates and configuration using custom or copied secrets.

- **Documentation**
- Updated Kubernetes app documentation to include the new containerd
patching secret configuration option.

- **Chores**
- Updated version mappings and chart versions for Kubernetes and Cluster
API-related components.
- Decomposed the monolithic Cluster API provider release into multiple,
more manageable releases with explicit namespaces and dependencies.

- **Refactor**
- Removed the previous unified Cluster API provider template in favor of
new, separate provider resource definitions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-04 11:07:58 +03:00
kklinch0
f10f8455fc (k8s) configure containerd for client k8s cluster 2025-06-04 10:40:10 +03:00
Timofei Larkin
974581d39b [monitoring-agents] Add events and audit inputs (#948)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced log monitoring by adding support for Kubernetes events and
audit logs.
  - Introduced custom log parsers for improved log format handling.
  - Added log source tagging for easier identification of log origins.

- **Improvements**
- Refined log filtering and output formatting for better log
organization and delivery.
- Updated log outputs to support compressed JSON lines and ISO8601 date
formatting.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-04 10:33:58 +03:00
Timofei Larkin
7e24297913 Use library chart for resource management (#1025)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a shared library for resource configuration management
across multiple application charts.

- **Refactor**
- Updated resource configuration handling in several application charts
to use new centralized helpers for improved consistency and
sanitization.

- **Chores**
- Added references to the shared library in various application chart
directories.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-04 09:42:31 +03:00
Timofei Larkin
b6142cd4f5 Use library chart for resource management
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-06-04 09:05:21 +03:00
Timofei Larkin
e87994c769 Capture all resources by WorkloadMonitors (#1024)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced WorkloadMonitor resources for tcp-balancer, vm-disk, and
VPN applications, enabling enhanced workload monitoring capabilities.

- **Bug Fixes**
- Standardized Kubernetes resource labels across multiple applications
for improved consistency and compatibility.

- **Chores**
- Updated chart versions for several applications, including ClickHouse,
FerretDB, http-cache, MySQL, Postgres, Redis, tcp-balancer,
virtual-machine, vm-disk, vm-instance, and VPN.
- Updated Docker image reference for the installer to use the latest
version.
  - Refreshed internal version mappings for multiple packages.
- Added standardized instance labels to Kubernetes resources across
multiple applications for better tracking and management.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-03 16:15:46 +03:00
Timofei Larkin
b140f1b57f Capture all resources by WorkloadMonitors
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-06-03 15:40:27 +03:00
Timofei Larkin
64936021d2 Release v0.31.0-rc.3 (#991)
This PR prepares the release `v0.31.0-rc.3`.

Signed-off by: Timofei Larkin <lllamnyp@gmail.com>
2025-06-03 14:10:11 +06:00
Andrei Kvapil
a887e19e6c Capture all resources by WorkloadMonitors (#1018)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced monitoring resources for HAProxy, NGINX, and generic HTTP
cache workloads, allowing improved workload observability.
- **Enhancements**
- Added standardized labels to MariaDB, Postgres, and Redis resources
for better integration and management within Kubernetes environments.
- Updated label selectors in Postgres resources to use standardized
Kubernetes app labels.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-02 09:28:30 +02:00
Andrei Kvapil
92b97a569e Fixed Gateway API manifest (#1016)
In current version of Cozystack, flux's HelmRelease will refuse to
install cozy-gateway-api-crds when gatewayAPI enabled, complaining
version '*'not found and breaking install of entire kubernetes app. This
patch adds working version match.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated configuration to allow compatibility with all available
versions of the gateway-api-crds chart.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-02 09:25:25 +02:00
Timofei Larkin
0e22358b30 Capture all resources by WorkloadMonitors
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-06-02 09:44:27 +03:00
Zdenek Deu Janda
7429daf99c Fixed Gateway API manifest
Signed-off-by: Zdenek Deu Janda <zdenek.janda@cloudevelops.com>
2025-06-01 23:49:42 +02:00
Andrei Kvapil
b470b82e2a [tests] Fix concurrency for docker loing action (#1014)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated workflow steps to use a job-specific temporary directory for
Docker configuration during build and container registry login
processes. This enhances isolation of Docker credentials in CI jobs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-30 16:59:11 +02:00
Andrei Kvapil
a0700e7399 [tests] Fix concurrency for docker loing action
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-30 16:49:21 +02:00
Andrei Kvapil
228e1983bc Let users specify CPU requests in VCPUs (#972)
With this change a request for a virtual machine with 3 vCPUs will
reserve exactly the same amount of physical compute, as a request for a
Clickhouse instance with `{"resources": {"cpu": "3"}}` in its values,
with the scaling factor being KubeVirt's CPU allocation ratio.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced configurable CPU allocation ratio for resource management,
allowing CPU requests to be scaled relative to limits.
- Added new templates for input validation and automatic loading of
configuration from Kubernetes ConfigMaps.

- **Bug Fixes**
- Improved resource sanitization and preset logic to handle CPU and
memory requests/limits more accurately.

- **Chores**
- Updated chart dependencies and versioning to reflect changes in
library usage.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-30 12:50:21 +02:00
Timofei Larkin
7023abdba7 Let users specify CPU requests in VCPUs
With this change a request for a virtual machine with 3 vCPUs will
reserve exactly the same amount of physical compute, as a request for a
Clickhouse instance with `{"resources": {"cpu": "3"}}` in its values,
with the scaling factor being KubeVirt's CPU allocation ratio.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-30 00:55:19 +02:00
Timofei Larkin
1b43a5f160 Remove user-facing config of limits and requests
This patch introduces reusable library charts that provide
backward-compatibility for users that specify their resources as
explicit requests and limits for cpu, however this input is processed so
that limits are set equal to requests except for CPU which only gets
requests. Users can now embrace the new form by directly specifying
resources in the first level of nesting (e.g. resources.cpu=100m instead
of .resources.requests.cpu=100m). The order of precedence is top-level,
then requests, then limits, ensuring that nothing will break in terms of
scheduling, however workloads that specified limits much higher than
requests might get a performance hit, now that they cannot use all this
excess capacity. This should only affect memory-hungry workloads in
low-contention environments.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-30 00:55:19 +02:00
Andrei Kvapil
20f4066c16 [tests] fix: increase qemu system disk size (#1011)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-30 00:54:48 +02:00
Andrei Kvapil
ea0dd68e84 [tests] fix: increase qemu system disk size
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-30 00:54:09 +02:00
Andrei Kvapil
e0c3d2324f [ci] Split build artefacts (#1010)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved workflow for pull requests by separating artifact uploads and
downloads, resulting in clearer and more organized handling of installer
and image files during build and test processes.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-30 00:53:54 +02:00
Andrei Kvapil
cb303d694c [ci] Split build artefacts
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-30 00:48:20 +02:00
Andrei Kvapil
6130f43d06 Release v0.31.1 (#1008)
This PR prepares the release `v0.31.1`.
2025-05-30 00:18:28 +02:00
Andrei Kvapil
4db55ac5eb [ci] Add Github token to fetch draft releases
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-30 00:16:03 +02:00
github-actions
bfd20a5e0e Prepare release v0.31.1
Signed-off-by: github-actions <github-actions@github.com>
2025-05-29 23:44:58 +02:00
Andrei Kvapil
977141bed3 [ci] Fix download released artifacts (#1009)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-29 23:42:32 +02:00
Andrei Kvapil
c4f8d6a251 [ci] Fix download released artifacts
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-29 23:42:21 +02:00
Andrei Kvapil
9633ca4d25 Update Talos Linux v1.10.3 and fix assets (#1006)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Installer artifacts now include an additional asset, improving the
completeness of installation resources.

- **Bug Fixes**
- End-to-end tests and cluster setup now verify the presence of all
required installer asset files, reducing setup errors.

- **Chores**
- Updated installer and system extension images to newer versions for
improved stability and compatibility.
- Improved build and test workflows to handle multiple installer assets
and streamline artifact management.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-29 23:27:12 +02:00
Andrei Kvapil
f798cbd9f9 Update Talos Linux v1.10.3
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-29 23:18:53 +02:00
Andrei Kvapil
cf87779f7b [ci] separate build and testing jobs (#1005)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Improved pull request workflow by separating build and test phases,
enhancing reliability and maintainability of automated checks.
- Updated testing process to use a pre-generated installer artifact,
streamlining test execution and environment setup.
- Enhanced release workflow to generate manifests before running tests,
ensuring up-to-date configurations during verification.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-29 18:41:40 +02:00
Andrei Kvapil
c69135e0e5 [ci] separate build and testing jobs
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-29 17:44:50 +02:00
Nick Volynkin
a9c3a4c601 [docs] Write a full release post for v0.31.0 (#999)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Expanded and restructured the changelog for v0.31.0 to provide
detailed information on new features, improvements, bug fixes, testing
updates, CI/CD changes, and community contributions. The changelog now
offers clearer insight into the release contents and lifecycle.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-29 15:34:02 +07:00
Nick Volynkin
d1081c86b3 [docs] Write a full release post for v0.31.0
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-29 10:05:53 +03:00
Andrei Kvapil
beadc80778 Release v0.31.0 (#1003)
This PR prepares the release `v0.31.0`.
2025-05-29 01:24:13 +02:00
github-actions
5bbb5a6266 Prepare release v0.31.0
Signed-off-by: github-actions <github-actions@github.com>
2025-05-28 21:40:20 +00:00
Andrei Kvapil
0664370218 [apps] Add topologySpreadConstraints for managed PostgreSQL and tenant Kubernetes clusters. (#995)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for injecting custom topology spread constraints into
virtual machine templates, PostgreSQL clusters, and monitoring
components based on a ConfigMap in the cluster.

- **Chores**
- Updated chart versions for Kubernetes (0.21.0), Postgres (0.12.0), and
Monitoring (1.10.0).
- Updated version mappings for Kubernetes, Postgres, and Monitoring
packages.
- Increased memory allocation for QEMU virtual machine tests from 8 GB
to 14 GB.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-28 23:20:11 +02:00
kklinch0
225d103509 [k8s] add topologySpreadConstraints for client k8s cluster
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-05-28 23:17:00 +02:00
Andrei Kvapil
0e22e3c12c [virtual-machine] fix: specify ports even for wholeIP mode (#1000)
There is an issue with wholeIP services: internal communication from
pods doesn't work as expected.

Cilium intercepts pod-to-pod traffic, preventing cozy-proxy from
rewriting the source IP in return packets.

This PR allows Cilium to handle specified ports, enabling hairpin
traffic to work correctly at least for these cases.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved service port configuration to ensure explicit port
definitions are respected when using the "WholeIP" method. Now, custom
external ports will not be overridden, providing more accurate and
expected service exposure.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-28 20:53:20 +02:00
Andrei Kvapil
7b8e7e40ce [virtual-machine] fix: specify ports even for wholeIP mode
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-28 20:12:30 +02:00
Nick Volynkin
c941e487fb [docs] Review the tenant Kubernetes cluster docs (#969)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Completely overhauled and expanded the Managed Kubernetes Service
guide for Cozystack.
- Added detailed explanations of service architecture, tenant isolation,
and use cases.
- Included step-by-step instructions for accessing tenant clusters and
kubeconfig files.
- Expanded configuration parameters with clear tables and
recommendations.
- Introduced a comprehensive resource reference and improved
descriptions of instance types and series.
- Enhanced configuration schema descriptions for clearer resource
specification and standardized addon settings.
- Updated configuration file comments for improved clarity and
consistency without changing functionality.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-28 21:00:06 +07:00
Nick Volynkin
8386e985f2 [docs] Review the tenant Kubernetes cluster docs
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-28 15:15:03 +03:00
Andrei Kvapil
e4c944488f [dx] remove version_map and building for library charts (#998)
We do not build helm charts directly for library, since in run-time they
are useless.
Let's remove version_map for them as well

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Simplified project build scripts by removing obsolete version mapping
and related checks.
  - Deleted the outdated versions mapping file for the library package.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-28 13:10:50 +02:00
Andrei Kvapil
99a7754c00 [virtual-machine] Set PortList method by default (#996)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Updated default traffic passthrough method for virtual machine and VM
instance apps to use specific port forwarding instead of whole IP
forwarding.

- **Documentation**
- Updated documentation to reflect the new default passthrough method
for both virtual machine and VM instance apps.

- **Chores**
- Incremented version numbers for virtual machine and VM instance apps
to reflect recent updates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-28 13:02:06 +02:00
Andrei Kvapil
6cbfab9b2a [dx] remove version_map and building for library charts
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-28 12:01:31 +02:00
Andrei Kvapil
461f756c88 [virtual-machine] Set PortList method by default
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-28 11:55:52 +02:00
Andrei Kvapil
50932ba49e [tests] Introduce cozytest - a new testing framework (#982)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced separate end-to-end test scripts for cluster and
application provisioning, improving test clarity and modularity.
- Added a new test runner script with enhanced output formatting and
live tracing for easier debugging of test runs.

- **Bug Fixes**
  - None.

- **Chores**
  - Removed the legacy end-to-end test automation script.
- Updated testing workflow to use new modular test scripts and runner
for improved maintainability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-28 09:19:56 +02:00
Timofei Larkin
f9f8bb2f11 Release v0.31.0-rc.3 (#994)
This PR prepares the release `v0.31.0-rc.3`.

Signed-off by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-27 15:38:54 +03:00
github-actions
2ae8f2aa19 Prepare release v0.31.0-rc.3
Signed-off-by: github-actions <github-actions@github.com>
2025-05-27 12:01:54 +00:00
Timofei Larkin
1a872ca95c Revert experiments with workflows
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-27 14:50:39 +03:00
Timofei Larkin
3e379e9697 Add manual workflow temporarily for quicker feedback
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-27 14:17:28 +03:00
Timofei Larkin
7746974644 Add manual workflow temporarily for quicker feedback
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-27 13:57:37 +03:00
Timofei Larkin
d989a8865d Release v0.31.0-rc.3 (#993)
This PR prepares the release `v0.31.0-rc.3`.

Signed-off by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-27 13:33:28 +03:00
github-actions
4aad0fc8f2 Prepare release v0.31.0-rc.3
Signed-off-by: github-actions <github-actions@github.com>
2025-05-27 10:08:18 +00:00
Timofei Larkin
0e5ac5ed7c Detail errors for workflows (#992)
Signed-off by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-27 12:45:28 +03:00
Timofei Larkin
c267c7eb9a Update .github/workflows/pull-requests-release.yaml
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-27 12:20:00 +03:00
Timofei Larkin
7792e29065 Detail errors for workflows
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-26 16:44:54 +03:00
Timofei Larkin
d35ff17de8 Release v0.31.0-rc.3 (#991)
This PR prepares the release `v0.31.0-rc.3`.

Signed-off by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-26 15:07:17 +03:00
github-actions
3a7d4c24ee Prepare release v0.31.0-rc.3
Signed-off-by: github-actions <github-actions@github.com>
2025-05-26 11:40:21 +00:00
Timofei Larkin
ff2638ef66 Fix regression in release workflow (#990)
Signed-off by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-26 14:32:10 +03:00
Timofei Larkin
bc294a0fe6 Fix regression in release workflow
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-26 14:06:20 +03:00
Timofei Larkin
bf5bccb7d9 Release v0.31.0-rc.3 (#988)
This PR prepares the release `v0.31.0-rc.3`.

Signed-off by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-26 13:41:41 +03:00
Timofei Larkin
f00364037e [docs] Update release notes for Cozystack v0.31.0-rc.3 (#989) 2025-05-26 13:39:46 +03:00
Nick Volynkin
e83bf379ba [docs] Update release notes for Cozystack v0.31.0-rc.3
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-26 13:09:31 +03:00
github-actions
ae0549f78b Prepare release v0.31.0-rc.3
Signed-off-by: github-actions <github-actions@github.com>
2025-05-26 08:26:01 +00:00
Andrei Kvapil
74e7e5cdfb [tests] Add tests for tenant Kubernetes cluster
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-25 09:30:26 +02:00
Andrei Kvapil
2bf4032d5b [tests] Introduce cozytest - a new testing framework
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-25 09:29:58 +02:00
Andrei Kvapil
ee1763cb85 [cert-manager] Update Cert-manager to v1.17.2 (#975)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added support for specifying a literal password in keystore
configurations, alongside existing secret reference options.
- Introduced a new optional tenant ID field for Azure DNS managed
identity in ACME DNS01 solver configuration.

- **Improvements**
  - Updated cert-manager Helm chart and documentation to version 1.17.2.
- Expanded feature gate configuration options with detailed default
values and stability levels.
- Enhanced documentation and examples for templating service account
annotations.
- Improved conditional logic for resource creation and image pull
secrets handling in deployments and services.

- **Bug Fixes**
- Made password fields in keystore configurations mutually exclusive and
optional, improving flexibility and clarity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-24 22:36:51 +02:00
Andrei Kvapil
d497be9e95 [build] system/metallb: multiarch support (#970)
Add support for metallb multiarch build.

Part of #519 and a follow-up to PR #945 (issue #909)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved Docker build process for image-controller and image-speaker
to allow dynamic control over image loading and enhanced build
configuration consistency. No changes to user-facing features.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-24 22:31:15 +02:00
Andrei Kvapil
6176a18a12 [ci] Support alpha and beta pre-releases (#978)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Expanded support for prerelease tags to include "alpha" and "beta"
suffixes (e.g., `-alpha.1`, `-beta.2`) in addition to "rc".
- **Style**
  - Improved formatting and consistency in comments and log messages.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-24 22:30:20 +02:00
Andrei Kvapil
5789f12f3f [ci] Force-update release branch on tagged main commits (#977)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved the process for updating or creating maintenance branches to
ensure they always point to the latest tagged release commit.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-24 22:29:52 +02:00
Andrei Kvapil
6279873a35 [kubernetes] Fix Ingress-NGINX depends on Cert-Manager (#976)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved configuration to automatically disable admission webhooks for
cert-manager when the cert-manager addon is not enabled, preventing
unnecessary webhook setup.

- **Chores**
  - Updated Kubernetes chart version to 0.20.1.
  - Updated version mapping for the Kubernetes package.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-24 22:29:18 +02:00
Andrei Kvapil
79c441acb7 [virtual-machine] Add support for various storages (#974)
remove specification:

```
      pvc:
        volumeMode: Block
        accessModes:
        - ReadWriteMany
```

with `storage` it will be filled automatcially from storageprofile for
specific storage provider

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
	- Updated the virtual machine app to version 0.9.2.
- **Refactor**
- Changed the data volume configuration to use a simplified storage
specification instead of a persistent volume claim.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-24 22:28:56 +02:00
Andrei Kvapil
7864811016 [docs] Tenants cannot have dashes in the names (#980)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
	- Improved and reorganized tenant documentation for better clarity.
- Added explicit rules for tenant naming, including restrictions on
dashes and required alphanumeric names.
	- Clarified how tenant domains are structured and inherited.
- Expanded explanations on nesting tenants and sharing parent services,
with updated examples and clearer formatting.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-24 22:28:32 +02:00
Andrei Kvapil
13938f34fd Re-update kamaji to latest version (#983)
It was updated:
4ecf492cd4

Then partially reverted during merge:
d550a67f19

Please take a look if it should be updated.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
	- Updated the Kamaji component to use version edge-25.4.1.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-24 17:56:22 +02:00
nbykov0
9fb6b41e03 Re-update kamaji to latest version
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-23 22:22:12 +03:00
Timofei Larkin
a8ba6b1328 Remove user-facing config of limits and requests (#935)
This patch introduces reusable library charts that provide
backward-compatibility for users that specify their resources as
explicit requests and limits for cpu, however this input is processed so
that limits are set equal to requests except for CPU which only gets
requests. Users can now embrace the new form by directly specifying
resources in the first level of nesting (e.g. resources.cpu=100m instead
of .resources.requests.cpu=100m). The order of precedence is top-level,
then requests, then limits, ensuring that nothing will break in terms of
scheduling, however workloads that specified limits much higher than
requests might get a performance hit, now that they cannot use all this
excess capacity. This should only affect memory-hungry workloads in
low-contention environments.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a reusable Helm library chart, "cozy-lib", providing common
templates and resource helpers for other charts.
- Added resource preset and sanitization templates to standardize
Kubernetes resource configurations.
- ClickHouse chart now depends on "cozy-lib" for improved resource
handling.
- Added a new packaging script and streamlined Helm chart packaging
processes across multiple packages.

- **Bug Fixes**
- Resource configuration logic in the ClickHouse deployment was updated
to use the new library templates, ensuring more consistent resource
definitions.

- **Chores**
- Added new Makefiles and version mapping for streamlined Helm chart
packaging and validation.
- Updated ClickHouse chart version to 0.9.0 and reflected this in
version mapping files.
- Refactored Makefile targets to consolidate packaging logic and improve
maintainability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-23 18:11:27 +03:00
Timofei Larkin
9592f7fe46 Remove user-facing config of limits and requests
This patch introduces reusable library charts that provide
backward-compatibility for users that specify their resources as
explicit requests and limits for cpu, however this input is processed so
that limits are set equal to requests except for CPU which only gets
requests. Users can now embrace the new form by directly specifying
resources in the first level of nesting (e.g. resources.cpu=100m instead
of .resources.requests.cpu=100m). The order of precedence is top-level,
then requests, then limits, ensuring that nothing will break in terms of
scheduling, however workloads that specified limits much higher than
requests might get a performance hit, now that they cannot use all this
excess capacity. This should only affect memory-hungry workloads in
low-contention environments.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-23 17:32:42 +03:00
Nick Volynkin
119d582379 [docs] Review the Tenant app documentation
* Brush up some formatting
* Explain the relations of nested tenants in more detail

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-23 14:50:24 +03:00
Nick Volynkin
451267310b [docs] Tenants cannot have dashes in the names
Gave examples of tenant naming.

Part of cozystack/cozystack#971

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-23 14:50:23 +03:00
Andrei Kvapil
0fee3f280b [ci] Support alpha and beta pre-releases
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-22 16:09:16 +02:00
Andrei Kvapil
2461fcd531 [ci] Force-update release branch on tagged main commits
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-22 15:12:51 +02:00
Andrei Kvapil
866b6e0a5a [kubernetes] Fix Ingress-NGINX depends on Cert-Manager
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-22 13:16:08 +02:00
Andrei Kvapil
1dccf96506 [cert-manager] Update Cert-manager to v1.17.2
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-22 12:17:06 +02:00
Andrei Kvapil
bca27dcfdc [virtual-machine] Add support for various storages
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-22 11:55:48 +02:00
nbykov0
5407ee01ee system/metallb: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-22 02:48:21 +03:00
kevin880202
fc8b52d73d reset and add audit/event monitoring in fluentbit values
Signed-off-by: kevin880202 <dytoponts11@gmail.com>
2025-05-21 22:07:27 +08:00
Timofei Larkin
609e7ede86 Release v0.31.0-rc.2 (#958)
This PR prepares the release `v0.31.0-rc.2`.
2025-05-21 16:18:38 +03:00
Timofei Larkin
289853e661 Release Notes for 0.31.0-RC.2 (#965)
- **[docs] Add Release Notes from v0.31.0-RC.1**
- **[docs] Update changelogs for v0.31.0-RC.2**
2025-05-21 16:12:35 +03:00
Nick Volynkin
013e1936ec [docs] Update changelogs for v0.31.0-RC.2
Add the changes accumulated since RC.1 to the changelog.

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-21 15:52:39 +03:00
Nick Volynkin
893818f64e [docs] Add Release Notes from v0.31.0-RC.1
Add to v0.31.0.md, as we're acuumulating changelogs until a stable release.
See https://github.com/cozystack/cozystack/blob/main/docs/release.md#release-candidates

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-21 15:32:22 +03:00
github-actions
56bdaae2c9 Prepare release v0.31.0-rc.2
Signed-off-by: github-actions <github-actions@github.com>
2025-05-21 11:52:29 +00:00
Timofei Larkin
775ecb7b11 (air gapped): disable fluxcd artifact by default (#964)
Default `artifact` field for FluxCD operator to empty string, so it uses embedded manifests and is compatible with air-gapped environments.
2025-05-20 15:46:44 +03:00
Timofei Larkin
8d74a35e9c [e2e] Return genisoimage to the e2e-test Dockerfile (#962)
commit 7a7512da30 removed genisoimage
package installation from Dockerfile which leds to test fail due to the
fact that genisoimage is missing and runner enable to create image.
[issue
reference](https://github.com/cozystack/cozystack/actions/runs/15084476654/job/42406141954).
restored genisoimage package installation in Dockerfile.
2025-05-20 15:21:20 +03:00
kklinch0
b753fd9fa8 (air gapped): disable fluxcd artifact by default
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-05-20 13:10:33 +03:00
Ahmad Murzahmatov
9698f3c9f4 commit 7a7512da30 removed genisoimage package installation from Dockerfile which leds to test fail due to the fact that genisoimage is missing and runner enable to create image. issue reference - https://github.com/cozystack/cozystack/actions/runs/15084476654/job/42406141954. restored genisoimage package installation in Dockerfile
Signed-off-by: Ahmad Murzahmatov <gwynbleidd2106@yandex.com>
2025-05-20 03:54:03 +06:00
Andrei Kvapil
31b110cd39 Revert "[ingress] avoid invalid externalIPs when config value is empty" (#959)
Reverts cozystack/cozystack#957. This was already fixed by
https://github.com/cozystack/cozystack/pull/952
2025-05-17 13:32:12 +02:00
Andrei Kvapil
b4da00f96f Revert "[ingress] avoid invalid externalIPs when config value is empty" 2025-05-17 13:28:31 +02:00
Andrei Kvapil
0369852035 [ingress] avoid invalid externalIPs when config value is empty (#957)
Fix regression introduced by
https://github.com/cozystack/cozystack/pull/929#discussion_r2090992853

becasue of `splitList "," "" == [""]`

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-17 12:35:34 +02:00
Andrei Kvapil
115497b73f [ingress] avoid invalid externalIPs when config value is empty
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-17 12:31:57 +02:00
Andrei Kvapil
4f78b133c2 [build] Cross-arch builds: components (#932)
Components with existing dockerfiles will be updated in this PR.

Part of #519 

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added support for multi-architecture and cross-platform Docker image
builds across various components, enabling builds for different
operating systems and CPU architectures.

- **Chores**
- Updated Docker build commands in multiple Makefiles to use
configurable builder and platform variables, improving build
flexibility.
- Standardized Dockerfile build arguments and environment variables for
cross-compilation.
- Improved package installation commands for quieter and more minimal
installs in Dockerfiles.
- Changed the default bucket name configuration to "cozystack" in system
bucket settings.
- Updated some maintenance targets and manual update reminders in
Makefiles.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-17 12:16:59 +02:00
Andrei Kvapil
d550a67f19 Merge branch 'main' into 519-cross-arch-components 2025-05-17 12:16:49 +02:00
Andrei Kvapil
8e6941dfbd [cluster-api] Update capi-providers (#947)
v0.10.1 version fixes Bootstrap: Make
joinConfiguration.discovery.bootstrapToken.token optional
(https://github.com/kubernetes-sigs/cluster-api/pull/12136)

ref https://github.com/cozystack/cozystack/issues/939 and
https://github.com/clastix/cluster-api-control-plane-provider-kamaji/issues/212

fixes https://github.com/cozystack/cozystack/issues/939

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated the versions of cluster-api CoreProvider and kubeadm
BootstrapProvider from v1.10.0 to v1.10.1.
- Updated the version of kamaji ControlPlaneProvider from v0.14.2 to
v0.15.1.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-17 12:15:10 +02:00
Andrei Kvapil
c54567ab45 Revert "Downgrade CAPI operator" (#946)
Reverts cozystack/cozystack#942

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for specifying manifest patches and additional manifests
for all provider types, enabling more flexible customization.
- Introduced an optional property to pass additional arguments to
provider controller managers.
  - Added a JSON schema for validating chart values.

- **Enhancements**
- Provider configuration now uses structured maps instead of strings,
simplifying customization and reducing errors.
- Improved validation and descriptions for condition fields in resource
schemas.

- **Updates**
  - Upgraded Cluster API Operator chart and app versions to 0.19.0.
  - Updated default image tag for the manager container to v0.19.0.

- **Documentation**
  - Added example configurations in the values file for easier setup.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-17 12:14:46 +02:00
Andrei Kvapil
dd592ca676 [kubernetes] Update Kubernetes v1.32.4 (#949)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
  - Updated the application version in the Kubernetes chart to 1.32.4.
- Made version fields in Kubernetes cluster templates dynamically
reference the chart's application version, ensuring consistency during
deployments.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-17 12:14:08 +02:00
Andrei Kvapil
5273722769 Update Kamaji to edge-25.4.1 (#953)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added new validation rules to enforce stricter configuration
requirements for datastore drivers and authentication fields.
- Introduced a new field to specify stop signals for containers and a
new status field to track terminating pods.
  - Added a new "Sleeping" status for version reporting.

- **Improvements**
- Updated and clarified field descriptions for environment variable
sources, volume types, and deployment status.
  - Removed outdated beta feature gate notes from documentation.

- **Bug Fixes**
- Improved handling and validation of sensitive configuration fields
based on driver type.

- **Chores**
  - Updated Go base image and Kamaji version in the Dockerfile.
  - Changed Kamaji image tag to use the latest version.

- **Refactor**
- Moved imagePullSecrets configuration from the deployment to the
ServiceAccount manifest for better management.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-17 12:13:57 +02:00
Andrei Kvapil
fb26e3e9b7 [kubernetes] fix regression: return port specification (#956)
This PR fixes regression from
https://github.com/cozystack/cozystack/pull/867

We have updated Kamaji, removed workaround, but didn't return the port
specification

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Updated network configuration to explicitly include port 443 in
hostnames for ingress.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-17 12:13:44 +02:00
Andrei Kvapil
5e0b0167fc [kubernetes] fix regression: return port specification
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-16 16:25:39 +02:00
Timofei Larkin
73fdc5ded7 Build patched MetalLB (#945)
Since it's taking a while for metallb/metallb#2726 to get released, the
binaries with the fix are recompiled in-tree. Workaround for #909.
2025-05-16 15:15:32 +03:00
Timofei Larkin
5fe7b3bf16 Build patched MetalLB
Since it's taking a while for metallb/metallb#2726 to get released, the
binaries with the fix are recompiled in-tree. Workaround for #909.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-16 14:57:58 +03:00
Andrei Kvapil
4ecf492cd4 Update Kamaji to edge-25.4.1
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-16 13:47:04 +02:00
Timofei Larkin
c42a50229f Hotfix: error in template (#952)
Resolves regressions introduced in #928 and #929
2025-05-16 14:42:11 +03:00
Timofei Larkin
6f55a66328 Hotfix: error in template
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-16 14:21:08 +03:00
Andrei Kvapil
9d551cc69b [kubernetes] Update Kubernetes v1.32.4
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-15 16:49:40 +02:00
Andrei Kvapil
93b8dbb9ab [cluster-api] Update capi-providers
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-15 16:43:56 +02:00
Andrei Kvapil
8ad010d331 Revert "Downgrade CAPI operator"
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-15 14:54:30 +02:00
Andrei Kvapil
404579c361 [platform] refactor dashboard values (#928)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-15 14:16:38 +02:00
Andrei Kvapil
f8210cf276 [platform] Introduce expose-services, expose-ingress and expose-external-ips options (#929)
docs update: https://github.com/cozystack/website/pull/197

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Summary by CodeRabbit

- **New Features**
- Added automated migration script to transition configuration from
HelmRelease to ConfigMap for service exposure and external IPs.
- Introduced new ingress templates for API, CDI upload proxy, and VM
export proxy services, enabling dynamic exposure based on centralized
configuration.

- **Bug Fixes**
  - Updated NGINX Ingress Controller Helm chart version to 1.6.0.

- **Refactor**
- Centralized ingress configuration using a ConfigMap, simplifying and
unifying service exposure and ingress class management.
- Removed legacy parameters and templates for dashboard, CDI upload
proxy, and VM export proxy from values and schema files.
- Simplified ingress templates for dashboard and Keycloak to rely on
centralized ConfigMap data and exposure lists.
- Adjusted ingress controller service to conditionally use external IPs
based on centralized configuration.

- **Documentation**
- Updated documentation to reflect the removal of deprecated parameters
and clarify current configuration options.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-15 14:15:56 +02:00
Andrei Kvapil
545e256695 [platform] refactor dashboard values
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-15 14:13:57 +02:00
Andrei Kvapil
e9c463c867 [platform] Add migration for expose-* options
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-15 13:45:19 +02:00
Andrei Kvapil
798ca12e43 [platform] Introduce expose-external-ips option
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-15 13:45:15 +02:00
Andrei Kvapil
3780925a68 [platform] Introduce expose-services and expose-ingress options
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-15 12:35:02 +02:00
Andrei Kvapil
a240c0b6ed [talos] Update Talos Linux v1.10.1 (#931)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated installer and system component versions to v1.10.1 across all
profiles.
- Refreshed system extension images to newer releases, including updated
versions for drbd and zfs.
- Applied recent date-based updates to firmware and extension images for
improved support and compatibility.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-15 12:33:02 +02:00
Andrei Kvapil
de1b38c64b Update Flux Operator to 0.20.0 (#934)
Now includes a Flux MCP server

(docs: https://fluxcd.control-plane.io/mcp/ - NB: it is not running in
the cluster by default, and I haven't tried it yet)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated Helm chart and app version numbers for Flux Operator and Flux
Instance to 0.20.0.
- **Documentation**
- Updated version badges in the README files to reflect the new 0.20.0
release.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-15 12:31:53 +02:00
nbykov0
15d7b6d99e extra/monitoring: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-14 18:30:15 +03:00
Timofei Larkin
9377f55000 Downgrade CAPI operator (#942)
Resolves #940.
2025-05-14 17:37:54 +03:00
Timofei Larkin
d002879b0b Downgrade CAPI operator
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-14 15:16:14 +03:00
Andrei Kvapil
2c6338a2ef Don't overcommit memory (#913)
This patch recreates the resource presets with a non-burstable memory
allocation (request==limit) and without CPU limits. With the new presets
the difference between the larger presets became meaningless, so their
values were adjusted.

Resolves #912 

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated resource presets across all application charts to remove CPU
limits, align memory limits with requests, and standardize memory units
for consistency.
- Adjusted CPU and memory request values for larger presets in several
applications.
  - Updated chart versions for all affected applications.
  - Refreshed version mappings to reflect latest commit hashes.
- Added explicit resource configuration for Redis in the dashboard
configuration.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-13 17:19:27 +02:00
Kingdon B
fd72d7c486 Flux Operator 0.20.0
Signed-off-by: Kingdon B <kingdon@urmanac.com>
2025-05-12 10:15:58 -04:00
Timofei Larkin
db34f31175 Don't overcommit memory or throttle CPU
This patch recreates the resource presets with a non-burstable memory
allocation (request==limit) and without CPU limits. With the new presets
the difference between the larger presets became meaningless, so their
values were adjusted.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-12 15:59:28 +03:00
Timofei Larkin
653e2bc774 [519] Cross-arch builds: builders variables (#907)
Added PLATFORM variable to `common-envs.mk`: if not defined, it is
calculated based on docker daemon arch.
May be overridden by e.g. `make -e PLATFORM='linux/arm64' ...`
Added the variable to a single Dockerfile for now.
2025-05-12 13:08:00 +04:00
nbykov0
31ea5eeeb2 system/kubeovn-webhook: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 03:06:57 +03:00
nbykov0
4a2c67e045 apps/kubernetes: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
68fb7570f7 apps/postgres: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
56fc08fab4 apps/mysql: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
b00ba53171 apps/clickhouse: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
4dd52290ea apps/mysql: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
492aff5265 apps/clickhouse: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
395cdc3af1 apps/http-cache: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
e6f3000b3c apps/postgres: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
e21c38c103 extra/monitoring: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
7a7512da30 core/testing: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
58b5f6610d system/cozystack-controller: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
e81053f7dd system/dashboard: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
424aab4a83 system/kubeovn: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
77e6db3381 system/kamaji: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
f6e3188ab8 system/cozystack-api: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
1ca0594060 system/cilium: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
ac59b4540b system/bucket: add meaningful default to values.yaml
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
d0bd4b1329 system/bucket: multiarch support
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
nbykov0
ccbcaf6331 system/cozystack-controller: add multiarch options
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-12 02:50:11 +03:00
Andrei Kvapil
1ad1b15a5b [talos] Update Talos Linux v1.10.1
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-09 14:56:27 +02:00
Ubuntu
2349ff61c1 scripts/common-envs.mk: add PLATFORM calculation with json parsing
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
Ubuntu
13139dd71d Revert "Makefile: add buildx version requirement"
This reverts commit 8d367533550236fc587bd5f236046c15f6b7609a.
The check it introduced is not needed.

Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
nbykov0
57ac614865 Makefile: add buildx version requirement
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
nbykov0
bbb93c647d scripts/common-envs.mk: commit suggestions after a review
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
nbykov0
951ba75d93 scripts/common-envs.mk: add --bootsrap flag to inspects
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
nbykov0
15c9c4a068 system/cozystack-controller: add PLATFORM and BUILDER variables to Makefile
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
nbykov0
57fefde732 scrips/common-envs.mk: add BUILDER and PLATFORM calculation
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
nbykov0
b4a04df6f3 system/cozystack-controller: add PLATFORM variable to Makefile: syntax
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
nbykov0
1e63b5e8ce system/cozystack-controller: add PLATFORM variable to Makefile
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
nbykov0
6ad30915eb Add PLATFORM make variable; calculate it if undefined
Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>
2025-05-08 19:04:48 +03:00
Timofei Larkin
557ffa536f Update kube-ovn to latest version (#922)
This commit bumps kube-ovn to 1.13.11 and does away with patching the
code now that the fixes necessary for kube-ovn to work properly in Talos
have been released in the upstream.
2025-05-08 16:33:17 +04:00
Andrei Kvapil
ae05d2f545 [kubernetes] Enable Cilium Gateway API #923 (#924)
Implementation of Cilium Gateway API

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added optional Gateway API addon for Kubernetes clusters, controlled
by a new configuration flag.
- Introduced automated deployment of Gateway API CRDs when the addon is
enabled.
- **Documentation**
- Updated documentation to describe the new Gateway API addon and its
configuration.
- **Chores**
- Added chart metadata and automation files for managing Gateway API
CRDs.
  - Updated chart version to reflect new features.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-08 12:21:17 +02:00
Andrei Kvapil
563c643813 [kubernetes] refactor gatewayAPI option
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-05-08 12:19:07 +02:00
Zdenek Deu Janda
68c85ac9ef [kubernetes] Enable Cilium Gateway API
Signed-off-by: Zdenek Deu Janda <zdenek.janda@cloudevelops.com>
2025-05-08 12:18:40 +02:00
Timofei Larkin
3ac00ea4ec Update kube-ovn to latest version
This commit bumps kube-ovn to 1.13.11 and does away with patching the
code now that the fixes necessary for kube-ovn to work properly in Talos
have been released in the upstream.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-07 14:35:00 +03:00
klinch0
29b49496f2 [platform] delete extra dependencies for piraeus operator (#856)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated dependency configuration so that piraeus-operator no longer
depends on victoria-metrics-operator.
- **Refactor**
- Improved compatibility by ensuring certain resources (VMPodScrape and
alert definitions) are only rendered if the required API versions are
available in the Kubernetes cluster.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-07 12:30:31 +03:00
kklinch0
3c27192d3e [platform] delete extra dependencies for piraeus operator
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-05-05 16:56:12 +03:00
klinch0
dca732cde0 [platform] add hr reconciler (#870)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new controller to synchronize tenant HelmReleases and
propagate configuration changes.
- Added dynamic host value overrides in multiple Helm templates by
conditionally retrieving values from the "tenant-root" HelmRelease.
- Updated RBAC permissions to allow management of HelmRelease resources.

- **Improvements**
  - Added support for Helm v2 API integration.
- Enhanced HelmRelease reconciliation logic and configuration
propagation for tenant environments.

- **Bug Fixes**
- Fixed periodic reconciliation for the "tenant-root" HelmRelease by
setting its interval to zero.

- **Version Updates**
  - Incremented version numbers for the "info" and "ingress" packages.

- **Chores**
  - Updated version mappings and commit references.
  - Improved .gitignore to exclude the .vscode directory.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-05 16:41:34 +03:00
Timofei Larkin
0346dc05bb Enable user-added params in tenant cluster Cilium (#917)
Users requested the possibility of passing custom values to the Cilium
HelmRelease in tenant k8s clusters to enable its latest features, such
as support for the Gateway API. This customization is now available via
the `valuesOverride` field under `addons.cilium` in the kubernetes' app
values.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for custom override values for the Cilium addon,
allowing users to configure Cilium settings via the values file.
- **Chores**
  - Updated the Kubernetes chart version to 0.20.0.
  - Updated version mappings to reflect the new chart version.
- **Documentation**
- Updated Kubernetes managed service docs to include configuration
details for Cilium addon overrides.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-05 16:55:17 +04:00
Timofei Larkin
a03cdeff04 Enable user-added params in tenant cluster Cilium
Users requested the possibility of passing custom values to the Cilium
HelmRelease in tenant k8s clusters to enable its latest features, such
as support for the Gateway API. This customization is now available via
the `valuesOverride` field under `addons.cilium` in the kubernetes' app
values.

Additionally add dummy schema for S3 bucket, as it breaks the pre-commit
checks.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-05-05 15:37:34 +03:00
Nick Volynkin
062d72805a [docs] Update release policy: Release Candidate versions (#897)
*Documentation**
- Expanded the release documentation with a new section explaining
Cozystack's staged release process, including details on Release
Candidates, Regular Releases, and Patch Releases.
- Clarified the workflow and purpose of Release Candidates and updated
the explanation of how regular releases are created.

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-05 16:15:57 +07:00
Nick Volynkin
70fed8148d [ci] Run pre-commit checks even on doc changes
Pre-commit is now required to merge PRs, so let it run even on documentation updates.
An alternative is to merge with administrator permissions, bypassing rules,
which is not a good practice.

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-05 11:57:02 +03:00
Nick Volynkin
12c6df83f5 [docs] Update release policy: Release Candidate versions
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-05-05 11:52:06 +03:00
kklinch0
f61a7817e6 [platform] add hr reconciler
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-05-05 09:26:50 +03:00
Timofei Larkin
c482289b14 Make kubevirt's CPU allocation ratio configurable (#905)
Kubevirt's default cpu-to-vcpu ration is 1:10, which might be a bit
extreme for some users. This patch introduces a new key in the Cozystack
configmap, "cpu-allocation-ratio" where admins of Cozystack can specify
an alternative value, if needed.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added support for optionally configuring a CPU allocation ratio for
KubeVirt deployments when the relevant setting is provided.
- **Chores**
- Improved configuration flexibility for KubeVirt by allowing dynamic
injection of CPU allocation settings.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-30 10:53:42 +04:00
Timofei Larkin
1e59e5fbb6 Fix virtual machine resource tracking (#904)
* Count Workload resources for pods by requests, not limits
* Do not count init container requests
* Prefix Workloads for pods with `pod-`, just like the other types to
prevent possible name collisions (closes #787)

The previous version of the WorkloadMonitor controller incorrectly
summed resource limits on pods, rather than requests. This prevented it
from tracking the resource allocation for pods, which only had requests
specified, which is particularly the case for kubevirt's virtual machine
pods. Additionally, it counted the limits for all containers, including
init containers, which are short-lived and do not contribute much to the
total resource usage.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved handling of workloads with unrecognized prefixes by ensuring
they are properly deleted and not processed further.
- Corrected resource aggregation for Pods to sum container resource
requests instead of limits, and now only includes normal containers.

- **New Features**
	- Added support for monitoring workloads with names prefixed by "pod-".

- **Tests**
- Introduced unit tests to verify correct handling of workload name
prefixes and monitored object creation.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-30 10:52:17 +04:00
Timofei Larkin
6106a9fe51 Make kubevirt's CPU allocation ratio configurable
Kubevirt's default cpu-to-vcpu ration is 1:10, which might be a bit
extreme for some users. This patch introduces a new key in the Cozystack
configmap, "cpu-allocation-ratio" where admins of Cozystack can specify
an alternative value, if needed.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-29 16:13:18 +03:00
Timofei Larkin
ec9e26c054 Fix virtual machine resource tracking
* Count Workload resources for pods by requests, not limits
* Do not count init container requests
* Prefix Workloads for pods with `pod-`, just like the other types to
  prevent possible name collisions (closes #787)

The previous version of the WorkloadMonitor controller incorrectly
summed resource limits on pods, rather than requests. This prevented it
from tracking the resource allocation for pods, which only had requests
specified, which is particularly the case for kubevirt's virtual machine
pods. Additionally, it counted the limits for all containers, including
init containers, which are short-lived and do not contribute much to the
total resource usage.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-29 15:22:46 +03:00
Andrei Kvapil
108fc647ea [ci] Use dots in release candidtate versions, as per SemVer (#901)
This change also fixes `finalizing release` workflow
https://github.com/cozystack/cozystack/pull/890#issuecomment-2830525103

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated release tag validation to require a dot between "rc" and the
number (e.g., `v0.31.5-rc.1` instead of `v0.31.5-rc1`).
  - Adjusted error messages to reflect the new release tag format.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 17:01:13 +02:00
Andrei Kvapil
a9b235048d [ci] Use dots in release candidtate versions, as per SemVer
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 16:54:41 +02:00
Andrei Kvapil
e1c14619d2 Revert "[ci] automatically trigger tests in releasing PR" (#900)
Revert https://github.com/cozystack/cozystack/pull/894 due to fact this
logic does not trigger checks in pull requests

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Removed support for manually triggering the pull request release
workflow.
- Simplified release workflow to run automatically only on labeled pull
requests.
- Eliminated the step in the tags workflow that triggered release
verification via manual dispatch.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 16:50:06 +02:00
Andrei Kvapil
f644bf20c5 Revert "[ci] automatically trigger tests in releasing PR"
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 16:47:45 +02:00
Andrei Kvapil
93bdf41144 Release v0.31.0-rc.1 (#895)
This PR prepares the release `v0.31.0-rc.1`.
2025-04-25 14:56:48 +02:00
Andrei Kvapil
bacf15f037 [e2e] Fix device_ownership_from_security_context CRI (#896)
Currently, you can't create VMDisk or VMInstance. The importer pod in
Error state with logs

`kubectl -n tenant-root logs
importer-prime-84b44042-c0ac-4e52-8fbd-a0313f4701a6`

```
I0422 07:37:02.928787       1 importer.go:107] Starting importer
E0422 07:37:02.929473       1 importer.go:137] exit status 1, blockdev: cannot open /dev/cdi-block-volume: Permission denied

kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceBlock
        pkg/util/file.go:135
kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceByVolumeMode
        pkg/util/util.go:99
main.main
        cmd/cdi-importer/importer.go:135
runtime.main
        GOROOT/src/runtime/proc.go:271
runtime.goexit
        src/runtime/asm_amd64.s:1695
```

This change solves the issue with importer pod

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
  - Improved formatting of script commands for better readability.
  - Updated container runtime configuration for enhanced customization.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 14:54:56 +02:00
dtrdnk
9239852ec8 Update permissions version for CRI containerd
Signed-off-by: dtrdnk <4demenko@gmail.com>
2025-04-25 15:45:07 +03:00
github-actions
87a286fc74 Prepare release v0.31.0-rc.1
Signed-off-by: github-actions <github-actions@github.com>
2025-04-25 12:37:42 +00:00
Andrei Kvapil
6d253b937b [ci] fix triggering releasing pr tests (#898)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 14:33:57 +02:00
Andrei Kvapil
255176c321 [ci] fix triggering releasing pr tests
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 14:33:29 +02:00
Andrei Kvapil
fa341deaac [ci] automatically trigger tests in releasing PR (#894)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added the ability to manually trigger the release verification
workflow with a specific commit SHA.
- The release verification workflow now supports both pull request
events and manual triggers.
- **Chores**
- Automated triggering of release verification tests from the tags
workflow when a new release is detected.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 14:00:26 +02:00
Nick Volynkin
f08566d3f1 [ci] Use dots in release candidtate versions, as per SemVer (#890)
Before: 0.31.0-rc1
After:  0.31.0-rc.1

Why this matters: we want to do things the right way from the start.
Version patten affects how versions are parsed and sorted.
For example, we have release candidates number 9 and 10:

* In 'rc.9' and 'rc.10', the numeric parts are compared as numbers,
  so 9 comes before 10.
* In 'rc9' and 'rc10', versions are compared lexicographically,
  so 10 comes before 9, which is wrong.

Reference: SemVer items 9–11. https://semver.org/#spec-item-9
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-25 14:13:57 +03:00
Andrei Kvapil
a29040faf7 [ci] automatically trigger tests in releasing PR
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:59:46 +02:00
Nick Volynkin
637551eb33 [ci] Use dots in release candidtate versions, as per SemVer
Before: 0.31.0-rc1
After:  0.31.0-rc.1

Why this matters: we want to do things the right way from the start.
Version patten affects how versions are parsed and sorted.
For example, we have release candidates number 9 and 10:

* In 'rc.9' and 'rc.10', the numeric parts are compared as numbers,
  so 9 comes before 10.
* In 'rc9' and 'rc10', versions are compared lexicographically,
  so 10 comes before 9, which is wrong.

Reference: SemVer items 9–11. https://semver.org/#spec-item-9
Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-25 13:57:03 +03:00
Andrei Kvapil
58d959b305 [tests] refactor tests and remove e2e.applications (#893)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:55:09 +02:00
Andrei Kvapil
fcc7056e5c [platform] Fix installing release candidate versions (#891)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated version constraints for multiple HelmRelease resources to use
an explicit semantic version range (>= 0.0.0-0) instead of a wildcard or
unspecified value, clarifying eligible chart versions for deployment.
- Renamed and updated version variable in build scripts to improve
version tagging and packaging consistency.
- Enhanced deployment verification by adding readiness checks for
HelmReleases, with failure detection and reporting for non-ready
releases.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-25 12:42:20 +02:00
Andrei Kvapil
5d7e56bffe [tests] refactor tests and remove e2e.applications
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:28:43 +02:00
Andrei Kvapil
69b3ddf717 [e2e] Better output in case of failed HelmReleases
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:21:49 +02:00
Andrei Kvapil
79b5c6b5af [platform] Use devel versions notation for HelmCharts
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:13:40 +02:00
Andrei Kvapil
076128c783 [platform] Fix installing release candidate versions
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-25 12:07:30 +02:00
Andrei Kvapil
894cb14d49 [kubernetes] Fix ubuntu-container-disk tag (#887)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 16:40:26 +02:00
Andrei Kvapil
a0935e9ae4 [kubernetes] Fix ubuntu-container-disk tag
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 16:38:42 +02:00
Andrei Kvapil
f2c248acbd [ci] Create long‑lived maintenance branch after release published (#886)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated release workflows to ensure maintenance branches are created
during release finalization instead of during tag creation.
- Removed maintenance branch creation from the tag workflow and added it
to the release finalization process.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 16:23:01 +02:00
Andrei Kvapil
590f14a614 [ci] Create long‑lived maintenance branch after release published
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 16:22:22 +02:00
Andrei Kvapil
4c8dba880a [ci] fix release branch creation (#884)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:57:23 +02:00
Andrei Kvapil
de0c7b94f4 [ci] fix release branch creation
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:55:46 +02:00
Andrei Kvapil
2682a6e674 [kube-ovn] fix versions mapping in Makefile (#883)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:37:10 +02:00
Andrei Kvapil
e3e0b21612 [kube-ovn] fix versions mapping in Makefile
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:36:25 +02:00
Andrei Kvapil
455d66fbe4 [ci] Do not run tests in release building pipeline (#882)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Removed the "Test" step from the release workflow, so tests will no
longer run as part of this process.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 15:27:27 +02:00
Andrei Kvapil
7db7277636 [ci] Do not run tests in release building pipeline
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 15:00:06 +02:00
Andrei Kvapil
7be5db8cff [fluxcd] update to flux-operator 0.19.0 (#880)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced configurable API priority and fairness settings for the
Flux Operator, allowing prioritization of API requests and inclusion of
extra service accounts.
- Added support for a new `skip` field in the `ResourceSetInputProvider`
CRD to control update skipping based on label conditions.

- **Bug Fixes**
- Updated service account reference in admin ClusterRoleBinding to use
the dedicated service account name for improved accuracy.

- **Documentation**
- Updated Helm chart and app version numbers to 0.19.0 in documentation
and metadata.
- Added documentation for the new `apiPriority` configuration option in
the Flux Operator Helm chart.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 14:45:43 +02:00
Andrei Kvapil
249950d94b [kubernetes] Update tenant Kubernetes to v1.32 (#871)
This PR also updates ubuntu-container-disk image to latest 24.04 LTS
(Noble Numbat)

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated Kubernetes version references from v1.30.1 to v1.32 in build
and deployment configurations.
	- Changed the base image for Ubuntu container disk to Ubuntu 24.04.
	- Made the Kubernetes version configurable during build processes.
- Updated the kubectl container image in pre-delete jobs to use the
latest tag.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 14:43:59 +02:00
Kingdon B
44565dca88 [fluxcd] update to flux-operator 0.19.0
Signed-off-by: Kingdon B <kingdon@urmanac.com>
2025-04-24 08:25:29 -04:00
Andrei Kvapil
cefcd24ebb [ci] Fix uploading assets to release (#876)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated release workflow to use the full tag string when uploading
assets.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 14:25:29 +02:00
Andrei Kvapil
13d7df47d7 [kubernetes] Fix merging valuesOverride for tenant clusters (#879) 2025-04-24 14:24:53 +02:00
Andrei Kvapil
1ccd3074dc [kubernetes] Fix merging valuesOverride for tenant clusters
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 14:24:07 +02:00
Andrei Kvapil
70d3591ed2 [kubernetes] Refactor controlPlane settings (#866)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Updated documentation to rename and restructure the control plane
resource configuration section, replacing the old naming with a unified
"Kubernetes control plane configuration" and updated parameter prefixes.
- **Refactor**
- Consolidated and renamed control plane configuration from
`kamajiControlPlane` to `controlPlane` across configuration files.
- Flattened configuration structure and updated all related parameter
references and hierarchy for improved clarity and consistency.
- **New Features**
- Enhanced resource preset options with expanded enum values for control
plane components.
- **Bug Fixes**
- Simplified HelmRelease manifests by embedding override values inline,
removing dependency on external Secret resources for addons including
cert-manager, GPU operator, ingress-nginx, and vertical-pod-autoscaler.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 14:08:54 +02:00
Andrei Kvapil
700991f4fa [ci] let CI to cancel previus job if new one is scheduled (#873)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved reliability of GitHub Actions workflows by ensuring only one
job per pull request or branch runs at a time. If a new workflow run is
triggered, any previous in-progress runs for the same group will be
automatically canceled, preventing overlapping executions.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 13:58:23 +02:00
Andrei Kvapil
d89acbf44d [ci] get rid of ok-to-test label (#875)
Github requires approval for external users anyway:


https://docs.github.com/en/actions/managing-workflow-runs-and-deployments/managing-workflow-runs/approving-workflow-runs-from-public-forks

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Simplified conditions for running GitHub Actions workflows on pull
requests, removing dependencies on the "ok-to-test" label and repository
origin.
  - Updated comments to reflect the new workflow logic.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 13:58:12 +02:00
Andrei Kvapil
59ef3296f0 [ci] Fix uploading assets to release
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 13:57:24 +02:00
Andrei Kvapil
3ed0cdee1c [kubernetes] Update tenant Kubernetes to v1.32
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 13:43:56 +02:00
Andrei Kvapil
9f5230a342 [kubernetes] Refactor controlPlane settings
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 13:35:10 +02:00
Andrei Kvapil
b895ccfdeb [cluster-api] Update operator, providers, remove Kamaji workaround (#867)
- Update Cluster API operator to v0.19.0
- Update Cluster API Kamaji control-plane provider to v0.14.2.
- This change includes [upstream
fix](https://github.com/clastix/cluster-api-control-plane-provider-kamaji/pull/175),
so our workaround get removed
- Update Cluster API KubeVirt infrastructure provider to v0.1.10
- Update Cluster API core provider to v1.10.0
- Update Cluster API kubeadm config provider to v1.10.0



Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 13:30:15 +02:00
Andrei Kvapil
d54a407d68 [ci] Disable pre-commit for release branches
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 11:46:26 +02:00
Andrei Kvapil
f9ec630509 [ci] get rid of ok-to-test label
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 11:39:45 +02:00
Andrei Kvapil
3f47181c10 [postgres] remove douplicated template from backup manifest
Resolves https://github.com/cozystack/cozystack/issues/869



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Updated backup cron job configuration for improved clarity and
structure. No changes to backup behavior or scheduling.
- **Chores**
  - Incremented the application chart version to 0.10.1.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 11:34:36 +02:00
Ian Simon
19409d801d [postgres] remove douplicated template from backup manifest
Signed-off-by: Ian Simon <cheatmaster114@gmail.com>
2025-04-24 11:29:30 +02:00
Andrei Kvapil
8a4793d571 [ci] let CI to cancel previus job if new one is scheduled
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-24 11:25:10 +02:00
Andrei Kvapil
0fc3fdcb3d Update Kube-OVN to v1.13.10 (#847)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
	- Updated kube-ovn chart and container image to version v1.13.10.
- **Bug Fixes**
- Adjusted volume mount paths in the ovncni DaemonSet for improved
configuration consistency.
- **Chores**
	- Streamlined Dockerfile to use the official kube-ovn image directly.
- Automated version synchronization between chart files and Dockerfile
for better maintainability.
- **Improvements**
- Removed NetworkManager synchronization to optimize controller runtime
behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-24 00:24:59 +02:00
Andrei Kvapil
04e2b3952b Update Kube-OVN to v1.13.10
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 23:25:19 +02:00
Andrei Kvapil
b56624a781 [cluster-api] Update operator, providers, remove Kamaji workaround
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 17:19:29 +02:00
Timofei Larkin
07d7fadb1a Suppress wget progress bar (#865)
In our CI wget spams thousands of lines of the progress bar into the
output, making it hard to read. Turns out, it doesn't have an option to
just remove the progress bar, but explicitly directing wget's log to
stdout and invoking --show-progress sends that to stderr which we
redirect to dev/null. The downloaded size is still reported at regular
intervals, but --progress=dot:giga shortens that to one line per 32M
which is manageable.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved file download process to display clearer progress updates
during downloads.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 19:02:12 +04:00
Andrei Kvapil
8db92d53d1 [kubernetes] Add gpu-operator and introduce GPU support for tenant Kubernetes clusters (#834)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for GPU resources in Kubernetes clusters, including the
ability to specify GPUs per node group and deploy the NVIDIA GPU
Operator as an optional addon.
- Introduced new configuration options for customizing Kamaji control
plane resources and presets.
- Added support for vertical pod autoscaler customization via override
values.

- **Bug Fixes**
- Corrected typographical errors in label keys across multiple
HelmRelease manifests to ensure consistent labeling.

- **Documentation**
- Updated documentation to describe new GPU and control plane
configuration options, removed the instance type feature matrix, and
added detailed parameter explanations.

- **Chores**
- Incremented Kubernetes app chart version to 0.19.0 and updated version
mappings.
  - Fixed typos in parameter descriptions and comments.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 16:44:01 +02:00
Andrei Kvapil
7537235f43 [kubernetes] Add gpu-operator and introduce GPU support for tenant Kubernetes clusters
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 16:39:10 +02:00
Timofei Larkin
4bb524e53d Suppress wget progress bar
In our CI wget spams thousands of lines of the progress bar into the
output, making it hard to read. Turns out, it doesn't have an option to
just remove the progress bar, but explicitly directing wget's log to
stdout and invoking --show-progress sends that to stderr which we
redirect to dev/null. The downloaded size is still reported at regular
intervals, but --progress=dot:giga shortens that to one line per 32M
which is manageable.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-23 17:37:57 +03:00
Andrei Kvapil
e7ded52f93 [virtual-machine] Fix: Add GPU names to virtual machines spec (#862)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Each GPU device entry now includes a unique identifier alongside its
device name in both VirtualMachine and VM Instance templates.

- **Configuration**
- The default GPU configuration now includes a specific GPU entry by
default, instead of being empty.

- **Version Updates**
- Chart versions for VirtualMachine and VM Instance applications have
been incremented.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 16:26:13 +02:00
Andrei Kvapil
8547dc3b21 [virtual-machine] Fix: Add GPU names to virtual machines spec
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 15:27:47 +02:00
Andrei Kvapil
c22603bf7e [tenant] Fix networkpolicy for accessing externalIPs from the cluster (#854)
This PR fixes an issue with accessing external IPs of cluster from
cluster itself

```
Policy verdict log: flow 0x6c9bf32e local EP ID 1155, remote ID remote-node, proto 6, ingress, action deny, auth: disabled, match none, 172.27.88.13:46124 -> 10.244.4.174:30274 tcp SYN
xx drop (Policy denied) flow 0x6c9bf32e to endpoint 1155, ifindex 247, file bpf_lxc.c:2181, , identity remote-node->56986: 172.27.88.13:46124 -> 10.244.4.174:30274 tcp SYN
```

related doc:
https://docs.cilium.io/en/stable/security/policy/language/#entities-based


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Expanded network access for the tenant application to allow
connections from both external sources and within the cluster.

- **Chores**
	- Updated the tenant application to version 1.9.2.
	- Adjusted version mappings to reflect the latest release.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 14:30:55 +02:00
Andrei Kvapil
89525dedb5 [e2e] fix timeouts for capi and keycloak (#858)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Increased timeout durations for waiting on certain Kubernetes
resources to improve reliability during environment setup.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 14:26:17 +02:00
Andrei Kvapil
1c53a6f9f6 [e2e] fix timeouts for capi and keycloak
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 14:01:51 +02:00
Andrei Kvapil
16ee0f2c3a [platform]: add vpa for cozy etcd operator (#850)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added support for Vertical Pod Autoscaler (VPA) configuration in the
etcd-operator Helm chart, allowing automatic scaling of CPU and memory
resources for both the operator and kube-rbac-proxy components.
- Introduced new configuration options for enabling VPA, setting
resource limits, and specifying update policies.
- **Documentation**
- Updated documentation to describe the new VPA configuration options
and usage.
- **Chores**
  - Incremented chart version to 0.4.2.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 13:58:47 +02:00
Andrei Kvapil
72d0394475 Revert "[platform] Hash tenant config and store in configmap" (#855)
Reverts cozystack/cozystack#818, according to decicion made in
https://github.com/cozystack/cozystack/issues/802#issuecomment-2823950243

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Removed configuration hash ConfigMaps and related logic from the
system.
- Updated resource templates to no longer reference configuration hash
values.
- Cleaned up internal constants and code related to configuration hash
handling.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-23 13:24:34 +02:00
Andrei Kvapil
0a998c8b49 Revert "[platform] Hash tenant config and store in configmap"
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 13:24:14 +02:00
Andrei Kvapil
7bfad655c2 Fix: networkpolicy for tenant to access from cluster
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-23 12:18:40 +02:00
Andrei Kvapil
e81cbf780c [ci] Enable release-candidates and backport functionality (#841)
This PR includes refactored pipeline:
- Automatcially create long-term releasing branch `release-X.Y` after
any tag `vX.Y.*` has publushed
- Allow only tags with names `vX.Y.Z` or `vX.Y.Z-rcN`
- Automatically set `prerelease` option for the release if release is
candidate
- Automatically set `latest` option for the release according to semver
- Add a new workflow to backport PRs with `backport` label into current
feature release
- Do not requrie `ok-to-test` label for internal PRs
2025-04-23 12:06:50 +02:00
kklinch0
e8cc44450a [platform]: add vpa for cozy etcd operator
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 22:48:47 +03:00
Andrei Kvapil
d3a8a4a7de Update Cilium to v1.17.3 (#848)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 20:02:06 +02:00
Andrei Kvapil
fc2c5a0f6b [kubevirt] Enable VMExport feature (#808)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new configuration option to control the virtual export
proxy service (default disabled).
- Deployed a dedicated ingress configuration to support flexible routing
for the virtual export proxy.
- Enabled a feature toggle for VM export capabilities in KubeVirt
deployments.
- **Documentation**
- Updated user documentation to include details about the new virtual
export proxy parameter.
- **Chores**
- Upgraded the associated ingress component from version 1.4.0 to 1.5.0.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 20:01:47 +02:00
Andrei Kvapil
0f8b8e1744 Update LINSTOR to v1.31.0 (#846)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated Helm chart version and container image tags for Piraeus
Operator and related components to newer releases. This includes updates
for controller, satellite, CSI, DRBD, and sig-storage images. No other
configuration changes were made.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 19:59:50 +02:00
Andrei Kvapil
197434ff94 [platform] Hash tenant config and store in configmap (#818)
Every tenant now creates a configmap in its __tenant__ namespace with a
sha256 of its values. Tenants (and eventually all other apps), watch the
configmap in their __release__ namespace, by referencing it in the
valuesFrom part of the HelmRelease. `tenant-root` is an exception, since
it is the only tenant where the release namespace is the same as the
tenant namespace. It references a different configmap in its valesFrom,
created and reconciled by the cozystack installer script. Part of #802.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Summary by CodeRabbit

- **New Features**
- Introduced ConfigMaps that provide SHA256 hashes representing
aggregated tenant and system configurations for improved configuration
tracking.
- Configuration hashes are now injected into application releases,
including a special system configuration hash for the root tenant.

- **Chores**
- Added new constants for configuration hash naming to improve
consistency and maintainability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 19:38:37 +02:00
Andrei Kvapil
703073a164 Update Cilium to v1.17.3
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 19:30:30 +02:00
Andrei Kvapil
6a0fc64475 Update LINSTOR to v1.31.0
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 19:12:44 +02:00
Timofei Larkin
f1624353ef Hash tenant config and store in configmap
Every tenant now creates a configmap in its __tenant__ namespace with a
sha256 of its values. Tenants (and eventually all other apps), watch the
configmap in their __release__ namespace, by referencing it in the
valuesFrom part of the HelmRelease. `tenant-root` is an exception, since
it is the only tenant where the release namespace is the same as the
tenant namespace. It references a different configmap in its valesFrom,
created and reconciled by the cozystack installer script. Part of #802.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-22 18:57:18 +02:00
Andrei Kvapil
277b438f68 [monitoring] Drop legacy label condition. (#826)
ref: https://github.com/deckhouse/deckhouse/pull/960/files

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Updated dashboard metrics filters to exclude containers with empty
names instead of specifically excluding containers named "POD". This
change applies to all relevant CPU, memory, network, and storage metrics
across capacity planning, controller, namespace, namespaces, and pod
dashboards. No other dashboard functionality or structure was changed.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 18:55:47 +02:00
Andrei Kvapil
405863cb11 Drop legacy label condition also for FluxCD.
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 18:53:05 +02:00
Andrei Kvapil
63ebab5c2a [ci] Enable release-candidates and backport functionality
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 18:49:40 +02:00
Andrei Kvapil
0ddaff9380 [kubevirt] Enable VMExport feature
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 18:40:01 +02:00
Andrei Kvapil
a6b02bf381 [ci] Fix checkout and improve error output for gen_versions_map.sh (#845)
Third attempt to fix https://github.com/cozystack/cozystack/pull/842 and
https://github.com/cozystack/cozystack/pull/836

tested in
https://github.com/cozystack/cozystack/actions/runs/14599981710/job/40955508728?pr=808

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved GitHub Actions workflow to fetch full git history and tags
during pre-commit checks.
- **Refactor**
- Updated script behavior to display error messages when version
extraction from git fails, making troubleshooting easier.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 18:38:08 +02:00
Andrei Kvapil
39ede77fec [ci] Fix checkout and improve error output for gen_versions_map.sh
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 18:34:50 +02:00
Andrei Kvapil
e505857832 [ci] Fix escaping for gen_versions_map.sh script (#842)
second attept of https://github.com/cozystack/cozystack/pull/836

fixes errors like this:

-
https://github.com/cozystack/cozystack/actions/runs/14591720553/job/40928276862?pr=835

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes**
- Improved reliability of version generation by handling empty or
special values safely in the process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 17:53:36 +02:00
Andrei Kvapil
d8f3547db7 [ci] Fix escaping for gen_versions_map.sh script
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 17:52:53 +02:00
Denis Seleznev
6d8a99269b Drop legacy label condition.
Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-22 17:42:15 +02:00
klinch0
b9112a398e [platform]: fix migrations (#840)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
	- Updated installer image to include additional system utilities.
- Migration scripts now update Kubernetes ConfigMap with the current
stack version for improved version tracking.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 18:11:24 +03:00
kklinch0
719fdd29cc [platform]: fix migrations
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 17:40:59 +03:00
Timofei Larkin
9e1376f709 Indicate the IP address pool and storage class (#831)
When populating the WorkloadMonitor objects, the status field is now
populated with a specially formatted string, mimicking the keys of
ResourceQuota.spec.hard, e.g.
`<storageclassname>.storageclass.storage.k8s.io/requests.storage` or
`<ipaddresspoolname>.ipaddresspool.metallb.io/requests.ipaddresses`
so the storage class or IP pool in use can be tracked. Part of #788.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Improved labeling of resource usage in workload status by using more
descriptive, context-based keys for IP addresses and storage resources.
This enhances clarity when viewing resource allocation details.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 17:48:51 +04:00
klinch0
7a9a1fcba4 [ci] Fix escaping for gen_versions_map.sh script (#836)
fixes errors like this:

-
https://github.com/cozystack/cozystack/actions/runs/14591720553/job/40928276862?pr=835

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes**
- Improved reliability of version generation by handling empty or
special values safely in the process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 16:35:54 +03:00
kklinch0
2def9f4e83 [ci] Fix escaping for gen_versions_map.sh script
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 16:33:40 +03:00
klinch0
c1046aae6a [github] Add @klinch0 to CODEOWNERS (#838) 2025-04-22 16:31:08 +03:00
klinch0
53cf1c537c [dx] automatically detect version for migrations in installer.sh (#837)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated migration versioning to automatically determine the next
version based on existing migration scripts, removing the need for
manual updates.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 16:24:01 +03:00
klinch0
ccedcb7419 [kubernetes] Fix tenant addons removal (#835)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Expanded the pre-delete operation to target additional components,
including cert-manager and vertical pod autoscaler resources.
- **Chores**
- Updated chart version to 0.18.1 and revised version mappings for
improved tracking.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 16:07:54 +03:00
Timofei Larkin
f94a01febd Indicate the IP address pool and storage class
When populating the WorkloadMonitor objects, the status field is now
populated with a specially formatted string, mimicking the keys of
ResourceQuota.spec.hard, e.g.
`<storageclassname>.storageclass.storage.k8s.io/requests.storage` or
`<ipaddresspoolname>.ipaddresspool.metallb.io/requests.ipaddresses`
so the storage class or IP pool in use can be tracked. Part of #788.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-22 15:59:17 +03:00
Andrei Kvapil
495e584313 [github] Add @klinch0 to CODEOWNERS
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 12:47:42 +02:00
Andrei Kvapil
172e660cd1 [dx] automatically detect version for migrations in installer.sh
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 12:46:54 +02:00
Andrei Kvapil
14262cdd2a [platform]: add migration for kube-rbac-proxy daemonset (#830)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Introduced a migration script to update monitoring resources, ensuring
refreshed configurations and pod restarts for improved system stability.
	- Updated installer version tracking to support the latest migration.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 12:44:56 +02:00
Andrei Kvapil
80576cb757 [platform]: add VerticalPodAutoscaler for Cozystack dashboard (#828)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced automated resource management for dashboard components
using Kubernetes VerticalPodAutoscaler, enabling dynamic adjustment of
CPU and memory resources.
- **Chores**
- Updated configuration to explicitly set resource presets to "none" for
dashboard, frontend, and related components.
- Added a migration script to ensure Keycloak configuration is properly
reconciled in managed environments.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 12:44:27 +02:00
kklinch0
fde6e9cc73 [platform]: add migration for kube-rbac-proxy daemonset
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 13:05:48 +03:00
Timofei Larkin
57ca60c5a5 [platform] Fix installing HelmReleases on initial setup (#833)
fixes https://github.com/cozystack/cozystack/issues/832

This PR fixes regression on installing helmreleases, also some refactor

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 14:01:32 +04:00
Andrei Kvapil
1d0ee15948 [kubernetes] Fix tenant addons removal
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 11:42:40 +02:00
kklinch0
eeaa1b4517 [platform]: add migration for kube-rbac-proxy daemonset
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-22 12:38:49 +03:00
Andrei Kvapil
a14bcf98dd [platform]: make lower resource request for capi-kamaji-controller-manager (#825)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated resource specifications for the "kamaji" provider to include
CPU and memory requests in addition to existing limits.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 11:22:33 +02:00
Andrei Kvapil
be84fc6e4e Fix: installing HelmReleases on initial setup
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-22 09:48:53 +02:00
kklinch0
73a3f481bc (platform): make lower resource request for capi-kamaji-controller-manager
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-18 15:00:52 +03:00
Andrei Kvapil
5903bbc64a [ci] Fix: do not run tests in case of release skipped (#822)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-17 23:31:07 +02:00
Andrei Kvapil
f204809e43 [ci] Revert: Workflows: Use real username to commit changes and fix assets (#823)
Let's revert 3c511023f3, because DCO don't
like such commits
2025-04-17 23:30:51 +02:00
Andrei Kvapil
fe4806ce49 [ci] Revert: Workflows: Use real username to commit changes and fix assets
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-17 23:29:41 +02:00
Andrei Kvapil
8f535acc3f [ci] Fix: do not run tests in case of release skipped
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-17 23:24:20 +02:00
Andrei Kvapil
53cbb4ae12 [monitoring] fix vpa for vmagent delete resources (#820)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated resource allocation settings for monitoring agents by removing
predefined CPU and memory limits.
- Added an option to specify separate resource settings for the config
reloader component.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-17 23:16:12 +02:00
kklinch0
4e9446d934 [monitoring] fix vpa for vmagent delete resources
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-17 21:38:28 +03:00
Andrei Kvapil
acbfb6ad64 [docs] Describe the Cozystack release workflow (#817)
See preview in
https://github.com/cozystack/cozystack/blob/127-document-release-workflow/docs/release.md

Resolves #127

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Added a comprehensive "Release Workflow" section detailing steps for
regular and patch releases, including tagging, CI workflows, pull
request management, artifact building, and publication.
- Included diagrams illustrating branching and release flows for
improved clarity.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-17 09:14:47 +02:00
Andrei Kvapil
8570449080 [ci] Update pipeline for patch releases (#816)
This PR includes the following changes:

* Do not remove version tag as part of releasing pipeline
* Overwrite tag only by fact of merging releasing pull request
* Automatically detect merge base and prepare pull request for this base
* Allow to run pipeline only for tags created on `main` and
`release-X.Y` branches


Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved workflow reliability by forcing Git tag creation and push to
overwrite existing tags if necessary.
- Enhanced workflow documentation with detailed, numbered comments for
greater clarity.
- Updated tag-based workflow to dynamically determine the base branch,
ensuring only valid branches are used.
	- Removed the automatic deletion of pushed tags in the workflow.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-17 09:14:28 +02:00
Nick Volynkin
ffe6109dfb [docs] Describe the Cozystack release workflow
Resolves #127

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-16 19:31:58 +03:00
Andrei Kvapil
7dbb8a1d75 [ci] Update pipeline for patch releases
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-authored-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-16 16:54:19 +02:00
Andrei Kvapil
86210c1fc1 Release v0.30.2 (#813)
This PR prepares the release `v0.30.2`.
(Please merge it before releasing draft)
2025-04-16 09:45:47 +02:00
kvaps
e96f15773d Prepare release v0.30.2
Signed-off-by: kvaps <kvaps@users.noreply.github.com>
2025-04-15 07:42:59 +00:00
klinch0
bc5635dd8e [monitoring] add vpa for users k8s clusters (#806)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the application version to 0.18.0 with refined version
tracking for improved deployment clarity.
  
- **New Features**
- Enhanced the monitoring agents integration with updated dependency
management.
- Introduced new deployment configurations for the vertical pod
autoscaler and its custom resource definitions, offering customizable
override options and improved reconciliation strategies.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-15 09:38:38 +02:00
Andrei Kvapil
5d71c90f0a [platform] Another logic for deleting components (#811)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Streamlined the internal deployment process by consolidating deletion
operations and simplifying task dependencies.
- **New Features**
- Enhanced release management with updated logic that automatically
determines whether to deploy or remove components based on their enabled
status.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 17:34:28 +02:00
Andrei Kvapil
05d6ab9516 [platform] Another logic for deleting components
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-14 17:02:50 +02:00
Andrei Kvapil
ccb001ee97 [platform] revert API_VERSIONS_FLAGS (#810)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Improved the deployment process to better incorporate API version
settings, enhancing the consistency and accuracy of resource generation
during deployment.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 14:40:32 +02:00
kklinch0
5a5cf91742 (platform): revert API_VERSIONS_FLAGS
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-14 15:36:16 +03:00
klinch0
6a0d4913f2 [platform] fix deleting bundles (#809)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced the container image with an additional YAML processing tool
for improved configuration management.
- Introduced new workflow commands that streamline deployment operations
by reconciling resource changes and automating cleanup.
- Enabled management of disabled components by automatically suspending
and flagging inactive deployments for optimized system performance.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 14:28:08 +03:00
klinch0
685e50bf6c [monitoring] add vpa for users k8s clusters (#806)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the application version to 0.18.0 with refined version
tracking for improved deployment clarity.
  
- **New Features**
- Enhanced the monitoring agents integration with updated dependency
management.
- Introduced new deployment configurations for the vertical pod
autoscaler and its custom resource definitions, offering customizable
override options and improved reconciliation strategies.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 14:07:35 +03:00
kklinch0
f90fc6f681 [platform] fix deleting bundles
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-14 13:22:33 +03:00
Andrei Kvapil
d8f3f2dee1 [ci] Fix matching tag for release branch (#805)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the automated release process to format version tags with a
"v" prefix for consistent version naming.
  - Performed minor cleanup to improve overall code clarity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-14 09:04:57 +02:00
kklinch0
da8100965f [monitoring] add vpa for users k8s clusters
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-11 14:52:26 +03:00
Andrei Kvapil
6d2ea1295e [ci] Fix matching tag for release branch
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-11 12:41:33 +02:00
Andrei Kvapil
d7914ff9aa Release v0.30.1 (#804)
This PR prepares the release `v0.30.1`.
(Please merge it before releasing draft)
2025-04-11 12:36:18 +02:00
kvaps
7f4af5ebbc Prepare release v0.30.1
Signed-off-by: kvaps <kvaps@users.noreply.github.com>
2025-04-11 10:07:16 +00:00
Andrei Kvapil
8f575c455c [monitoring] Refactor management etcd monitoring config (#799)
* Reuse the vmagent's serviceaccount
* Mount the serviceaccount token instead of manually creating secrets
* Give the kube-rbac-proxy a unique labelset to avoid targeting wrong
pods

Resolves #789 

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new proxy component within the monitoring configuration.

- **Refactor**
  - Updated resource labeling for improved consistency.
- Revised service account references and authentication settings for a
more streamlined operation.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-11 10:57:30 +02:00
Andrei Kvapil
819166eb35 [monitoring] create a new version to include fix for vlogs image (#803)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
  - Updated the monitoring application version from 1.9.1 to 1.9.2.
- Refined version reference details by shifting from a dynamic reference
to a fixed commit identifier for improved tracking and reliability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-11 10:55:54 +02:00
kklinch0
f507802ec9 (monitoring) patch for fix vlogs image
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-11 11:51:33 +03:00
Andrei Kvapil
62267811cb [ci] Fix release branch matching (#800)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the branch naming convention for release management to remove
the 'v' prefix.
  - Revised the displayed error messages to match the new format.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-11 10:46:32 +02:00
Andrei Kvapil
12bedef2d3 [ci] Fix release branch matching 2025-04-11 08:55:11 +02:00
Andrei Kvapil
fd240701f8 Release v0.30.0 (#783)
This PR prepares the release `v0.30.0`.
(Please merge it before releasing draft)
2025-04-10 16:35:41 +02:00
Timofei Larkin
60b96e0a62 Refactor management etcd monitoring config
* Reuse the vmagent's serviceaccount
* Mount the serviceaccount token instead of manually creating secrets
* Give the kube-rbac-proxy a unique labelset to avoid targeting wrong
  pods

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-10 16:59:43 +03:00
kvaps
1d377bab9d Prepare release v0.30.0
Signed-off-by: kvaps <kvaps@users.noreply.github.com>
2025-04-10 12:50:47 +00:00
Andrei Kvapil
799690dc07 [ci] Allow tests to run in paralel (#786)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:36:01 +02:00
Andrei Kvapil
aa02d0c5e6 [tests] fix tests and add retry for monitoring hr tests (#791)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced explicit versioning for monitoring components, now using
version *v1.17.0-victorialogs*.
- Enhanced the Docker setup by including Flux CD support for improved
deployments.

- **Chores**
- Optimized deployment automation to streamline service readiness checks
and improve overall reliability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 14:35:50 +02:00
Andrei Kvapil
6b8ecf3953 Upd: Kube-OVN to v1.13.8 (#797) 2025-04-10 14:35:22 +02:00
Andrei Kvapil
e5b81f367e Upd: Cilium to v1.17.2 (#796)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:35:07 +02:00
Andrei Kvapil
ba4798464d Upd: Kamaji to edge-25.3.2 (#795)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:34:50 +02:00
Andrei Kvapil
655e8be382 Upd: Keycloak-operator to v1.25.0 (#794)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Upgraded to a new release version, offering enhanced integration and
secure client configuration options.
- Expanded realm settings now support advanced user profile
customization and robust email configuration for streamlined operations.
- Improved administrative views deliver clearer insights for managing
your system.

- **Documentation**
- Installation and release details have been updated to accurately
reflect the latest version.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 14:34:32 +02:00
Andrei Kvapil
fd9a5b0d7b Update Cluster-API operator to v0.18.1 (#793)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Upgraded the Cluster API Operator release to version 0.18.1 with an
updated manager image.
- Introduced a new configuration option that enables conditional
deployment hooks, allowing for custom post-install and post-upgrade
actions.
- Enhanced resource synchronization with dynamic feature gate settings,
providing more flexible and coordinated deployments.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 14:34:16 +02:00
Andrei Kvapil
d09314bbb5 Upd: victoria-metrics operator to v0.55.0 (#792)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Upgraded operator and chart versions with additional configuration
options for enhanced metrics scraping, security, and resource
management.
- Introduced new settings to fine-tune monitoring endpoints and resource
definitions.

- **Documentation**
  - Updated release notes and metadata for clearer change tracking.
  - Streamlined user documentation by removing legacy files.

- **Improvements**
- Enhanced error handling and validation logic for smoother deployments.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 14:33:58 +02:00
Andrei Kvapil
371791215a Upd: Kube-OVN to v1.13.8
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:32:34 +02:00
Andrei Kvapil
4c220bb443 Upd: Cilium to v1.17.2
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:28:51 +02:00
Andrei Kvapil
e27611d45a Upd: Kamaji to edge-25.3.2
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:28:09 +02:00
Andrei Kvapil
a3e647c547 Upd: Keycloak-operator to v1.25.0
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:26:58 +02:00
Andrei Kvapil
be52fe5461 Update Cluster-API operator to v0.18.1
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:26:09 +02:00
kklinch0
1d639fda0d [tests] fix tests and add retry for monitoring hr tests
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-10 15:23:31 +03:00
Andrei Kvapil
bbdde79428 Upd: victoria-metrics operator to v0.55.0
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 14:23:12 +02:00
Andrei Kvapil
1966f86120 [ci] Allow to run tests in paralel
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 11:35:06 +02:00
Andrei Kvapil
fa7c98bc38 [ci] Automatically run tests as part of release pipepline (#785)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 11:15:38 +02:00
Andrei Kvapil
6671869acc [ci] Automatically run tests as part of release pipepline
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-10 11:13:51 +02:00
435 changed files with 76967 additions and 11492 deletions

2
.github/CODEOWNERS vendored
View File

@@ -1 +1 @@
* @kvaps @lllamnyp
* @kvaps @lllamnyp @klinch0

53
.github/workflows/backport.yaml vendored Normal file
View File

@@ -0,0 +1,53 @@
name: Automatic Backport
on:
pull_request_target:
types: [closed] # fires when PR is closed (merged)
concurrency:
group: backport-${{ github.workflow }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
permissions:
contents: write
pull-requests: write
jobs:
backport:
if: |
github.event.pull_request.merged == true &&
contains(github.event.pull_request.labels.*.name, 'backport')
runs-on: [self-hosted]
steps:
# 1. Decide which maintenance branch should receive the backport
- name: Determine target maintenance branch
id: target
uses: actions/github-script@v7
with:
script: |
let rel;
try {
rel = await github.rest.repos.getLatestRelease({
owner: context.repo.owner,
repo: context.repo.repo
});
} catch (e) {
core.setFailed('No existing releases found; cannot determine backport target.');
return;
}
const [maj, min] = rel.data.tag_name.replace(/^v/, '').split('.');
const branch = `release-${maj}.${min}`;
core.setOutput('branch', branch);
console.log(`Latest release ${rel.data.tag_name}; backporting to ${branch}`);
# 2. Checkout (required by backportaction)
- name: Checkout repository
uses: actions/checkout@v4
# 3. Create the backport pull request
- name: Create backport PR
uses: korthout/backport-action@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
label_pattern: '' # don't read labels for targets
target_branches: ${{ steps.target.outputs.branch }}

View File

@@ -1,18 +1,22 @@
name: Pre-Commit Checks
on:
push:
branches:
- main
pull_request:
paths-ignore:
- '**.md'
types: [labeled, opened, synchronize, reopened]
concurrency:
group: pre-commit-${{ github.workflow }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
pre-commit:
runs-on: ubuntu-22.04
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0
fetch-tags: true
- name: Set up Python
uses: actions/setup-python@v4

View File

@@ -4,6 +4,10 @@ on:
pull_request:
types: [labeled, opened, synchronize, reopened, closed]
concurrency:
group: pull-requests-release-${{ github.workflow }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
verify:
name: Test Release
@@ -13,7 +17,6 @@ jobs:
packages: write
if: |
contains(github.event.pull_request.labels.*.name, 'ok-to-test') &&
contains(github.event.pull_request.labels.*.name, 'release') &&
github.event.action != 'closed'
@@ -31,6 +34,64 @@ jobs:
password: ${{ secrets.GITHUB_TOKEN }}
registry: ghcr.io
- name: Extract tag from PR branch
id: get_tag
uses: actions/github-script@v7
with:
script: |
const branch = context.payload.pull_request.head.ref;
const m = branch.match(/^release-(\d+\.\d+\.\d+(?:[-\w\.]+)?)$/);
if (!m) {
core.setFailed(`❌ Branch '${branch}' does not match 'release-X.Y.Z[-suffix]'`);
return;
}
const tag = `v${m[1]}`;
core.setOutput('tag', tag);
- name: Find draft release and get asset IDs
id: fetch_assets
uses: actions/github-script@v7
with:
github-token: ${{ secrets.GH_PAT }}
script: |
const tag = '${{ steps.get_tag.outputs.tag }}';
const releases = await github.rest.repos.listReleases({
owner: context.repo.owner,
repo: context.repo.repo,
per_page: 100
});
const draft = releases.data.find(r => r.tag_name === tag && r.draft);
if (!draft) {
core.setFailed(`Draft release '${tag}' not found`);
return;
}
const findAssetId = (name) =>
draft.assets.find(a => a.name === name)?.id;
const installerId = findAssetId("cozystack-installer.yaml");
const diskId = findAssetId("nocloud-amd64.raw.xz");
if (!installerId || !diskId) {
core.setFailed("Missing required assets");
return;
}
core.setOutput("installer_id", installerId);
core.setOutput("disk_id", diskId);
- name: Download assets from GitHub API
run: |
mkdir -p _out/assets
curl -sSL \
-H "Authorization: token ${GH_PAT}" \
-H "Accept: application/octet-stream" \
-o _out/assets/cozystack-installer.yaml \
"https://api.github.com/repos/${GITHUB_REPOSITORY}/releases/assets/${{ steps.fetch_assets.outputs.installer_id }}"
curl -sSL \
-H "Authorization: token ${GH_PAT}" \
-H "Accept: application/octet-stream" \
-o _out/assets/nocloud-amd64.raw.xz \
"https://api.github.com/repos/${GITHUB_REPOSITORY}/releases/assets/${{ steps.fetch_assets.outputs.disk_id }}"
env:
GH_PAT: ${{ secrets.GH_PAT }}
- name: Run tests
run: make test
@@ -39,38 +100,137 @@ jobs:
runs-on: [self-hosted]
permissions:
contents: write
if: |
github.event.pull_request.merged == true &&
contains(github.event.pull_request.labels.*.name, 'release')
steps:
# Extract tag from branch name (branch = release-X.Y.Z*)
- name: Extract tag from branch name
id: get_tag
uses: actions/github-script@v7
with:
script: |
const branch = context.payload.pull_request.head.ref;
const match = branch.match(/^release-(v\d+\.\d+\.\d+(?:[-\w\.]+)?)$/);
if (!match) {
core.setFailed(`Branch '${branch}' does not match expected format 'release-vX.Y.Z[-suffix]'`);
} else {
const tag = match[1];
core.setOutput('tag', tag);
console.log(`✅ Extracted tag: ${tag}`);
const m = branch.match(/^release-(\d+\.\d+\.\d+(?:[-\w\.]+)?)$/);
if (!m) {
core.setFailed(`Branch '${branch}' does not match 'release-X.Y.Z[-suffix]'`);
return;
}
const tag = `v${m[1]}`;
core.setOutput('tag', tag);
console.log(`✅ Tag to publish: ${tag}`);
# Checkout repo & create / push annotated tag
- name: Checkout repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Create tag on merged commit
- name: Create tag on merge commit
run: |
git tag ${{ steps.get_tag.outputs.tag }} ${{ github.sha }}
git push origin ${{ steps.get_tag.outputs.tag }}
git tag -f ${{ steps.get_tag.outputs.tag }} ${{ github.sha }}
git push -f origin ${{ steps.get_tag.outputs.tag }}
# Ensure maintenance branch release-X.Y
- name: Ensure maintenance branch release-X.Y
uses: actions/github-script@v7
with:
github-token: ${{ secrets.GH_PAT }}
script: |
const tag = '${{ steps.get_tag.outputs.tag }}'; // e.g. v0.1.3 or v0.1.3-rc3
const match = tag.match(/^v(\d+)\.(\d+)\.\d+(?:[-\w\.]+)?$/);
if (!match) {
core.setFailed(`❌ tag '${tag}' must match 'vX.Y.Z' or 'vX.Y.Z-suffix'`);
return;
}
const line = `${match[1]}.${match[2]}`;
const branch = `release-${line}`;
// Get main branch commit for the tag
const ref = await github.rest.git.getRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: `tags/${tag}`
});
const commitSha = ref.data.object.sha;
try {
await github.rest.repos.getBranch({
owner: context.repo.owner,
repo: context.repo.repo,
branch
});
await github.rest.git.updateRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: `heads/${branch}`,
sha: commitSha,
force: true
});
console.log(`🔁 Force-updated '${branch}' to ${commitSha}`);
} catch (err) {
if (err.status === 404) {
await github.rest.git.createRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: `refs/heads/${branch}`,
sha: commitSha
});
console.log(`✅ Created branch '${branch}' at ${commitSha}`);
} else {
console.error('Unexpected error --', err);
core.setFailed(`Unexpected error creating/updating branch: ${err.message}`);
throw err;
}
}
# Get the latest published release
- name: Get the latest published release
id: latest_release
uses: actions/github-script@v7
with:
script: |
try {
const rel = await github.rest.repos.getLatestRelease({
owner: context.repo.owner,
repo: context.repo.repo
});
core.setOutput('tag', rel.data.tag_name);
} catch (_) {
core.setOutput('tag', '');
}
# Compare current tag vs latest using semver-utils
- name: Semver compare
id: semver
uses: madhead/semver-utils@v4.3.0
with:
version: ${{ steps.get_tag.outputs.tag }}
compare-to: ${{ steps.latest_release.outputs.tag }}
# Derive flags: prerelease? make_latest?
- name: Calculate publish flags
id: flags
uses: actions/github-script@v7
with:
script: |
const tag = '${{ steps.get_tag.outputs.tag }}'; // v0.31.5-rc.1
const m = tag.match(/^v(\d+\.\d+\.\d+)(-(?:alpha|beta|rc)\.\d+)?$/);
if (!m) {
core.setFailed(`❌ tag '${tag}' must match 'vX.Y.Z' or 'vX.Y.Z-(alpha|beta|rc).N'`);
return;
}
const version = m[1] + (m[2] ?? ''); // 0.31.5-rc.1
const isRc = Boolean(m[2]);
core.setOutput('is_rc', isRc);
const outdated = '${{ steps.semver.outputs.comparison-result }}' === '<';
core.setOutput('make_latest', isRc || outdated ? 'false' : 'legacy');
# Publish draft release with correct flags
- name: Publish draft release
uses: actions/github-script@v7
with:
@@ -78,19 +238,17 @@ jobs:
const tag = '${{ steps.get_tag.outputs.tag }}';
const releases = await github.rest.repos.listReleases({
owner: context.repo.owner,
repo: context.repo.repo
repo: context.repo.repo
});
const release = releases.data.find(r => r.tag_name === tag && r.draft);
if (!release) {
throw new Error(`Draft release with tag ${tag} not found`);
}
const draft = releases.data.find(r => r.tag_name === tag && r.draft);
if (!draft) throw new Error(`Draft release for ${tag} not found`);
await github.rest.repos.updateRelease({
owner: context.repo.owner,
repo: context.repo.repo,
release_id: release.id,
draft: false
owner: context.repo.owner,
repo: context.repo.repo,
release_id: draft.id,
draft: false,
prerelease: ${{ steps.flags.outputs.is_rc }},
make_latest: '${{ steps.flags.outputs.make_latest }}'
});
console.log(` Published release for ${tag}`);
console.log(`🚀 Published release for ${tag}`);

View File

@@ -4,16 +4,20 @@ on:
pull_request:
types: [labeled, opened, synchronize, reopened]
concurrency:
group: pull-requests-${{ github.workflow }}-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
e2e:
name: Build and Test
build:
name: Build
runs-on: [self-hosted]
permissions:
contents: read
packages: write
# Never run when the PR carries the "release" label.
if: |
contains(github.event.pull_request.labels.*.name, 'ok-to-test') &&
!contains(github.event.pull_request.labels.*.name, 'release')
steps:
@@ -29,11 +33,50 @@ jobs:
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
registry: ghcr.io
env:
DOCKER_CONFIG: ${{ runner.temp }}/.docker
- name: make build
run: |
make build
- name: Build
run: make build
env:
DOCKER_CONFIG: ${{ runner.temp }}/.docker
- name: make test
run: |
make test
- name: Build Talos image
run: make -C packages/core/installer talos-nocloud
- name: Upload installer
uses: actions/upload-artifact@v4
with:
name: cozystack-installer
path: _out/assets/cozystack-installer.yaml
- name: Upload Talos image
uses: actions/upload-artifact@v4
with:
name: talos-image
path: _out/assets/nocloud-amd64.raw.xz
test:
name: Test
runs-on: [self-hosted]
needs: build
# Never run when the PR carries the "release" label.
if: |
!contains(github.event.pull_request.labels.*.name, 'release')
steps:
- name: Download installer
uses: actions/download-artifact@v4
with:
name: cozystack-installer
path: _out/assets/
- name: Download Talos image
uses: actions/download-artifact@v4
with:
name: talos-image
path: _out/assets/
- name: Test
run: make test

View File

@@ -3,7 +3,14 @@ name: Versioned Tag
on:
push:
tags:
- 'v*.*.*'
- 'v*.*.*' # vX.Y.Z
- 'v*.*.*-rc.*' # vX.Y.Z-rc.N
- 'v*.*.*-beta.*' # vX.Y.Z-beta.N
- 'v*.*.*-alpha.*' # vX.Y.Z-alpha.N
concurrency:
group: tags-${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
prepare-release:
@@ -13,8 +20,10 @@ jobs:
contents: write
packages: write
pull-requests: write
actions: write
steps:
# Check if a non-draft release with this tag already exists
- name: Check if release already exists
id: check_release
uses: actions/github-script@v7
@@ -23,137 +32,201 @@ jobs:
const tag = context.ref.replace('refs/tags/', '');
const releases = await github.rest.repos.listReleases({
owner: context.repo.owner,
repo: context.repo.repo
repo: context.repo.repo
});
const exists = releases.data.some(r => r.tag_name === tag && !r.draft);
core.setOutput('skip', exists);
console.log(exists ? `Release ${tag} already published` : `No published release ${tag}`);
const existing = releases.data.find(r => r.tag_name === tag && !r.draft);
if (existing) {
core.setOutput('skip', 'true');
} else {
core.setOutput('skip', 'false');
}
# If a published release already exists, skip the rest of the workflow
- name: Skip if release already exists
if: steps.check_release.outputs.skip == 'true'
run: echo "Release already exists, skipping workflow."
# Parse tag meta-data (rc?, maintenance line, etc.)
- name: Parse tag
if: steps.check_release.outputs.skip == 'false'
id: tag
uses: actions/github-script@v7
with:
script: |
const ref = context.ref.replace('refs/tags/', ''); // e.g. v0.31.5-rc.1
const m = ref.match(/^v(\d+\.\d+\.\d+)(-(?:alpha|beta|rc)\.\d+)?$/); // ['0.31.5', '-rc.1' | '-beta.1' | …]
if (!m) {
core.setFailed(`❌ tag '${ref}' must match 'vX.Y.Z' or 'vX.Y.Z-(alpha|beta|rc).N'`);
return;
}
const version = m[1] + (m[2] ?? ''); // 0.31.5-rc.1
const isRc = Boolean(m[2]);
const [maj, min] = m[1].split('.');
core.setOutput('tag', ref); // v0.31.5-rc.1
core.setOutput('version', version); // 0.31.5-rc.1
core.setOutput('is_rc', isRc); // true
core.setOutput('line', `${maj}.${min}`); // 0.31
# Detect base branch (main or release-X.Y) the tag was pushed from
- name: Get base branch
if: steps.check_release.outputs.skip == 'false'
id: get_base
uses: actions/github-script@v7
with:
script: |
const baseRef = context.payload.base_ref;
if (!baseRef) {
core.setFailed(`❌ base_ref is empty. Push the tag via 'git push origin HEAD:refs/tags/<tag>'.`);
return;
}
const branch = baseRef.replace('refs/heads/', '');
const ok = branch === 'main' || /^release-\d+\.\d+$/.test(branch);
if (!ok) {
core.setFailed(`❌ Tagged commit must belong to 'main' or 'release-X.Y'. Got '${branch}'`);
return;
}
core.setOutput('branch', branch);
# Checkout & login once
- name: Checkout code
if: steps.check_release.outputs.skip == 'false'
uses: actions/checkout@v4
with:
fetch-depth: 0
fetch-tags: true
fetch-tags: true
- name: Login to GitHub Container Registry
- name: Login to GHCR
if: steps.check_release.outputs.skip == 'false'
uses: docker/login-action@v3
with:
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
registry: ghcr.io
env:
DOCKER_CONFIG: ${{ runner.temp }}/.docker
# Build project artifacts
- name: Build
if: steps.check_release.outputs.skip == 'false'
run: make build
env:
DOCKER_CONFIG: ${{ runner.temp }}/.docker
# Commit built artifacts
- name: Commit release artifacts
if: steps.check_release.outputs.skip == 'false'
env:
GIT_AUTHOR_NAME: ${{ github.actor }}
GIT_AUTHOR_EMAIL: ${{ github.actor }}@users.noreply.github.com
run: |
git config user.name "$GIT_AUTHOR_NAME"
git config user.email "$GIT_AUTHOR_EMAIL"
git config user.name "github-actions"
git config user.email "github-actions@github.com"
git add .
git commit -m "Prepare release ${GITHUB_REF#refs/tags/}" -s || echo "No changes to commit"
git push origin HEAD || true
# Get `latest_version` from latest published release
- name: Get latest published release
if: steps.check_release.outputs.skip == 'false'
id: latest_release
uses: actions/github-script@v7
with:
script: |
try {
const rel = await github.rest.repos.getLatestRelease({
owner: context.repo.owner,
repo: context.repo.repo
});
core.setOutput('tag', rel.data.tag_name);
} catch (_) {
core.setOutput('tag', '');
}
# Compare tag (A) with latest (B)
- name: Semver compare
if: steps.check_release.outputs.skip == 'false'
id: semver
uses: madhead/semver-utils@v4.3.0
with:
version: ${{ steps.tag.outputs.tag }} # A
compare-to: ${{ steps.latest_release.outputs.tag }} # B
# Create or reuse DRAFT GitHub Release
- name: Create / reuse draft release
if: steps.check_release.outputs.skip == 'false'
id: release
uses: actions/github-script@v7
with:
script: |
const tag = '${{ steps.tag.outputs.tag }}';
const isRc = ${{ steps.tag.outputs.is_rc }};
const outdated = '${{ steps.semver.outputs.comparison-result }}' === '<';
const makeLatest = outdated ? false : 'legacy';
const releases = await github.rest.repos.listReleases({
owner: context.repo.owner,
repo: context.repo.repo
});
let rel = releases.data.find(r => r.tag_name === tag);
if (!rel) {
rel = await github.rest.repos.createRelease({
owner: context.repo.owner,
repo: context.repo.repo,
tag_name: tag,
name: tag,
draft: true,
prerelease: isRc,
make_latest: makeLatest
});
console.log(`Draft release created for ${tag}`);
} else {
console.log(`Re-using existing release ${tag}`);
}
core.setOutput('upload_url', rel.upload_url);
# Build + upload assets (optional)
- name: Build & upload assets
if: steps.check_release.outputs.skip == 'false'
run: |
make assets
make upload_assets VERSION=${{ steps.tag.outputs.tag }}
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# Create release-X.Y.Z branch and push (force-update)
- name: Create release branch
if: steps.check_release.outputs.skip == 'false'
run: |
BRANCH_NAME="release-${GITHUB_REF#refs/tags/v}"
git branch -f "$BRANCH_NAME"
git push origin "$BRANCH_NAME" --force
BRANCH="release-${GITHUB_REF#refs/tags/v}"
git branch -f "$BRANCH"
git push -f origin "$BRANCH"
# Create pull request into original base branch (if absent)
- name: Create pull request if not exists
if: steps.check_release.outputs.skip == 'false'
uses: actions/github-script@v7
with:
script: |
const version = context.ref.replace('refs/tags/v', '');
const branch = `release-${version}`;
const base = 'main';
const base = '${{ steps.get_base.outputs.branch }}';
const head = `release-${version}`;
const prs = await github.rest.pulls.list({
owner: context.repo.owner,
repo: context.repo.repo,
head: `${context.repo.owner}:${branch}`,
repo: context.repo.repo,
head: `${context.repo.owner}:${head}`,
base
});
if (prs.data.length === 0) {
const newPr = await github.rest.pulls.create({
const pr = await github.rest.pulls.create({
owner: context.repo.owner,
repo: context.repo.repo,
head: branch,
base: base,
repo: context.repo.repo,
head,
base,
title: `Release v${version}`,
body:
`This PR prepares the release \`v${version}\`.\n` +
`(Please merge it before releasing draft)`,
body: `This PR prepares the release \`v${version}\`.`,
draft: false
});
console.log(`Created pull request #${newPr.data.number} from ${branch} to ${base}`);
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: newPr.data.number,
labels: ['release', 'ok-to-test']
repo: context.repo.repo,
issue_number: pr.data.number,
labels: ['release']
});
console.log(`Created PR #${pr.data.number}`);
} else {
console.log(`Pull request already exists from ${branch} to ${base}`);
console.log(`PR already exists from ${head} to ${base}`);
}
- name: Create or reuse draft release
if: steps.check_release.outputs.skip == 'false'
id: create_release
uses: actions/github-script@v7
with:
script: |
const tag = context.ref.replace('refs/tags/', '');
const releases = await github.rest.repos.listReleases({
owner: context.repo.owner,
repo: context.repo.repo
});
let release = releases.data.find(r => r.tag_name === tag);
if (!release) {
release = await github.rest.repos.createRelease({
owner: context.repo.owner,
repo: context.repo.repo,
tag_name: tag,
name: `${tag}`,
draft: true,
prerelease: false
});
}
core.setOutput('upload_url', release.upload_url);
- name: Build assets
if: steps.check_release.outputs.skip == 'false'
run: make assets
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Upload assets
if: steps.check_release.outputs.skip == 'false'
run: make upload_assets VERSION=${GITHUB_REF#refs/tags/}
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Delete pushed tag
if: steps.check_release.outputs.skip == 'false'
run: |
git push --delete origin ${GITHUB_REF#refs/tags/}

3
.gitignore vendored
View File

@@ -1,6 +1,7 @@
_out
.git
.idea
.vscode
# User-specific stuff
.idea/**/workspace.xml
@@ -75,4 +76,4 @@ fabric.properties
.idea/caches/build_file_checksums.ser
.DS_Store
**/.DS_Store
**/.DS_Store

View File

@@ -18,6 +18,7 @@ repos:
(cd "$dir" && make generate)
fi
done
git diff --color=always | cat
'
language: script
files: ^.*$

View File

@@ -20,6 +20,7 @@ build: build-deps
make -C packages/system/kubeovn image
make -C packages/system/kubeovn-webhook image
make -C packages/system/dashboard image
make -C packages/system/metallb image
make -C packages/system/kamaji image
make -C packages/system/bucket image
make -C packages/core/testing image
@@ -42,12 +43,11 @@ manifests:
(cd packages/core/installer/; helm template -n cozy-installer installer .) > _out/assets/cozystack-installer.yaml
assets:
make -C packages/core/installer/ assets
make -C packages/core/installer assets
test:
make -C packages/core/testing apply
make -C packages/core/testing test
#make -C packages/core/testing test-applications
generate:
hack/update-codegen.sh

View File

@@ -39,6 +39,8 @@ import (
cozystackiov1alpha1 "github.com/cozystack/cozystack/api/v1alpha1"
"github.com/cozystack/cozystack/internal/controller"
"github.com/cozystack/cozystack/internal/telemetry"
helmv2 "github.com/fluxcd/helm-controller/api/v2"
// +kubebuilder:scaffold:imports
)
@@ -51,6 +53,7 @@ func init() {
utilruntime.Must(clientgoscheme.AddToScheme(scheme))
utilruntime.Must(cozystackiov1alpha1.AddToScheme(scheme))
utilruntime.Must(helmv2.AddToScheme(scheme))
// +kubebuilder:scaffold:scheme
}
@@ -182,6 +185,14 @@ func main() {
if err = (&controller.WorkloadReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "WorkloadReconciler")
os.Exit(1)
}
if err = (&controller.TenantHelmReconciler{
Client: mgr.GetClient(),
Scheme: mgr.GetScheme(),
}).SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "Workload")
os.Exit(1)

View File

@@ -626,7 +626,7 @@
"datasource": {
"uid": "${DS_PROMETHEUS}"
},
"expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\",container!=\"POD\",container!=\"\",pod=~\".*-controller-.*\"}) by (pod)",
"expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\",container!=\"\",pod=~\".*-controller-.*\"}) by (pod)",
"hide": false,
"interval": "",
"legendFormat": "{{pod}}",

View File

@@ -450,7 +450,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval])))\n / sum(sum by (node) (avg_over_time(kube_node_status_allocatable{resource=\"cpu\",unit=\"core\",node=~\"$node\"}[$__rate_interval])))",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"\",node=~\"$node\"}[$__rate_interval])))\n / sum(sum by (node) (avg_over_time(kube_node_status_allocatable{resource=\"cpu\",unit=\"core\",node=~\"$node\"}[$__rate_interval])))",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -520,7 +520,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",node=~\"$node\"})) / sum(sum by (node) (avg_over_time(kube_node_status_allocatable{resource=\"memory\",unit=\"byte\",node=~\"$node\"}[$__rate_interval])))",
"expr": "sum(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"\",node=~\"$node\"})) / sum(sum by (node) (avg_over_time(kube_node_status_allocatable{resource=\"memory\",unit=\"byte\",node=~\"$node\"}[$__rate_interval])))",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -590,7 +590,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval]))) / sum(sum by (node) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\"}[$__rate_interval])))",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"\",node=~\"$node\"}[$__rate_interval]))) / sum(sum by (node) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\"}[$__rate_interval])))",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -660,7 +660,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",node=~\"$node\"} )) / sum(sum by (node) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"}[$__rate_interval])))",
"expr": "sum(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"\",node=~\"$node\"} )) / sum(sum by (node) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"}[$__rate_interval])))",
"hide": false,
"legendFormat": "__auto",
"range": true,
@@ -1128,7 +1128,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval]) - on (namespace,pod,container,node) group_left avg by (namespace,pod,container, node)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"})) * -1 > 0\n",
"expr": "sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"\",node=~\"$node\"}[$__rate_interval]) - on (namespace,pod,container,node) group_left avg by (namespace,pod,container, node)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"})) * -1 > 0\n",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -1143,7 +1143,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval]) - on (namespace,pod,container,node) group_left avg by (namespace,pod,container, node)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"})) * -1 > 0)",
"expr": "sum(sum by (node) (rate(container_cpu_usage_seconds_total{container!=\"\",node=~\"$node\"}[$__rate_interval]) - on (namespace,pod,container,node) group_left avg by (namespace,pod,container, node)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"})) * -1 > 0)",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -1527,7 +1527,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",node=~\"$node\"} ) - sum by (node) (kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"})) * -1 > 0\n",
"expr": "(sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"\",node=~\"$node\"} ) - sum by (node) (kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"})) * -1 > 0\n",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -1542,7 +1542,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum((sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",node=~\"$node\"} ) - sum by (node) (kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"})) * -1 > 0)",
"expr": "sum((sum by (node) (container_memory_working_set_bytes:without_kmem{container!=\"\",node=~\"$node\"} ) - sum by (node) (kube_pod_container_resource_requests{resource=\"memory\",node=~\"$node\"})) * -1 > 0)",
"hide": false,
"legendFormat": "Total",
"range": true,
@@ -1909,7 +1909,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "topk(10, (sum by (namespace,pod,container)((rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval])) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"}))) * -1 > 0)\n",
"expr": "topk(10, (sum by (namespace,pod,container)((rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!=\"\",node=~\"$node\"}[$__rate_interval])) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"}))) * -1 > 0)\n",
"format": "table",
"instant": true,
"range": false,
@@ -2037,7 +2037,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "topk(10, (sum by (namespace,container,pod) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",namespace=~\"$namespace\",node=~\"$node\"}) - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"memory\",namespace=~\"$namespace\",node=~\"$node\"})) * -1 >0)\n",
"expr": "topk(10, (sum by (namespace,container,pod) (container_memory_working_set_bytes:without_kmem{container!=\"\",namespace=~\"$namespace\",node=~\"$node\"}) - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"memory\",namespace=~\"$namespace\",node=~\"$node\"})) * -1 >0)\n",
"format": "table",
"instant": true,
"range": false,
@@ -2160,7 +2160,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "topk(10, (sum by (namespace,pod,container)((rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!=\"POD\",container!=\"\",node=~\"$node\"}[$__rate_interval])) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"}))) > 0)\n",
"expr": "topk(10, (sum by (namespace,pod,container)((rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\",container!=\"\",node=~\"$node\"}[$__rate_interval])) - on (namespace,pod,container) group_left avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"cpu\",node=~\"$node\"}))) > 0)\n",
"format": "table",
"instant": true,
"range": false,
@@ -2288,7 +2288,7 @@
},
"editorMode": "code",
"exemplar": false,
"expr": "topk(10, (sum by (namespace,container,pod) (container_memory_working_set_bytes:without_kmem{container!=\"POD\",container!=\"\",namespace=~\"$namespace\",node=~\"$node\"}) - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"memory\",namespace=~\"$namespace\",node=~\"$node\"})) >0)\n",
"expr": "topk(10, (sum by (namespace,container,pod) (container_memory_working_set_bytes:without_kmem{container!=\"\",namespace=~\"$namespace\",node=~\"$node\"}) - on (namespace,pod,container) avg by (namespace,pod,container)(kube_pod_container_resource_requests{resource=\"memory\",namespace=~\"$namespace\",node=~\"$node\"})) >0)\n",
"format": "table",
"instant": true,
"range": false,

View File

@@ -684,7 +684,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -710,7 +710,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (pod)\n(\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\", namespace=\"$namespace\", resource=\"cpu\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "sum by (pod)\n(\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\", namespace=\"$namespace\", resource=\"cpu\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -723,7 +723,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) > 0\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) > 0\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -736,7 +736,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) \n (\n (\n (\n sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) or sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) > 0\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) \n (\n (\n (\n sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) or sum by (namespace, pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) > 0\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -762,7 +762,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -786,7 +786,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (pod)\n(\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\", namespace=\"$namespace\", resource=\"memory\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "sum by (pod)\n(\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\", namespace=\"$namespace\", resource=\"memory\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -798,7 +798,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) > 0\n )\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) > 0\n )\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -810,7 +810,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) or sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__range]))\n ) > 0\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n -\n sum by (namespace, pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) or sum by (namespace, pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__range]))\n ) > 0\n )\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -848,7 +848,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -860,7 +860,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"expr": "(\n sum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range])) \n * on (pod)\n sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__range]))\n)\nor\nsum by (pod) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -1315,7 +1315,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -1488,7 +1488,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -1502,7 +1502,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -1642,7 +1642,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by (pod)\n (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (pod)\n (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -1779,7 +1779,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": " (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n )\n or\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n )\n) > 0",
"expr": " (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n )\n or\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))\n )\n) > 0",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -2095,7 +2095,7 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Usage",
@@ -2109,7 +2109,7 @@
"refId": "D"
},
{
"expr": "sum by (pod)\n(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\", namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))\n)",
"expr": "sum by (pod)\n(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\", namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
@@ -2295,7 +2295,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "System",
@@ -2306,7 +2306,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -2468,7 +2468,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -2653,7 +2653,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
@@ -2666,7 +2666,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Cache",
@@ -2679,7 +2679,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Swap",
@@ -2692,7 +2692,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
@@ -2705,7 +2705,7 @@
"uid": "${ds_prometheus}"
},
"editorMode": "code",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))\n)",
"expr": "sum (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}) by(pod)\n * on (pod)\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -2837,7 +2837,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (pod) group_left()\n sum by (pod)\n (\n (\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n ) > 0\n )\n)",
"expr": "(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (pod) group_left()\n sum by (pod)\n (\n (\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n ) > 0\n )\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -2974,7 +2974,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (pod) group_left()\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n ) or sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n ) > 0\n )\n)",
"expr": "(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (pod) group_left()\n sum by (pod)\n (\n (\n (\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n ) or sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n ) > 0\n )\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -3290,56 +3290,56 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Cache",
"refId": "B"
},
{
"expr": "sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Swap",
"refId": "C"
},
{
"expr": "sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum by (pod)\n(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{namespace=\"$namespace\", container!=\"POD\", resource=\"memory\"}[$__rate_interval]))\n)",
"expr": "sum by (pod)\n(\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\", pod=~\"$pod\"}\n * on (controller_type, controller_name) group_left()\n sum by (controller_type, controller_name) (avg_over_time(vpa_target_recommendation{namespace=\"$namespace\", container!=\"\", resource=\"memory\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
"refId": "E"
},
{
"expr": "sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "F"
},
{
"expr": "sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "G"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", pod=~\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -3834,7 +3834,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",
@@ -3972,7 +3972,7 @@
"uid": "$ds_prometheus"
},
"editorMode": "code",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"POD\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"expr": "sum by(pod) (\n max(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}) by(pod)\n * on (pod)\n sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"\", pod=~\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))\n)",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ pod }}",

View File

@@ -656,7 +656,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -680,7 +680,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"cpu\"}[$__range]))\n ) \nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"cpu\"}[$__range]))\n ) \nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -692,7 +692,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -704,7 +704,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -728,7 +728,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -740,7 +740,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (pod) group_left()\n sum by (namespace, pod)\n (\n avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__range])\n )\n )\n or\n count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (pod) group_left()\n sum by (namespace, pod)\n (\n avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__range])\n )\n )\n or\n count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -752,7 +752,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"memory\"}[$__range]))\n ) \n or \ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"memory\"}[$__range]))\n ) \n or \ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -764,7 +764,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -776,7 +776,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller)\n (\n avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range]))\n ) > 0\n )\n )\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -814,7 +814,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -826,7 +826,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -877,7 +877,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"expr": "sum by (controller) (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range]) * on (pod) group_left() sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}[$__range])) by (controller) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -1475,7 +1475,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -1646,7 +1646,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum (sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))))",
"expr": "sum (sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "System",
@@ -1657,7 +1657,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum (sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))))",
"expr": "sum (sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -1798,7 +1798,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -1939,7 +1939,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -2257,28 +2257,28 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Usage",
"refId": "D"
},
{
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "C"
},
{
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "E"
},
{
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
@@ -2458,7 +2458,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval])))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2470,7 +2470,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -2622,7 +2622,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -2799,14 +2799,14 @@
"pluginVersion": "8.5.13",
"targets": [
{
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2814,7 +2814,7 @@
"refId": "B"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2822,14 +2822,14 @@
"refId": "C"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -2955,7 +2955,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n ) > 0\n )\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -3091,7 +3091,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n )\n or\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n +\n sum by(namespace, pod, container) (avg_over_time(container_memory:kmem{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))\n )\n ) > 0\n )\n )",
"expr": "sum by (controller)\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"}\n * on (namespace, pod) group_left()\n sum by (namespace, pod)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n )\n or\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n +\n sum by(namespace, pod, container) (avg_over_time(container_memory:kmem{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))\n )\n ) > 0\n )\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -3408,14 +3408,14 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} \n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} \n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3423,7 +3423,7 @@
"refId": "B"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left() \n sum by (pod) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3431,35 +3431,35 @@
"refId": "C"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\",namespace=\"$namespace\"}[$__rate_interval]))\n ) ",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by(pod) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))\n ) ",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "E"
},
{
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} \n * on (pod) group_left() \n sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"expr": "sum\n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"} \n * on (pod) group_left() \n sum by(pod) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "F"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"memory\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (controller_type, controller_name) group_left()\n sum by(controller_type, controller_name) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"memory\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
"refId": "G"
},
{
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))\n )",
"expr": "sum \n (\n kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller=\"$controller\"}\n * on (pod) group_left()\n sum by (pod) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -3910,7 +3910,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_fs_reads_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",
@@ -4049,7 +4049,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval])))",
"expr": "sum by (controller) (kube_controller_pod{node=~\"$node\", namespace=\"$namespace\", controller_type=~\"$controller_type\", controller=~\"$controller\"} * on (pod) group_left() sum by (pod) (rate(container_fs_writes_total{node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval])))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ controller }}",

View File

@@ -869,7 +869,7 @@
"refId": "A"
},
{
"expr": "100 * count by (namespace) (\n sum by (namespace, verticalpodautoscaler) ( \n count by (namespace, controller_name, verticalpodautoscaler) (avg_over_time(vpa_target_recommendation{namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\n / on (controller_name, namespace) group_left\n count by (namespace, controller_name) (avg_over_time(kube_controller_pod{namespace=~\"$namespace\"}[$__range]))\n )\n) \n/ count by (namespace) (sum by (namespace, controller) (avg_over_time(kube_controller_pod{namespace=~\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "100 * count by (namespace) (\n sum by (namespace, verticalpodautoscaler) ( \n count by (namespace, controller_name, verticalpodautoscaler) (avg_over_time(vpa_target_recommendation{namespace=~\"$namespace\", container!=\"\"}[$__range]))\n / on (controller_name, namespace) group_left\n count by (namespace, controller_name) (avg_over_time(kube_controller_pod{namespace=~\"$namespace\"}[$__range]))\n )\n) \n/ count by (namespace) (sum by (namespace, controller) (avg_over_time(kube_controller_pod{namespace=~\"$namespace\"}[$__range])))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"hide": false,
"instant": true,
@@ -878,7 +878,7 @@
"refId": "B"
},
{
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"hide": false,
"instant": true,
@@ -895,7 +895,7 @@
"refId": "D"
},
{
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n ) > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n ) > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -903,7 +903,7 @@
"refId": "E"
},
{
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n )\n > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n )\n > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -919,7 +919,7 @@
"refId": "G"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -935,7 +935,7 @@
"refId": "I"
},
{
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n ) > 0\n )\nor\ncount(avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n ) > 0\n )\nor\ncount(avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -943,7 +943,7 @@
"refId": "J"
},
{
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__range]))\n )\n > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__range]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__range]))\n )\n > 0\n )\nor count (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -968,7 +968,7 @@
"refId": "M"
},
{
"expr": "sum by (namespace) (rate(container_fs_reads_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace) (rate(container_fs_reads_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"hide": false,
"instant": true,
@@ -977,7 +977,7 @@
"refId": "N"
},
{
"expr": "sum by (namespace) (rate(container_fs_writes_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"expr": "sum by (namespace) (rate(container_fs_writes_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__range]))\nor\ncount (avg_over_time(kube_controller_pod{node=~\"$node\", namespace=~\"$namespace\"}[$__range])) by (namespace) * 0",
"format": "table",
"hide": false,
"instant": true,
@@ -1449,7 +1449,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -1616,7 +1616,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "System",
@@ -1627,7 +1627,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -1764,7 +1764,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -1901,7 +1901,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n )\n > 0\n )",
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n )\n > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -2210,7 +2210,7 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_cpu_usage_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2218,21 +2218,21 @@
"refId": "A"
},
{
"expr": "sum by (namespace) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval])* on (uid) group_left(phase) kube_pod_status_phase{phase=\"Running\"})",
"expr": "sum by (namespace) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval])* on (uid) group_left(phase) kube_pod_status_phase{phase=\"Running\"})",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "B"
},
{
"expr": "sum by (namespace) (avg_over_time(kube_pod_container_resource_limits{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval])* on (uid) group_left(phase) kube_pod_status_phase{phase=\"Running\"})",
"expr": "sum by (namespace) (avg_over_time(kube_pod_container_resource_limits{resource=\"cpu\",unit=\"core\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval])* on (uid) group_left(phase) kube_pod_status_phase{phase=\"Running\"})",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "C"
},
{
"expr": "sum by (namespace) (avg_over_time(vpa_target_recommendation{container!=\"POD\", namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(vpa_target_recommendation{container!=\"\", namespace=\"$namespace\", resource=\"cpu\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
@@ -2407,7 +2407,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_cpu_system_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2419,7 +2419,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_cpu_user_seconds_total{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -2572,7 +2572,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -2754,14 +2754,14 @@
"pluginVersion": "8.5.13",
"targets": [
{
"expr": "sum (avg_over_time(container_memory_rss{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory_rss{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum (avg_over_time(container_memory_cache{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory_cache{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2769,7 +2769,7 @@
"refId": "B"
},
{
"expr": "sum (avg_over_time(container_memory_swap{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory_swap{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -2777,14 +2777,14 @@
"refId": "C"
},
{
"expr": "sum (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -2910,7 +2910,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (namespace)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -3046,7 +3046,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"POD\", namespace=~\"$namespace\"}[$__rate_interval]))\n )\n > 0\n )",
"expr": "sum by (namespace)\n (\n (\n (\n sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", namespace=~\"$namespace\"}[$__rate_interval]))\n ) or sum by(namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", container!=\"\", namespace=~\"$namespace\"}[$__rate_interval]))\n )\n > 0\n )",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -3370,14 +3370,14 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by (namespace) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_rss{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "RSS",
"refId": "A"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_cache{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3385,7 +3385,7 @@
"refId": "B"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_swap{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3393,35 +3393,35 @@
"refId": "C"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory_working_set_bytes:without_kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum by(namespace) (avg_over_time(vpa_target_recommendation{container!=\"POD\",namespace=\"$namespace\", resource=\"memory\"}[$__rate_interval]))",
"expr": "sum by(namespace) (avg_over_time(vpa_target_recommendation{container!=\"\",namespace=\"$namespace\", resource=\"memory\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "VPA Target",
"refId": "E"
},
{
"expr": "sum by(namespace) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(namespace) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
"refId": "F"
},
{
"expr": "sum by(namespace) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"POD\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(namespace) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",node=~\"$node\", container!=\"\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "G"
},
{
"expr": "sum by (namespace) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (avg_over_time(container_memory:kmem{node=~\"$node\", namespace=\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -3873,7 +3873,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_fs_reads_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_fs_reads_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",
@@ -4008,7 +4008,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace) (rate(container_fs_writes_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by (namespace) (rate(container_fs_writes_total{node=~\"$node\", namespace=~\"$namespace\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ namespace }}",

View File

@@ -686,7 +686,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (container) (rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"expr": "sum by (container) (rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"format": "table",
"hide": false,
"instant": true,
@@ -759,7 +759,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"expr": "sum by (container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -847,7 +847,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by(container) (rate(container_fs_reads_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__range]))",
"expr": "sum by(container) (rate(container_fs_reads_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__range]))",
"format": "table",
"hide": false,
"instant": true,
@@ -860,7 +860,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by(container) (rate(container_fs_writes_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__range]))",
"expr": "sum by(container) (rate(container_fs_writes_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__range]))",
"format": "table",
"hide": false,
"instant": true,
@@ -899,7 +899,7 @@
"type": "prometheus",
"uid": "${ds_prometheus}"
},
"expr": "sum by (container) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"expr": "sum by (container) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=~\"$container\"}[$__range]))\nor\nsum by (container) (avg_over_time(kube_pod_container_info{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__range]) * 0)",
"format": "table",
"instant": true,
"intervalFactor": 1,
@@ -1503,7 +1503,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_cpu_usage_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_cpu_usage_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -1669,7 +1669,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(pod) (rate(container_cpu_system_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_system_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -1681,7 +1681,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(pod) (rate(container_cpu_user_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(pod) (rate(container_cpu_user_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -1820,7 +1820,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (namespace, pod, container)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{container!=\"POD\", namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (namespace, pod, container)\n (\n (\n sum by(namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"cpu\",unit=\"core\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by(namespace, pod, container) (rate(container_cpu_usage_seconds_total{container!=\"\", namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -2269,7 +2269,7 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by(container) (rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_cpu_usage_seconds_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Usage",
@@ -2476,7 +2476,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_cpu_system_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_cpu_system_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -2488,7 +2488,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_cpu_user_seconds_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_cpu_user_seconds_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "User",
@@ -2639,7 +2639,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (avg_over_time(container_memory_working_set_bytes:without_kmem{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_working_set_bytes:without_kmem{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -2816,7 +2816,7 @@
"pluginVersion": "8.5.13",
"targets": [
{
"expr": "sum by(pod) (avg_over_time(container_memory_rss{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory_rss{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -2824,28 +2824,28 @@
"refId": "A"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory_cache{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory_cache{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Cache",
"refId": "B"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory_swap{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory_swap{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Swap",
"refId": "C"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum by(pod) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\"}[$__rate_interval]))",
"expr": "sum by(pod) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -2974,7 +2974,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (container)\n (\n (\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"POD\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (container)\n (\n (\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -3110,7 +3110,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by (container)\n (\n (\n (\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"POD\"}[$__rate_interval]))\n ) or sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"POD\"}[$__rate_interval]))\n ) > 0\n )",
"expr": "sum by (container)\n (\n (\n (\n sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\"}[$__rate_interval]))\n -\n sum by (namespace, pod, container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"\"}[$__rate_interval]))\n ) or sum by (namespace, pod, container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container=~\"$container\", container!=\"\"}[$__rate_interval]))\n ) > 0\n )",
"format": "time_series",
"hide": false,
"intervalFactor": 1,
@@ -3431,7 +3431,7 @@
"repeatDirection": "h",
"targets": [
{
"expr": "sum by(container) (avg_over_time(container_memory_rss{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_rss{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"instant": false,
"intervalFactor": 1,
@@ -3439,7 +3439,7 @@
"refId": "A"
},
{
"expr": "sum by(container) (avg_over_time(container_memory_cache{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_cache{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"interval": "",
"intervalFactor": 1,
@@ -3447,28 +3447,28 @@
"refId": "B"
},
{
"expr": "sum by(container) (avg_over_time(container_memory_swap{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_swap{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Swap",
"refId": "C"
},
{
"expr": "sum by(container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory_working_set_bytes:without_kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Working set bytes without kmem",
"refId": "D"
},
{
"expr": "sum by(container) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(kube_pod_container_resource_limits{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Limits",
"refId": "E"
},
{
"expr": "sum by(container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(kube_pod_container_resource_requests{resource=\"memory\",unit=\"byte\",namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Requests",
@@ -3482,7 +3482,7 @@
"refId": "G"
},
{
"expr": "sum by(container) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"POD\", container=\"$container\"}[$__rate_interval]))",
"expr": "sum by(container) (avg_over_time(container_memory:kmem{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", container=\"$container\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "Kmem",
@@ -3930,7 +3930,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_fs_reads_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_fs_reads_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ container }}",
@@ -4068,7 +4068,7 @@
"type": "prometheus",
"uid": "$ds_prometheus"
},
"expr": "sum by(container) (rate(container_fs_writes_total{container!=\"POD\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"expr": "sum by(container) (rate(container_fs_writes_total{container!=\"\", pod=\"$pod\", namespace=\"$namespace\"}[$__rate_interval]))",
"format": "time_series",
"intervalFactor": 1,
"legendFormat": "{{ container }}",

243
docs/changelogs/v0.31.0.md Normal file
View File

@@ -0,0 +1,243 @@
Cozystack v0.31.0 is a significant release that brings new features, key fixes, and updates to underlying components.
This version enhances GPU support, improves many components of Cozystack, and introduces a more robust release process to improve stability.
Below, we'll go over the highlights in each area for current users, developers, and our community.
## Major Features and Improvements
### GPU support for tenant Kubernetes clusters
Cozystack now integrates NVIDIA GPU Operator support for tenant Kubernetes clusters.
This enables platform users to run GPU-powered AI/ML applications in their own clusters.
To enable GPU Operator, set `addons.gpuOperator.enabled: true` in the cluster configuration.
(@kvaps in https://github.com/cozystack/cozystack/pull/834)
Check out Andrei Kvapil's CNCF webinar [showcasing the GPU support by running Stable Diffusion in Cozystack](https://www.youtube.com/watch?v=S__h_QaoYEk).
<!--
* [kubernetes] Introduce GPU support for tenant Kubernetes clusters. (@kvaps in https://github.com/cozystack/cozystack/pull/834)
-->
### Cilium Improvements
Cozystacks Cilium integration received two significant enhancements.
First, Gateway API support in Cilium is now enabled, allowing advanced L4/L7 routing features via Kubernetes Gateway API.
We thank Zdenek Janda @zdenekjanda for contributing this feature in https://github.com/cozystack/cozystack/pull/924.
Second, Cozystack now permits custom user-provided parameters in the tenant clusters Cilium configuration.
(@lllamnyp in https://github.com/cozystack/cozystack/pull/917)
<!--
* [cilium] Enable Cilium Gateway API. (@zdenekjanda in https://github.com/cozystack/cozystack/pull/924)
* [cilium] Enable user-added parameters in a tenant cluster Cilium. (@lllamnyp in https://github.com/cozystack/cozystack/pull/917)
-->
### Cross-Architecture Builds (ARM Support Beta)
Cozystack's build system was refactored to support multi-architecture binaries and container images.
This paves the road to running Cozystack on ARM64 servers.
Changes include Makefile improvements (https://github.com/cozystack/cozystack/pull/907)
and multi-arch Docker image builds (https://github.com/cozystack/cozystack/pull/932 and https://github.com/cozystack/cozystack/pull/970).
We thank Nikita Bykov @nbykov0 for his ongoing work on ARM support!
<!--
* Introduce support for cross-architecture builds and Cozystack on ARM:
* [build] Refactor Makefiles introducing build variables. (@nbykov0 in https://github.com/cozystack/cozystack/pull/907)
* [build] Add support for multi-architecture and cross-platform image builds. (@nbykov0 in https://github.com/cozystack/cozystack/pull/932 and https://github.com/cozystack/cozystack/pull/970)
-->
### VerticalPodAutoscaler (VPA) Expansion
The VerticalPodAutoscaler is now enabled for more Cozystack components to automate resource tuning.
Specifically, VPA was added for tenant Kubernetes control planes (@klinch0 in https://github.com/cozystack/cozystack/pull/806),
the Cozystack Dashboard (https://github.com/cozystack/cozystack/pull/828),
and the Cozystack etcd-operator (https://github.com/cozystack/cozystack/pull/850).
All Cozystack components that have VPA enabled can automatically adjust their CPU and memory requests based on usage, improving platform and application stability.
<!--
* Add VerticalPodAutoscaler to a few more components:
* [kubernetes] Kubernetes clusters in user tenants. (@klinch0 in https://github.com/cozystack/cozystack/pull/806)
* [platform] Cozystack dashboard. (@klinch0 in https://github.com/cozystack/cozystack/pull/828)
* [platform] Cozystack etcd-operator (@klinch0 in https://github.com/cozystack/cozystack/pull/850)
-->
### Tenant HelmRelease Reconcile Controller
A new controller was introduced to monitor and synchronize HelmRelease resources across tenants.
This controller propagates configuration changes to tenant workloads and ensures that any HelmRelease defined in a tenant
stays in sync with platform updates.
It improves the reliability of deploying managed applications in Cozystack.
(@klinch0 in https://github.com/cozystack/cozystack/pull/870)
<!--
* [platform] Introduce a new controller to synchronize tenant HelmReleases and propagate configuration changes. (@klinch0 in https://github.com/cozystack/cozystack/pull/870)
-->
### Virtual Machine Improvements
**Configurable KubeVirt CPU Overcommit**: The CPU allocation ratio in KubeVirt (how virtual CPUs are overcommitted relative to physical) is now configurable
via the `cpu-allocation-ratio` value in the Cozystack configmap.
This means Cozystack administrators can now tune CPU overcommitment for VMs to balance performance vs. density.
(@lllamnyp in https://github.com/cozystack/cozystack/pull/905)
**KubeVirt VM Export**: Cozystack now allows exporting KubeVirt virtual machines.
This feature, enabled via KubeVirt's `VirtualMachineExport` capability, lets users snapshot or back up VM images.
(@kvaps in https://github.com/cozystack/cozystack/pull/808)
**Support for various storage classes in Virtual Machines**: The `virtual-machine` application (since version 0.9.2) lets you pick any StorageClass for a VM's
system disk instead of relying on a hard-coded PVC.
Refer to values `systemDisk.storage` and `systemDisk.storageClass` in the [application's configs](https://cozystack.io/docs/reference/applications/virtual-machine/#common-parameters).
(@kvaps in https://github.com/cozystack/cozystack/pull/974)
<!--
* [kubevirt] Enable exporting VMs. (@kvaps in https://github.com/cozystack/cozystack/pull/808)
* [kubevirt] Make KubeVirt's CPU allocation ratio configurable. (@lllamnyp in https://github.com/cozystack/cozystack/pull/905)
* [virtual-machine] Add support for various storages. (@kvaps in https://github.com/cozystack/cozystack/pull/974)
-->
### Other Features and Improvements
* [platform] Introduce options `expose-services`, `expose-ingress`, and `expose-external-ips` to the ingress service. (@kvaps in https://github.com/cozystack/cozystack/pull/929)
* [cozystack-controller] Record the IP address pool and storage class in Workload objects. (@lllamnyp in https://github.com/cozystack/cozystack/pull/831)
* [apps] Remove user-facing config of limits and requests. (@lllamnyp in https://github.com/cozystack/cozystack/pull/935)
## New Release Lifecycle
Cozystack release lifecycle is changing to provide a more stable and predictable lifecycle to customers running Cozystack in mission-critical environments.
* **Gradual Release with Alpha, Beta, and Release Candidates**: Cozystack will now publish pre-release versions (alpha, beta, release candidates) before a stable release.
Starting with v0.31.0, the team made three release candidates before releasing version v0.31.0.
This allows more testing and feedback before marking a release as stable.
* **Prolonged Release Support with Patch Versions**: After the initial `vX.Y.0` release, a long-lived branch `release-X.Y` will be created to backport fixes.
For example, with 0.31.0s release, a `release-0.31` branch will track patch fixes (`0.31.x`).
This strategy lets Cozystack users receive timely patch releases and updates with minimal risks.
To implement these new changes, we have rebuilt our CI/CD workflows and introduced automation, enabling automatic backports.
You can read more about how it's implemented in the Development section below.
For more information, read the [Cozystack Release Workflow](https://github.com/cozystack/cozystack/blob/main/docs/release.md) documentation.
## Fixes
* [virtual-machine] Add GPU names to the virtual machine specifications. (@kvaps in https://github.com/cozystack/cozystack/pull/862)
* [virtual-machine] Count Workload resources for pods by requests, not limits. Other improvements to VM resource tracking. (@lllamnyp in https://github.com/cozystack/cozystack/pull/904)
* [virtual-machine] Set PortList method by default. (@kvaps in https://github.com/cozystack/cozystack/pull/996)
* [virtual-machine] Specify ports even for wholeIP mode. (@kvaps in https://github.com/cozystack/cozystack/pull/1000)
* [platform] Fix installing HelmReleases on initial setup. (@kvaps in https://github.com/cozystack/cozystack/pull/833)
* [platform] Migration scripts update Kubernetes ConfigMap with the current stack version for improved version tracking. (@klinch0 in https://github.com/cozystack/cozystack/pull/840)
* [platform] Reduce requested CPU and RAM for the `kamaji` provider. (@klinch0 in https://github.com/cozystack/cozystack/pull/825)
* [platform] Improve the reconciliation loop for the Cozystack system HelmReleases logic. (@klinch0 in https://github.com/cozystack/cozystack/pull/809 and https://github.com/cozystack/cozystack/pull/810, @kvaps in https://github.com/cozystack/cozystack/pull/811)
* [platform] Remove extra dependencies for the Piraeus operator. (@klinch0 in https://github.com/cozystack/cozystack/pull/856)
* [platform] Refactor dashboard values. (@kvaps in https://github.com/cozystack/cozystack/pull/928, patched by @llamnyp in https://github.com/cozystack/cozystack/pull/952)
* [platform] Make FluxCD artifact disabled by default. (@klinch0 in https://github.com/cozystack/cozystack/pull/964)
* [kubernetes] Update garbage collection of HelmReleases in tenant Kubernetes clusters. (@kvaps in https://github.com/cozystack/cozystack/pull/835)
* [kubernetes] Fix merging `valuesOverride` for tenant clusters. (@kvaps in https://github.com/cozystack/cozystack/pull/879)
* [kubernetes] Fix `ubuntu-container-disk` tag. (@kvaps in https://github.com/cozystack/cozystack/pull/887)
* [kubernetes] Refactor Helm manifests for tenant Kubernetes clusters. (@kvaps in https://github.com/cozystack/cozystack/pull/866)
* [kubernetes] Fix Ingress-NGINX depends on Cert-Manager. (@kvaps in https://github.com/cozystack/cozystack/pull/976)
* [kubernetes, apps] Enable `topologySpreadConstraints` for tenant Kubernetes clusters and fix it for managed PostgreSQL. (@klinch0 in https://github.com/cozystack/cozystack/pull/995)
* [tenant] Fix an issue with accessing external IPs of a cluster from the cluster itself. (@kvaps in https://github.com/cozystack/cozystack/pull/854)
* [cluster-api] Remove the no longer necessary workaround for Kamaji. (@kvaps in https://github.com/cozystack/cozystack/pull/867, patched in https://github.com/cozystack/cozystack/pull/956)
* [monitoring] Remove legacy label "POD" from the exclude filter in metrics. (@xy2 in https://github.com/cozystack/cozystack/pull/826)
* [monitoring] Refactor management etcd monitoring config. Introduce a migration script for updating monitoring resources (`kube-rbac-proxy` daemonset). (@lllamnyp in https://github.com/cozystack/cozystack/pull/799 and https://github.com/cozystack/cozystack/pull/830)
* [monitoring] Fix VerticalPodAutoscaler resource allocation for VMagent. (@klinch0 in https://github.com/cozystack/cozystack/pull/820)
* [postgres] Remove duplicated `template` entry from backup manifest. (@etoshutka in https://github.com/cozystack/cozystack/pull/872)
* [kube-ovn] Fix versions mapping in Makefile. (@kvaps in https://github.com/cozystack/cozystack/pull/883)
* [dx] Automatically detect version for migrations in the installer.sh. (@kvaps in https://github.com/cozystack/cozystack/pull/837)
* [dx] remove version_map and building for library charts. (@kvaps in https://github.com/cozystack/cozystack/pull/998)
* [docs] Review the tenant Kubernetes cluster docs. (@NickVolynkin in https://github.com/cozystack/cozystack/pull/969)
* [docs] Explain that tenants cannot have dashes in their names. (@NickVolynkin in https://github.com/cozystack/cozystack/pull/980)
## Dependencies
* MetalLB images are now built in-tree based on version 0.14.9 with additional critical patches. (@lllamnyp in https://github.com/cozystack/cozystack/pull/945)
* Update Kubernetes to v1.32.4. (@kvaps in https://github.com/cozystack/cozystack/pull/949)
* Update Talos Linux to v1.10.1. (@kvaps in https://github.com/cozystack/cozystack/pull/931)
* Update Cilium to v1.17.3. (@kvaps in https://github.com/cozystack/cozystack/pull/848)
* Update LINSTOR to v1.31.0. (@kvaps in https://github.com/cozystack/cozystack/pull/846)
* Update Kube-OVN to v1.13.11. (@kvaps in https://github.com/cozystack/cozystack/pull/847, @lllamnyp in https://github.com/cozystack/cozystack/pull/922)
* Update tenant Kubernetes to v1.32. (@kvaps in https://github.com/cozystack/cozystack/pull/871)
* Update flux-operator to 0.20.0. (@kingdonb in https://github.com/cozystack/cozystack/pull/880 and https://github.com/cozystack/cozystack/pull/934)
* Update multiple Cluster API components. (@kvaps in https://github.com/cozystack/cozystack/pull/867 and https://github.com/cozystack/cozystack/pull/947)
* Update KamajiControlPlane to edge-25.4.1. (@kvaps in https://github.com/cozystack/cozystack/pull/953, fixed by @nbykov0 in https://github.com/cozystack/cozystack/pull/983)
* Update cert-manager to v1.17.2. (@kvaps in https://github.com/cozystack/cozystack/pull/975)
## Documentation
* [Installing Talos in Air-Gapped Environment](https://cozystack.io/docs/operations/talos/configuration/air-gapped/):
new guide for configuring and bootstrapping Talos Linux clusters in air-gapped environments.
(@klinch0 in https://github.com/cozystack/website/pull/203)
* [Cozystack Bundles](https://cozystack.io/docs/guides/bundles/): new page in the learning section explaining how Cozystack bundles work and how to choose a bundle.
(@NickVolynkin in https://github.com/cozystack/website/pull/188, https://github.com/cozystack/website/pull/189, and others;
updated by @kvaps in https://github.com/cozystack/website/pull/192 and https://github.com/cozystack/website/pull/193)
* [Managed Application Reference](https://cozystack.io/docs/reference/applications/): A set of new pages in the docs, mirroring application docs from the Cozystack dashboard.
(@NickVolynkin in https://github.com/cozystack/website/pull/198, https://github.com/cozystack/website/pull/202, and https://github.com/cozystack/website/pull/204)
* **LINSTOR Networking**: Guides on [configuring dedicated network for LINSTOR](https://cozystack.io/docs/operations/storage/dedicated-network/)
and [configuring network for distributed storage in multi-datacenter setup](https://cozystack.io/docs/operations/stretched/linstor-dedicated-network/).
(@xy2, edited by @NickVolynkin in https://github.com/cozystack/website/pull/171, https://github.com/cozystack/website/pull/182, and https://github.com/cozystack/website/pull/184)
### Fixes
* Correct error in the doc for the command to edit the configmap. (@lb0o in https://github.com/cozystack/website/pull/207)
* Fix group name in OIDC docs (@kingdonb in https://github.com/cozystack/website/pull/179)
* A bit more explanation of Docker buildx builders. (@nbykov0 in https://github.com/cozystack/website/pull/187)
## Development, Testing, and CI/CD
### Testing
Improvements:
* Introduce `cozytest` — a new [BATS-based](https://github.com/bats-core/bats-core) testing framework. (@kvaps in https://github.com/cozystack/cozystack/pull/982)
Fixes:
* Fix `device_ownership_from_security_context` CRI. (@dtrdnk in https://github.com/cozystack/cozystack/pull/896)
* Increase timeout durations for `capi` and `keycloak` to improve reliability during e2e-tests. (@kvaps in https://github.com/cozystack/cozystack/pull/858)
* Return `genisoimage` to the e2e-test Dockerfile (@gwynbleidd2106 in https://github.com/cozystack/cozystack/pull/962)
### CI/CD Changes
Improvements:
* Use release branches `release-X.Y` for gathering and releasing fixes after initial `vX.Y.0` release. (@kvaps in https://github.com/cozystack/cozystack/pull/816)
* Automatically create release branches after initial `vX.Y.0` release is published. (@kvaps in https://github.com/cozystack/cozystack/pull/886)
* Introduce Release Candidate versions. Automate patch backporting by applying patches from pull requests labeled `[backport]` to the current release branch. (@kvaps in https://github.com/cozystack/cozystack/pull/841 and https://github.com/cozystack/cozystack/pull/901, @nickvolynkin in https://github.com/cozystack/cozystack/pull/890)
* Support alpha and beta pre-releases. (@kvaps in https://github.com/cozystack/cozystack/pull/978)
* Commit changes in release pipelines under `github-actions <github-actions@github.com>`. (@kvaps in https://github.com/cozystack/cozystack/pull/823)
* Describe the Cozystack release workflow. (@NickVolynkin in https://github.com/cozystack/cozystack/pull/817 and https://github.com/cozystack/cozystack/pull/897)
Fixes:
* Improve the check for `versions_map` running on pull requests. (@kvaps and @klinch0 in https://github.com/cozystack/cozystack/pull/836, https://github.com/cozystack/cozystack/pull/842, and https://github.com/cozystack/cozystack/pull/845)
* If the release step was skipped on a tag, skip tests as well. (@kvaps in https://github.com/cozystack/cozystack/pull/822)
* Allow CI to cancel the previous job if a new one is scheduled. (@kvaps in https://github.com/cozystack/cozystack/pull/873)
* Use the correct version name when uploading build assets to the release page. (@kvaps in https://github.com/cozystack/cozystack/pull/876)
* Stop using `ok-to-test` label to trigger CI in pull requests. (@kvaps in https://github.com/cozystack/cozystack/pull/875)
* Do not run tests in the release building pipeline. (@kvaps in https://github.com/cozystack/cozystack/pull/882)
* Fix release branch creation. (@kvaps in https://github.com/cozystack/cozystack/pull/884)
* Reduce noise in the test logs by suppressing the `wget` progress bar. (@lllamnyp in https://github.com/cozystack/cozystack/pull/865)
* Revert "automatically trigger tests in releasing PR". (@kvaps in https://github.com/cozystack/cozystack/pull/900)
* Force-update release branch on tagged main commits. (@kvaps in https://github.com/cozystack/cozystack/pull/977)
* Show detailed errors in the `pull-request-release` workflow. (@lllamnyp in https://github.com/cozystack/cozystack/pull/992)
## Community and Maintenance
### Repository Maintenance
Added @klinch0 to CODEOWNERS. (@kvaps in https://github.com/cozystack/cozystack/pull/838)
### New Contributors
* @etoshutka made their first contribution in https://github.com/cozystack/cozystack/pull/872
* @dtrdnk made their first contribution in https://github.com/cozystack/cozystack/pull/896
* @zdenekjanda made their first contribution in https://github.com/cozystack/cozystack/pull/924
* @gwynbleidd2106 made their first contribution in https://github.com/cozystack/cozystack/pull/962
## Full Changelog
See https://github.com/cozystack/cozystack/compare/v0.30.0...v0.31.0

166
docs/release.md Normal file
View File

@@ -0,0 +1,166 @@
# Release Workflow
This document describes Cozystacks release process.
## Introduction
Cozystack uses a staged release process to ensure stability and flexibility during development.
There are three types of releases:
- **Release Candidates (RC)** Preview versions (e.g., `v0.42.0-rc.1`) used for final testing and validation.
- **Regular Releases** Final versions (e.g., `v0.42.0`) that are feature-complete and thoroughly tested.
- **Patch Releases** Bugfix-only updates (e.g., `v0.42.1`) made after a stable release, based on a dedicated release branch.
Each type plays a distinct role in delivering reliable and tested updates while allowing ongoing development to continue smoothly.
## Release Candidates
Release candidates are Cozystack versions that introduce new features and are published before a stable release.
Their purpose is to help validate stability before finalizing a new feature release.
They allow for final rounds of testing and bug fixes without freezing development.
Release candidates are given numbers `vX.Y.0-rc.N`, for example, `v0.42.0-rc.1`.
They are created directly in the `main` branch.
An RC is typically tagged when all major features for the upcoming release have been merged into main and the release enters its testing phase.
However, new features and changes can still be added before the regular release `vX.Y.0`.
Each RC contributes to a cumulative set of release notes that will be finalized when `vX.Y.0` is released.
After testing, if no critical issues remain, the regular release (`vX.Y.0`) is tagged from the last RC or a later commit in main.
This begins the regular release process, creates a dedicated `release-X.Y` branch, and opens the way for patch releases.
## Regular Releases
When making a regular release, we tag the latest RC or a subsequent minimal-change commit as `vX.Y.0`.
In this explanation, we'll use version `v0.42.0` as an example:
```mermaid
gitGraph
commit id: "feature"
commit id: "feature 2"
commit id: "feature 3" tag: "v0.42.0"
```
A regular release sequence starts in the following way:
1. Maintainer tags a commit in `main` with `v0.42.0` and pushes it to GitHub.
2. CI workflow triggers on tag push:
1. Creates a draft page for release `v0.42.0`, if it wasn't created before.
2. Takes code from tag `v0.42.0`, builds images, and pushes them to ghcr.io.
3. Makes a new commit `Prepare release v0.42.0` with updated digests, pushes it to the new branch `release-0.42.0`, and opens a PR to `main`.
4. Builds Cozystack release assets from the new commit `Prepare release v0.42.0` and uploads them to the release draft page.
3. Maintainer reviews PR, tests build artifacts, and edits changelogs on the release draft page.
```mermaid
gitGraph
commit id: "feature"
commit id: "feature 2"
commit id: "feature 3" tag: "v0.42.0"
branch release-0.42.0
checkout release-0.42.0
commit id: "Prepare release v0.42.0"
checkout main
merge release-0.42.0 id: "Pull Request"
```
When testing and editing are completed, the sequence goes on.
4. Maintainer merges the PR. GitHub removes the merged branch `release-0.42.0`.
5. CI workflow triggers on merge:
1. Moves the tag `v0.42.0` to the newly created merge commit by force-pushing a tag to GitHub.
2. Publishes the release page (`draft` → `latest`).
6. The maintainer can now announce the release to the community.
```mermaid
gitGraph
commit id: "feature"
commit id: "feature 2"
commit id: "feature 3"
branch release-0.42.0
checkout release-0.42.0
commit id: "Prepare release v0.42.0"
checkout main
merge release-0.42.0 id: "Release v0.42.0" tag: "v0.42.0"
```
## Patch Releases
Making a patch release has a lot in common with a regular release, with a couple of differences:
* A release branch is used instead of `main`
* Patch commits are cherry-picked to the release branch.
* A pull request is opened against the release branch.
Let's assume that we've released `v0.42.0` and that development is ongoing.
We have introduced a couple of new features and some fixes to features that we have released
in `v0.42.0`.
Once problems were found and fixed, a patch release is due.
```mermaid
gitGraph
commit id: "Release v0.42.0" tag: "v0.42.0"
checkout main
commit id: "feature 4"
commit id: "patch 1"
commit id: "feature 5"
commit id: "patch 2"
```
1. The maintainer creates a release branch, `release-0.42,` and cherry-picks patch commits from `main` to `release-0.42`.
These must be only patches to features that were present in version `v0.42.0`.
Cherry-picking can be done as soon as each patch is merged into `main`,
or directly before the release.
```mermaid
gitGraph
commit id: "Release v0.42.0" tag: "v0.42.0"
branch release-0.42
checkout main
commit id: "feature 4"
commit id: "patch 1"
commit id: "feature 5"
commit id: "patch 2"
checkout release-0.42
cherry-pick id: "patch 1"
cherry-pick id: "patch 2"
```
When all relevant patch commits are cherry-picked, the branch is ready for release.
2. The maintainer tags the `HEAD` commit of branch `release-0.42` as `v0.42.1` and then pushes it to GitHub.
3. CI workflow triggers on tag push:
1. Creates a draft page for release `v0.42.1`, if it wasn't created before.
2. Takes code from tag `v0.42.1`, builds images, and pushes them to ghcr.io.
3. Makes a new commit `Prepare release v0.42.1` with updated digests, pushes it to the new branch `release-0.42.1`, and opens a PR to `release-0.42`.
4. Builds Cozystack release assets from the new commit `Prepare release v0.42.1` and uploads them to the release draft page.
4. Maintainer reviews PR, tests build artifacts, and edits changelogs on the release draft page.
```mermaid
gitGraph
commit id: "Release v0.42.0" tag: "v0.42.0"
branch release-0.42
checkout main
commit id: "feature 4"
commit id: "patch 1"
commit id: "feature 5"
commit id: "patch 2"
checkout release-0.42
cherry-pick id: "patch 1"
cherry-pick id: "patch 2" tag: "v0.42.1"
branch release-0.42.1
commit id: "Prepare release v0.42.1"
checkout release-0.42
merge release-0.42.1 id: "Pull request"
```
Finally, when release is confirmed, the release sequence goes on.
5. Maintainer merges the PR. GitHub removes the merged branch `release-0.42.1`.
6. CI workflow triggers on merge:
1. Moves the tag `v0.42.1` to the newly created merge commit by force-pushing a tag to GitHub.
2. Publishes the release page (`draft` → `latest`).
7. The maintainer can now announce the release to the community.

117
hack/cozytest.sh Executable file
View File

@@ -0,0 +1,117 @@
#!/bin/sh
###############################################################################
# cozytest.sh - Bats-compatible test runner with live trace and enhanced #
# output, written in pure shell #
###############################################################################
set -eu
TEST_FILE=${1:?Usage: ./cozytest.sh <file.bats> [pattern]}
PATTERN=${2:-*}
LINE='----------------------------------------------------------------'
cols() { stty size 2>/dev/null | awk '{print $2}' || echo 80; }
MAXW=$(( $(cols) - 12 )); [ "$MAXW" -lt 40 ] && MAXW=70
BEGIN=$(date +%s)
timestamp() { s=$(( $(date +%s) - BEGIN )); printf '[%02d:%02d]' $((s/60)) $((s%60)); }
###############################################################################
# run_one <fn> <title> #
###############################################################################
run_one() {
fn=$1 title=$2
tmp=$(mktemp -d) || { echo "Failed to create temp directory" >&2; exit 1; }
log="$tmp/log"
echo "╭ » Run test: $title"
START=$(date +%s)
skip_next="+ $fn" # первую строку трассировки с именем функции пропустим
{
(
PS4='+ ' # prefix for set -x
set -eu -x # strict + trace
"$fn"
)
printf '__RC__%s\n' "$?"
} 2>&1 | tee "$log" | while IFS= read -r line; do
case "$line" in
'__RC__'*) : ;;
'+ '*) cmd=${line#'+ '}
[ "$cmd" = "${skip_next#+ }" ] && continue
case "$cmd" in
'set -e'|'set -x'|'set -u'|'return 0') continue ;;
esac
out=$cmd ;;
*) out=$line ;;
esac
now=$(( $(date +%s) - START ))
[ ${#out} -gt "$MAXW" ] && out="$(printf '%.*s…' "$MAXW" "$out")"
printf '┊[%02d:%02d] %s\n' $((now/60)) $((now%60)) "$out"
done
rc=$(awk '/^__RC__/ {print substr($0,7)}' "$log" | tail -n1)
[ -z "$rc" ] && rc=1
now=$(( $(date +%s) - START ))
if [ "$rc" -eq 0 ]; then
printf '╰[%02d:%02d] ✅ Test OK: %s\n' $((now/60)) $((now%60)) "$title"
else
printf '╰[%02d:%02d] ❌ Test failed: %s (exit %s)\n' \
$((now/60)) $((now%60)) "$title" "$rc"
echo "----- captured output -----------------------------------------"
grep -v '^__RC__' "$log"
echo "$LINE"
exit "$rc"
fi
rm -rf "$tmp"
}
###############################################################################
# convert .bats -> shell-functions #
###############################################################################
TMP_SH=$(mktemp) || { echo "Failed to create temp file" >&2; exit 1; }
trap 'rm -f "$TMP_SH"' EXIT
awk '
/^@test[[:space:]]+"/ {
line = substr($0, index($0, "\"") + 1)
title = substr(line, 1, index(line, "\"") - 1)
fname = "test_"
for (i = 1; i <= length(title); i++) {
c = substr(title, i, 1)
fname = fname (c ~ /[A-Za-z0-9]/ ? c : "_")
}
printf("### %s\n", title)
printf("%s() {\n", fname)
print " set -e" # ошибка → падение теста
next
}
/^}$/ {
print " return 0" # если автор не сделал exit 1 — тест ОК
print "}"
next
}
{ print }
' "$TEST_FILE" > "$TMP_SH"
[ -f "$TMP_SH" ] || { echo "Failed to generate test functions" >&2; exit 1; }
# shellcheck disable=SC1090
. "$TMP_SH"
###############################################################################
# run selected tests #
###############################################################################
awk -v pat="$PATTERN" '
/^### / {
title = substr($0, 5)
name = "test_"
for (i = 1; i <= length(title); i++) {
c = substr(title, i, 1)
name = name (c ~ /[A-Za-z0-9]/ ? c : "_")
}
if (pat == "*" || index(title, pat) > 0)
printf("%s %s\n", name, title)
}
' "$TMP_SH" | while IFS=' ' read -r fn title; do
run_one "$fn" "$title"
done

94
hack/e2e-apps.bats Executable file
View File

@@ -0,0 +1,94 @@
#!/usr/bin/env bats
# -----------------------------------------------------------------------------
# Cozystack endtoend provisioning test (Bats)
# -----------------------------------------------------------------------------
@test "Create tenant with isolated mode enabled" {
kubectl create -f - <<EOF
apiVersion: apps.cozystack.io/v1alpha1
kind: Tenant
metadata:
name: test
namespace: tenant-root
spec:
etcd: false
host: ""
ingress: false
isolated: true
monitoring: false
resourceQuotas: {}
seaweedfs: false
EOF
kubectl wait hr/tenant-test -n tenant-root --timeout=1m --for=condition=ready
kubectl wait namespace tenant-test --timeout=20s --for=jsonpath='{.status.phase}'=Active
}
@test "Create a tenant Kubernetes control plane" {
kubectl create -f - <<EOF
apiVersion: apps.cozystack.io/v1alpha1
kind: Kubernetes
metadata:
name: test
namespace: tenant-test
spec:
addons:
certManager:
enabled: false
valuesOverride: {}
cilium:
valuesOverride: {}
fluxcd:
enabled: false
valuesOverride: {}
gatewayAPI:
enabled: false
gpuOperator:
enabled: false
valuesOverride: {}
ingressNginx:
enabled: true
hosts: []
valuesOverride: {}
monitoringAgents:
enabled: false
valuesOverride: {}
verticalPodAutoscaler:
valuesOverride: {}
controlPlane:
apiServer:
resources: {}
resourcesPreset: small
controllerManager:
resources: {}
resourcesPreset: micro
konnectivity:
server:
resources: {}
resourcesPreset: micro
replicas: 2
scheduler:
resources: {}
resourcesPreset: micro
host: ""
nodeGroups:
md0:
ephemeralStorage: 20Gi
gpus: []
instanceType: u1.medium
maxReplicas: 10
minReplicas: 0
resources:
cpu: ""
memory: ""
roles:
- ingress-nginx
storageClass: replicated
EOF
kubectl wait namespace tenant-test --timeout=20s --for=jsonpath='{.status.phase}'=Active
timeout 10 sh -ec 'until kubectl get kamajicontrolplane -n tenant-test kubernetes-test; do sleep 1; done'
kubectl wait --for=condition=TenantControlPlaneCreated kamajicontrolplane -n tenant-test kubernetes-test --timeout=4m
kubectl wait tcp -n tenant-test kubernetes-test --timeout=2m --for=jsonpath='{.status.kubernetesResources.version.status}'=Ready
kubectl wait deploy --timeout=4m --for=condition=available -n tenant-test kubernetes-test kubernetes-test-cluster-autoscaler kubernetes-test-kccm kubernetes-test-kcsi-controller
kubectl wait machinedeployment kubernetes-test-md0 -n tenant-test --timeout=1m --for=jsonpath='{.status.replicas}'=2
kubectl wait machinedeployment kubernetes-test-md0 -n tenant-test --timeout=5m --for=jsonpath='{.status.v1beta2.readyReplicas}'=2
}

432
hack/e2e-cluster.bats Executable file
View File

@@ -0,0 +1,432 @@
#!/usr/bin/env bats
# -----------------------------------------------------------------------------
# Cozystack endtoend provisioning test (Bats)
# -----------------------------------------------------------------------------
@test "Required installer assets exist" {
if [ ! -f _out/assets/cozystack-installer.yaml ]; then
echo "Missing: _out/assets/cozystack-installer.yaml" >&2
exit 1
fi
if [ ! -f _out/assets/nocloud-amd64.raw.xz ]; then
echo "Missing: _out/assets/nocloud-amd64.raw.xz" >&2
exit 1
fi
}
@test "IPv4 forwarding is enabled" {
if [ "$(cat /proc/sys/net/ipv4/ip_forward)" != 1 ]; then
echo "IPv4 forwarding is disabled!" >&2
echo >&2
echo "Enable it with:" >&2
echo " echo 1 > /proc/sys/net/ipv4/ip_forward" >&2
exit 1
fi
}
@test "Clean previous VMs" {
kill $(cat srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid srv4/qemu.pid 2>/dev/null) 2>/dev/null || true
rm -rf srv1 srv2 srv3 srv4
}
@test "Prepare networking and masquerading" {
ip link del cozy-br0 2>/dev/null || true
ip link add cozy-br0 type bridge
ip link set cozy-br0 up
ip address add 192.168.123.1/24 dev cozy-br0
# Masquerading rule idempotent (delete first, then add)
iptables -t nat -D POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE 2>/dev/null || true
iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE
}
@test "Prepare cloudinit drive for VMs" {
mkdir -p srv1 srv2 srv3 srv4
# Generate cloudinit ISOs
for i in 1 2 3; do
echo "hostname: srv${i}" > "srv${i}/meta-data"
cat > "srv${i}/user-data" <<'EOF'
#cloud-config
EOF
cat > "srv${i}/network-config" <<EOF
version: 2
ethernets:
eth0:
dhcp4: false
addresses:
- "192.168.123.1${i}/26"
gateway4: "192.168.123.1"
nameservers:
search: [cluster.local]
addresses: [8.8.8.8]
EOF
( cd "srv${i}" && genisoimage \
-output seed.img \
-volid cidata -rational-rock -joliet \
user-data meta-data network-config )
done
cat > "srv4/meta-data" <<EOT
hostname: srv4
instance-id: srv4
local-hostname: srv4
EOT
cat > "srv4/user-data" <<EOT
#cloud-config
password: mysecret
chpasswd: { expire: False }
ssh_pwauth: True
EOT
cat > "srv4/network-config" <<EOT
version: 2
ethernets:
eth0:
dhcp4: false
addresses:
- "192.168.123.14/26"
gateway4: "192.168.123.1"
nameservers:
search: [cluster.local]
addresses: [8.8.8.8]
EOT
cd srv4 && genisoimage \
-output seed.img \
-volid cidata -rational-rock -joliet \
user-data meta-data network-config
cd ..
}
@test "Use Talos NoCloud image from assets" {
if [ ! -f _out/assets/nocloud-amd64.raw.xz ]; then
echo "Missing _out/assets/nocloud-amd64.raw.xz" 2>&1
exit 1
fi
rm -f nocloud-amd64.raw
cp _out/assets/nocloud-amd64.raw.xz .
xz --decompress nocloud-amd64.raw.xz
}
@test "Prepare VM disks" {
wget https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img -O ubuntu.img
qemu-img convert -f qcow2 -O raw ubuntu.img srv4/system.img
qemu-img resize srv4/system.img 20G
qemu-img create srv4/data.img 100G
for i in 1 2 3; do
cp nocloud-amd64.raw srv${i}/system.img
qemu-img resize srv${i}/system.img 50G
qemu-img create srv${i}/data.img 100G
done
}
@test "Create tap devices" {
for i in 1 2 3 4; do
ip link del cozy-srv${i} 2>/dev/null || true
ip tuntap add dev cozy-srv${i} mode tap
ip link set cozy-srv${i} up
ip link set cozy-srv${i} master cozy-br0
done
}
@test "Boot QEMU VMs" {
for i in 1 2 3 4; do
qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host -smp 8 -m 16384 \
-device virtio-net,netdev=net0,mac=52:54:00:12:34:5${i} \
-netdev tap,id=net0,ifname=cozy-srv${i},script=no,downscript=no \
-drive file=srv${i}/system.img,if=virtio,format=raw \
-drive file=srv${i}/seed.img,if=virtio,format=raw \
-drive file=srv${i}/data.img,if=virtio,format=raw \
-display none -daemonize -pidfile srv${i}/qemu.pid
done
# Give qemu a few seconds to start up networking
sleep 5
}
@test "Wait until Talos API port 50000 is reachable on all machines" {
timeout 60 sh -ec 'until nc -nz 192.168.123.11 50000 && nc -nz 192.168.123.12 50000 && nc -nz 192.168.123.13 50000; do sleep 1; done'
}
@test "Generate Talos cluster configuration" {
# Clusterwide patches
cat > patch.yaml <<'EOF'
machine:
kubelet:
nodeIP:
validSubnets:
- 192.168.123.0/24
extraConfig:
maxPods: 512
kernel:
modules:
- name: openvswitch
- name: drbd
parameters:
- usermode_helper=disabled
- name: zfs
- name: spl
registries:
mirrors:
docker.io:
endpoints:
- https://mirror.gcr.io
files:
- content: |
[plugins]
[plugins."io.containerd.cri.v1.runtime"]
device_ownership_from_security_context = true
path: /etc/cri/conf.d/20-customization.part
op: create
cluster:
apiServer:
extraArgs:
oidc-issuer-url: "https://keycloak.example.org/realms/cozy"
oidc-client-id: "kubernetes"
oidc-username-claim: "preferred_username"
oidc-groups-claim: "groups"
network:
cni:
name: none
dnsDomain: cozy.local
podSubnets:
- 10.244.0.0/16
serviceSubnets:
- 10.96.0.0/16
EOF
# Controlplaneonly patches
cat > patch-controlplane.yaml <<'EOF'
machine:
nodeLabels:
node.kubernetes.io/exclude-from-external-load-balancers:
$patch: delete
network:
interfaces:
- interface: eth0
vip:
ip: 192.168.123.10
cluster:
allowSchedulingOnControlPlanes: true
controllerManager:
extraArgs:
bind-address: 0.0.0.0
scheduler:
extraArgs:
bind-address: 0.0.0.0
apiServer:
certSANs:
- 127.0.0.1
proxy:
disabled: true
discovery:
enabled: false
etcd:
advertisedSubnets:
- 192.168.123.0/24
EOF
# Generate secrets once
if [ ! -f secrets.yaml ]; then
talosctl gen secrets
fi
rm -f controlplane.yaml worker.yaml talosconfig kubeconfig
talosctl gen config --with-secrets secrets.yaml cozystack https://192.168.123.10:6443 \
--config-patch=@patch.yaml --config-patch-control-plane @patch-controlplane.yaml
}
@test "Apply Talos configuration to the node" {
# Apply the configuration to all three nodes
for node in 11 12 13; do
talosctl apply -f controlplane.yaml -n 192.168.123.${node} -e 192.168.123.${node} -i
done
# Wait for Talos services to come up again
timeout 60 sh -ec 'until nc -nz 192.168.123.11 50000 && nc -nz 192.168.123.12 50000 && nc -nz 192.168.123.13 50000; do sleep 1; done'
}
@test "Bootstrap Talos cluster" {
# Bootstrap etcd on the first node
timeout 10 sh -ec 'until talosctl bootstrap -n 192.168.123.11 -e 192.168.123.11; do sleep 1; done'
# Wait until etcd is healthy
timeout 180 sh -ec 'until talosctl etcd members -n 192.168.123.11,192.168.123.12,192.168.123.13 -e 192.168.123.10 >/dev/null 2>&1; do sleep 1; done'
timeout 60 sh -ec 'while talosctl etcd members -n 192.168.123.11,192.168.123.12,192.168.123.13 -e 192.168.123.10 2>&1 | grep -q "rpc error"; do sleep 1; done'
# Retrieve kubeconfig
rm -f kubeconfig
talosctl kubeconfig kubeconfig -e 192.168.123.10 -n 192.168.123.10
# Wait until all three nodes register in Kubernetes
timeout 60 sh -ec 'until [ $(kubectl get node --no-headers | wc -l) -eq 3 ]; do sleep 1; done'
}
@test "Install Cozystack" {
# Create namespace & configmap required by installer
kubectl create namespace cozy-system --dry-run=client -o yaml | kubectl apply -f -
kubectl create configmap cozystack -n cozy-system \
--from-literal=bundle-name=paas-full \
--from-literal=ipv4-pod-cidr=10.244.0.0/16 \
--from-literal=ipv4-pod-gateway=10.244.0.1 \
--from-literal=ipv4-svc-cidr=10.96.0.0/16 \
--from-literal=ipv4-join-cidr=100.64.0.0/16 \
--from-literal=root-host=example.org \
--from-literal=api-server-endpoint=https://192.168.123.10:6443 \
--dry-run=client -o yaml | kubectl apply -f -
# Apply installer manifests from file
kubectl apply -f _out/assets/cozystack-installer.yaml
# Wait for the installer deployment to become available
kubectl wait deployment/cozystack -n cozy-system --timeout=1m --for=condition=Available
# Wait until HelmReleases appear & reconcile them
timeout 60 sh -ec 'until kubectl get hr -A | grep -q cozys; do sleep 1; done'
sleep 5
kubectl get hr -A | awk 'NR>1 {print "kubectl wait --timeout=15m --for=condition=ready -n "$1" hr/"$2" &"} END {print "wait"}' | sh -ex
# Fail the test if any HelmRelease is not Ready
if kubectl get hr -A | grep -v " True " | grep -v NAME; then
kubectl get hr -A
fail "Some HelmReleases failed to reconcile"
fi
}
@test "Wait for ClusterAPI provider deployments" {
# Wait for ClusterAPI provider deployments
timeout 60 sh -ec 'until kubectl get deploy -n cozy-cluster-api capi-controller-manager capi-kamaji-controller-manager capi-kubeadm-bootstrap-controller-manager capi-operator-cluster-api-operator capk-controller-manager >/dev/null 2>&1; do sleep 1; done'
kubectl wait deployment/capi-controller-manager deployment/capi-kamaji-controller-manager deployment/capi-kubeadm-bootstrap-controller-manager deployment/capi-operator-cluster-api-operator deployment/capk-controller-manager -n cozy-cluster-api --timeout=1m --for=condition=available
}
@test "Wait for LINSTOR and configure storage" {
# Linstor controller and nodes
kubectl wait deployment/linstor-controller -n cozy-linstor --timeout=5m --for=condition=available
timeout 60 sh -ec 'until [ $(kubectl exec -n cozy-linstor deploy/linstor-controller -- linstor node list | grep -c Online) -eq 3 ]; do sleep 1; done'
for node in srv1 srv2 srv3; do
kubectl exec -n cozy-linstor deploy/linstor-controller -- linstor ps cdp zfs ${node} /dev/vdc --pool-name data --storage-pool data
done
# Storage classes
kubectl apply -f - <<'EOF'
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: linstor.csi.linbit.com
parameters:
linstor.csi.linbit.com/storagePool: "data"
linstor.csi.linbit.com/layerList: "storage"
linstor.csi.linbit.com/allowRemoteVolumeAccess: "false"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: replicated
provisioner: linstor.csi.linbit.com
parameters:
linstor.csi.linbit.com/storagePool: "data"
linstor.csi.linbit.com/autoPlace: "3"
linstor.csi.linbit.com/layerList: "drbd storage"
linstor.csi.linbit.com/allowRemoteVolumeAccess: "true"
property.linstor.csi.linbit.com/DrbdOptions/auto-quorum: suspend-io
property.linstor.csi.linbit.com/DrbdOptions/Resource/on-no-data-accessible: suspend-io
property.linstor.csi.linbit.com/DrbdOptions/Resource/on-suspended-primary-outdated: force-secondary
property.linstor.csi.linbit.com/DrbdOptions/Net/rr-conflict: retry-connect
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
EOF
}
@test "Wait for MetalLB and configure address pool" {
# MetalLB address pool
kubectl apply -f - <<'EOF'
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: cozystack
namespace: cozy-metallb
spec:
ipAddressPools: [cozystack]
---
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: cozystack
namespace: cozy-metallb
spec:
addresses: [192.168.123.200-192.168.123.250]
autoAssign: true
avoidBuggyIPs: false
EOF
}
@test "Check Cozystack API service" {
kubectl wait --for=condition=Available apiservices/v1alpha1.apps.cozystack.io --timeout=2m
}
@test "Configure Tenant and wait for applications" {
return 0
# Patch root tenant and wait for its releases
kubectl patch tenants/root -n tenant-root --type merge -p '{"spec":{"host":"example.org","ingress":true,"monitoring":true,"etcd":true,"isolated":true}}'
timeout 60 sh -ec 'until kubectl get hr -n tenant-root etcd ingress monitoring tenant-root >/dev/null 2>&1; do sleep 1; done'
kubectl wait hr/etcd hr/ingress hr/tenant-root -n tenant-root --timeout=2m --for=condition=ready
if ! kubectl wait hr/monitoring -n tenant-root --timeout=2m --for=condition=ready; then
flux reconcile hr monitoring -n tenant-root --force
kubectl wait hr/monitoring -n tenant-root --timeout=2m --for=condition=ready
fi
# Expose Cozystack services through ingress
kubectl patch configmap/cozystack -n cozy-system --type merge -p '{"data":{"expose-services":"api,dashboard,cdi-uploadproxy,vm-exportproxy,keycloak"}}'
# NGINX ingress controller
timeout 60 sh -ec 'until kubectl get deploy root-ingress-controller -n tenant-root >/dev/null 2>&1; do sleep 1; done'
kubectl wait deploy/root-ingress-controller -n tenant-root --timeout=5m --for=condition=available
# etcd statefulset
kubectl wait sts/etcd -n tenant-root --for=jsonpath='{.status.readyReplicas}'=3 --timeout=5m
# VictoriaMetrics components
kubectl wait vmalert/vmalert-shortterm vmalertmanager/alertmanager -n tenant-root --for=jsonpath='{.status.updateStatus}'=operational --timeout=5m
kubectl wait vlogs/generic -n tenant-root --for=jsonpath='{.status.updateStatus}'=operational --timeout=5m
kubectl wait vmcluster/shortterm vmcluster/longterm -n tenant-root --for=jsonpath='{.status.clusterStatus}'=operational --timeout=5m
# Grafana
kubectl wait clusters.postgresql.cnpg.io/grafana-db -n tenant-root --for=condition=ready --timeout=5m
kubectl wait deploy/grafana-deployment -n tenant-root --for=condition=available --timeout=5m
# Verify Grafana via ingress
ingress_ip=$(kubectl get svc root-ingress-controller -n tenant-root -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
if ! curl -sS -k "https://${ingress_ip}" -H 'Host: grafana.example.org' --max-time 30 | grep -q Found; then
echo "Failed to access Grafana via ingress at ${ingress_ip}" >&2
exit 1
fi
}
@test "Keycloak OIDC stack is healthy" {
return 0
kubectl patch configmap/cozystack -n cozy-system --type merge -p '{"data":{"oidc-enabled":"true"}}'
timeout 120 sh -ec 'until kubectl get hr -n cozy-keycloak keycloak keycloak-configure keycloak-operator >/dev/null 2>&1; do sleep 1; done'
kubectl wait hr/keycloak hr/keycloak-configure hr/keycloak-operator -n cozy-keycloak --timeout=10m --for=condition=ready
}

View File

@@ -1,165 +0,0 @@
#!/bin/bash
RED='\033[0;31m'
GREEN='\033[0;32m'
RESET='\033[0m'
YELLOW='\033[0;33m'
ROOT_NS="tenant-root"
TEST_TENANT="tenant-e2e"
values_base_path="/hack/testdata/"
checks_base_path="/hack/testdata/"
function delete_hr() {
local release_name="$1"
local namespace="$2"
if [[ -z "$release_name" ]]; then
echo -e "${RED}Error: Release name is required.${RESET}"
exit 1
fi
if [[ -z "$namespace" ]]; then
echo -e "${RED}Error: Namespace name is required.${RESET}"
exit 1
fi
if [[ "$release_name" == "tenant-e2e" ]]; then
echo -e "${YELLOW}Skipping deletion for release tenant-e2e.${RESET}"
return 0
fi
kubectl delete helmrelease $release_name -n $namespace
}
function install_helmrelease() {
local release_name="$1"
local namespace="$2"
local chart_path="$3"
local repo_name="$4"
local repo_ns="$5"
local values_file="$6"
if [[ -z "$release_name" ]]; then
echo -e "${RED}Error: Release name is required.${RESET}"
exit 1
fi
if [[ -z "$namespace" ]]; then
echo -e "${RED}Error: Namespace name is required.${RESET}"
exit 1
fi
if [[ -z "$chart_path" ]]; then
echo -e "${RED}Error: Chart path name is required.${RESET}"
exit 1
fi
if [[ -n "$values_file" && -f "$values_file" ]]; then
local values_section
values_section=$(echo " values:" && sed 's/^/ /' "$values_file")
fi
local helmrelease_file=$(mktemp /tmp/HelmRelease.XXXXXX.yaml)
{
echo "apiVersion: helm.toolkit.fluxcd.io/v2"
echo "kind: HelmRelease"
echo "metadata:"
echo " labels:"
echo " cozystack.io/ui: \"true\""
echo " name: \"$release_name\""
echo " namespace: \"$namespace\""
echo "spec:"
echo " chart:"
echo " spec:"
echo " chart: \"$chart_path\""
echo " reconcileStrategy: Revision"
echo " sourceRef:"
echo " kind: HelmRepository"
echo " name: \"$repo_name\""
echo " namespace: \"$repo_ns\""
echo " version: '*'"
echo " interval: 1m0s"
echo " timeout: 5m0s"
[[ -n "$values_section" ]] && echo "$values_section"
} > "$helmrelease_file"
kubectl apply -f "$helmrelease_file"
rm -f "$helmrelease_file"
}
function install_tenant (){
local release_name="$1"
local namespace="$2"
local values_file="${values_base_path}tenant/values.yaml"
local repo_name="cozystack-apps"
local repo_ns="cozy-public"
install_helmrelease "$release_name" "$namespace" "tenant" "$repo_name" "$repo_ns" "$values_file"
}
function make_extra_checks(){
local checks_file="$1"
echo "after exec make $checks_file"
if [[ -n "$checks_file" && -f "$checks_file" ]]; then
echo -e "${YELLOW}Start extra checks with file: ${checks_file}${RESET}"
fi
}
function check_helmrelease_status() {
local release_name="$1"
local namespace="$2"
local checks_file="$3"
local timeout=300 # Timeout in seconds
local interval=5 # Interval between checks in seconds
local elapsed=0
while [[ $elapsed -lt $timeout ]]; do
local status_output
status_output=$(kubectl get helmrelease "$release_name" -n "$namespace" -o json | jq -r '.status.conditions[-1].reason')
if [[ "$status_output" == "InstallSucceeded" || "$status_output" == "UpgradeSucceeded" ]]; then
echo -e "${GREEN}Helm release '$release_name' is ready.${RESET}"
make_extra_checks "$checks_file"
delete_hr $release_name $namespace
return 0
elif [[ "$status_output" == "InstallFailed" ]]; then
echo -e "${RED}Helm release '$release_name': InstallFailed${RESET}"
exit 1
else
echo -e "${YELLOW}Helm release '$release_name' is not ready. Current status: $status_output${RESET}"
fi
sleep "$interval"
elapsed=$((elapsed + interval))
done
echo -e "${RED}Timeout reached. Helm release '$release_name' is still not ready after $timeout seconds.${RESET}"
exit 1
}
chart_name="$1"
if [ -z "$chart_name" ]; then
echo -e "${RED}No chart name provided. Exiting...${RESET}"
exit 1
fi
checks_file="${checks_base_path}${chart_name}/check.sh"
repo_name="cozystack-apps"
repo_ns="cozy-public"
release_name="$chart_name-e2e"
values_file="${values_base_path}${chart_name}/values.yaml"
install_tenant $TEST_TENANT $ROOT_NS
check_helmrelease_status $TEST_TENANT $ROOT_NS "${checks_base_path}tenant/check.sh"
echo -e "${YELLOW}Running tests for chart: $chart_name${RESET}"
install_helmrelease $release_name $TEST_TENANT $chart_name $repo_name $repo_ns $values_file
check_helmrelease_status $release_name $TEST_TENANT $checks_file

View File

@@ -1,356 +0,0 @@
#!/bin/bash
if [ "$COZYSTACK_INSTALLER_YAML" = "" ]; then
echo 'COZYSTACK_INSTALLER_YAML variable is not set!' >&2
echo 'please set it with following command:' >&2
echo >&2
echo 'export COZYSTACK_INSTALLER_YAML=$(helm template -n cozy-system installer packages/core/installer)' >&2
echo >&2
exit 1
fi
if [ "$(cat /proc/sys/net/ipv4/ip_forward)" != 1 ]; then
echo "IPv4 forwarding is not enabled!" >&2
echo 'please enable forwarding with the following command:' >&2
echo >&2
echo 'echo 1 > /proc/sys/net/ipv4/ip_forward' >&2
echo >&2
exit 1
fi
set -x
set -e
kill `cat srv1/qemu.pid srv2/qemu.pid srv3/qemu.pid` || true
ip link del cozy-br0 || true
ip link add cozy-br0 type bridge
ip link set cozy-br0 up
ip addr add 192.168.123.1/24 dev cozy-br0
# Enable masquerading
iptables -t nat -D POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE 2>/dev/null || true
iptables -t nat -A POSTROUTING -s 192.168.123.0/24 ! -d 192.168.123.0/24 -j MASQUERADE
rm -rf srv1 srv2 srv3
mkdir -p srv1 srv2 srv3
# Prepare cloud-init
for i in 1 2 3; do
echo "hostname: srv$i" > "srv$i/meta-data"
echo '#cloud-config' > "srv$i/user-data"
cat > "srv$i/network-config" <<EOT
version: 2
ethernets:
eth0:
dhcp4: false
addresses:
- "192.168.123.1$i/26"
gateway4: "192.168.123.1"
nameservers:
search: [cluster.local]
addresses: [8.8.8.8]
EOT
( cd srv$i && genisoimage \
-output seed.img \
-volid cidata -rational-rock -joliet \
user-data meta-data network-config
)
done
# Prepare system drive
if [ ! -f nocloud-amd64.raw ]; then
wget https://github.com/cozystack/cozystack/releases/latest/download/nocloud-amd64.raw.xz -O nocloud-amd64.raw.xz
rm -f nocloud-amd64.raw
xz --decompress nocloud-amd64.raw.xz
fi
for i in 1 2 3; do
cp nocloud-amd64.raw srv$i/system.img
qemu-img resize srv$i/system.img 20G
done
# Prepare data drives
for i in 1 2 3; do
qemu-img create srv$i/data.img 100G
done
# Prepare networking
for i in 1 2 3; do
ip link del cozy-srv$i || true
ip tuntap add dev cozy-srv$i mode tap
ip link set cozy-srv$i up
ip link set cozy-srv$i master cozy-br0
done
# Start VMs
for i in 1 2 3; do
qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host -smp 8 -m 16384 \
-device virtio-net,netdev=net0,mac=52:54:00:12:34:5$i -netdev tap,id=net0,ifname=cozy-srv$i,script=no,downscript=no \
-drive file=srv$i/system.img,if=virtio,format=raw \
-drive file=srv$i/seed.img,if=virtio,format=raw \
-drive file=srv$i/data.img,if=virtio,format=raw \
-display none -daemonize -pidfile srv$i/qemu.pid
done
sleep 5
# Wait for VM to start up
timeout 60 sh -c 'until nc -nzv 192.168.123.11 50000 && nc -nzv 192.168.123.12 50000 && nc -nzv 192.168.123.13 50000; do sleep 1; done'
cat > patch.yaml <<\EOT
machine:
kubelet:
nodeIP:
validSubnets:
- 192.168.123.0/24
extraConfig:
maxPods: 512
kernel:
modules:
- name: openvswitch
- name: drbd
parameters:
- usermode_helper=disabled
- name: zfs
- name: spl
registries:
mirrors:
docker.io:
endpoints:
- https://mirror.gcr.io
files:
- content: |
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
device_ownership_from_security_context = true
path: /etc/cri/conf.d/20-customization.part
op: create
cluster:
apiServer:
extraArgs:
oidc-issuer-url: "https://keycloak.example.org/realms/cozy"
oidc-client-id: "kubernetes"
oidc-username-claim: "preferred_username"
oidc-groups-claim: "groups"
network:
cni:
name: none
dnsDomain: cozy.local
podSubnets:
- 10.244.0.0/16
serviceSubnets:
- 10.96.0.0/16
EOT
cat > patch-controlplane.yaml <<\EOT
machine:
nodeLabels:
node.kubernetes.io/exclude-from-external-load-balancers:
$patch: delete
network:
interfaces:
- interface: eth0
vip:
ip: 192.168.123.10
cluster:
allowSchedulingOnControlPlanes: true
controllerManager:
extraArgs:
bind-address: 0.0.0.0
scheduler:
extraArgs:
bind-address: 0.0.0.0
apiServer:
certSANs:
- 127.0.0.1
proxy:
disabled: true
discovery:
enabled: false
etcd:
advertisedSubnets:
- 192.168.123.0/24
EOT
# Gen configuration
if [ ! -f secrets.yaml ]; then
talosctl gen secrets
fi
rm -f controlplane.yaml worker.yaml talosconfig kubeconfig
talosctl gen config --with-secrets secrets.yaml cozystack https://192.168.123.10:6443 --config-patch=@patch.yaml --config-patch-control-plane @patch-controlplane.yaml
export TALOSCONFIG=$PWD/talosconfig
# Apply configuration
talosctl apply -f controlplane.yaml -n 192.168.123.11 -e 192.168.123.11 -i
talosctl apply -f controlplane.yaml -n 192.168.123.12 -e 192.168.123.12 -i
talosctl apply -f controlplane.yaml -n 192.168.123.13 -e 192.168.123.13 -i
# Wait for VM to be configured
timeout 60 sh -c 'until nc -nzv 192.168.123.11 50000 && nc -nzv 192.168.123.12 50000 && nc -nzv 192.168.123.13 50000; do sleep 1; done'
# Bootstrap
timeout 10 sh -c 'until talosctl bootstrap -n 192.168.123.11 -e 192.168.123.11; do sleep 1; done'
# Wait for etcd
timeout 180 sh -c 'until timeout -s 9 2 talosctl etcd members -n 192.168.123.11,192.168.123.12,192.168.123.13 -e 192.168.123.10 2>&1; do sleep 1; done'
timeout 60 sh -c 'while talosctl etcd members -n 192.168.123.11,192.168.123.12,192.168.123.13 -e 192.168.123.10 2>&1 | grep "rpc error"; do sleep 1; done'
rm -f kubeconfig
talosctl kubeconfig kubeconfig -e 192.168.123.10 -n 192.168.123.10
export KUBECONFIG=$PWD/kubeconfig
# Wait for kubernetes nodes appear
timeout 60 sh -c 'until [ $(kubectl get node -o name | wc -l) = 3 ]; do sleep 1; done'
kubectl create ns cozy-system -o yaml | kubectl apply -f -
kubectl create -f - <<\EOT
apiVersion: v1
kind: ConfigMap
metadata:
name: cozystack
namespace: cozy-system
data:
bundle-name: "paas-full"
ipv4-pod-cidr: "10.244.0.0/16"
ipv4-pod-gateway: "10.244.0.1"
ipv4-svc-cidr: "10.96.0.0/16"
ipv4-join-cidr: "100.64.0.0/16"
root-host: example.org
api-server-endpoint: https://192.168.123.10:6443
EOT
#
echo "$COZYSTACK_INSTALLER_YAML" | kubectl apply -f -
# wait for cozystack pod to start
kubectl wait deploy --timeout=1m --for=condition=available -n cozy-system cozystack
# wait for helmreleases appear
timeout 60 sh -c 'until kubectl get hr -A | grep cozy; do sleep 1; done'
sleep 5
kubectl get hr -A | awk 'NR>1 {print "kubectl wait --timeout=15m --for=condition=ready -n " $1 " hr/" $2 " &"} END{print "wait"}' | sh -x
# Wait for Cluster-API providers
timeout 30 sh -c 'until kubectl get deploy -n cozy-cluster-api capi-controller-manager capi-kamaji-controller-manager capi-kubeadm-bootstrap-controller-manager capi-operator-cluster-api-operator capk-controller-manager; do sleep 1; done'
kubectl wait deploy --timeout=30s --for=condition=available -n cozy-cluster-api capi-controller-manager capi-kamaji-controller-manager capi-kubeadm-bootstrap-controller-manager capi-operator-cluster-api-operator capk-controller-manager
# Wait for linstor controller
kubectl wait deploy --timeout=5m --for=condition=available -n cozy-linstor linstor-controller
# Wait for all linstor nodes become Online
timeout 60 sh -c 'until [ $(kubectl exec -n cozy-linstor deploy/linstor-controller -- linstor node list | grep -c Online) = 3 ]; do sleep 1; done'
kubectl exec -n cozy-linstor deploy/linstor-controller -- linstor ps cdp zfs srv1 /dev/vdc --pool-name data --storage-pool data
kubectl exec -n cozy-linstor deploy/linstor-controller -- linstor ps cdp zfs srv2 /dev/vdc --pool-name data --storage-pool data
kubectl exec -n cozy-linstor deploy/linstor-controller -- linstor ps cdp zfs srv3 /dev/vdc --pool-name data --storage-pool data
kubectl create -f- <<EOT
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: linstor.csi.linbit.com
parameters:
linstor.csi.linbit.com/storagePool: "data"
linstor.csi.linbit.com/layerList: "storage"
linstor.csi.linbit.com/allowRemoteVolumeAccess: "false"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: replicated
provisioner: linstor.csi.linbit.com
parameters:
linstor.csi.linbit.com/storagePool: "data"
linstor.csi.linbit.com/autoPlace: "3"
linstor.csi.linbit.com/layerList: "drbd storage"
linstor.csi.linbit.com/allowRemoteVolumeAccess: "true"
property.linstor.csi.linbit.com/DrbdOptions/auto-quorum: suspend-io
property.linstor.csi.linbit.com/DrbdOptions/Resource/on-no-data-accessible: suspend-io
property.linstor.csi.linbit.com/DrbdOptions/Resource/on-suspended-primary-outdated: force-secondary
property.linstor.csi.linbit.com/DrbdOptions/Net/rr-conflict: retry-connect
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
EOT
kubectl create -f- <<EOT
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: cozystack
namespace: cozy-metallb
spec:
ipAddressPools:
- cozystack
---
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: cozystack
namespace: cozy-metallb
spec:
addresses:
- 192.168.123.200-192.168.123.250
autoAssign: true
avoidBuggyIPs: false
EOT
# Wait for cozystack-api
kubectl wait --for=condition=Available apiservices v1alpha1.apps.cozystack.io --timeout=2m
kubectl patch -n tenant-root tenants.apps.cozystack.io root --type=merge -p '{"spec":{
"host": "example.org",
"ingress": true,
"monitoring": true,
"etcd": true,
"isolated": true
}}'
# Wait for HelmRelease be created
timeout 60 sh -c 'until kubectl get hr -n tenant-root etcd ingress monitoring tenant-root; do sleep 1; done'
# Wait for HelmReleases be installed
kubectl wait --timeout=2m --for=condition=ready -n tenant-root hr etcd ingress monitoring tenant-root
kubectl patch -n tenant-root ingresses.apps.cozystack.io ingress --type=merge -p '{"spec":{
"dashboard": true
}}'
# Wait for nginx-ingress-controller
timeout 60 sh -c 'until kubectl get deploy -n tenant-root root-ingress-controller; do sleep 1; done'
kubectl wait --timeout=5m --for=condition=available -n tenant-root deploy root-ingress-controller
# Wait for etcd
kubectl wait --timeout=5m --for=jsonpath=.status.readyReplicas=3 -n tenant-root sts etcd
# Wait for Victoria metrics
kubectl wait --timeout=5m --for=jsonpath=.status.updateStatus=operational -n tenant-root vmalert/vmalert-shortterm vmalertmanager/alertmanager
kubectl wait --timeout=5m --for=jsonpath=.status.status=operational -n tenant-root vlogs/generic
kubectl wait --timeout=5m --for=jsonpath=.status.clusterStatus=operational -n tenant-root vmcluster/shortterm vmcluster/longterm
# Wait for grafana
kubectl wait --timeout=5m --for=condition=ready -n tenant-root clusters.postgresql.cnpg.io grafana-db
kubectl wait --timeout=5m --for=condition=available -n tenant-root deploy grafana-deployment
# Get IP of nginx-ingress
ip=$(kubectl get svc -n tenant-root root-ingress-controller -o jsonpath='{.status.loadBalancer.ingress..ip}')
# Check Grafana
curl -sS -k "https://$ip" -H 'Host: grafana.example.org' | grep Found
# Test OIDC
kubectl patch -n cozy-system cm/cozystack --type=merge -p '{"data":{
"oidc-enabled": "true"
}}'
timeout 60 sh -c 'until kubectl get hr -n cozy-keycloak keycloak keycloak-configure keycloak-operator; do sleep 1; done'
kubectl wait --timeout=10m --for=condition=ready -n cozy-keycloak hr keycloak keycloak-configure keycloak-operator

View File

@@ -16,22 +16,24 @@ if [ ! -f "$file" ] || [ ! -s "$file" ]; then
exit 0
fi
miss_map=$(echo "$new_map" | awk 'NR==FNR { nm[$1 " " $2] = $3; next } { if (!($1 " " $2 in nm)) print $1, $2, $3}' - "$file")
miss_map=$(mktemp)
trap 'rm -f "$miss_map"' EXIT
echo -n "$new_map" | awk 'NR==FNR { nm[$1 " " $2] = $3; next } { if (!($1 " " $2 in nm)) print $1, $2, $3}' - "$file" > $miss_map
# search accross all tags sorted by version
search_commits=$(git ls-remote --tags origin | awk -F/ '$3 ~ /v[0-9]+.[0-9]+.[0-9]+/ {print}' | sort -k2,2 -rV | awk '{print $1}')
resolved_miss_map=$(
echo "$miss_map" | while read -r chart version commit; do
while read -r chart version commit; do
# if version is found in HEAD, it's HEAD
if [ $(awk '$1 == "version:" {print $2}' ./${chart}/Chart.yaml) = "${version}" ]; then
if [ "$(awk '$1 == "version:" {print $2}' ./${chart}/Chart.yaml)" = "${version}" ]; then
echo "$chart $version HEAD"
continue
fi
# if commit is not HEAD, check if it's valid
if [ $commit != "HEAD" ]; then
if [ $(git show "${commit}:./${chart}/Chart.yaml" 2>/dev/null | awk '$1 == "version:" {print $2}') != "${version}" ]; then
if [ "$commit" != "HEAD" ]; then
if [ "$(git show "${commit}:./${chart}/Chart.yaml" | awk '$1 == "version:" {print $2}')" != "${version}" ]; then
echo "Commit $commit for $chart $version is not valid" >&2
exit 1
fi
@@ -44,7 +46,7 @@ resolved_miss_map=$(
# if commit is HEAD, but version is not found in HEAD, check all tags
found_tag=""
for tag in $search_commits; do
if [ $(git show "${tag}:./${chart}/Chart.yaml" 2>/dev/null | awk '$1 == "version:" {print $2}') = "${version}" ]; then
if [ "$(git show "${tag}:./${chart}/Chart.yaml" | awk '$1 == "version:" {print $2}')" = "${version}" ]; then
found_tag=$(git rev-parse --short "${tag}")
break
fi
@@ -56,7 +58,7 @@ resolved_miss_map=$(
fi
echo "$chart $version $found_tag"
done
done < $miss_map
)
printf "%s\n" "$new_map" "$resolved_miss_map" | sort -k1,1 -k2,2 -V | awk '$1' > "$file"

65
hack/package_chart.sh Executable file
View File

@@ -0,0 +1,65 @@
#!/bin/sh
set -e
usage() {
printf "%s\n" "Usage:" >&2 ;
printf -- "%s\n" '---' >&2 ;
printf "%s %s\n" "$0" "INPUT_DIR OUTPUT_DIR TMP_DIR [DEPENDENCY_DIR]" >&2 ;
printf -- "%s\n" '---' >&2 ;
printf "%s\n" "Takes a helm repository from INPUT_DIR, with an optional library repository in" >&2 ;
printf "%s\n" "DEPENDENCY_DIR, prepares a view of the git archive at select points in history" >&2 ;
printf "%s\n" "in TMP_DIR and packages helm charts, outputting the tarballs to OUTPUT_DIR" >&2 ;
}
if [ "x$(basename $PWD)" != "xpackages" ]
then
echo "Error: This script must run from the ./packages/ directory" >&2
echo >&2
usage
exit 1
fi
if [ "x$#" != "x3" ] && [ "x$#" != "x4" ]
then
echo "Error: This script takes 3 or 4 arguments" >&2
echo "Got $# arguments:" "$@" >&2
echo >&2
usage
exit 1
fi
input_dir=$1
output_dir=$2
tmp_dir=$3
if [ "x$#" = "x4" ]
then
dependency_dir=$4
fi
rm -rf "${output_dir:?}"
mkdir -p "${output_dir}"
while read package _ commit
do
# this lets devs build the packages from a dirty repo for quick local testing
if [ "x$commit" = "xHEAD" ]
then
helm package "${input_dir}/${package}" -d "${output_dir}"
continue
fi
git archive --format tar "${commit}" "${input_dir}/${package}" | tar -xf- -C "${tmp_dir}/"
# the library chart is not present in older commits and git archive doesn't fail gracefully if the path is not found
if [ "x${dependency_dir}" != "x" ] && git ls-tree --name-only "${commit}" "${dependency_dir}" | grep -qx "${dependency_dir}"
then
git archive --format tar "${commit}" "${dependency_dir}" | tar -xf- -C "${tmp_dir}/"
fi
helm package "${tmp_dir}/${input_dir}/${package}" -d "${output_dir}"
rm -rf "${tmp_dir:?}/${input_dir:?}/${package:?}"
if [ "x${dependency_dir}" != "x" ]
then
rm -rf "${tmp_dir:?}/${dependency_dir:?}"
fi
done < "${input_dir}/versions_map"
helm repo index "${output_dir}"

View File

@@ -1 +0,0 @@
return 0

View File

@@ -1,2 +0,0 @@
endpoints:
- 8.8.8.8:443

View File

@@ -1 +0,0 @@
return 0

View File

@@ -1,62 +0,0 @@
## @section Common parameters
## @param host The hostname used to access the Kubernetes cluster externally (defaults to using the cluster name as a subdomain for the tenant host).
## @param controlPlane.replicas Number of replicas for Kubernetes contorl-plane components
## @param storageClass StorageClass used to store user data
##
host: ""
controlPlane:
replicas: 2
storageClass: replicated
## @param nodeGroups [object] nodeGroups configuration
##
nodeGroups:
md0:
minReplicas: 0
maxReplicas: 10
instanceType: "u1.medium"
ephemeralStorage: 20Gi
roles:
- ingress-nginx
resources:
cpu: ""
memory: ""
## @section Cluster Addons
##
addons:
## Cert-manager: automatically creates and manages SSL/TLS certificate
##
certManager:
## @param addons.certManager.enabled Enables the cert-manager
## @param addons.certManager.valuesOverride Custom values to override
enabled: true
valuesOverride: {}
## Ingress-NGINX Controller
##
ingressNginx:
## @param addons.ingressNginx.enabled Enable Ingress-NGINX controller (expect nodes with 'ingress-nginx' role)
## @param addons.ingressNginx.valuesOverride Custom values to override
##
enabled: true
## @param addons.ingressNginx.hosts List of domain names that should be passed through to the cluster by upper cluster
## e.g:
## hosts:
## - example.org
## - foo.example.net
##
hosts: []
valuesOverride: {}
## Flux CD
##
fluxcd:
## @param addons.fluxcd.enabled Enables Flux CD
## @param addons.fluxcd.valuesOverride Custom values to override
##
enabled: true
valuesOverride: {}

View File

@@ -1 +0,0 @@
return 0

View File

@@ -1,10 +0,0 @@
## @section Common parameters
## @param external Enable external access from outside the cluster
## @param replicas Persistent Volume size for NATS
## @param storageClass StorageClass used to store the data
##
external: false
replicas: 2
storageClass: ""

View File

@@ -1 +0,0 @@
return 0

View File

@@ -1,6 +0,0 @@
host: ""
etcd: false
monitoring: false
ingress: false
seaweedfs: false
isolated: true

View File

@@ -0,0 +1,158 @@
package controller
import (
"context"
"fmt"
"strings"
"time"
e "errors"
helmv2 "github.com/fluxcd/helm-controller/api/v2"
"gopkg.in/yaml.v2"
corev1 "k8s.io/api/core/v1"
apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
"k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/runtime"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/log"
)
type TenantHelmReconciler struct {
client.Client
Scheme *runtime.Scheme
}
func (r *TenantHelmReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
logger := log.FromContext(ctx)
hr := &helmv2.HelmRelease{}
if err := r.Get(ctx, req.NamespacedName, hr); err != nil {
if errors.IsNotFound(err) {
return ctrl.Result{}, nil
}
logger.Error(err, "unable to fetch HelmRelease")
return ctrl.Result{}, err
}
if !strings.HasPrefix(hr.Name, "tenant-") {
return ctrl.Result{}, nil
}
if len(hr.Status.Conditions) == 0 || hr.Status.Conditions[0].Type != "Ready" {
return ctrl.Result{}, nil
}
if len(hr.Status.History) == 0 {
logger.Info("no history in HelmRelease status", "name", hr.Name)
return ctrl.Result{}, nil
}
if hr.Status.History[0].Status != "deployed" {
return ctrl.Result{}, nil
}
newDigest := hr.Status.History[0].Digest
var hrList helmv2.HelmReleaseList
childNamespace := getChildNamespace(hr.Namespace, hr.Name)
if childNamespace == "tenant-root" && hr.Name == "tenant-root" {
if hr.Spec.Values == nil {
logger.Error(e.New("hr.Spec.Values is nil"), "cant annotate tenant-root ns")
return ctrl.Result{}, nil
}
err := annotateTenantRootNs(*hr.Spec.Values, r.Client)
if err != nil {
logger.Error(err, "cant annotate tenant-root ns")
return ctrl.Result{}, nil
}
logger.Info("namespace 'tenant-root' annotated")
}
if err := r.List(ctx, &hrList, client.InNamespace(childNamespace)); err != nil {
logger.Error(err, "unable to list HelmReleases in namespace", "namespace", hr.Name)
return ctrl.Result{}, err
}
for _, item := range hrList.Items {
if item.Name == hr.Name {
continue
}
oldDigest := item.GetAnnotations()["cozystack.io/tenant-config-digest"]
if oldDigest == newDigest {
continue
}
patchTarget := item.DeepCopy()
if patchTarget.Annotations == nil {
patchTarget.Annotations = map[string]string{}
}
ts := time.Now().Format(time.RFC3339Nano)
patchTarget.Annotations["cozystack.io/tenant-config-digest"] = newDigest
patchTarget.Annotations["reconcile.fluxcd.io/forceAt"] = ts
patchTarget.Annotations["reconcile.fluxcd.io/requestedAt"] = ts
patch := client.MergeFrom(item.DeepCopy())
if err := r.Patch(ctx, patchTarget, patch); err != nil {
logger.Error(err, "failed to patch HelmRelease", "name", patchTarget.Name)
continue
}
logger.Info("patched HelmRelease with new digest", "name", patchTarget.Name, "digest", newDigest, "version", hr.Status.History[0].Version)
}
return ctrl.Result{}, nil
}
func (r *TenantHelmReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&helmv2.HelmRelease{}).
Complete(r)
}
func getChildNamespace(currentNamespace, hrName string) string {
tenantName := strings.TrimPrefix(hrName, "tenant-")
switch {
case currentNamespace == "tenant-root" && hrName == "tenant-root":
// 1) root tenant inside root namespace
return "tenant-root"
case currentNamespace == "tenant-root":
// 2) any other tenant in root namespace
return fmt.Sprintf("tenant-%s", tenantName)
default:
// 3) tenant in a dedicated namespace
return fmt.Sprintf("%s-%s", currentNamespace, tenantName)
}
}
func annotateTenantRootNs(values apiextensionsv1.JSON, c client.Client) error {
var data map[string]interface{}
if err := yaml.Unmarshal(values.Raw, &data); err != nil {
return fmt.Errorf("failed to parse HelmRelease values: %w", err)
}
host, ok := data["host"].(string)
if !ok || host == "" {
return fmt.Errorf("host field not found or not a string")
}
var ns corev1.Namespace
if err := c.Get(context.TODO(), client.ObjectKey{Name: "tenant-root"}, &ns); err != nil {
return fmt.Errorf("failed to get namespace tenant-root: %w", err)
}
if ns.Annotations == nil {
ns.Annotations = map[string]string{}
}
ns.Annotations["namespace.cozystack.io/host"] = host
if err := c.Update(context.TODO(), &ns); err != nil {
return fmt.Errorf("failed to update namespace: %w", err)
}
return nil
}

View File

@@ -39,6 +39,15 @@ func (r *WorkloadReconciler) Reconcile(ctx context.Context, req ctrl.Request) (c
}
t := getMonitoredObject(w)
if t == nil {
err = r.Delete(ctx, w)
if err != nil {
logger.Error(err, "failed to delete workload")
}
return ctrl.Result{}, err
}
err = r.Get(ctx, types.NamespacedName{Name: t.GetName(), Namespace: t.GetNamespace()}, t)
// found object, nothing to do
@@ -68,20 +77,23 @@ func (r *WorkloadReconciler) SetupWithManager(mgr ctrl.Manager) error {
}
func getMonitoredObject(w *cozyv1alpha1.Workload) client.Object {
if strings.HasPrefix(w.Name, "pvc-") {
switch {
case strings.HasPrefix(w.Name, "pvc-"):
obj := &corev1.PersistentVolumeClaim{}
obj.Name = strings.TrimPrefix(w.Name, "pvc-")
obj.Namespace = w.Namespace
return obj
}
if strings.HasPrefix(w.Name, "svc-") {
case strings.HasPrefix(w.Name, "svc-"):
obj := &corev1.Service{}
obj.Name = strings.TrimPrefix(w.Name, "svc-")
obj.Namespace = w.Namespace
return obj
case strings.HasPrefix(w.Name, "pod-"):
obj := &corev1.Pod{}
obj.Name = strings.TrimPrefix(w.Name, "pod-")
obj.Namespace = w.Namespace
return obj
}
obj := &corev1.Pod{}
obj.Name = w.Name
obj.Namespace = w.Namespace
var obj client.Object
return obj
}

View File

@@ -0,0 +1,26 @@
package controller
import (
"testing"
cozyv1alpha1 "github.com/cozystack/cozystack/api/v1alpha1"
corev1 "k8s.io/api/core/v1"
)
func TestUnprefixedMonitoredObjectReturnsNil(t *testing.T) {
w := &cozyv1alpha1.Workload{}
w.Name = "unprefixed-name"
obj := getMonitoredObject(w)
if obj != nil {
t.Errorf(`getMonitoredObject(&Workload{Name: "%s"}) == %v, want nil`, w.Name, obj)
}
}
func TestPodMonitoredObject(t *testing.T) {
w := &cozyv1alpha1.Workload{}
w.Name = "pod-mypod"
obj := getMonitoredObject(w)
if pod, ok := obj.(*corev1.Pod); !ok || pod.Name != "mypod" {
t.Errorf(`getMonitoredObject(&Workload{Name: "%s"}) == %v, want &Pod{Name: "mypod"}`, w.Name, obj)
}
}

View File

@@ -116,15 +116,24 @@ func (r *WorkloadMonitorReconciler) reconcileServiceForMonitor(
resources := make(map[string]resource.Quantity)
q := resource.MustParse("0")
quantity := resource.MustParse("0")
for _, ing := range svc.Status.LoadBalancer.Ingress {
if ing.IP != "" {
q.Add(resource.MustParse("1"))
quantity.Add(resource.MustParse("1"))
}
}
resources["public-ips"] = q
var resourceLabel string
if svc.Annotations != nil {
var ok bool
resourceLabel, ok = svc.Annotations["metallb.universe.tf/ip-allocated-from-pool"]
if !ok {
resourceLabel = "default"
}
}
resourceLabel = fmt.Sprintf("%s.ipaddresspool.metallb.io/requests.ipaddresses", resourceLabel)
resources[resourceLabel] = quantity
_, err := ctrl.CreateOrUpdate(ctx, r.Client, workload, func() error {
// Update owner references with the new monitor
@@ -165,7 +174,12 @@ func (r *WorkloadMonitorReconciler) reconcilePVCForMonitor(
resources := make(map[string]resource.Quantity)
for resourceName, resourceQuantity := range pvc.Status.Capacity {
resources[resourceName.String()] = resourceQuantity
storageClass := "default"
if pvc.Spec.StorageClassName != nil || *pvc.Spec.StorageClassName == "" {
storageClass = *pvc.Spec.StorageClassName
}
resourceLabel := fmt.Sprintf("%s.storageclass.storage.k8s.io/requests.%s", storageClass, resourceName.String())
resources[resourceLabel] = resourceQuantity
}
_, err := ctrl.CreateOrUpdate(ctx, r.Client, workload, func() error {
@@ -198,15 +212,12 @@ func (r *WorkloadMonitorReconciler) reconcilePodForMonitor(
) error {
logger := log.FromContext(ctx)
// Combine both init containers and normal containers to sum resources properly
combinedContainers := append(pod.Spec.InitContainers, pod.Spec.Containers...)
// totalResources will store the sum of all container resource limits
// totalResources will store the sum of all container resource requests
totalResources := make(map[string]resource.Quantity)
// Iterate over all containers to aggregate their Limits
for _, container := range combinedContainers {
for name, qty := range container.Resources.Limits {
// Iterate over all containers to aggregate their requests
for _, container := range pod.Spec.Containers {
for name, qty := range container.Resources.Requests {
if existing, exists := totalResources[name.String()]; exists {
existing.Add(qty)
totalResources[name.String()] = existing
@@ -235,7 +246,7 @@ func (r *WorkloadMonitorReconciler) reconcilePodForMonitor(
workload := &cozyv1alpha1.Workload{
ObjectMeta: metav1.ObjectMeta{
Name: pod.Name,
Name: fmt.Sprintf("pod-%s", pod.Name),
Namespace: pod.Namespace,
},
}

View File

@@ -1,14 +1,8 @@
OUT=../../_out/repos/apps
TMP=../../_out/repos/apps/historical
OUT=../_out/repos/apps
TMP := $(shell mktemp -d)
repo:
rm -rf "$(OUT)"
mkdir -p "$(OUT)"
awk '$$3 != "HEAD" {print "mkdir -p $(TMP)/" $$1 "-" $$2}' versions_map | sh -ex
awk '$$3 != "HEAD" {print "git archive " $$3 " " $$1 " | tar -xf- --strip-components=1 -C $(TMP)/" $$1 "-" $$2 }' versions_map | sh -ex
helm package -d "$(OUT)" $$(find . $(TMP) -mindepth 2 -maxdepth 2 -name Chart.yaml | awk 'sub("/Chart.yaml", "")' | sort -V)
cd "$(OUT)" && helm repo index . --url http://cozystack.cozy-system.svc/repos/apps
rm -rf "$(TMP)"
cd .. && ../hack/package_chart.sh apps $(OUT) $(TMP) library
fix-chartnames:
find . -maxdepth 2 -name Chart.yaml | awk -F/ '{print $$2}' | while read i; do sed -i "s/^name: .*/name: $$i/" "$$i/Chart.yaml"; done

View File

@@ -0,0 +1,3 @@
# S3 bucket
## Parameters

View File

@@ -11,7 +11,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '*'
version: '>= 0.0.0-0'
interval: 1m0s
timeout: 5m0s
values:

View File

@@ -0,0 +1,5 @@
{
"title": "Chart Values",
"type": "object",
"properties": {}
}

View File

@@ -0,0 +1 @@
{}

View File

@@ -16,10 +16,15 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.7.0
version: 0.9.1
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "24.9.2"
dependencies:
- name: cozy-lib
version: 0.1.0
repository: "http://cozystack.cozy-system.svc/repos/library"

View File

@@ -7,8 +7,10 @@ generate:
readme-generator -v values.yaml -s values.schema.json -r README.md
image:
docker buildx build --platform linux/amd64 --build-arg ARCH=amd64 images/clickhouse-backup \
docker buildx build images/clickhouse-backup \
--provenance false \
--builder=$(BUILDER) \
--platform=$(PLATFORM) \
--tag $(REGISTRY)/clickhouse-backup:$(call settag,$(CLICKHOUSE_BACKUP_TAG)) \
--cache-from type=registry,ref=$(REGISTRY)/clickhouse-backup:latest \
--cache-to type=inline \

View File

@@ -0,0 +1 @@
../../../library/cozy-lib

View File

@@ -1 +1 @@
ghcr.io/cozystack/cozystack/clickhouse-backup:0.7.0@sha256:3faf7a4cebf390b9053763107482de175aa0fdb88c1e77424fd81100b1c3a205
ghcr.io/cozystack/cozystack/clickhouse-backup:0.9.0@sha256:3faf7a4cebf390b9053763107482de175aa0fdb88c1e77424fd81100b1c3a205

View File

@@ -11,35 +11,34 @@ These presets are for basic testing and not meant to be used in production
{{ include "resources.preset" (dict "type" "nano") -}}
*/}}
{{- define "resources.preset" -}}
{{/* The limits are the requests increased by 50% (except ephemeral-storage and xlarge/2xlarge sizes)*/}}
{{- $presets := dict
"nano" (dict
"requests" (dict "cpu" "100m" "memory" "128Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "150m" "memory" "192Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "128Mi" "ephemeral-storage" "2Gi")
)
"micro" (dict
"requests" (dict "cpu" "250m" "memory" "256Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "375m" "memory" "384Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "256Mi" "ephemeral-storage" "2Gi")
)
"small" (dict
"requests" (dict "cpu" "500m" "memory" "512Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "768Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "512Mi" "ephemeral-storage" "2Gi")
)
"medium" (dict
"requests" (dict "cpu" "500m" "memory" "1024Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "1536Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "500m" "memory" "1Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "1Gi" "ephemeral-storage" "2Gi")
)
"large" (dict
"requests" (dict "cpu" "1.0" "memory" "2048Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "1.5" "memory" "3072Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "1" "memory" "2Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "2Gi" "ephemeral-storage" "2Gi")
)
"xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "3.0" "memory" "6144Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "2" "memory" "4Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "4Gi" "ephemeral-storage" "2Gi")
)
"2xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "6.0" "memory" "12288Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "4" "memory" "8Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "8Gi" "ephemeral-storage" "2Gi")
)
}}
{{- if hasKey $presets .type -}}

View File

@@ -92,6 +92,9 @@ spec:
templates:
volumeClaimTemplates:
- name: data-volume-template
metadata:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}
spec:
accessModes:
- ReadWriteOnce
@@ -99,6 +102,9 @@ spec:
requests:
storage: {{ .Values.size }}
- name: log-volume-template
metadata:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}
spec:
accessModes:
- ReadWriteOnce
@@ -107,6 +113,9 @@ spec:
storage: {{ .Values.logStorageSize }}
podTemplates:
- name: clickhouse-per-host
metadata:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}
spec:
affinity:
podAntiAffinity:
@@ -122,9 +131,9 @@ spec:
- name: clickhouse
image: clickhouse/clickhouse-server:24.9.2.42
{{- if .Values.resources }}
resources: {{- toYaml .Values.resources | nindent 16 }}
resources: {{- include "cozy-lib.resources.sanitize" (list .Values.resources $) | nindent 16 }}
{{- else if ne .Values.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.resourcesPreset "Release" .Release) | nindent 16 }}
resources: {{- include "cozy-lib.resources.preset" (list .Values.resourcesPreset $) | nindent 16 }}
{{- end }}
volumeMounts:
- name: data-volume-template
@@ -133,6 +142,9 @@ spec:
mountPath: /var/log/clickhouse-server
serviceTemplates:
- name: svc-template
metadata:
labels:
app.kubernetes.io/instance: {{ .Release.Name }}
generateName: chendpoint-{chi}
spec:
ports:

View File

@@ -9,5 +9,5 @@ spec:
kind: clickhouse
type: clickhouse
selector:
clickhouse.altinity.com/chi: {{ $.Release.Name }}
app.kubernetes.io/instance: {{ $.Release.Name }}
version: {{ $.Chart.Version }}

View File

@@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.5.0
version: 0.6.1
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to

View File

@@ -0,0 +1 @@
../../../library/cozy-lib

View File

@@ -1 +1 @@
ghcr.io/cozystack/cozystack/postgres-backup:0.10.0@sha256:10179ed56457460d95cd5708db2a00130901255fa30c4dd76c65d2ef5622b61f
ghcr.io/cozystack/cozystack/postgres-backup:0.12.0@sha256:10179ed56457460d95cd5708db2a00130901255fa30c4dd76c65d2ef5622b61f

View File

@@ -11,35 +11,34 @@ These presets are for basic testing and not meant to be used in production
{{ include "resources.preset" (dict "type" "nano") -}}
*/}}
{{- define "resources.preset" -}}
{{/* The limits are the requests increased by 50% (except ephemeral-storage and xlarge/2xlarge sizes)*/}}
{{- $presets := dict
"nano" (dict
"requests" (dict "cpu" "100m" "memory" "128Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "150m" "memory" "192Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "128Mi" "ephemeral-storage" "2Gi")
)
"micro" (dict
"requests" (dict "cpu" "250m" "memory" "256Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "375m" "memory" "384Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "256Mi" "ephemeral-storage" "2Gi")
)
"small" (dict
"requests" (dict "cpu" "500m" "memory" "512Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "768Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "512Mi" "ephemeral-storage" "2Gi")
)
"medium" (dict
"requests" (dict "cpu" "500m" "memory" "1024Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "1536Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "500m" "memory" "1Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "1Gi" "ephemeral-storage" "2Gi")
)
"large" (dict
"requests" (dict "cpu" "1.0" "memory" "2048Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "1.5" "memory" "3072Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "1" "memory" "2Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "2Gi" "ephemeral-storage" "2Gi")
)
"xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "3.0" "memory" "6144Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "2" "memory" "4Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "4Gi" "ephemeral-storage" "2Gi")
)
"2xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "6.0" "memory" "12288Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "4" "memory" "8Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "8Gi" "ephemeral-storage" "2Gi")
)
}}
{{- if hasKey $presets .type -}}

View File

@@ -2,6 +2,8 @@ apiVersion: v1
kind: Service
metadata:
name: {{ .Release.Name }}
labels:
app.kubernetes.io/instance: {{ .Release.Name }}
spec:
type: {{ ternary "LoadBalancer" "ClusterIP" .Values.external }}
{{- if .Values.external }}

View File

@@ -12,6 +12,7 @@ spec:
metadata:
labels:
app: {{ .Release.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
spec:
containers:
- name: ferretdb

View File

@@ -11,14 +11,17 @@ spec:
{{- $rawConstraints := get $configMap.data "globalAppTopologySpreadConstraints" }}
{{- if $rawConstraints }}
{{- $rawConstraints | fromYaml | toYaml | nindent 2 }}
labelSelector:
matchLabels:
cnpg.io/cluster: {{ .Release.Name }}-postgres
{{- end }}
{{- end }}
minSyncReplicas: {{ .Values.quorum.minSyncReplicas }}
maxSyncReplicas: {{ .Values.quorum.maxSyncReplicas }}
{{- if .Values.resources }}
resources: {{- toYaml .Values.resources | nindent 4 }}
resources: {{- include "cozy-lib.resources.sanitize" (list .Values.resources $) | nindent 4 }}
{{- else if ne .Values.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.resourcesPreset "Release" .Release) | nindent 4 }}
resources: {{- include "cozy-lib.resources.preset" (list .Values.resourcesPreset $) | nindent 4 }}
{{- end }}
monitoring:
enablePodMonitor: true
@@ -32,6 +35,7 @@ spec:
inheritedMetadata:
labels:
policy.cozystack.io/allow-to-apiserver: "true"
app.kubernetes.io/instance: {{ .Release.Name }}
{{- if .Values.users }}
managed:

View File

@@ -9,5 +9,5 @@ spec:
kind: ferretdb
type: ferretdb
selector:
app: {{ $.Release.Name }}
app.kubernetes.io/instance: {{ $.Release.Name }}
version: {{ $.Chart.Version }}

View File

@@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.4.0
version: 0.5.1
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to

View File

@@ -6,8 +6,10 @@ include ../../../scripts/package.mk
image: image-nginx
image-nginx:
docker buildx build --platform linux/amd64 --build-arg ARCH=amd64 images/nginx-cache \
docker buildx build images/nginx-cache \
--provenance false \
--builder=$(BUILDER) \
--platform=$(PLATFORM) \
--tag $(REGISTRY)/nginx-cache:$(call settag,$(NGINX_CACHE_TAG)) \
--cache-from type=registry,ref=$(REGISTRY)/nginx-cache:latest \
--cache-to type=inline \

View File

@@ -0,0 +1 @@
../../../library/cozy-lib

View File

@@ -1 +1 @@
ghcr.io/cozystack/cozystack/nginx-cache:0.4.0@sha256:0f4d8e6863ed074e90f8a7a8390ccd98dae0220119346aba19e85054bb902e2f
ghcr.io/cozystack/cozystack/nginx-cache:0.5.0@sha256:99cd04f09f80eb0c60cc0b2f6bc8180ada7ada00cb594606447674953dfa1b67

View File

@@ -1,4 +1,4 @@
FROM ubuntu:22.04 as stage
FROM ubuntu:22.04 AS stage
ARG NGINX_VERSION=1.25.3
ARG IP2LOCATION_C_VERSION=8.6.1
@@ -9,11 +9,15 @@ ARG FIFTYONEDEGREES_NGINX_VERSION=3.2.21.1
ARG NGINX_CACHE_PURGE_VERSION=2.5.3
ARG NGINX_VTS_VERSION=0.2.2
ARG TARGETOS
ARG TARGETARCH
# Install required packages for development
RUN apt-get update -q \
&& apt-get install -yq \
RUN apt update -q \
&& apt install -yq --no-install-recommends \
ca-certificates \
unzip \
autoconf \
automake \
build-essential \
libtool \
libpcre3 \
@@ -68,7 +72,7 @@ RUN checkinstall \
--default \
--pkgname=ip2location-c \
--pkgversion=${IP2LOCATION_C_VERSION} \
--pkgarch=amd64 \
--pkgarch=${TARGETARCH} \
--pkggroup=lib \
--pkgsource="https://github.com/chrislim2888/IP2Location-C-Library" \
--maintainer="Eduard Generalov <eduard@generalov.net>" \
@@ -97,7 +101,7 @@ RUN checkinstall \
--default \
--pkgname=ip2proxy-c \
--pkgversion=${IP2PROXY_C_VERSION} \
--pkgarch=amd64 \
--pkgarch=${TARGETARCH} \
--pkggroup=lib \
--pkgsource="https://github.com/ip2location/ip2proxy-c" \
--maintainer="Eduard Generalov <eduard@generalov.net>" \
@@ -144,7 +148,7 @@ RUN checkinstall \
--default \
--pkgname=nginx \
--pkgversion=$VERS \
--pkgarch=amd64 \
--pkgarch=${TARGETARCH} \
--pkggroup=web \
--provides=nginx \
--requires=ip2location-c,ip2proxy-c,libssl3,libc-bin,libc6,libzstd1,libpcre++0v5,libpcre16-3,libpcre2-8-0,libpcre3,libpcre32-3,libpcrecpp0v5,libmaxminddb0 \
@@ -165,10 +169,9 @@ COPY nginx-reloader.sh /usr/bin/nginx-reloader.sh
RUN set -x \
&& groupadd --system --gid 101 nginx \
&& useradd --system --gid nginx --no-create-home --home /nonexistent --comment "nginx user" --shell /bin/false --uid 101 nginx \
&& apt update \
&& apt-get install --no-install-recommends --no-install-suggests -y gnupg1 ca-certificates inotify-tools \
&& apt -y install /packages/*.deb \
&& apt-get clean \
&& apt update -q \
&& apt install -yq --no-install-recommends --no-install-suggests gnupg1 ca-certificates inotify-tools \
&& apt install -y /packages/*.deb \
&& rm -rf /var/lib/apt/lists/* \
&& mkdir -p /var/lib/nginx /var/log/nginx \
&& ln -sf /dev/stdout /var/log/nginx/access.log \

View File

@@ -11,35 +11,34 @@ These presets are for basic testing and not meant to be used in production
{{ include "resources.preset" (dict "type" "nano") -}}
*/}}
{{- define "resources.preset" -}}
{{/* The limits are the requests increased by 50% (except ephemeral-storage and xlarge/2xlarge sizes)*/}}
{{- $presets := dict
"nano" (dict
"requests" (dict "cpu" "100m" "memory" "128Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "150m" "memory" "192Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "128Mi" "ephemeral-storage" "2Gi")
)
"micro" (dict
"requests" (dict "cpu" "250m" "memory" "256Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "375m" "memory" "384Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "256Mi" "ephemeral-storage" "2Gi")
)
"small" (dict
"requests" (dict "cpu" "500m" "memory" "512Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "768Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "512Mi" "ephemeral-storage" "2Gi")
)
"medium" (dict
"requests" (dict "cpu" "500m" "memory" "1024Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "1536Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "500m" "memory" "1Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "1Gi" "ephemeral-storage" "2Gi")
)
"large" (dict
"requests" (dict "cpu" "1.0" "memory" "2048Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "1.5" "memory" "3072Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "1" "memory" "2Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "2Gi" "ephemeral-storage" "2Gi")
)
"xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "3.0" "memory" "6144Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "2" "memory" "4Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "4Gi" "ephemeral-storage" "2Gi")
)
"2xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "6.0" "memory" "12288Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "4" "memory" "8Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "8Gi" "ephemeral-storage" "2Gi")
)
}}
{{- if hasKey $presets .type -}}

View File

@@ -34,9 +34,9 @@ spec:
- image: haproxy:latest
name: haproxy
{{- if .Values.haproxy.resources }}
resources: {{- toYaml .Values.haproxy.resources | nindent 10 }}
resources: {{- include "cozy-lib.resources.sanitize" (list .Values.haproxy.resources $) | nindent 10 }}
{{- else if ne .Values.haproxy.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.haproxy.resourcesPreset "Release" .Release) | nindent 10 }}
resources: {{- include "cozy-lib.resources.preset" (list .Values.haproxy.resourcesPreset $) | nindent 10 }}
{{- end }}
ports:
- containerPort: 8080

View File

@@ -53,9 +53,9 @@ spec:
containers:
- name: nginx
{{- if $.Values.nginx.resources }}
resources: {{- toYaml $.Values.nginx.resources | nindent 10 }}
resources: {{- include "cozy-lib.resources.sanitize" (list $.Values.nginx.resources $) | nindent 10 }}
{{- else if ne $.Values.nginx.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" $.Values.nginx.resourcesPreset "Release" $.Release) | nindent 10 }}
resources: {{- include "cozy-lib.resources.preset" (list $.Values.nginx.resourcesPreset $) | nindent 10 }}
{{- end }}
image: "{{ $.Files.Get "images/nginx-cache.tag" | trim }}"
readinessProbe:

View File

@@ -0,0 +1,39 @@
---
apiVersion: cozystack.io/v1alpha1
kind: WorkloadMonitor
metadata:
name: {{ $.Release.Name }}-haproxy
spec:
replicas: {{ .Values.haproxy.replicas }}
minReplicas: 1
kind: http-cache
type: http-cache
selector:
app: {{ $.Release.Name }}-haproxy
version: {{ $.Chart.Version }}
---
apiVersion: cozystack.io/v1alpha1
kind: WorkloadMonitor
metadata:
name: {{ $.Release.Name }}-nginx
spec:
replicas: {{ .Values.nginx.replicas }}
minReplicas: 1
kind: http-cache
type: http-cache
selector:
app: {{ $.Release.Name }}-nginx-cache
version: {{ $.Chart.Version }}
---
apiVersion: cozystack.io/v1alpha1
kind: WorkloadMonitor
metadata:
name: {{ $.Release.Name }}
spec:
replicas: {{ .Values.replicas }}
minReplicas: 1
kind: http-cache
type: http-cache
selector:
app.kubernetes.io/instance: {{ $.Release.Name }}
version: {{ $.Chart.Version }}

View File

@@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.5.0
version: 0.6.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to

View File

@@ -0,0 +1 @@
../../../library/cozy-lib

View File

@@ -11,35 +11,34 @@ These presets are for basic testing and not meant to be used in production
{{ include "resources.preset" (dict "type" "nano") -}}
*/}}
{{- define "resources.preset" -}}
{{/* The limits are the requests increased by 50% (except ephemeral-storage and xlarge/2xlarge sizes)*/}}
{{- $presets := dict
"nano" (dict
"requests" (dict "cpu" "100m" "memory" "128Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "150m" "memory" "192Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "128Mi" "ephemeral-storage" "2Gi")
)
"micro" (dict
"requests" (dict "cpu" "250m" "memory" "256Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "375m" "memory" "384Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "256Mi" "ephemeral-storage" "2Gi")
)
"small" (dict
"requests" (dict "cpu" "500m" "memory" "512Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "768Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "512Mi" "ephemeral-storage" "2Gi")
)
"medium" (dict
"requests" (dict "cpu" "500m" "memory" "1024Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "1536Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "500m" "memory" "1Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "1Gi" "ephemeral-storage" "2Gi")
)
"large" (dict
"requests" (dict "cpu" "1.0" "memory" "2048Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "1.5" "memory" "3072Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "1" "memory" "2Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "2Gi" "ephemeral-storage" "2Gi")
)
"xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "3.0" "memory" "6144Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "2" "memory" "4Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "4Gi" "ephemeral-storage" "2Gi")
)
"2xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "6.0" "memory" "12288Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "4" "memory" "8Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "8Gi" "ephemeral-storage" "2Gi")
)
}}
{{- if hasKey $presets .type -}}

View File

@@ -9,9 +9,9 @@ spec:
kafka:
replicas: {{ .Values.kafka.replicas }}
{{- if .Values.kafka.resources }}
resources: {{- toYaml .Values.kafka.resources | nindent 6 }}
resources: {{- include "cozy-lib.resources.sanitize" (list .Values.kafka.resources $) | nindent 6 }}
{{- else if ne .Values.kafka.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.kafka.resourcesPreset "Release" .Release) | nindent 6 }}
resources: {{- include "cozy-lib.resources.preset" (list .Values.kafka.resourcesPreset $) | nindent 6 }}
{{- end }}
listeners:
- name: plain
@@ -71,9 +71,9 @@ spec:
zookeeper:
replicas: {{ .Values.zookeeper.replicas }}
{{- if .Values.zookeeper.resources }}
resources: {{- toYaml .Values.zookeeper.resources | nindent 6 }}
resources: {{- include "cozy-lib.resources.sanitize" (list .Values.zookeeper.resources $) | nindent 6 }}
{{- else if ne .Values.zookeeper.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.zookeeper.resourcesPreset "Release" .Release) | nindent 6 }}
resources: {{- include "cozy-lib.resources.preset" (list .Values.zookeeper.resourcesPreset $) | nindent 6 }}
{{- end }}
storage:
type: persistent-claim

View File

@@ -16,10 +16,10 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.17.1
version: 0.23.0
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.30.1"
appVersion: 1.32.4

View File

@@ -1,4 +1,4 @@
UBUNTU_CONTAINER_DISK_TAG = v1.30.1
KUBERNETES_VERSION = v1.32
KUBERNETES_PKG_TAG = $(shell awk '$$1 == "version:" {print $$2}' Chart.yaml)
include ../../../scripts/common-envs.mk
@@ -6,27 +6,36 @@ include ../../../scripts/package.mk
generate:
readme-generator -v values.yaml -s values.schema.json -r README.md
yq -o json -i '.properties.controlPlane.properties.apiServer.properties.resourcesPreset.enum = ["none","nano","micro","small","medium","large","xlarge","2xlarge"]' values.schema.json
yq -o json -i '.properties.controlPlane.properties.controllerManager.properties.resourcesPreset.enum = ["none","nano","micro","small","medium","large","xlarge","2xlarge"]' values.schema.json
yq -o json -i '.properties.controlPlane.properties.scheduler.properties.resourcesPreset.enum = ["none","nano","micro","small","medium","large","xlarge","2xlarge"]' values.schema.json
yq -o json -i '.properties.controlPlane.properties.konnectivity.properties.server.properties.resourcesPreset.enum = ["none","nano","micro","small","medium","large","xlarge","2xlarge"]' values.schema.json
image: image-ubuntu-container-disk image-kubevirt-cloud-provider image-kubevirt-csi-driver image-cluster-autoscaler
image-ubuntu-container-disk:
docker buildx build --platform linux/amd64 --build-arg ARCH=amd64 images/ubuntu-container-disk \
docker buildx build images/ubuntu-container-disk \
--provenance false \
--tag $(REGISTRY)/ubuntu-container-disk:$(call settag,$(UBUNTU_CONTAINER_DISK_TAG)) \
--tag $(REGISTRY)/ubuntu-container-disk:$(call settag,$(UBUNTU_CONTAINER_DISK_TAG)-$(TAG)) \
--builder=$(BUILDER) \
--platform=$(PLATFORM) \
--build-arg KUBERNETES_VERSION=${KUBERNETES_VERSION} \
--tag $(REGISTRY)/ubuntu-container-disk:$(call settag,$(KUBERNETES_VERSION)) \
--tag $(REGISTRY)/ubuntu-container-disk:$(call settag,$(KUBERNETES_VERSION)-$(TAG)) \
--cache-from type=registry,ref=$(REGISTRY)/ubuntu-container-disk:latest \
--cache-to type=inline \
--metadata-file images/ubuntu-container-disk.json \
--push=$(PUSH) \
--label "org.opencontainers.image.source=https://github.com/cozystack/cozystack" \
--load=$(LOAD)
echo "$(REGISTRY)/ubuntu-container-disk:$(call settag,$(UBUNTU_CONTAINER_DISK_TAG))@$$(yq e '."containerimage.digest"' images/ubuntu-container-disk.json -o json -r)" \
echo "$(REGISTRY)/ubuntu-container-disk:$(call settag,$(KUBERNETES_VERSION))@$$(yq e '."containerimage.digest"' images/ubuntu-container-disk.json -o json -r)" \
> images/ubuntu-container-disk.tag
rm -f images/ubuntu-container-disk.json
image-kubevirt-cloud-provider:
docker buildx build --platform linux/amd64 --build-arg ARCH=amd64 images/kubevirt-cloud-provider \
docker buildx build images/kubevirt-cloud-provider \
--provenance false \
--builder=$(BUILDER) \
--platform=$(PLATFORM) \
--tag $(REGISTRY)/kubevirt-cloud-provider:$(call settag,$(KUBERNETES_PKG_TAG)) \
--tag $(REGISTRY)/kubevirt-cloud-provider:$(call settag,$(KUBERNETES_PKG_TAG)-$(TAG)) \
--cache-from type=registry,ref=$(REGISTRY)/kubevirt-cloud-provider:latest \
@@ -40,8 +49,10 @@ image-kubevirt-cloud-provider:
rm -f images/kubevirt-cloud-provider.json
image-kubevirt-csi-driver:
docker buildx build --platform linux/amd64 --build-arg ARCH=amd64 images/kubevirt-csi-driver \
docker buildx build images/kubevirt-csi-driver \
--provenance false \
--builder=$(BUILDER) \
--platform=$(PLATFORM) \
--tag $(REGISTRY)/kubevirt-csi-driver:$(call settag,$(KUBERNETES_PKG_TAG)) \
--tag $(REGISTRY)/kubevirt-csi-driver:$(call settag,$(KUBERNETES_PKG_TAG)-$(TAG)) \
--cache-from type=registry,ref=$(REGISTRY)/kubevirt-csi-driver:latest \
@@ -56,8 +67,10 @@ image-kubevirt-csi-driver:
image-cluster-autoscaler:
docker buildx build --platform linux/amd64 --build-arg ARCH=amd64 images/cluster-autoscaler \
docker buildx build images/cluster-autoscaler \
--provenance false \
--builder=$(BUILDER) \
--platform=$(PLATFORM) \
--tag $(REGISTRY)/cluster-autoscaler:$(call settag,$(KUBERNETES_PKG_TAG)) \
--tag $(REGISTRY)/cluster-autoscaler:$(call settag,$(KUBERNETES_PKG_TAG)-$(TAG)) \
--cache-from type=registry,ref=$(REGISTRY)/cluster-autoscaler:latest \

View File

@@ -1,49 +1,200 @@
# Managed Kubernetes Service
## Overview
## Managed Kubernetes in Cozystack
The Managed Kubernetes Service offers a streamlined solution for efficiently managing server workloads. Kubernetes has emerged as the industry standard, providing a unified and accessible API, primarily utilizing YAML for configuration. This means that teams can easily understand and work with Kubernetes, streamlining infrastructure management.
Whenever you want to deploy a custom containerized application in Cozystack, it's best to deploy it to a managed Kubernetes cluster.
The Kubernetes leverages robust software design patterns, enabling continuous recovery in any scenario through the reconciliation method. Additionally, it ensures seamless scaling across a multitude of servers, addressing the challenges posed by complex and outdated APIs found in traditional virtualization platforms. This managed service eliminates the need for developing custom solutions or modifying source code, saving valuable time and effort.
Cozystack deploys and manages Kubernetes-as-a-service as standalone applications within each tenants isolated environment.
In Cozystack, such clusters are named tenant Kubernetes clusters, while the base Cozystack cluster is called a management or root cluster.
Tenant clusters are fully separated from the management cluster and are intended for deploying tenant-specific or customer-developed applications.
## Deployment Details
Within a tenant cluster, users can take advantage of LoadBalancer services and easily provision physical volumes as needed.
The control-plane operates within containers, while the worker nodes are deployed as virtual machines, all seamlessly managed by the application.
The managed Kubernetes service deploys a standard Kubernetes cluster utilizing the Cluster API, Kamaji as control-plane provicer and the KubeVirt infrastructure provider. This ensures a consistent and reliable setup for workloads.
## Why Use a Managed Kubernetes Cluster?
Within this cluster, users can take advantage of LoadBalancer services and easily provision physical volumes as needed. The control-plane operates within containers, while the worker nodes are deployed as virtual machines, all seamlessly managed by the application.
Kubernetes has emerged as the industry standard, providing a unified and accessible API, primarily utilizing YAML for configuration.
This means that teams can easily understand and work with Kubernetes, streamlining infrastructure management.
- Docs: https://github.com/clastix/kamaji
- Docs: https://cluster-api.sigs.k8s.io/
- GitHub: https://github.com/clastix/kamaji
- GitHub: https://github.com/kubernetes-sigs/cluster-api-provider-kubevirt
- GitHub: https://github.com/kubevirt/csi-driver
Kubernetes leverages robust software design patterns, enabling continuous recovery in any scenario through the reconciliation method.
Additionally, it ensures seamless scaling across a multitude of servers,
addressing the challenges posed by complex and outdated APIs found in traditional virtualization platforms.
This managed service eliminates the need for developing custom solutions or modifying source code, saving valuable time and effort.
The Managed Kubernetes Service in Cozystack offers a streamlined solution for efficiently managing server workloads.
## How-Tos
## Starting Work
How to access to deployed cluster:
Once the tenant Kubernetes cluster is ready, you can get a kubeconfig file to work with it.
It can be done via UI or a `kubectl` request:
```
kubectl get secret -n <namespace> kubernetes-<clusterName>-admin-kubeconfig -o go-template='{{ printf "%s\n" (index .data "super-admin.conf" | base64decode) }}' > test
- Open the Cozystack dashboard, switch to your tenant, find and open the application page. Copy one of the config files from the **Secrets** section.
- Run the following command (using the management cluster kubeconfig):
```bash
kubectl get secret -n tenant-<name> kubernetes-<clusterName>-admin-kubeconfig -o go-template='{{ printf "%s\n" (index .data "admin.conf" | base64decode) }}' > admin.conf
```
There are several kubeconfig options available:
- `admin.conf` — The standard kubeconfig for accessing your new cluster.
You can create additional Kubernetes users using this configuration.
- `admin.svc` — Same token as `admin.conf`, but with the API server address set to the internal service name.
Use it for applications running inside the cluster that need API access.
- `super-admin.conf` — Similar to `admin.conf`, but with extended administrative permissions.
Intended for troubleshooting and cluster maintenance tasks.
- `super-admin.svc` — Same as `super-admin.conf`, but pointing to the internal API server address.
## Implementation Details
A tenant Kubernetes cluster in Cozystack is essentially Kubernetes-in-Kubernetes.
Deploying it involves the following components:
- **Kamaji Control Plane**: [Kamaji](https://kamaji.clastix.io/) is an open-source project that facilitates the deployment
of Kubernetes control planes as pods within a root cluster.
Each control plane pod includes essential components like `kube-apiserver`, `controller-manager`, and `scheduler`,
allowing for efficient multi-tenancy and resource utilization.
- **Etcd Cluster**: A dedicated etcd cluster is deployed using Ænix's [etcd-operator](https://github.com/aenix-io/etcd-operator).
It provides reliable and scalable key-value storage for the Kubernetes control plane.
- **Worker Nodes**: Virtual Machines are provisioned to serve as worker nodes using KubeVirt.
These nodes are configured to join the tenant Kubernetes cluster, enabling the deployment and management of workloads.
- **Cluster API**: Cozystack is using the [Kubernetes Cluster API](https://cluster-api.sigs.k8s.io/) to provision the components of a cluster.
This architecture ensures isolated, scalable, and efficient tenant Kubernetes environments.
See the reference for components utilized in this service:
- [Kamaji Control Plane](https://kamaji.clastix.io)
- [Kamaji — Cluster API](https://kamaji.clastix.io/cluster-api/)
- [github.com/clastix/kamaji](https://github.com/clastix/kamaji)
- [KubeVirt](https://kubevirt.io/)
- [github.com/kubevirt/kubevirt](https://github.com/kubevirt/kubevirt)
- [github.com/aenix-io/etcd-operator](https://github.com/aenix-io/etcd-operator)
- [Kubernetes Cluster API](https://cluster-api.sigs.k8s.io/)
- [github.com/kubernetes-sigs/cluster-api-provider-kubevirt](https://github.com/kubernetes-sigs/cluster-api-provider-kubevirt)
- [github.com/kubevirt/csi-driver](https://github.com/kubevirt/csi-driver)
## Parameters
### Common Parameters
| Name | Description | Value |
| ----------------------------------- | ----------------------------------------------------------------------------------------------------------------- | ------------ |
| `host` | Hostname used to access the Kubernetes cluster externally. Defaults to `<cluster-name>.<tenant-host>` when empty. | `""` |
| `controlPlane.replicas` | Number of replicas for Kubernetes control-plane components. | `2` |
| `storageClass` | StorageClass used to store user data. | `replicated` |
| `useCustomSecretForPatchContainerd` | if true, for patch containerd will be used secret: {{ .Release.Name }}-patch-containerd | `false` |
| `nodeGroups` | nodeGroups configuration | `{}` |
### Cluster Addons
| Name | Description | Value |
| --------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `addons.certManager.enabled` | Enable cert-manager, which automatically creates and manages SSL/TLS certificates. | `false` |
| `addons.certManager.valuesOverride` | Custom values to override | `{}` |
| `addons.cilium.valuesOverride` | Custom values to override | `{}` |
| `addons.gatewayAPI.enabled` | Enable the Gateway API | `false` |
| `addons.ingressNginx.enabled` | Enable the Ingress-NGINX controller (requires nodes labeled with the 'ingress-nginx' role). | `false` |
| `addons.ingressNginx.valuesOverride` | Custom values to override | `{}` |
| `addons.ingressNginx.hosts` | List of domain names that the parent cluster should route to this tenant cluster. | `[]` |
| `addons.gpuOperator.enabled` | Enable the GPU-operator | `false` |
| `addons.gpuOperator.valuesOverride` | Custom values to override | `{}` |
| `addons.fluxcd.enabled` | Enable FluxCD | `false` |
| `addons.fluxcd.valuesOverride` | Custom values to override | `{}` |
| `addons.monitoringAgents.enabled` | Enable monitoring agents (Fluent Bit and VMAgents) to send logs and metrics. If tenant monitoring is enabled, data is sent to tenant storage; otherwise, it goes to root storage. | `false` |
| `addons.monitoringAgents.valuesOverride` | Custom values to override | `{}` |
| `addons.verticalPodAutoscaler.valuesOverride` | Custom values to override | `{}` |
### Kubernetes Control Plane Configuration
| Name | Description | Value |
| -------------------------------------------------- | ---------------------------------------------------------------------------- | ------- |
| `controlPlane.apiServer.resources` | Explicit CPU/memory resource requests and limits for the API server. | `{}` |
| `controlPlane.apiServer.resourcesPreset` | Use a common resources preset when `resources` is not set explicitly. | `small` |
| `controlPlane.controllerManager.resources` | Explicit CPU/memory resource requests and limits for the controller manager. | `{}` |
| `controlPlane.controllerManager.resourcesPreset` | Use a common resources preset when `resources` is not set explicitly. | `micro` |
| `controlPlane.scheduler.resources` | Explicit CPU/memory resource requests and limits for the scheduler. | `{}` |
| `controlPlane.scheduler.resourcesPreset` | Use a common resources preset when `resources` is not set explicitly. | `micro` |
| `controlPlane.konnectivity.server.resources` | Explicit CPU/memory resource requests and limits for the Konnectivity. | `{}` |
| `controlPlane.konnectivity.server.resourcesPreset` | Use a common resources preset when `resources` is not set explicitly. | `micro` |
In production environments, it's recommended to set `resources` explicitly.
Example of `controlPlane.*.resources`:
```yaml
resources:
limits:
cpu: 4000m
memory: 4Gi
requests:
cpu: 100m
memory: 512Mi
```
# Series
Allowed values for `controlPlane.*.resourcesPreset` are `none`, `nano`, `micro`, `small`, `medium`, `large`, `xlarge`, `2xlarge`.
This value is ignored if the corresponding `resources` value is set.
<!-- source: https://github.com/kubevirt/common-instancetypes/blob/main/README.md -->
## Resources Reference
. | U | O | CX | M | RT
----------------------------|-----|-----|------|-----|------
*Has GPUs* | | | | |
*Hugepages* | | | ✓ | ✓ | ✓
*Overcommitted Memory* | | ✓ | | |
*Dedicated CPU* | | | ✓ | | ✓
*Burstable CPU performance* | ✓ | ✓ | | ✓ |
*Isolated emulator threads* | | | ✓ | | ✓
*vNUMA* | | | ✓ | | ✓
*vCPU-To-Memory Ratio* | 1:4 | 1:4 | 1:2 | 1:8 | 1:4
### instanceType Resources
The following instanceType resources are provided by Cozystack:
## U Series
| Name | vCPUs | Memory |
|---------------|-------|--------|
| `cx1.2xlarge` | 8 | 16Gi |
| `cx1.4xlarge` | 16 | 32Gi |
| `cx1.8xlarge` | 32 | 64Gi |
| `cx1.large` | 2 | 4Gi |
| `cx1.medium` | 1 | 2Gi |
| `cx1.xlarge` | 4 | 8Gi |
| `gn1.2xlarge` | 8 | 32Gi |
| `gn1.4xlarge` | 16 | 64Gi |
| `gn1.8xlarge` | 32 | 128Gi |
| `gn1.xlarge` | 4 | 16Gi |
| `m1.2xlarge` | 8 | 64Gi |
| `m1.4xlarge` | 16 | 128Gi |
| `m1.8xlarge` | 32 | 256Gi |
| `m1.large` | 2 | 16Gi |
| `m1.xlarge` | 4 | 32Gi |
| `n1.2xlarge` | 16 | 32Gi |
| `n1.4xlarge` | 32 | 64Gi |
| `n1.8xlarge` | 64 | 128Gi |
| `n1.large` | 4 | 8Gi |
| `n1.medium` | 4 | 4Gi |
| `n1.xlarge` | 8 | 16Gi |
| `o1.2xlarge` | 8 | 32Gi |
| `o1.4xlarge` | 16 | 64Gi |
| `o1.8xlarge` | 32 | 128Gi |
| `o1.large` | 2 | 8Gi |
| `o1.medium` | 1 | 4Gi |
| `o1.micro` | 1 | 1Gi |
| `o1.nano` | 1 | 512Mi |
| `o1.small` | 1 | 2Gi |
| `o1.xlarge` | 4 | 16Gi |
| `rt1.2xlarge` | 8 | 32Gi |
| `rt1.4xlarge` | 16 | 64Gi |
| `rt1.8xlarge` | 32 | 128Gi |
| `rt1.large` | 2 | 8Gi |
| `rt1.medium` | 1 | 4Gi |
| `rt1.micro` | 1 | 1Gi |
| `rt1.small` | 1 | 2Gi |
| `rt1.xlarge` | 4 | 16Gi |
| `u1.2xlarge` | 8 | 32Gi |
| `u1.2xmedium` | 2 | 4Gi |
| `u1.4xlarge` | 16 | 64Gi |
| `u1.8xlarge` | 32 | 128Gi |
| `u1.large` | 2 | 8Gi |
| `u1.medium` | 1 | 4Gi |
| `u1.micro` | 1 | 1Gi |
| `u1.nano` | 1 | 512Mi |
| `u1.small` | 1 | 2Gi |
| `u1.xlarge` | 4 | 16Gi |
### U Series: Universal
The U Series is quite neutral and provides resources for
general purpose applications.
@@ -54,7 +205,7 @@ attitude towards workloads.
VMs of instance types will share physical CPU cores on a
time-slice basis with other VMs.
### U Series Characteristics
#### U Series Characteristics
Specific characteristics of this series are:
- *Burstable CPU performance* - The workload has a baseline compute
@@ -63,14 +214,14 @@ Specific characteristics of this series are:
- *vCPU-To-Memory Ratio (1:4)* - A vCPU-to-Memory ratio of 1:4, for less
noise per node.
## O Series
### O Series: Overcommitted
The O Series is based on the U Series, with the only difference
being that memory is overcommitted.
*O* is the abbreviation for "Overcommitted".
### UO Series Characteristics
#### O Series Characteristics
Specific characteristics of this series are:
- *Burstable CPU performance* - The workload has a baseline compute
@@ -81,7 +232,7 @@ Specific characteristics of this series are:
- *vCPU-To-Memory Ratio (1:4)* - A vCPU-to-Memory ratio of 1:4, for less
noise per node.
## CX Series
### CX Series: Compute Exclusive
The CX Series provides exclusive compute resources for compute
intensive applications.
@@ -95,7 +246,7 @@ the IO threading from cores dedicated to the workload.
In addition, in this series, the NUMA topology of the used
cores is provided to the VM.
### CX Series Characteristics
#### CX Series Characteristics
Specific characteristics of this series are:
- *Hugepages* - Hugepages are used in order to improve memory
@@ -110,14 +261,14 @@ Specific characteristics of this series are:
optimize guest sided cache utilization.
- *vCPU-To-Memory Ratio (1:2)* - A vCPU-to-Memory ratio of 1:2.
## M Series
### M Series: Memory
The M Series provides resources for memory intensive
applications.
*M* is the abbreviation of "Memory".
### M Series Characteristics
#### M Series Characteristics
Specific characteristics of this series are:
- *Hugepages* - Hugepages are used in order to improve memory
@@ -128,7 +279,7 @@ Specific characteristics of this series are:
- *vCPU-To-Memory Ratio (1:8)* - A vCPU-to-Memory ratio of 1:8, for much
less noise per node.
## RT Series
### RT Series: RealTime
The RT Series provides resources for realtime applications, like Oslat.
@@ -137,7 +288,7 @@ The RT Series provides resources for realtime applications, like Oslat.
This series of instance types requires nodes capable of running
realtime applications.
### RT Series Characteristics
#### RT Series Characteristics
Specific characteristics of this series are:
- *Hugepages* - Hugepages are used in order to improve memory
@@ -151,57 +302,3 @@ Specific characteristics of this series are:
- *vCPU-To-Memory Ratio (1:4)* - A vCPU-to-Memory ratio of 1:4 starting from
the medium size.
## Resources
The following instancetype resources are provided by Cozystack:
Name | vCPUs | Memory
-----|-------|-------
cx1.2xlarge | 8 | 16Gi
cx1.4xlarge | 16 | 32Gi
cx1.8xlarge | 32 | 64Gi
cx1.large | 2 | 4Gi
cx1.medium | 1 | 2Gi
cx1.xlarge | 4 | 8Gi
gn1.2xlarge | 8 | 32Gi
gn1.4xlarge | 16 | 64Gi
gn1.8xlarge | 32 | 128Gi
gn1.xlarge | 4 | 16Gi
m1.2xlarge | 8 | 64Gi
m1.4xlarge | 16 | 128Gi
m1.8xlarge | 32 | 256Gi
m1.large | 2 | 16Gi
m1.xlarge | 4 | 32Gi
n1.2xlarge | 16 | 32Gi
n1.4xlarge | 32 | 64Gi
n1.8xlarge | 64 | 128Gi
n1.large | 4 | 8Gi
n1.medium | 4 | 4Gi
n1.xlarge | 8 | 16Gi
o1.2xlarge | 8 | 32Gi
o1.4xlarge | 16 | 64Gi
o1.8xlarge | 32 | 128Gi
o1.large | 2 | 8Gi
o1.medium | 1 | 4Gi
o1.micro | 1 | 1Gi
o1.nano | 1 | 512Mi
o1.small | 1 | 2Gi
o1.xlarge | 4 | 16Gi
rt1.2xlarge | 8 | 32Gi
rt1.4xlarge | 16 | 64Gi
rt1.8xlarge | 32 | 128Gi
rt1.large | 2 | 8Gi
rt1.medium | 1 | 4Gi
rt1.micro | 1 | 1Gi
rt1.small | 1 | 2Gi
rt1.xlarge | 4 | 16Gi
u1.2xlarge | 8 | 32Gi
u1.2xmedium | 2 | 4Gi
u1.4xlarge | 16 | 64Gi
u1.8xlarge | 32 | 128Gi
u1.large | 2 | 8Gi
u1.medium | 1 | 4Gi
u1.micro | 1 | 1Gi
u1.nano | 1 | 512Mi
u1.small | 1 | 2Gi
u1.xlarge | 4 | 16Gi

View File

@@ -1 +1 @@
ghcr.io/cozystack/cozystack/cluster-autoscaler:0.17.0@sha256:85371c6aabf5a7fea2214556deac930c600e362f92673464fe2443784e2869c3
ghcr.io/cozystack/cozystack/cluster-autoscaler:0.21.0@sha256:7315850634728a5864a3de3150c12f0e1454f3f1ce33cdf21a278f57611dd5e9

View File

@@ -1,7 +1,14 @@
# Source: https://raw.githubusercontent.com/kubernetes/autoscaler/refs/heads/master/cluster-autoscaler/Dockerfile.amd64
ARG builder_image=docker.io/library/golang:1.23.4
ARG BASEIMAGE=gcr.io/distroless/static:nonroot-amd64
ARG BASEIMAGE=gcr.io/distroless/static:nonroot-${TARGETARCH}
FROM ${builder_image} AS builder
ARG TARGETOS
ARG TARGETARCH
ENV GOOS=$TARGETOS
ENV GOARCH=$TARGETARCH
RUN git clone https://github.com/kubernetes/autoscaler /src/autoscaler \
&& cd /src/autoscaler/cluster-autoscaler \
&& git checkout cluster-autoscaler-1.32.0
@@ -14,6 +21,8 @@ RUN make build
FROM $BASEIMAGE
LABEL maintainer="Marcin Wielgus <mwielgus@google.com>"
COPY --from=builder /src/autoscaler/cluster-autoscaler/cluster-autoscaler-amd64 /cluster-autoscaler
ARG TARGETARCH
COPY --from=builder /src/autoscaler/cluster-autoscaler/cluster-autoscaler-${TARGETARCH} /cluster-autoscaler
WORKDIR /
CMD ["/cluster-autoscaler"]

View File

@@ -1 +1 @@
ghcr.io/cozystack/cozystack/kubevirt-cloud-provider:0.17.0@sha256:53f4734109799da8b27f35a3b1afdb4746b5992f1d7b9d1c132ea6242cdd8cf0
ghcr.io/cozystack/cozystack/kubevirt-cloud-provider:0.21.0@sha256:6962bdf51ab2ff40b420b9cff7c850aeea02187da2a65a67f10e0471744649d7

View File

@@ -1,5 +1,10 @@
# Source: https://github.com/kubevirt/cloud-provider-kubevirt/blob/main/build/images/kubevirt-cloud-controller-manager/Dockerfile
FROM --platform=linux/amd64 golang:1.20.6 AS builder
FROM golang:1.20.6 AS builder
ARG TARGETOS
ARG TARGETARCH
ENV GOOS=$TARGETOS
ENV GOARCH=$TARGETARCH
RUN git clone https://github.com/kubevirt/cloud-provider-kubevirt /go/src/kubevirt.io/cloud-provider-kubevirt \
&& cd /go/src/kubevirt.io/cloud-provider-kubevirt \
@@ -14,7 +19,7 @@ RUN go get 'k8s.io/endpointslice/util@v0.28' 'k8s.io/apiserver@v0.28'
RUN go mod tidy
RUN go mod vendor
RUN CGO_ENABLED=0 GOOS=linux go build -mod=vendor -ldflags="-s -w" -o bin/kubevirt-cloud-controller-manager ./cmd/kubevirt-cloud-controller-manager
RUN CGO_ENABLED=0 go build -mod=vendor -ldflags="-s -w" -o bin/kubevirt-cloud-controller-manager ./cmd/kubevirt-cloud-controller-manager
FROM registry.access.redhat.com/ubi9/ubi-micro
COPY --from=builder /go/src/kubevirt.io/cloud-provider-kubevirt/bin/kubevirt-cloud-controller-manager /bin/kubevirt-cloud-controller-manager

View File

@@ -1 +1 @@
ghcr.io/cozystack/cozystack/kubevirt-csi-driver:0.17.0@sha256:1a6605d3bff6342e12bcc257e852a4f89e97e8af6d3d259930ec07c7ad5f001d
ghcr.io/cozystack/cozystack/kubevirt-csi-driver:0.21.0@sha256:b1525163cd21938ac934bb1b860f2f3151464fa463b82880ab058167aeaf3e29

View File

@@ -5,6 +5,11 @@ RUN git clone https://github.com/kubevirt/csi-driver /src/kubevirt-csi-driver \
&& cd /src/kubevirt-csi-driver \
&& git checkout 35836e0c8b68d9916d29a838ea60cdd3fc6199cf
ARG TARGETOS
ARG TARGETARCH
ENV GOOS=$TARGETOS
ENV GOARCH=$TARGETARCH
WORKDIR /src/kubevirt-csi-driver
RUN make build

View File

@@ -1 +1 @@
ghcr.io/cozystack/cozystack/ubuntu-container-disk:v1.30.1@sha256:d842de4637ea6188999464f133c89f63a3bd13f1cb202c10f1f8c0c1c3c3dbd4
ghcr.io/cozystack/cozystack/ubuntu-container-disk:v1.32@sha256:bfe568db4b768a4b6c67a8d562892bbba766d0245e140d431754589b347f0b41

View File

@@ -1,19 +1,26 @@
FROM ubuntu:22.04 as guestfish
# TODO: Here we use ubuntu:22.04, as guestfish has some network issues running in ubuntu:24.04
FROM ubuntu:22.04 AS guestfish
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
&& apt-get -y install \
libguestfs-tools \
linux-image-generic \
wget \
make \
bash-completion \
&& apt-get clean
bash-completion
WORKDIR /build
FROM guestfish as builder
FROM guestfish AS builder
RUN wget -O image.img https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
ARG TARGETOS
ARG TARGETARCH
# noble is a code name for the Ubuntu 24.04 LTS release
RUN wget -O image.img https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-${TARGETARCH}.img --show-progress --output-file /dev/stdout --progress=dot:giga 2>/dev/null
ARG KUBERNETES_VERSION
RUN qemu-img resize image.img 5G \
&& eval "$(guestfish --listen --network)" \
@@ -24,19 +31,21 @@ RUN qemu-img resize image.img 5G \
&& guestfish --remote command "resize2fs /dev/sda1" \
# docker repo
&& guestfish --remote sh "curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg" \
&& guestfish --remote sh 'echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list' \
&& guestfish --remote sh 'echo "deb [signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list' \
# kubernetes repo
&& guestfish --remote sh "curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg" \
&& guestfish --remote sh "echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | tee /etc/apt/sources.list.d/kubernetes.list" \
&& guestfish --remote sh "curl -fsSL https://pkgs.k8s.io/core:/stable:/${KUBERNETES_VERSION}/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg" \
&& guestfish --remote sh "echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/${KUBERNETES_VERSION}/deb/ /' | tee /etc/apt/sources.list.d/kubernetes.list" \
&& guestfish --remote command "apt-get check -q" \
# install containerd
&& guestfish --remote command "apt-get update -y" \
&& guestfish --remote command "apt-get install -y containerd.io" \
&& guestfish --remote command "apt-get update -q" \
&& guestfish --remote command "apt-get install -yq containerd.io" \
# configure containerd
&& guestfish --remote command "mkdir -p /etc/containerd" \
&& guestfish --remote sh "containerd config default | tee /etc/containerd/config.toml" \
&& guestfish --remote command "sed -i '/SystemdCgroup/ s/=.*/= true/' /etc/containerd/config.toml" \
&& guestfish --remote command "containerd config dump >/dev/null" \
# install kubernetes
&& guestfish --remote command "apt-get install -y kubelet kubeadm" \
&& guestfish --remote command "apt-get install -yq kubelet kubeadm" \
# clean apt cache
&& guestfish --remote sh 'apt-get clean && rm -rf /var/lib/apt/lists/*' \
# write system configuration

View File

@@ -11,35 +11,34 @@ These presets are for basic testing and not meant to be used in production
{{ include "resources.preset" (dict "type" "nano") -}}
*/}}
{{- define "resources.preset" -}}
{{/* The limits are the requests increased by 50% (except ephemeral-storage and xlarge/2xlarge sizes)*/}}
{{- $presets := dict
"nano" (dict
"requests" (dict "cpu" "100m" "memory" "128Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "150m" "memory" "192Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "128Mi" "ephemeral-storage" "2Gi")
)
"micro" (dict
"requests" (dict "cpu" "250m" "memory" "256Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "375m" "memory" "384Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "256Mi" "ephemeral-storage" "2Gi")
)
"small" (dict
"requests" (dict "cpu" "500m" "memory" "512Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "768Mi" "ephemeral-storage" "2Gi")
"limits" (dict "memory" "512Mi" "ephemeral-storage" "2Gi")
)
"medium" (dict
"requests" (dict "cpu" "500m" "memory" "1024Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "750m" "memory" "1536Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "500m" "memory" "1Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "1Gi" "ephemeral-storage" "2Gi")
)
"large" (dict
"requests" (dict "cpu" "1.0" "memory" "2048Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "1.5" "memory" "3072Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "1" "memory" "2Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "2Gi" "ephemeral-storage" "2Gi")
)
"xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "3.0" "memory" "6144Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "2" "memory" "4Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "4Gi" "ephemeral-storage" "2Gi")
)
"2xlarge" (dict
"requests" (dict "cpu" "1.0" "memory" "3072Mi" "ephemeral-storage" "50Mi")
"limits" (dict "cpu" "6.0" "memory" "12288Mi" "ephemeral-storage" "2Gi")
"requests" (dict "cpu" "4" "memory" "8Gi" "ephemeral-storage" "50Mi")
"limits" (dict "memory" "8Gi" "ephemeral-storage" "2Gi")
)
}}
{{- if hasKey $presets .type -}}

View File

@@ -31,6 +31,16 @@ spec:
{{- end }}
cluster.x-k8s.io/deployment-name: {{ $.Release.Name }}-{{ .groupName }}
spec:
{{- $configMap := lookup "v1" "ConfigMap" "cozy-system" "cozystack-scheduling" }}
{{- if $configMap }}
{{- $rawConstraints := get $configMap.data "globalAppTopologySpreadConstraints" }}
{{- if $rawConstraints }}
{{- $rawConstraints | fromYaml | toYaml | nindent 10 }}
labelSelector:
matchLabels:
cluster.x-k8s.io/cluster-name: {{ $.Release.Name }}
{{- end }}
{{- end }}
domain:
{{- if and .group.resources .group.resources.cpu }}
cpu:
@@ -39,6 +49,13 @@ spec:
sockets: 1
{{- end }}
devices:
{{- if .group.gpus }}
gpus:
{{- range $i, $gpu := .group.gpus }}
- name: gpu{{ add $i 1 }}
deviceName: {{ $gpu.name }}
{{- end }}
{{- end }}
disks:
- name: system
disk:
@@ -103,22 +120,22 @@ metadata:
kamaji.clastix.io/kubeconfig-secret-key: "super-admin.svc"
spec:
apiServer:
{{- if .Values.kamajiControlPlane.apiServer.resources }}
resources: {{- toYaml .Values.kamajiControlPlane.apiServer.resources | nindent 6 }}
{{- else if ne .Values.kamajiControlPlane.apiServer.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.kamajiControlPlane.apiServer.resourcesPreset "Release" .Release) | nindent 6 }}
{{- if .Values.controlPlane.apiServer.resources }}
resources: {{- toYaml .Values.controlPlane.apiServer.resources | nindent 6 }}
{{- else if ne .Values.controlPlane.apiServer.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.controlPlane.apiServer.resourcesPreset "Release" .Release) | nindent 6 }}
{{- end }}
controllerManager:
{{- if .Values.kamajiControlPlane.controllerManager.resources }}
resources: {{- toYaml .Values.kamajiControlPlane.controllerManager.resources | nindent 6 }}
{{- else if ne .Values.kamajiControlPlane.controllerManager.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.kamajiControlPlane.controllerManager.resourcesPreset "Release" .Release) | nindent 6 }}
{{- if .Values.controlPlane.controllerManager.resources }}
resources: {{- toYaml .Values.controlPlane.controllerManager.resources | nindent 6 }}
{{- else if ne .Values.controlPlane.controllerManager.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.controlPlane.controllerManager.resourcesPreset "Release" .Release) | nindent 6 }}
{{- end }}
scheduler:
{{- if .Values.kamajiControlPlane.scheduler.resources }}
resources: {{- toYaml .Values.kamajiControlPlane.scheduler.resources | nindent 6 }}
{{- else if ne .Values.kamajiControlPlane.scheduler.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.kamajiControlPlane.scheduler.resourcesPreset "Release" .Release) | nindent 6 }}
{{- if .Values.controlPlane.scheduler.resources }}
resources: {{- toYaml .Values.controlPlane.scheduler.resources | nindent 6 }}
{{- else if ne .Values.controlPlane.scheduler.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.controlPlane.scheduler.resourcesPreset "Release" .Release) | nindent 6 }}
{{- end }}
dataStoreName: "{{ $etcd }}"
addons:
@@ -128,10 +145,10 @@ spec:
konnectivity:
server:
port: 8132
{{- if .Values.kamajiControlPlane.addons.konnectivity.server.resources }}
resources: {{- toYaml .Values.kamajiControlPlane.addons.konnectivity.server.resources | nindent 10 }}
{{- else if ne .Values.kamajiControlPlane.addons.konnectivity.server.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.kamajiControlPlane.addons.konnectivity.server.resourcesPreset "Release" .Release) | nindent 10 }}
{{- if .Values.controlPlane.konnectivity.server.resources }}
resources: {{- toYaml .Values.controlPlane.konnectivity.server.resources | nindent 10 }}
{{- else if ne .Values.controlPlane.konnectivity.server.resourcesPreset "none" }}
resources: {{- include "resources.preset" (dict "type" .Values.controlPlane.konnectivity.server.resourcesPreset "Release" .Release) | nindent 10 }}
{{- end }}
kubelet:
cgroupfs: systemd
@@ -143,14 +160,14 @@ spec:
ingress:
extraAnnotations:
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
hostname: {{ .Values.host | default (printf "%s.%s" .Release.Name $host) }}
hostname: {{ .Values.host | default (printf "%s.%s" .Release.Name $host) }}:443
className: "{{ $ingress }}"
deployment:
podAdditionalMetadata:
labels:
policy.cozystack.io/allow-to-etcd: "true"
replicas: 2
version: 1.30.1
version: {{ $.Chart.AppVersion }}
---
apiVersion: cozystack.io/v1alpha1
kind: WorkloadMonitor
@@ -194,12 +211,25 @@ spec:
- ["LABEL=ephemeral", "/ephemeral"]
- ["/ephemeral/kubelet", "/var/lib/kubelet", "none", "bind,nofail"]
- ["/ephemeral/containerd", "/var/lib/containerd", "none", "bind,nofail"]
{{- $sec := lookup "v1" "Secret" .Release.Namespace (printf "%s-patch-containerd" .Release.Name) }}
{{- if $sec }}
files:
{{- range $key, $_ := $sec.data }}
- path: /etc/containerd/certs.d/{{ trimSuffix ".toml" $key }}/hosts.toml
contentFrom:
secret:
name: {{ .Release.Name }}-patch-containerd
key: {{ $key }}
permissions: "0400"
{{- end }}
{{- end }}
preKubeadmCommands:
- sed -i 's|root:x:|root::|' /etc/passwd
- systemctl stop containerd.service
- mkdir -p /ephemeral/kubelet /ephemeral/containerd
- mount -o bind /ephemeral/kubelet /var/lib/kubelet
- mount -o bind /ephemeral/containerd /var/lib/containerd
- sudo sed -i '/\[plugins."io.containerd.grpc.v1.cri".registry\]/,/^\[/ s|^\(\s*config_path\s*=\s*\).*|\1"/etc/containerd/certs.d"|' /etc/containerd/config.toml
- systemctl start containerd.service
joinConfiguration:
nodeRegistration:
@@ -276,7 +306,7 @@ spec:
kind: KubevirtMachineTemplate
name: {{ $.Release.Name }}-{{ $groupName }}-{{ $kubevirtmachinetemplateHash }}
namespace: {{ $.Release.Namespace }}
version: v1.30.1
version: v{{ $.Chart.AppVersion }}
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineHealthCheck

View File

@@ -0,0 +1,15 @@
{{- if not .Values.useCustomSecretForPatchContainerd }}
{{- $sourceSecret := lookup "v1" "Secret" "cozy-system" "patch-containerd" }}
{{- if $sourceSecret }}
apiVersion: v1
kind: Secret
metadata:
name: {{ .Release.Name }}-patch-containerd
namespace: {{ .Release.Namespace }}
type: {{ $sourceSecret.type }}
data:
{{- range $key, $value := $sourceSecret.data }}
{{ printf "%s: %s" $key ($value | quote) | indent 2 }}
{{- end }}
{{- end }}
{{- end }}

View File

@@ -4,7 +4,7 @@ metadata:
name: {{ .Release.Name }}-cert-manager-crds
labels:
cozystack.io/repository: system
coztstack.io/target-cluster-name: {{ .Release.Name }}
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: cert-manager-crds
@@ -16,6 +16,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig

View File

@@ -5,7 +5,7 @@ metadata:
name: {{ .Release.Name }}-cert-manager
labels:
cozystack.io/repository: system
coztstack.io/target-cluster-name: {{ .Release.Name }}
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: cert-manager
@@ -17,6 +17,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig
@@ -30,11 +31,9 @@ spec:
upgrade:
remediation:
retries: -1
{{- if .Values.addons.certManager.valuesOverride }}
valuesFrom:
- kind: Secret
name: {{ .Release.Name }}-cert-manager-values-override
valuesKey: values
{{- with .Values.addons.certManager.valuesOverride }}
values:
{{- toYaml . | nindent 4 }}
{{- end }}
dependsOn:
@@ -47,13 +46,3 @@ spec:
- name: {{ .Release.Name }}-cert-manager-crds
namespace: {{ .Release.Namespace }}
{{- end }}
{{- if .Values.addons.certManager.valuesOverride }}
---
apiVersion: v1
kind: Secret
metadata:
name: {{ .Release.Name }}-cert-manager-values-override
stringData:
values: |
{{- toYaml .Values.addons.certManager.valuesOverride | nindent 4 }}
{{- end }}

View File

@@ -1,10 +1,25 @@
{{- define "cozystack.defaultCiliumValues" -}}
cilium:
k8sServiceHost: {{ .Release.Name }}.{{ .Release.Namespace }}.svc
k8sServicePort: 6443
routingMode: tunnel
enableIPv4Masquerade: true
ipv4NativeRoutingCIDR: ""
{{- if $.Values.addons.gatewayAPI.enabled }}
gatewayAPI:
enabled: true
envoy:
enabled: true
{{- end }}
{{- end }}
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: {{ .Release.Name }}-cilium
labels:
cozystack.io/repository: system
coztstack.io/target-cluster-name: {{ .Release.Name }}
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: cilium
@@ -16,6 +31,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig
@@ -30,14 +46,13 @@ spec:
remediation:
retries: -1
values:
cilium:
k8sServiceHost: {{ .Release.Name }}.{{ .Release.Namespace }}.svc
k8sServicePort: 6443
routingMode: tunnel
enableIPv4Masquerade: true
ipv4NativeRoutingCIDR: ""
{{- toYaml (deepCopy .Values.addons.cilium.valuesOverride | mergeOverwrite (fromYaml (include "cozystack.defaultCiliumValues" .))) | nindent 4 }}
dependsOn:
{{- if lookup "helm.toolkit.fluxcd.io/v2" "HelmRelease" .Release.Namespace .Release.Name }}
- name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
{{- end }}
{{- if $.Values.addons.gatewayAPI.enabled }}
- name: {{ .Release.Name }}-gateway-api-crds
namespace: {{ .Release.Namespace }}
{{- end }}

View File

@@ -4,7 +4,7 @@ metadata:
name: {{ .Release.Name }}-csi
labels:
cozystack.io/repository: system
coztstack.io/target-cluster-name: {{ .Release.Name }}
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: csi
@@ -16,6 +16,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig

View File

@@ -20,7 +20,7 @@ spec:
effect: "NoSchedule"
containers:
- name: kubectl
image: docker.io/clastix/kubectl:v1.30.1
image: docker.io/clastix/kubectl:v1.32
command:
- /bin/sh
- -c
@@ -30,11 +30,16 @@ spec:
patch
helmrelease
{{ .Release.Name }}-cilium
{{ .Release.Name }}-gateway-api-crds
{{ .Release.Name }}-csi
{{ .Release.Name }}-cert-manager
{{ .Release.Name }}-cert-manager-crds
{{ .Release.Name }}-vertical-pod-autoscaler
{{ .Release.Name }}-vertical-pod-autoscaler-crds
{{ .Release.Name }}-ingress-nginx
{{ .Release.Name }}-fluxcd-operator
{{ .Release.Name }}-fluxcd
{{ .Release.Name }}-gpu-operator
-p '{"spec": {"suspend": true}}'
--type=merge --field-manager=flux-client-side-apply || true
---
@@ -67,9 +72,13 @@ rules:
- {{ .Release.Name }}-cilium
- {{ .Release.Name }}-csi
- {{ .Release.Name }}-cert-manager
- {{ .Release.Name }}-cert-manager-crds
- {{ .Release.Name }}-vertical-pod-autoscaler
- {{ .Release.Name }}-vertical-pod-autoscaler-crds
- {{ .Release.Name }}-ingress-nginx
- {{ .Release.Name }}-fluxcd-operator
- {{ .Release.Name }}-fluxcd
- {{ .Release.Name }}-gpu-operator
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding

View File

@@ -5,7 +5,7 @@ metadata:
name: {{ .Release.Name }}-fluxcd-operator
labels:
cozystack.io/repository: system
coztstack.io/target-cluster-name: {{ .Release.Name }}
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: fluxcd-operator
@@ -17,6 +17,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig
@@ -49,7 +50,7 @@ metadata:
name: {{ .Release.Name }}-fluxcd
labels:
cozystack.io/repository: system
coztstack.io/target-cluster-name: {{ .Release.Name }}
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: fluxcd
@@ -61,6 +62,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-kubeconfig
@@ -73,11 +75,9 @@ spec:
upgrade:
remediation:
retries: -1
{{- if .Values.addons.fluxcd.valuesOverride }}
valuesFrom:
- kind: Secret
name: {{ .Release.Name }}-fluxcd-values-override
valuesKey: values
{{- with .Values.addons.fluxcd.valuesOverride }}
values:
{{- toYaml . | nindent 4 }}
{{- end }}
dependsOn:
{{- if lookup "helm.toolkit.fluxcd.io/v2" "HelmRelease" .Release.Namespace .Release.Name }}
@@ -89,14 +89,3 @@ spec:
- name: {{ .Release.Name }}-fluxcd-operator
namespace: {{ .Release.Namespace }}
{{- end }}
{{- if .Values.addons.fluxcd.valuesOverride }}
---
apiVersion: v1
kind: Secret
metadata:
name: {{ .Release.Name }}-fluxcd-values-override
stringData:
values: |
{{- toYaml .Values.addons.fluxcd.valuesOverride | nindent 4 }}
{{- end }}

View File

@@ -0,0 +1,39 @@
{{- if $.Values.addons.gatewayAPI.enabled }}
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: {{ .Release.Name }}-gateway-api-crds
labels:
cozystack.io/repository: system
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: gateway-api-crds
chart:
spec:
chart: cozy-gateway-api-crds
reconcileStrategy: Revision
sourceRef:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig
key: super-admin.svc
targetNamespace: kube-system
storageNamespace: kube-system
install:
createNamespace: false
remediation:
retries: -1
upgrade:
remediation:
retries: -1
dependsOn:
{{- if lookup "helm.toolkit.fluxcd.io/v2" "HelmRelease" .Release.Namespace .Release.Name }}
- name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
{{- end }}
{{- end }}

View File

@@ -0,0 +1,46 @@
{{- if .Values.addons.gpuOperator.enabled }}
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: {{ .Release.Name }}-gpu-operator
labels:
cozystack.io/repository: system
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: gpu-operator
chart:
spec:
chart: cozy-gpu-operator
reconcileStrategy: Revision
sourceRef:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig
key: super-admin.svc
targetNamespace: cozy-gpu-operator
storageNamespace: cozy-gpu-operator
install:
createNamespace: true
remediation:
retries: -1
upgrade:
remediation:
retries: -1
{{- with .Values.addons.gpuOperator.valuesOverride }}
values:
{{- toYaml . | nindent 4 }}
{{- end }}
dependsOn:
{{- if lookup "helm.toolkit.fluxcd.io/v2" "HelmRelease" .Release.Namespace .Release.Name }}
- name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
{{- end }}
- name: {{ .Release.Name }}-cilium
namespace: {{ .Release.Namespace }}
{{- end }}

View File

@@ -1,3 +1,20 @@
{{- define "cozystack.defaultIngressValues" -}}
ingress-nginx:
fullnameOverride: ingress-nginx
controller:
kind: DaemonSet
hostNetwork: true
service:
enabled: false
{{- if not .Values.addons.certManager.enabled }}
admissionWebhooks:
certManager:
enabled: false
{{- end }}
nodeSelector:
node-role.kubernetes.io/ingress-nginx: ""
{{- end }}
{{- if .Values.addons.ingressNginx.enabled }}
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
@@ -5,7 +22,7 @@ metadata:
name: {{ .Release.Name }}-ingress-nginx
labels:
cozystack.io/repository: system
coztstack.io/target-cluster-name: {{ .Release.Name }}
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: ingress-nginx
@@ -17,6 +34,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig
@@ -31,21 +49,7 @@ spec:
remediation:
retries: -1
values:
ingress-nginx:
fullnameOverride: ingress-nginx
controller:
kind: DaemonSet
hostNetwork: true
service:
enabled: false
nodeSelector:
node-role.kubernetes.io/ingress-nginx: ""
{{- if .Values.addons.ingressNginx.valuesOverride }}
valuesFrom:
- kind: Secret
name: {{ .Release.Name }}-ingress-nginx-values-override
valuesKey: values
{{- end }}
{{- toYaml (deepCopy .Values.addons.ingressNginx.valuesOverride | mergeOverwrite (fromYaml (include "cozystack.defaultIngressValues" .))) | nindent 4 }}
dependsOn:
{{- if lookup "helm.toolkit.fluxcd.io/v2" "HelmRelease" .Release.Namespace .Release.Name }}
- name: {{ .Release.Name }}
@@ -54,14 +58,3 @@ spec:
- name: {{ .Release.Name }}-cilium
namespace: {{ .Release.Namespace }}
{{- end }}
{{- if .Values.addons.ingressNginx.valuesOverride }}
---
apiVersion: v1
kind: Secret
metadata:
name: {{ .Release.Name }}-ingress-nginx-values-override
stringData:
values: |
{{- toYaml .Values.addons.ingressNginx.valuesOverride | nindent 4 }}
{{- end }}

View File

@@ -7,7 +7,7 @@ metadata:
name: {{ .Release.Name }}-monitoring-agents
labels:
cozystack.io/repository: system
coztstack.io/target-cluster-name: {{ .Release.Name }}
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: cozy-monitoring-agents
@@ -19,6 +19,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig
@@ -38,10 +39,10 @@ spec:
- name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
{{- end }}
- name: {{ .Release.Name }}-cilium
namespace: {{ .Release.Namespace }}
- name: {{ .Release.Name }}-cozy-victoria-metrics-operator
namespace: {{ .Release.Namespace }}
- name: {{ .Release.Name }}-vertical-pod-autoscaler-crds
namespace: {{ .Release.Namespace }}
values:
vmagent:
externalLabels:

View File

@@ -0,0 +1,42 @@
{{- if .Values.addons.monitoringAgents.enabled }}
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: {{ .Release.Name }}-vertical-pod-autoscaler-crds
labels:
cozystack.io/repository: system
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: vertical-pod-autoscaler-crds
chart:
spec:
chart: cozy-vertical-pod-autoscaler-crds
reconcileStrategy: Revision
sourceRef:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig
key: super-admin.svc
targetNamespace: cozy-vertical-pod-autoscaler-crds
storageNamespace: cozy-vertical-pod-autoscaler-crds
install:
createNamespace: true
remediation:
retries: -1
upgrade:
remediation:
retries: -1
dependsOn:
{{- if lookup "helm.toolkit.fluxcd.io/v2" "HelmRelease" .Release.Namespace .Release.Name }}
- name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
{{- end }}
- name: {{ .Release.Name }}-cilium
namespace: {{ .Release.Namespace }}
{{- end }}

View File

@@ -0,0 +1,68 @@
{{- define "cozystack.defaultVPAValues" -}}
{{- $myNS := lookup "v1" "Namespace" "" .Release.Namespace }}
{{- $targetTenant := index $myNS.metadata.annotations "namespace.cozystack.io/monitoring" }}
vertical-pod-autoscaler:
recommender:
extraArgs:
container-name-label: container
container-namespace-label: namespace
container-pod-name-label: pod
storage: prometheus
memory-saver: true
pod-label-prefix: label_
metric-for-pod-labels: kube_pod_labels{job="kube-state-metrics", tenant="{{ .Release.Namespace }}", cluster="{{ .Release.Name }}"}[8d]
pod-name-label: pod
pod-namespace-label: namespace
prometheus-address: http://vmselect-shortterm.{{ $targetTenant }}.svc.cozy.local:8481/select/0/prometheus/
prometheus-cadvisor-job-name: cadvisor
resources:
limits:
memory: 1600Mi
requests:
cpu: 100m
memory: 1600Mi
{{- end }}
{{- if .Values.addons.monitoringAgents.enabled }}
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: {{ .Release.Name }}-vertical-pod-autoscaler
labels:
cozystack.io/repository: system
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: vertical-pod-autoscaler
chart:
spec:
chart: cozy-vertical-pod-autoscaler
reconcileStrategy: Revision
sourceRef:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig
key: super-admin.svc
targetNamespace: cozy-vertical-pod-autoscaler
storageNamespace: cozy-vertical-pod-autoscaler
install:
createNamespace: true
remediation:
retries: -1
upgrade:
remediation:
retries: -1
values:
{{- toYaml (deepCopy .Values.addons.verticalPodAutoscaler.valuesOverride | mergeOverwrite (fromYaml (include "cozystack.defaultVPAValues" .))) | nindent 4 }}
dependsOn:
{{- if lookup "helm.toolkit.fluxcd.io/v2" "HelmRelease" .Release.Namespace .Release.Name }}
- name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
{{- end }}
- name: {{ .Release.Name }}-monitoring-agents
namespace: {{ .Release.Namespace }}
{{- end }}

View File

@@ -5,7 +5,7 @@ metadata:
name: {{ .Release.Name }}-cozy-victoria-metrics-operator
labels:
cozystack.io/repository: system
coztstack.io/target-cluster-name: {{ .Release.Name }}
cozystack.io/target-cluster-name: {{ .Release.Name }}
spec:
interval: 5m
releaseName: cozy-victoria-metrics-operator
@@ -17,6 +17,7 @@ spec:
kind: HelmRepository
name: cozystack-system
namespace: cozy-system
version: '>= 0.0.0-0'
kubeConfig:
secretRef:
name: {{ .Release.Name }}-admin-kubeconfig

View File

@@ -1,97 +1,252 @@
{
"title": "Chart Values",
"type": "object",
"properties": {
"host": {
"type": "string",
"description": "The hostname used to access the Kubernetes cluster externally (defaults to using the cluster name as a subdomain for the tenant host).",
"default": ""
"title": "Chart Values",
"type": "object",
"properties": {
"host": {
"type": "string",
"description": "Hostname used to access the Kubernetes cluster externally. Defaults to `<cluster-name>.<tenant-host>` when empty.",
"default": ""
},
"controlPlane": {
"type": "object",
"properties": {
"replicas": {
"type": "number",
"description": "Number of replicas for Kubernetes control-plane components.",
"default": 2
},
"controlPlane": {
"type": "object",
"properties": {
"replicas": {
"type": "number",
"description": "Number of replicas for Kubernetes contorl-plane components",
"default": 2
}
"apiServer": {
"type": "object",
"properties": {
"resources": {
"type": "object",
"description": "Explicit CPU/memory resource requests and limits for the API server.",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Use a common resources preset when `resources` is not set explicitly.",
"default": "small",
"enum": [
"none",
"nano",
"micro",
"small",
"medium",
"large",
"xlarge",
"2xlarge"
]
}
}
},
"storageClass": {
"type": "string",
"description": "StorageClass used to store user data",
"default": "replicated"
},
"addons": {
"type": "object",
"properties": {
"certManager": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enables the cert-manager",
"default": false
},
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
}
}
},
"ingressNginx": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enable Ingress-NGINX controller (expect nodes with 'ingress-nginx' role)",
"default": false
},
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
},
"hosts": {
"type": "array",
"description": "List of domain names that should be passed through to the cluster by upper cluster",
"default": [],
"items": {}
}
}
},
"fluxcd": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enables Flux CD",
"default": false
},
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
}
}
},
"monitoringAgents": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enables MonitoringAgents (fluentbit, vmagents for sending logs and metrics to storage) if tenant monitoring enabled, send to tenant storage, else to root storage",
"default": false
},
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
}
}
}
"controllerManager": {
"type": "object",
"properties": {
"resources": {
"type": "object",
"description": "Explicit CPU/memory resource requests and limits for the controller manager.",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Use a common resources preset when `resources` is not set explicitly.",
"default": "micro",
"enum": [
"none",
"nano",
"micro",
"small",
"medium",
"large",
"xlarge",
"2xlarge"
]
}
}
},
"scheduler": {
"type": "object",
"properties": {
"resources": {
"type": "object",
"description": "Explicit CPU/memory resource requests and limits for the scheduler.",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Use a common resources preset when `resources` is not set explicitly.",
"default": "micro",
"enum": [
"none",
"nano",
"micro",
"small",
"medium",
"large",
"xlarge",
"2xlarge"
]
}
}
},
"konnectivity": {
"type": "object",
"properties": {
"server": {
"type": "object",
"properties": {
"resources": {
"type": "object",
"description": "Explicit CPU/memory resource requests and limits for the Konnectivity.",
"default": {}
},
"resourcesPreset": {
"type": "string",
"description": "Use a common resources preset when `resources` is not set explicitly.",
"default": "micro",
"enum": [
"none",
"nano",
"micro",
"small",
"medium",
"large",
"xlarge",
"2xlarge"
]
}
}
}
}
}
}
},
"storageClass": {
"type": "string",
"description": "StorageClass used to store user data.",
"default": "replicated"
},
"useCustomSecretForPatchContainerd": {
"type": "boolean",
"description": "if true, for patch containerd will be used secret: {{ .Release.Name }}-patch-containerd",
"default": false
},
"addons": {
"type": "object",
"properties": {
"certManager": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enable cert-manager, which automatically creates and manages SSL/TLS certificates.",
"default": false
},
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
}
}
},
"cilium": {
"type": "object",
"properties": {
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
}
}
},
"gatewayAPI": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enable the Gateway API",
"default": false
}
}
},
"ingressNginx": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enable the Ingress-NGINX controller (requires nodes labeled with the 'ingress-nginx' role).",
"default": false
},
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
},
"hosts": {
"type": "array",
"description": "List of domain names that the parent cluster should route to this tenant cluster.",
"default": [],
"items": {}
}
}
},
"gpuOperator": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enable the GPU-operator",
"default": false
},
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
}
}
},
"fluxcd": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enable FluxCD",
"default": false
},
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
}
}
},
"monitoringAgents": {
"type": "object",
"properties": {
"enabled": {
"type": "boolean",
"description": "Enable monitoring agents (Fluent Bit and VMAgents) to send logs and metrics. If tenant monitoring is enabled, data is sent to tenant storage; otherwise, it goes to root storage.",
"default": false
},
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
}
}
},
"verticalPodAutoscaler": {
"type": "object",
"properties": {
"valuesOverride": {
"type": "object",
"description": "Custom values to override",
"default": {}
}
}
}
}
}
}
}

View File

@@ -1,13 +1,13 @@
## @section Common parameters
## @section Common Parameters
## @param host The hostname used to access the Kubernetes cluster externally (defaults to using the cluster name as a subdomain for the tenant host).
## @param controlPlane.replicas Number of replicas for Kubernetes contorl-plane components
## @param storageClass StorageClass used to store user data
## @param host Hostname used to access the Kubernetes cluster externally. Defaults to `<cluster-name>.<tenant-host>` when empty.
## @param controlPlane.replicas Number of replicas for Kubernetes control-plane components.
## @param storageClass StorageClass used to store user data.
## @param useCustomSecretForPatchContainerd if true, for patch containerd will be used secret: {{ .Release.Name }}-patch-containerd
##
host: ""
controlPlane:
replicas: 2
storageClass: replicated
useCustomSecretForPatchContainerd: false
## @param nodeGroups [object] nodeGroups configuration
##
@@ -24,6 +24,14 @@ nodeGroups:
cpu: ""
memory: ""
## List of GPUs to attach (WARN: NVIDIA driver requires at least 4 GiB of RAM)
## e.g:
## instanceType: "u1.xlarge"
## gpus:
## - name: nvidia.com/AD102GL_L40S
gpus: []
## @section Cluster Addons
##
addons:
@@ -31,19 +39,31 @@ addons:
## Cert-manager: automatically creates and manages SSL/TLS certificate
##
certManager:
## @param addons.certManager.enabled Enables the cert-manager
## @param addons.certManager.enabled Enable cert-manager, which automatically creates and manages SSL/TLS certificates.
## @param addons.certManager.valuesOverride Custom values to override
enabled: false
valuesOverride: {}
## Cilium CNI plugin
##
cilium:
## @param addons.cilium.valuesOverride Custom values to override
valuesOverride: {}
## Gateway API
##
gatewayAPI:
## @param addons.gatewayAPI.enabled Enable the Gateway API
enabled: false
## Ingress-NGINX Controller
##
ingressNginx:
## @param addons.ingressNginx.enabled Enable Ingress-NGINX controller (expect nodes with 'ingress-nginx' role)
## @param addons.ingressNginx.enabled Enable the Ingress-NGINX controller (requires nodes labeled with the 'ingress-nginx' role).
## @param addons.ingressNginx.valuesOverride Custom values to override
##
enabled: false
## @param addons.ingressNginx.hosts List of domain names that should be passed through to the cluster by upper cluster
## @param addons.ingressNginx.hosts List of domain names that the parent cluster should route to this tenant cluster.
## e.g:
## hosts:
## - example.org
@@ -52,10 +72,18 @@ addons:
hosts: []
valuesOverride: {}
## GPU-operator: NVIDIA GPU Operator
##
gpuOperator:
## @param addons.gpuOperator.enabled Enable the GPU-operator
## @param addons.gpuOperator.valuesOverride Custom values to override
enabled: false
valuesOverride: {}
## Flux CD
##
fluxcd:
## @param addons.fluxcd.enabled Enables Flux CD
## @param addons.fluxcd.enabled Enable FluxCD
## @param addons.fluxcd.valuesOverride Custom values to override
##
enabled: false
@@ -64,68 +92,55 @@ addons:
## MonitoringAgents
##
monitoringAgents:
## @param addons.monitoringAgents.enabled Enables MonitoringAgents (fluentbit, vmagents for sending logs and metrics to storage) if tenant monitoring enabled, send to tenant storage, else to root storage
## @param addons.monitoringAgents.enabled Enable monitoring agents (Fluent Bit and VMAgents) to send logs and metrics. If tenant monitoring is enabled, data is sent to tenant storage; otherwise, it goes to root storage.
## @param addons.monitoringAgents.valuesOverride Custom values to override
##
enabled: false
valuesOverride: {}
## @section Kamaji control plane
## VerticalPodAutoscaler
##
verticalPodAutoscaler:
## @param addons.verticalPodAutoscaler.valuesOverride Custom values to override
##
valuesOverride: {}
## @section Kubernetes Control Plane Configuration
##
kamajiControlPlane:
controlPlane:
replicas: 2
apiServer:
## @param kamajiControlPlane.apiServer.resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param kamajiControlPlane.apiServer.resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
## @param controlPlane.apiServer.resources Explicit CPU/memory resource requests and limits for the API server.
## @param controlPlane.apiServer.resourcesPreset Use a common resources preset when `resources` is not set explicitly.
## e.g:
## resources:
## limits:
## cpu: 4000m
## memory: 4Gi
## requests:
## cpu: 100m
## memory: 512Mi
##
resourcesPreset: "small"
resources: {}
controllerManager:
## @param kamajiControlPlane.controllerManager.resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param kamajiControlPlane.controllerManager.resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
## @param controlPlane.controllerManager.resources Explicit CPU/memory resource requests and limits for the controller manager.
## @param controlPlane.controllerManager.resourcesPreset Use a common resources preset when `resources` is not set explicitly.
resourcesPreset: "micro"
resources: {}
scheduler:
## @param kamajiControlPlane.scheduler.resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param kamajiControlPlane.scheduler.resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
## @param controlPlane.scheduler.resources Explicit CPU/memory resource requests and limits for the scheduler.
## @param controlPlane.scheduler.resourcesPreset Use a common resources preset when `resources` is not set explicitly.
resourcesPreset: "micro"
addons:
konnectivity:
server:
## @param kamajiControlPlane.addons.konnectivity.server.resources Resources
resources: {}
# resources:
# limits:
# cpu: 4000m
# memory: 4Gi
# requests:
# cpu: 100m
# memory: 512Mi
## @param kamajiControlPlane.addons.konnectivity.server.resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
resourcesPreset: "micro"
resources: {}
konnectivity:
server:
## @param controlPlane.konnectivity.server.resources Explicit CPU/memory resource requests and limits for the Konnectivity.
## @param controlPlane.konnectivity.server.resourcesPreset Use a common resources preset when `resources` is not set explicitly.
resourcesPreset: "micro"
resources: {}

View File

@@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.6.0
version: 0.7.1
# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to

Some files were not shown because too many files have changed in this diff Show More