Commit Graph

881 Commits

Author SHA1 Message Date
Timofei Larkin
9adcd48c44 [keycloak, cozy-lib] Calculate Java heap params
This patch passes Java heap parameters to Keycloak to prevent OOM errors
when the JVM lacks compatibility with cgroups v2 and fails to recognize
container memory requests and limits. A new function is introduced in
cozy-lib to calculate the heap parameters from requests and limits,
setting Xmx to 75% of the memory limit and Xms to the lesser of the
memory request or 25% of the memory limits.

Change log:
[keycloak] Calculate and pass Java heap parameters explicitly to prevent
OOM errors.
[cozy-lib] Introduce helper function to calculate Java heap params based
on memory requests and limits.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-07-03 22:15:04 +03:00
Timofei Larkin
bd9e283d3b [platform] Always set resources for managed apps
This patch removes the loophole to leave resource requests and limits
unspecified in managed apps. Any of cpu, memory, and ephemeral storage
are now filled in from the resource preset (default or user-specified)
if not explicitly specified in .Values.resources. "none" is no longer an
accepted value in resourcePresets and the primary resources now always
have some explicit value for proper billing and isolation.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-07-03 17:45:32 +03:00
Andrei Kvapil
d2126b6703 Save a list of observed images after workflow (#1089)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added a process to list images used in the environment before deletion
during cleanup operations.
- **Chores**
- Enhanced environment cleanup workflow with improved visibility into
used images.
- Introduced a shared writable directory between host and container for
better file management during testing.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-07-03 15:45:12 +03:00
Andrei Kvapil
0b7bbb1ba9 [system] Recuce resources for some system apps
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 15:00:41 +03:00
Andrei Kvapil
bb46aa4b7d [cozy-lib] Introduce memory-allocation-ratio and ephemeral-strorage-allocation-ratio options
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 15:00:41 +03:00
Andrei Kvapil
6256e40169 [cozy-lib] remove handler for nested resources/requests map
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 15:00:40 +03:00
Andrei Kvapil
22cda073b9 [cozy-lib, bug] divf by cpu ratio, not mulf (#1125)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

* **Refactor**
* Updated the structure of resource presets for improved clarity and
processing.
* Adjusted template logic to streamline resource handling and removed
previous resource limit calculations.
* Modified template parameters to enhance flexibility in resource
processing.
* **Chores**
* Improved internal template invocation for better compatibility with
resource data.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 15:00:24 +03:00
Andrei Kvapil
0d46393e8c [nfs-driver] Introduce new module (#1133)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


## What this PR does

This PR adds a new optional module to support nfs shares

## Way to test it:

#### driver and provisioner setup

```yaml
---
apiVersion: v1
kind: Namespace
metadata:
  labels:
    cozystack.io/system: "true"
    pod-security.kubernetes.io/enforce: privileged
  name: cozy-nfs-driver
spec:
  finalizers:
  - kubernetes
---
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  labels:
    cozystack.io/repository: system
    cozystack.io/system-app: "true"
  name: nfs-driver
  namespace: cozy-nfs-driver
spec:
  chart:
    spec:
      chart: cozy-nfs-driver
      reconcileStrategy: Revision
      sourceRef:
        kind: HelmRepository
        name: cozystack-system
        namespace: cozy-system
      version: '>= 0.0.0-0'
  dependsOn:
  - name: cilium
    namespace: cozy-cilium
  - name: kubeovn
    namespace: cozy-kubeovn
  install:
    crds: CreateReplace
    remediation:
      retries: -1
  interval: 5m
  releaseName: nfs-driver
  suspend: true
  upgrade:
    crds: CreateReplace
    remediation:
      retries: -1
```

Then `cd packages/system/csi-driver-nfs` and:

```
make apply
```

#### export share

```bash
apt install nfs-server
mkdir /data
chmod 777 /data
echo '/data *(rw,sync,no_subtree_check)' >> /etc/exports
exportfs -a
```

#### configure connection

```yaml
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs
provisioner: nfs.csi.k8s.io
parameters:
  server: 10.244.57.210
  share: /data
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
mountOptions:
  - nfsvers=4.1
```

#### order volume

```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: task-pv-claim
spec:
  storageClassName: nfs
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 3Gi
```

### Release note

<!--  Write a release note:
- Explain what has changed internally and for users.
- Start with the same [label] as in the PR title
- Follow the guidelines at
https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md.
-->

```release-note
[nfs-driver] Introduce new optional module to order volumes from NFS shares
```
2025-07-03 14:32:51 +03:00
Andrei Kvapil
193f43d7bb [kubernetes] Fix dead-lock while reattaching a KubeVirt-CSI volume (#1135)
## What this PR does


This pr imports upstream fix for volume reattaching procedure
- https://github.com/kubevirt/csi-driver/pull/143

### Release note

<!--  Write a release note:
- Explain what has changed internally and for users.
- Start with the same [label] as in the PR title
- Follow the guidelines at
https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md.
-->

```release-note
[kubernetes] Fix dead-lock while reattaching a KubeVirt-CSI volume
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Improved volume management for virtual machines by adding checks to
skip unnecessary attach or detach operations when the volume is already
in the desired state.

* **Tests**
* Added new unit tests to verify optimized volume attach/detach
workflows and ensure fast-path logic is functioning correctly.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-07-03 14:27:10 +03:00
Andrei Kvapil
8ec882ca5f [dx] Refactor collect-images functionality
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 14:26:56 +03:00
Timofei Larkin
f891d0bee6 Add exec bit to script, sanitize image list
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-07-03 13:56:41 +03:00
Andrei Kvapil
1f748d563f Copy contents of directory instead of directory
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-07-03 13:56:35 +03:00
Timofei Larkin
433bfe7b6c Save image list outside of sandbox
Because the sandbox is torn down after successful tests

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-07-03 13:54:32 +03:00
Timofei Larkin
fa6442998a Save a list of observed images after workflow
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-07-03 13:54:32 +03:00
Andrei Kvapil
6d06d3b1fb [nfs-driver] Introduce new module
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 13:46:24 +03:00
Andrei Kvapil
4c347cc026 [kubernetes] Fix dead-lock while reattaching a KubeVirt-CSI volume
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 13:40:54 +03:00
Andrei Kvapil
986de717f1 [virtual-machine] Refactor golden images
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 13:33:44 +03:00
Andrei Kvapil
d38c8aa5ab [CDI] golden disks feature for reuse
Use Golden Images to speed up VM / VMI deploy

Signed-off-by: gwynbleidd <gwynbleidd2106@yandex.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 13:23:44 +03:00
Andrei Kvapil
7f9f850b47 [tests] Fix pre-commit check for kubernetes options
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-03 13:08:20 +03:00
klinch0
ca772fae2e platform add velero (#1132)
<!-- Thank you for making a contribution! Here are some tips for you:
- Start the PR title with the [label] of Cozystack component:
- For system components: [platform], [system], [linstor], [cilium],
[kube-ovn], [dashboard], [cluster-api], etc.
- For managed apps: [apps], [tenant], [kubernetes], [postgres],
[virtual-machine] etc.
- For development and maintenance: [tests], [ci], [docs], [maintenance].
- If it's a work in progress, consider creating this PR as a draft.
- Don't hesistate to ask for opinion and review in the community chats,
even if it's still a draft.
- Add the label `backport` if it's a bugfix that needs to be backported
to a previous version.
-->

## What this PR does


### Release note

<!--  Write a release note:
- Explain what has changed internally and for users.
- Start with the same [label] as in the PR title
- Follow the guidelines at
https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md.
-->

```release-note
[]
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added Velero integration as an optional addon for Kubernetes cluster
backup and restore.
* Introduced configurable parameters to enable Velero and override its
settings.
* Included a comprehensive Helm chart, manifests, and configuration
files for deploying Velero.
* Added support for Velero-related Kubernetes resources, including
backup, restore, schedule, and data mover management.
* Enabled Prometheus monitoring and metrics for Velero components with
PodMonitor and ServiceMonitor support.
* Provided customizable backup storage and volume snapshot location
settings.
  * Added automated Helm hooks for CRD upgrades and cleanup jobs.
  * Included node-agent DaemonSet deployment for Velero.

* **Documentation**
* Updated documentation to describe new Velero addon parameters,
installation, upgrade, and usage instructions.

* **Chores**
  * Incremented Kubernetes app chart version to reflect new features.
  * Updated version mapping and bundle configurations to include Velero.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-07-03 09:59:34 +03:00
Andrei Kvapil
fb831c05c0 vms add sockets to resources (#1131)
<!-- Thank you for making a contribution! Here are some tips for you:
- Start the PR title with the [label] of Cozystack component:
- For system components: [platform], [system], [linstor], [cilium],
[kube-ovn], [dashboard], [cluster-api], etc.
- For managed apps: [apps], [tenant], [kubernetes], [postgres],
[virtual-machine] etc.
- For development and maintenance: [tests], [ci], [docs], [maintenance].
- If it's a work in progress, consider creating this PR as a draft.
- Don't hesistate to ask for opinion and review in the community chats,
even if it's still a draft.
- Add the label `backport` if it's a bugfix that needs to be backported
to a previous version.
-->

## What this PR does


### Release note

<!--  Write a release note:
- Explain what has changed internally and for users.
- Start with the same [label] as in the PR title
- Follow the guidelines at
https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md.
-->

```release-note
- Allow to set socket count for VM and VMI
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added support for specifying the number of CPU sockets
(resources.sockets) in virtual machine configurations for both
virtual-machine and vm-instance applications.

* **Documentation**
* Updated documentation to describe the new resources.sockets parameter
and its role in defining vCPU topology.

* **Chores**
* Incremented chart versions for virtual-machine (to 0.12.0) and
vm-instance (to 0.9.0).
  * Updated version mappings to reflect the latest releases.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-07-02 17:33:29 +03:00
kklinch0
98194a7414 platform add velero
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-07-02 16:47:44 +03:00
kklinch0
70c7978306 vms add sockets to resources
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-07-02 15:17:30 +03:00
Andrei Kvapil
d5521df9bd [tenant] Respect cpu-allocation-ratio in resourceQuotas
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-02 15:14:56 +03:00
Andrei Kvapil
d1275ecd08 [kubernetes] fix ingress template
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-02 15:13:50 +03:00
Andrei Kvapil
1f240387f9 [dx] fix: exclude ps from self destructing enviroments check
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-02 13:37:15 +03:00
Andrei Kvapil
512277fa93 [kubernetes] Add option for exposing ingress-nginx via LoadBalancer (#1114)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Added a new configuration option to choose the method for exposing the
Ingress-NGINX controller: "Proxied" or "LoadBalancer".
- **Documentation**
- Updated documentation to describe the new `exposeMethod` option and
clarified the conditions under which domain names are used.
- **Bug Fixes**
- Improved conditional logic to ensure Ingress resources are only
created when the appropriate expose method is selected.
- **Chores**
	- Incremented the chart version to 0.25.0.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-07-02 10:52:44 +02:00
Andrei Kvapil
d12d07fd5c [etcd] Update etcd application (fix resources and headless services) (#1128)
ref to https://github.com/cozystack/cozystack/pull/1127,
https://github.com/clastix/kamaji/issues/856 and
https://github.com/aenix-io/etcd-operator/pull/291

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
  * Updated etcd chart to version 2.9.0.
* **Improvements**
* Simplified etcd endpoint configuration to use a single static
endpoint.
* Expanded TLS certificate DNS names to include additional service
addresses.
  * Streamlined resource configuration for etcd deployment.
* **Chores**
  * Updated version mapping for etcd package.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-07-02 10:45:37 +02:00
Andrei Kvapil
f953db50da [dx] Introduce cozyreport tool
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-02 10:37:40 +03:00
Andrei Kvapil
39daa3a38a [dx] better check for processes in self destructing enviroments
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-02 09:54:15 +03:00
Andrei Kvapil
a5ff9bf65b [etcd] Update etcd application (fix resources and headless services)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-02 06:15:38 +03:00
Andrei Kvapil
792f6b4af8 [tests] Introduce self destructing environments (#1138)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

<!-- Thank you for making a contribution! Here are some tips for you:
- Start the PR title with the [label] of Cozystack component:
- For system components: [platform], [system], [linstor], [cilium],
[kube-ovn], [dashboard], [cluster-api], etc.
- For managed apps: [apps], [tenant], [kubernetes], [postgres],
[virtual-machine] etc.
- For development and maintenance: [tests], [ci], [docs], [maintenance].
- If it's a work in progress, consider creating this PR as a draft.
- Don't hesistate to ask for opinion and review in the community chats,
even if it's still a draft.
- Add the label `backport` if it's a bugfix that needs to be backported
to a previous version.
-->

## What this PR does


### Release note

<!--  Write a release note:
- Explain what has changed internally and for users.
- Start with the same [label] as in the PR title
- Follow the guidelines at
https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md.
-->

```release-note
[tests] Introduce self destructing environments
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Introduced a process-monitoring entrypoint script for end-to-end
testing containers, allowing for customizable timeout intervals.

* **Chores**
* Updated the Docker image used for end-to-end testing to the latest
available version.
* Modified Docker build context and container runtime options for
testing environments.
* Removed systemd timer and service management steps from workflow
automation.
* Added a new test to verify the presence of required installer assets
before running end-to-end tests.
* Removed redundant installer asset checks from cluster preparation
tests.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-07-02 04:25:42 +02:00
Andrei Kvapil
52714f5cce [tests] Introduce self destructing environments
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-07-02 03:42:14 +03:00
kklinch0
2d294f0546 monitoring disable alerta sign up
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-06-29 01:18:38 +03:00
Andrei Kvapil
78b4d06b25 [apps] Add enum of allowed values to resourcePreset in all applications (#1117)
It was present in some apps, such as managed kubernetes, but was missing
in others.

bitnami/readme-generator removes enums after re-generating README, so
now we patch them back using `yq` in Makefiles.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Resource preset options are now strictly limited to a predefined set
of values across multiple apps, ensuring only valid selections such as
"none", "nano", "micro", "small", "medium", "large", "xlarge", and
"2xlarge" can be used.
- **Bug Fixes**
- Improved validation for resource presets to prevent invalid entries
and enhance consistency in configuration.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-28 14:03:13 +02:00
Andrei Kvapil
ae90969b7e [platform] rm kk memory limit (#1122)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
* Removed the memory limit for Keycloak deployment, retaining only
resource requests for memory and CPU.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-28 13:56:12 +02:00
Andrei Kvapil
6732205b24 Create LoadBalancer service for single-node MySQL (#1113)
## Changelog
```
[mysql] Bugfix: external=true did not work for MySQL deployed with a single replica,
since the MariaDB operator does not create separate primary and secondary services for a single-node DB.
A special condition is added to make the "all-node" service a LoadBalancer if external=true and replicas=1.
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Improved handling of external service exposure for MySQL deployments,
with refined logic for LoadBalancer configuration based on the number of
replicas.
- **Chores**
  - Updated MySQL chart version to 0.8.2.
  - Adjusted version mapping to reflect the latest changes.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Resolves https://github.com/cozystack/cozystack/issues/1095
2025-06-28 13:36:47 +02:00
kklinch0
6a080fbf5d [platform] rm kk memory limit 2025-06-26 11:19:25 +03:00
Nick Volynkin
cfc8c269f3 [apps] Add enum of allowed values to resourcePreset in all applications
It was present in some apps, such as managed kubernetes, but missing in others.

bitnami/readme-generator removes enums after re-generating README,
so now we patch them back using `yq` in Makefiles.

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-06-25 16:48:20 +03:00
Andrei Kvapil
1da45ff039 [dx] Fix Makefile envs for capi-providers
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-06-25 14:50:12 +02:00
Andrei Kvapil
c6ee006d6b [kubernetes] Add option for exposing ingress-nginx via LoadBalancer
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-06-25 14:44:52 +02:00
Timofei Larkin
848abc4bd1 Create LoadBalancer service for single-node MySQL
[mysql] Bugfix: external=true did not work for MySQL deployed with a
single replica, since the MariaDB operator does not create separate
primary and secondary services for a single-node DB. A special condition
is added to make the "all-node" service a LoadBalancer if external=true
and replicas=1.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-06-25 14:24:45 +03:00
github-actions
baefc78bfe Prepare release v0.32.1
Signed-off-by: github-actions <github-actions@github.com>
2025-06-24 23:07:51 +00:00
Andrei Kvapil
5602e9753f [ci] Refactor Github workflows (#1107)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

- **New Features**
- Pull request workflows now support release pull requests by fetching
artifacts from draft releases and running all jobs without label-based
exclusions.
- Test matrices are now generated dynamically, improving flexibility in
end-to-end application testing.
- Added a new end-to-end test verifying tenant creation with isolated
mode enabled.

- **Refactor**
- Workflow steps and job dependencies have been streamlined for improved
efficiency and maintainability.
- Workflow names and concurrency group names have been updated for
clarity.
- Environment preparation and artifact handling have been unified into
consolidated jobs.
	- Release-related workflow simplified to a single finalize job.
- Makefile targets for asset copying and test execution have been
reorganized for better modularity.

- **Tests**
	- End-to-end application and cluster test scripts have been removed.
- Removed collective end-to-end test target; individual app test targets
remain.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-06-25 00:18:42 +02:00
Andrei Kvapil
e7681debe2 [ci] Refactor Github workflows
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-06-24 23:41:10 +02:00
Andrei Kvapil
654778a0c7 [apps] Refactor resources
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-06-24 17:35:26 +02:00
Andrei Kvapil
e8b83fbbda [clickhouse][kafka] fix openapispec generation
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-06-24 17:10:55 +02:00
Andrei Kvapil
a0526be17d [clickhouse][kafka] increase resources
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-06-24 16:56:00 +02:00
Andrei Kvapil
587904e8cc [kafka] downgrade operator to 0.45.1-rc1
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-06-24 13:37:42 +02:00
github-actions
2832058036 Prepare release v0.32.1
Signed-off-by: github-actions <github-actions@github.com>
2025-06-24 08:55:52 +00:00