Commit Graph

576 Commits

Author SHA1 Message Date
Andrei Kvapil
9e6478b9c9 [linstor] Add plunger check for disconnected DRBD peers. (#707)
Sometimes DRBD devices get stuck in "Connecting" state, probably due to
some
race conditions. This scriptlet provides a workaround for such
situations.
2025-04-10 10:38:29 +02:00
Andrei Kvapil
3a295c4474 Add guard against empty cloudInit in vm-instance app (#646)
Prevent the VM resource from referencing a non-existent secret when
`sshKeys` are set and `cloudInit` is set to empty.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Improved cloud-init configuration handling with conditional logic and
clearer error messaging when expected configuration values are missing.

- **Documentation**
- Refined virtual machine configuration guides by reformatting parameter
tables and correcting typographical errors in parameter descriptions.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 10:37:50 +02:00
klinch0
2966922c0b feat(vpa): separate-crds (#781)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Improved autoscaling deployment by integrating an additional component
for managing custom resource definitions.
- Enhanced dependency management now ensures critical prerequisites are
deployed in the correct order.
- Introduced an automated update mechanism to keep resource definitions
current.
- Added a new configuration option, giving users the flexibility to
enable or disable custom resource definitions as needed.
- Introduced two new Custom Resource Definitions:
`VerticalPodAutoscalerCheckpoint` and `VerticalPodAutoscaler`.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-10 11:35:36 +03:00
Denis Seleznev
991c7e1943 Handle empty cloudInit.
Add a no-op user-data when sshKeys are specified.

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-10 10:07:43 +02:00
kklinch0
c31a7710ad feat(vpa): separate-crds
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-10 10:57:50 +03:00
Andrei Kvapil
f4cace093c Add a setting to VMs that allows users to trigger cloud-init full reconfiguration. (#767)
This will trigger cloud-init reinitialization, including ssh keys update
and static network config refresh.
2025-04-10 09:20:55 +02:00
Denis Seleznev
01e417d436 Add Linstor plunger scriptlet to fix DRBD devices that are stuck disconnected.
Sometimes DRBD devices get stuck in "Connecting" state, probably due to some
race conditions. This scriptlet provides a workaround for such situations.

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-10 03:49:23 +02:00
Denis Seleznev
261ce4278f Add a setting to VMs that allows users to trigger cloud-init full reconfiguration.
Changing `cloudInitSeed`  will trigger cloud-init reinitialization, including ssh keys update and static network config refresh.

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-09 20:48:18 +02:00
Timofei Larkin
1f19793613 Merge branch 'main' into 176-track-ips 2025-04-09 20:26:29 +04:00
Timofei Larkin
a0df2989af Track public IP usage
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-09 19:24:36 +03:00
Timofei Larkin
bdb538ab42 Track PVCs with WorkloadMonitor (#768)
The WorkloadMonitor controller now also watches PVCs, just like it has been watching Pods and creates Workloads per PVC according to the `spec.selector` field to track the used storage space.
2025-04-09 20:01:13 +04:00
Andrei Kvapil
4b575299bc Log verbose state for DRBD devices that are not healthy. (#771)
This will help troubleshoot issues that occurred in the past but have
already been resolved.
2025-04-09 14:16:53 +02:00
Andrei Kvapil
4078b21ac6 Merge branch 'main' into main 2025-04-09 14:09:59 +02:00
kklinch0
d60b81c8a0 fix version map
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-09 14:44:42 +03:00
Timofei Larkin
cc14c1fbab Track PVCs with WorkloadMonitor
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-09 14:09:36 +03:00
Andrei Kvapil
332d69259b Merge pull request #766 from cozystack/vm-gpu
[virtual-machine] Add GPU support
2025-04-09 10:40:52 +02:00
Andrei Kvapil
9ad6b0d726 [virtual-machine] Add GPU support
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-09 10:39:49 +02:00
Andrei Kvapil
ea9df9e371 Merge pull request #765 from cozystack/gpu-operator
[gpu-operator] Introduce GPU-operator
2025-04-09 10:37:52 +02:00
Denis Seleznev
aed184f6ef Log verbose state for DRBD devices that are not healthy.
This will help troubleshoot issues that occurred in the past but have already been resolved.

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-04-09 03:46:37 +02:00
kklinch0
8e2e77da56 [monitoring] add vpa for vmagent
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-08 16:40:39 +03:00
Andrei Kvapil
1e27dedde5 [gpu-operator] Introduce GPU-operator
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-08 14:03:52 +02:00
Pavlo Gaponuk
7a1c3b6209 Need for CX or RX type of instances
Signed-off-by: Pavlo Gaponuk <pashagaponuk@gmail.com>
2025-04-07 19:03:04 +02:00
kklinch0
3cf850c2c4 [k8s] change CP default resourcesPreset
Signed-off-by: kklinch0 <kklinch0@gmail.com>
2025-04-05 21:31:17 +03:00
kvaps
da301373fa Prepare release v0.29.1
Signed-off-by: kvaps <kvaps@users.noreply.github.com>
2025-04-03 14:27:23 +00:00
Andrei Kvapil
6980dc59c5 Add workflow to run e2e tests using GitHub CI
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 15:02:52 +02:00
Andrei Kvapil
a9c8133fd4 fix dependencies for kafka-operator and clickhouse-operator (#748)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 00:58:16 +02:00
Andrei Kvapil
065abdd95a [linstor] fix reloader patch (#747)
this PR fixes problem

```
* admission webhook "vlinstorcluster.kb.io" denied the request: LinstorCluster.piraeus.io "linstorcluster" is invalid: spec.patches.0.patch: Invalid value: "apiVersion: apps/v1\nkind: Deployment\nmetadata:\n  annotations:\n    secret.reloader.stakater.com/auto: \"true\"\n": Failed to parse patch as either Strategic Merge Patch (missing metadata.name in object {{apps/v1 Deployment} {{ } map[] map[secret.reloader.stakater.com/auto:true]}}) or JSON Patch (json: cannot unmarshal object into Go value of type jsonpatch.Patch)
```

used solution from
-
https://github.com/piraeusdatastore/piraeus-operator/issues/701#issuecomment-2377702085

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 00:45:27 +02:00
Andrei Kvapil
cd8c6a8b9a Fix dependency for clickhouse-operator (#746)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-03 00:36:25 +02:00
Andrei Kvapil
459673f764 Fix CiliumNetworkPolicy depends on cilium (#745) 2025-04-03 00:21:13 +02:00
Nick Volynkin
c795e4fb68 Prepare release v0.29.0 (#740)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Streamlined the asset release process to automatically replace
existing files during uploads.
  
- **Container Image Updates**
- Upgraded versions across multiple components—including backup,
caching, autoscaling, API, dashboard, monitoring, and more—to align with
the latest release (e.g., updating from v0.28.0 to v0.29.0 and other
minor version increments).
- Updated specific images for Grafana, PostgreSQL, MariaDB, ClickHouse,
and others to their latest versions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 23:45:25 +02:00
Andrei Kvapil
7c98248e45 Update Talos Linux to v1.9.5 (#744)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 22:56:41 +02:00
Andrei Kvapil
16c771aa77 [vm-disk] disable immediate bind for non-upload disks (#742)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 22:42:51 +02:00
Andrei Kvapil
d971f2ff29 Enhance versions_map generator logic (#741)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Enhanced version mapping with improved error reporting and clearer
version resolution, ensuring more accurate and reliable version
displays.
- Updated version references for multiple packages to maintain
consistency and stability across the system.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 21:43:17 +02:00
Nick Volynkin
ead0cca5af Fix bootbox version commit (#739)
Fixup to 116196b4d4

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Upgraded the Bootbox package to the latest release, ensuring continued
stability and a solid foundation for future improvements.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-04-02 15:35:41 +02:00
Andrei Kvapil
dbc1fb8a09 Enable Cilium host firewall (#738)
This commit enables Cilium's host firewall feature and makes use of it
to deny external connections to two exporters running as daemonset pods
in the host network namespace.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
Co-authored-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-02 14:42:19 +02:00
Timofei Larkin
d9c6fb7625 Enable Cilium host firewall (#736)
This commit enables Cilium's host firewall feature and makes use of it
to deny external connections to two exporters running as daemonset pods
in the host network namespace.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Host firewall is now enabled by default, adding an extra layer of
security.
  - Enhanced network traffic management with new policies:
    - One policy tightens access to critical service ports.
- Another secures monitoring endpoints by restricting unauthorized
external access.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-02 13:16:15 +02:00
Andrei Kvapil
cd23a30e76 [kubernetes] Fix patches for kubevirt-cloud-provider (#735)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
	- Updated the cloud provider’s container image to a new stable version.
- **New Features**
- Enhanced multi-cluster support to ensure services are correctly scoped
to their respective clusters.
- **Refactor**
	- Removed legacy service management logic for streamlined processing.
- **Tests**
- Adjusted test coverage to validate the improved multi-cluster and
service filtering behaviors.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-02 11:02:29 +02:00
Timofei Larkin
01b3a82ee2 [linstor] Introduce Reloader to automatically reload certificates (#715)
* Add stakater/Reloader to the storage-enabled bundles.
* Add annotations to Linstor components to reload when secrets change.

Closes #456 

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new reloader component that triggers automatic rolling
updates when configuration or secret changes are detected.
- Delivered a fully customizable Helm chart and configuration schema,
including a reload strategy based on annotations for enhanced deployment
control.
  
- **Tests**
- Added test cases to validate container security settings and
environment variable propagation, ensuring robust high-availability
configurations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-04-01 18:47:18 +02:00
Andrei Kvapil
0403f30cd6 [kubernetes] Fix race-condition between two KubeVirt CCMs working with the services with identifical names (#722)
This PR fixes race condition when you have two clusters with two
services with simillar names, two eps-controllers might continiusly
conflict between each other.

Upstream issue
https://github.com/kubevirt/cloud-provider-kubevirt/pull/341

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced multi-cluster support with a new configuration option to
ensure that services and deployments are correctly scoped across
different environments.

- **Bug Fixes**
- Refined endpoint management by improving service port handling and
eliminating duplicate entries, resulting in more consistent and reliable
service routing.

- **Chores**
- Updated the image versioning strategy to a dynamic tag, streamlining
deployments and simplifying future upgrades.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-01 18:47:01 +02:00
Andrei Kvapil
116196b4d4 [bootbox] Automatically lowercase mac addresses (#732)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- MAC addresses are now consistently displayed in lowercase for uniform
formatting.
- **Chores**
- Updated the package version to 0.1.1 and refined version tracking for
improved package management.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-04-01 18:45:26 +02:00
xy2
bdc3525c9b [linstor] Make satellite plunger script continue on errors. (#728)
Do not restart the container if something goes wrong while plunger tries
to fix the linstor state.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved error notifications during system operations to ensure any
issues are promptly reported.
- **Documentation**
- Updated descriptive information on secondary volume management to help
clarify operational conditions.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-03-31 19:47:26 +02:00
Andrei Kvapil
9a0f459655 [linstor] Plunger: detach also / devices (#731)
I just noticed that many devices are still attached to `/`, they are not
deleted but still blocking DRBD operations

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved detection of inactive loop devices by broadening the
filtering criteria, ensuring more accurate identification in various
scenarios.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-31 15:51:10 +02:00
Timofei Larkin
59aac50ebd Merge pull request #711 from cozystack/fix-regression
Fix dependency for piraeus-operator
2025-03-28 20:06:12 +04:00
Timofei Larkin
5c900a7467 Revert ingress nginx chart to v4.11.2
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-03-28 17:26:24 +03:00
Timofei Larkin
cc9abbc505 Use backported ingress controller
Due to upstream compat issues we backport the security patches to
v1.11.2 of the ingress controller and do not rebuild the existing
protobuf exporter.

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-03-28 16:49:31 +03:00
Timofei Larkin
c66eb9f94c Merge pull request #710 from cozystack/709-update-ingress-nginx
Update ingress-nginx to mitigate CVE-2025-1974
2025-03-26 14:11:44 +04:00
Timofei Larkin
0045ddc757 Update ingress-nginx to mitigate CVE-2025-1974
Closes #709

Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
2025-03-26 12:12:06 +03:00
Andrei Kvapil
209a3ef181 Fix dependency for piraeus-operator
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-25 12:58:21 +01:00
Nick Volynkin
92e2173fa5 Fix typo in VirtualPodAutoscaler Makefile
Makefile was copied from VictoriaMetrics Operator, some lines were not
changed.

Follow-up to #676
Resolves #705

Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>
2025-03-25 12:51:01 +02:00
xy2
2d997a4f8d Remove workaround for ZFS tools. (#704)
The mounting of /run parts was fixed in a consistent way upstream.

fixes #699

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Streamlined satellite configurations by removing an unnecessary volume
setting from a container.
- Eliminated an unneeded container entry along with its related
configuration, reducing overall complexity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>
2025-03-23 21:22:26 +01:00