95 Commits

Author SHA1 Message Date
Timofei Larkin
8dd8a718a7 Prepare release v0.27.0 2025-03-06 18:54:54 +03:00
Timofei Larkin
63358e3e6c Recreate etcd pods and certificates on update (#675)
Since updating from 2.5.0 to 2.6.0 renews all certificates including the
trusted CAs for etcd, all etcd pods need a restart. As many people had
problems with their etcds post-update, we decided to script the
procedure, so the renewal of certs and pods is done automatically and in
a predictable order. The hook fires when upgrading from <2.6.0 to
>=2.6.1.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
	- Introduced pre- and post-upgrade tasks for the etcd package.
- Added supporting resources, including roles, role bindings, service
accounts, and a ConfigMap to facilitate smoother upgrade operations.
- **Chores**
	- Updated the application version from 2.6.0 to 2.6.1.
	- Revised version mapping entries to reflect the updated release.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-06 15:24:12 +01:00
xy2
063439ac94 Raise maxLabelsPerTimeseries for VictoriaMetrics vminsert. (#677) 2025-03-06 14:44:58 +01:00
Andrei Kvapil
06daf34102 bump monitoring chart version 2025-03-05 15:00:14 +01:00
xy2
c60b7c0730 Import Piraeus dashboard and alerts. (#658)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Expanded the monitored dashboards with a new storage dashboard entry.
- Introduced proactive alert configurations that cover key storage
components.
- Added templated alert management to streamline dynamic configuration.
- Enhanced metric collection by integrating monitoring endpoints for
storage components.
- Delivered a comprehensive dashboard offering real-time insights into
storage performance.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-05 14:51:23 +01:00
Andrei Kvapil
d4452ea708 Merge pull request #660 from xy2/169-victoria-limits
Increase VMSelect default cpu limit
2025-03-05 14:13:32 +01:00
Timofei Larkin
425ce77f60 Merge pull request #655 from klinch0/feature/add-multi-dc
feature/add-multi-dc-for-pg
2025-03-05 12:51:36 +04:00
kklinch0
88729e4124 rename globalAppTopologySpreadConstraints 2025-03-05 11:39:41 +03:00
kklinch0
4cce138d31 feature/add-topologyspreadconstraints-pg 2025-03-05 10:41:43 +03:00
Timofei Larkin
e7d6f2dfa3 Merge pull request #661 from klinch0/feature/add-ch-monitoring
feature/add-ch-dashboard
2025-03-04 20:22:15 +04:00
kklinch0
36b66a681d feature/add-ch-dashboard 2025-03-04 10:53:51 +03:00
Denis Seleznev
3e273c03b6 Increase the default cpu limit for vminsert. 2025-03-03 19:31:27 +01:00
Denis Seleznev
da0437a774 Make it possible to set cpu limit too. 2025-03-03 19:31:05 +01:00
Denis Seleznev
78cff8c223 Change defaults calculation logic. 2025-03-03 19:18:24 +01:00
Andrei Kvapil
8c4605284c Prepare release v0.26.1 (#659)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
  - Upgraded core platform components to version **v0.26.1**.
- Refreshed container images for key services including backups,
caching, autoscaling, dashboard integrations, and cloud providers.
- These updates improve overall stability, consistency, and performance
across the system.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-03-01 21:04:40 +01:00
Timofei Larkin
a5dc2d5382 Prepare release v0.26.0 2025-02-27 11:51:46 +03:00
Andrei Kvapil
86bb64000e Add new info logo in common style (#649)
New info icon for Cozystack

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Co-authored-by: Viktoriia Kvapil <159528100+kvapsova@users.noreply.github.com>
2025-02-25 15:12:06 +01:00
Timofei Larkin
6ff8b527ea Merge branch 'main' into chore/improve-etcd-tls 2025-02-25 13:38:58 +03:00
Timofei Larkin
0f87c73051 Improve TLS handling in etcd helm chart
1. Add a `commonName` to every certificate.
2. Move 127.0.0.1 from DNS names to IP Addresses in the certificate
   spec.
3. Add **client** auth usage to the etcd-**server** certificate (yes,
   that's necessary), because etcd queries itself using its
   [server cert as a client cert](https://github.com/etcd-io/etcd/issues/9785#issuecomment-432438748).
4. Default all CA certificates' durations to 10 years.
5. Set subject org to release namespace and OU to name so that subjects
   are unique
2025-02-25 13:36:46 +03:00
klinch0
d0d62e8847 feature/add-goldpinger (#648)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a comprehensive Grafana dashboard for Goldpinger, offering
real-time insights into node health, error occurrences, and response
times with intuitive filtering.
- Expanded deployment configurations to include Goldpinger across
environments, streamlining release management and dependency handling.
- Launched a dedicated deployment package featuring customizable
templates for secure, efficient Kubernetes deployments—including
workloads, services, ingress, and monitoring integrations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-25 10:08:08 +01:00
Timofei Larkin
0211c57bed Prepare release v0.25.3 2025-02-22 10:33:32 +03:00
Floppy Disk
6c73e3f3ae feature/mv-kubeconfig 2025-02-20 15:23:54 +03:00
Timofei Larkin
0f68db6793 Merge pull request #635 from klinch0/feature/update-limits
feature/add-more-resources
2025-02-18 20:01:09 +03:00
Timofei Larkin
93c4616115 Merge pull request #630 from aenix-io/release-0.25.1
Prepare release v0.25.1
2025-02-13 17:32:52 +04:00
Andrei Kvapil
1f6ea333b6 Prepare release v0.25.1
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-13 16:00:02 +03:00
Floppy Disk
b75aaf177b add kafka monitoring 2025-02-10 13:29:17 +03:00
Andrei Kvapil
3fa4dd3af9 Prepare release v0.25.0 (#622)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Upgraded multiple system components to the latest version, ensuring
improved performance, stability, and enhanced security.
- Updated deployment and testing configurations across the platform for
a more reliable user experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-02-09 11:41:28 +01:00
klinch0
842d3e55bc feature/add-flux-dashboards (#619)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new dashboard for Flux Control Plane monitoring that
visualizes key performance metrics like CPU, memory, API requests, and
more.
- Added a second dashboard for Flux Cluster Stats to display resource
reconciliation, operation durations, and readiness indicators.
- Seamlessly integrated these dashboards into the monitoring workflow
with dynamic querying and periodic refresh options.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-06 13:50:57 +01:00
klinch0
861e6c464b feature/add-client-etcd-monitoring (#613)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced etcd monitoring with new metrics exposure, pod scraping
configuration, and comprehensive alert rules for proactive
observability.
- Introduced a new `VMPodScrape` resource for improved pod metrics
collection.
- Added a new PrometheusRule configuration for monitoring etcd clusters
with various alert conditions.
- **Chores**
  - Upgraded the etcd release from version 2.4.0 to 2.5.0.
- Consolidated and renamed monitoring dashboard references for better
consistency.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-05 13:13:28 +01:00
Andrei Kvapil
835ee117f7 Rebiild matchobx (#612)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Upgraded the deployment Docker image to version 0.24.1, ensuring
improved stability and potential performance enhancements.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-03 16:42:04 +01:00
Andrei Kvapil
af48519d65 Prepare release v0.24.1 (#611)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-29 11:39:30 +01:00
Andrei Kvapil
0ab39f207c Prepare release v0.24.0 (#606)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes for CozyStack v0.24.0

- **Image Updates**
  - Upgraded CozyStack core components to version v0.24.0
- Updated multiple system images, including cluster-autoscaler, kubevirt
cloud provider, and CSI driver
  - Refreshed images for dashboard, API, and controller components
  - Updated Grafana image to version 1.8.0

- **Infrastructure Changes**
- Replaced `darkhttpd` container with new `assets` container in
deployment configuration
  - Updated image digests across various system components

- **Version Bump**
  - Incremented CozyStack version from v0.23.1 to v0.24.0
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-27 19:44:54 +01:00
Andrei Kvapil
ef2e065c77 Update Grafana, build plugins (#607)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

- **New Features**
  - Updated Grafana to version 11.4.0
- Added new Grafana plugins: VictoriaMetrics logs datasource, Natel
Discrete Panel, and Worldmap Panel

- **Improvements**
  - Enhanced Grafana image build process
  - Dynamically manage Grafana image versioning
  - Updated plugin installation method

- **Version Update**
  - Monitoring package version bumped from 1.7.0 to 1.8.0

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-27 18:45:48 +01:00
Andrei Kvapil
cc5eb4765c Introduce BootBox (#601)
- Introduce tinkerbell essentials
- Introduce bootbox


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

# Release Notes: BootBox Package (v0.1.0)

## New Features
- Added BootBox, a PXE hardware provisioning service.
- Introduced network boot configuration with Matchbox and Smee.
- Enabled hardware management through Kubernetes Custom Resource
Definitions.
- Added support for managing physical machine specifications and
configurations.
- New HelmRelease configuration for streamlined deployment.
- Added new application entry for BootBox in the configuration.

## Configuration
- Supports configuring physical machine instances.
- Provides flexible network boot and DHCP settings.
- Includes role-based access control (RBAC) configurations.
- New parameters for trusted proxies and syslog settings.
- Enhanced configuration options for deployment parameters and resource
allocations.
- Introduced new schema for validating configuration values.

## Deployment
- Deployed in `tenant-root` namespace.
- Optional and privileged installation.
- Depends on Cilium and KubeOVN networking components.
- Configurable deployment strategies and resource allocations.
- Introduced new Service and Ingress resources for improved traffic
management.
- Added support for host networking and public IP configurations.

## Compatibility
- Supports single-node and multi-node cluster configurations.
- Compatible with Kubernetes environments.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-27 10:56:23 +01:00
klinch0
e037cb0e3e feature/add-tg-severity-setting (#585)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added option to disable Telegram alerts for specific severity levels
in the Monitoring Hub.

- **Documentation**
- Updated README with new parameter
`alerta.alerts.telegram.disabledSeverity`.

- **Chores**
  - Bumped monitoring package version from 1.6.1 to 1.7.0.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-17 14:49:33 +01:00
klinch0
59b4a0fb91 bugfix/monitoring add nil checker (#587)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Version Update**
  - Monitoring application version updated from 1.6.1 to 1.6.2
- **Configuration Improvements**
  - Enhanced resource configuration checks for VM cluster components
- Improved handling of resource definitions to prevent potential errors

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-17 14:47:29 +01:00
klinch0
9f9a774340 add kubevirt metrics and dashboards (#573)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added PrometheusRule configuration to monitor virtual machine (VM) and
virtual machine instance (VMI) states.
	- Introduced ServiceMonitor resource for Kubevirt metrics monitoring.
	- Added `monitorNamespace` configuration in KubeVirt custom resource.

- **Monitoring Enhancements**
- Implemented alerts for VMs and VMIs not running for more than 10
minutes.
	- Configured metrics endpoint with HTTPS support.

- **Version Updates**
- Updated version mappings for several packages, reflecting new commit
hashes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-16 11:00:59 +01:00
klinch0
3bb975965d enablePodMonitor (#572)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enabled pod monitoring for multiple database clusters (Alerta,
Keycloak, SeaweedFS, Grafana)

- **Chores**
	- Updated monitoring package version from 1.6.0 to 1.6.1
- Updated version mapping with specific commit hash for monitoring
package
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-15 10:53:16 +01:00
Andrei Kvapil
b08a5d3e2f Fix version-map for updated application (#569) 2025-01-09 15:28:33 +01:00
Andrei Kvapil
cb7b8158e0 Update version-map for updated application (#568)
fixes versions update after this pr
https://github.com/aenix-io/cozystack/pull/563

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-09 15:09:18 +01:00
Andrei Kvapil
107f390ae8 workloadmonitor (#563)
- upd redis
- update kubernetes app to use workloadmonitors
- upd kubernetes
- fix version


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added `WorkloadMonitor` resources for various components including
Kubernetes clusters, Redis, Sentinel, and SeaweedFS.
- Introduced monitoring capabilities for `alerta`, `alertmanager`,
`grafana`, and `vlogs` services.
- Enhanced RBAC configurations to support new monitoring resources
across multiple API groups.

- **Improvements**
	- Updated metadata and labeling for virtual machine templates.
	- Added dynamic resource naming based on release and group names.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-01-09 13:25:12 +01:00
klinch0
d4634797f3 feature/add resources to vmcluster (#556)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **Version Updates**
  - Tenant application version bumped from 1.6.5 to 1.6.6
  - Monitoring application version updated from 1.5.3 to 1.5.4

- **Monitoring Configuration**
- Adjusted metrics storage deduplication interval: shortterm from 5
minutes to 15 seconds, longterm from 15 seconds to 5 minutes
- Updated resource configurations for VM components, including new
resource specifications for vminsert, vmselect, and vmstorage
- Increased memory limits and requests for VMAgent from 500Mi to 1024Mi
and from 200Mi to 768Mi, respectively

- **Performance Improvements**
  - Enhanced resource allocation for monitoring services
  - More flexible configuration options for metrics storage
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-09 13:18:46 +01:00
Andrei Kvapil
c48aed0aa8 hardcode vlogs version (#545) 2024-12-27 14:33:32 +01:00
klinch0
17fbda6e12 fix-vm-logs-url (#538)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
	- Updated monitoring application version to 1.5.3.
- Changed the data source type in Grafana configuration to
`victoriametrics-logs-datasource`.
- **Bug Fixes**
	- Corrected plugin loading configuration in Grafana.
- **Chores**
- Updated version mapping for the monitoring package in the versions
map.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2024-12-23 23:40:52 +01:00
klinch0
c1ca19dc18 add grafana size configure (#536)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new parameter for Grafana's database size with a default
value of 10Gi.
  
- **Bug Fixes**
- Updated default values for `alerta.alerts.telegram.token` and
`alerta.alerts.telegram.chatID` to empty strings.

- **Documentation**
- Revised the README to reflect changes in default parameter values and
added new parameters for Grafana.

- **Chores**
  - Updated the monitoring application's version from 1.5.2 to 1.5.3.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2024-12-20 11:21:54 +01:00
klinch0
cfe86c0815 delete-cpu-limit (#535)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced resource management for the VMCluster resource, specifically
for the `vmstorage` component.
- Added resource specifications including memory limits and CPU
requests.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2024-12-19 21:48:11 +01:00
Andrei Kvapil
898374b533 bump monitoring version (#523) 2024-12-09 19:26:06 +01:00
Andrei Kvapil
66d9b17525 fix monitoring: show alerts only from first instance (#521)
We don't need to show alerts from longterm instance, because the alerts
have shorter timeout than metrics collection interval


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Updated the `VMAlert` YAML template to generate only the first
`VMAlert` resource based on metrics storage values.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2024-12-09 14:00:40 +01:00
klinch0
edbbb9be68 add kubeaps integration (#486)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Introduced a new variable `$host` for improved configuration
management.
- Added a `valuesFrom` section to the `dashboard` release, allowing
external value sourcing.
- Enhanced Keycloak integration with new client scopes, roles, and
configurations for Kubeapps.
- Added support for custom pod specifications and environment variables
in Redis configurations.
- Introduced a new Kubernetes configuration file for managing access to
resources via Role and Secret.
- Updated image versions across various components to ensure
compatibility and leverage new features.

- **Bug Fixes**
- Implemented error handling to ensure required configurations are
present.
- Improved handling of request headers for the `/logos` endpoint in
Nginx configuration.
- Adjusted security context configurations to enhance deployment
security.

- **Documentation**
- Updated configuration files to reflect new dependencies and structures
for better clarity.
- Enhanced README documentation with upgrade instructions and security
defaults.
- Expanded notes on handling persistent volumes and data migration
during upgrades.

These enhancements improve the overall functionality and reliability of
the platform.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2024-12-02 18:57:14 +01:00
klinch0
3c27a1e9bf add metrics agents (#461)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced new HelmRelease configurations for cert-manager, monitoring
agents, and Victoria Metrics Operator in Kubernetes.
- Added resource specifications for `vmselect` in the VMCluster
configuration.
- Enhanced resource management for `vmselect` with defined limits and
requests for memory and CPU.

- **Bug Fixes**
	- Adjusted resource limits for Redis failover memory allocation.

- **Documentation**
- Updated README and release notes for various components, enhancing
clarity and usability.

- **Chores**
- Updated image versions across multiple components for consistency and
performance improvements.
- Modified migration scripts to facilitate transitions and manage
resources effectively.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Andrei Kvapil <kvapss@gmail.com>
2024-11-04 19:01:33 +01:00