cozystack-cozystack

mirror of https://github.com/cozystack/cozystack.git synced 2026-03-02 22:59:06 +00:00

Author	SHA1	Message	Date
Andrei Kvapil	14a9017932	fix(migration): suspend cozy-proxy if it conflicts with installer release In v0.41.x, cozy-proxy HelmRelease was configured with releaseName: cozystack, which collides with the installer helm release. If not suspended before upgrade, the cozy-proxy HR reconciles and overwrites the installer release, deleting cozystack-operator. Add a check in the migration script that detects this conflict and suspends the cozy-proxy HelmRelease before proceeding. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-03-02 12:59:36 +01:00
Andrei Kvapil	c83e41ea14	fix(installer): add keep annotation to Namespace and update migration script Add helm.sh/resource-policy=keep annotation to the cozy-system Namespace in the installer helm chart. This prevents Helm from deleting the namespace when the HelmRelease is removed, which would otherwise destroy all other HelmReleases within it. Update the migration script to annotate the cozy-system namespace and cozystack-version ConfigMap with helm.sh/resource-policy=keep before generating the Package resource. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-28 11:46:09 +01:00
Andrei Kvapil	daa3905b67	[ci] Debug improvements (#2111 ) <!-- Thank you for making a contribution! Here are some tips for you: - Start the PR title with the [label] of Cozystack component: - For system components: [platform], [system], [linstor], [cilium], [kube-ovn], [dashboard], [cluster-api], etc. - For managed apps: [apps], [tenant], [kubernetes], [postgres], [virtual-machine] etc. - For development and maintenance: [tests], [ci], [docs], [maintenance]. - If it's a work in progress, consider creating this PR as a draft. - Don't hesistate to ask for opinion and review in the community chats, even if it's still a draft. - Add the label `backport` if it's a bugfix that needs to be backported to a previous version. --> ## What this PR does ### Release note <!-- Write a release note: - Explain what has changed internally and for users. - Start with the same [label] as in the PR title - Follow the guidelines at https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md. --> ```release-note [ci] Added more debug information to ci tests ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Enhanced error handling and diagnostic output in development testing infrastructure. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-27 18:24:07 +01:00
Andrei Kvapil	022ddf73a8	[apps] Add OpenBAO as a managed secrets management service (#2059 ) ## What this PR does Adds OpenBAO (open-source Vault fork) as a new managed PaaS application in Cozystack. Structure follows existing app patterns (qdrant, nats): - System chart with vendored upstream `openbao/openbao` (chart v0.25.3, appVersion v2.5.0) - App chart with standalone/HA mode switching based on replicas count - TLS via cert-manager self-signed certificates per instance - ApplicationDefinition, PackageSource, PaaS bundle entry - E2E test with init/unseal workflow Key design decisions: - `replicas: 1` → standalone mode with file storage; `replicas > 1` → HA with Raft integrated storage and retry_join with TLS peer verification - TLS enabled by default — each instance gets a self-signed Certificate with DNS SANs covering services and pod addresses - `disable_mlock = true` in HCL config since default security context drops IPC_LOCK capability - Injector and CSI provider disabled (cluster-scoped components, not safe per-tenant) - No auto-init/unseal — OpenBAO requires manual initialization by design - E2E test performs full lifecycle: deploy, wait for certificate + API, init, unseal, verify readiness, cleanup ### Release note ```release-note [apps] Add OpenBAO as a managed secrets management service with standalone and HA Raft modes, TLS enabled by default ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * New Features * Added OpenBAO managed secrets management service with high-availability and standalone deployment options * Integrated monitoring and dashboards for operational visibility * Enabled configurable external access and web UI * Added automated snapshot backup capability <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-27 11:11:59 +01:00
Myasnikov Daniil	3bf43312aa	(ci) Removed cozytest output trimming in non-tty run Signed-off-by: Myasnikov Daniil <myasnikovdaniil2001@gmail.com>	2026-02-27 12:43:00 +05:00
Myasnikov Daniil	fd6d0c3603	(ci) Added extra debug commands for k8s startup Signed-off-by: Myasnikov Daniil <myasnikovdaniil2001@gmail.com>	2026-02-27 12:41:40 +05:00
Andrei Kvapil	948346ef6d	fix(platform): use original cozystack.io/ui label in migration 26 and simplify migration script Migration 26 was using apps.cozystack.io/application.kind=Monitoring label which is added by migration 22 and may not be present on v0.41.1 clusters. Switch to cozystack.io/ui=true (guaranteed on old HRs) with field-selector for exact name match. Also remove redundant bundle enabled flags from migrate-to-version-1.0.sh since the variant already determines them via its values file. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-24 23:49:43 +01:00
Andrei Kvapil	da597225d1	fix(platform): add missing field mappings in migrate-to-version-1.0.sh Add ConfigMap fields that were not converted to Package values: - bundle-disable → bundles.disabledPackages - bundle-enable → bundles.enabledPackages - expose-ingress → publishing.ingressName - expose-services → publishing.exposedServices Remove incorrect bundles.system.type field that is not part of the Package values schema. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-24 23:34:53 +01:00
Andrei Kvapil	02064888a4	feat(platform): make cluster issuer name and ACME solver configurable (#2077 ) <!-- Thank you for making a contribution! Here are some tips for you: - Start the PR title with the [label] of Cozystack component: - For system components: [platform], [system], [linstor], [cilium], [kube-ovn], [dashboard], [cluster-api], etc. - For managed apps: [apps], [tenant], [kubernetes], [postgres], [virtual-machine] etc. - For development and maintenance: [tests], [ci], [docs], [maintenance]. - If it's a work in progress, consider creating this PR as a draft. - Don't hesistate to ask for opinion and review in the community chats, even if it's still a draft. - Add the label `backport` if it's a bugfix that needs to be backported to a previous version. --> ## What this PR does Previously `_cluster.clusterissuer` controlled the ACME solver type using values `http01` / `cloudflare`, and every ingress template hardcoded `cert-manager.io/cluster-issuer: letsencrypt-prod` with no way to override it. This PR adds new parameters in platform chart: - `publishing.certificates.solver` (default `http01`) - `publishing.certificates.issuerName` (default: `letsencrypt-prod`) instead of single parameter before - `publishing.certificates.issuerType` Previous `certificates.issuerType` was renamed to `certificates.solver`; Also its possible value `cloudflare` was renamed to `dns01` to use standard ACME terminology. New `certificates.issuerName` (default: `letsencrypt-prod`) — propagated as `_cluster.issuer-name` to all packages via `cozystack-values` then its value appears in `cert-manager.io/cluster-issuer` annotation across 8 templates of ingresses in system applications. `publishing.certificates.solver` can be set empty to clearly support `selfsigned-cluster-issuer`, or have any value, but it can be a bit confusing. Operators can now point ingresses at any ClusterIssuer (custom ACME, self-signed, internal CA) by setting `certificates.issuerName` without touching individual package templates. ## Breaking changes \| What changed \| Before \| After \| \|---\|---\|---\| \| Solver key \| `certificates.issuerType` \| `certificates.solver` \| \| Cloudflare solver value \| `issuerType: cloudflare` \| `solver: dns01` \| This changes handled by migration when upgrading cozystack from v1 or by `migration-to-v1.0.sh` script (also checked by migration later) No actions from user needed. ### Release note <!-- Write a release note: - Explain what has changed internally and for users. - Start with the same [label] as in the PR title - Follow the guidelines at https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md. --> ```release-note [platform] Added publishing.certificates.solver (http01/dns01) and publishing.certificates.issuerName fields to allow configuring ACME challenge type and ClusterIssuer per installation, replacing the old implicit issuerType field [platform] Migration script and upgrade hook (migration 32) convert old clusterissuer/issuerType fields to the new solver/issuerName fields ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Migrated certificate issuer configuration from legacy `issuerType` field to new `solver` and `issuerName` fields system-wide. * Automated migration script converts existing configurations, mapping legacy values (cloudflare, http01) to new format. * Updated all certificate-related templates to use new configurable solver and issuer settings with sensible defaults. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-20 23:09:12 +01:00
Andrei Kvapil	d856775961	feat(kubernetes): update supported versions to v1.30-v1.35 (#2073 ) ## What this PR does Updates Kubernetes version support to match current release landscape and Talos 1.12 compatibility: - Update Kamaji from `edge-25.4.1` to `edge-26.2.4` (adds K8s 1.35 support) - Update Kubernetes version matrix: v1.30, v1.31, v1.32, v1.33, v1.34, v1.35 - Drop EOL versions v1.28 and v1.29 - Remove merged-upstream patch (992.diff — label preservation fix) - Regenerate disable-datastore-check.diff for new Kamaji version Changes: - Default Kubernetes version is now v1.35 - E2E tests will validate v1.35 (latest) and v1.34 (previous) - Patch versions updated to latest available (v1.35.0, v1.34.4, v1.33.8, v1.32.12, v1.31.14, v1.30.14) ### Release note ```release-note [kubernetes] Update supported Kubernetes versions to v1.30-v1.35 ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added a Kamaji CRDs Helm chart with DataStore and KubeconfigGenerator resources, plus deployment templates and configurable kubeconfigGenerator settings * DataStore now supports multiple backends (etcd, MySQL, PostgreSQL, NATS) with TLS/auth validations and status tracking (observedGeneration) * Chores * Bumped default Kubernetes version from v1.33 to v1.35 (added v1.34; removed v1.28–v1.29) * Updated charts, packaging metadata, README/docs and helm ignore/Makefile entries; updated builder base image and chart dependencies <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-20 20:12:56 +01:00
Myasnikov Daniil	c98b6203a7	fix(platform): fix migrate script to account clusterissuer parameter Signed-off-by: Myasnikov Daniil <myasnikovdaniil2001@gmail.com>	2026-02-20 16:41:41 +05:00
Aleksei Sviridkin	8f1e52690d	test(e2e): fix kubernetes-previous retry failures - Kill stale port-forward processes before starting a new one; on retries, the previous attempt's port-forward still holds the port, causing all kubectl commands to get "connection refused" - Use -ge 2 instead of -eq 2 for node count check; MachineHealthCheck may create a 3rd VM, leading to 3 nodes joining the tenant cluster which would never satisfy the exact equality check - Increase node join timeout from 5m to 8m; QEMU VMs with v1.34 need more time to boot and join when running after kubernetes-latest Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-20 03:23:50 +03:00
Aleksei Sviridkin	00ab6e792c	test(e2e): increase worker node join timeout to 5 minutes When running kubernetes-latest and kubernetes-previous E2E tests simultaneously, worker VMs compete for resources in the sandbox environment. 3 minutes was insufficient for nodes to boot and join the tenant cluster under load. Increase to 5 minutes. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-20 01:30:10 +03:00
Aleksei Sviridkin	4e5455c72c	fix(e2e): poll for CRD existence before waiting for Established condition kubectl wait fails immediately with NotFound if the CRD does not exist yet. The operator creates CRDs asynchronously on startup, so wrap the wait in a retry loop that tolerates the initial absence. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-19 19:24:19 +03:00
Aleksei Sviridkin	dbfdbc8298	fix(installer): check parsePlatformSourceURL error, wait for PackageSource in E2E Explicitly check error from parsePlatformSourceURL instead of relying on the implicit guarantee that installPlatformSourceResource already checked it. This prevents latent bugs if startup order is ever restructured. Add wait for platform PackageSource existence in E2E test before creating Package resource, preventing flaky failures when operator startup is slow. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-19 17:57:45 +03:00
Aleksei Sviridkin	8450830f06	fix(installer): add CRD wait in E2E, unit tests for PackageSource creation Add explicit CRD wait (kubectl wait crd --for=condition=Established) in E2E test before creating Package resources, preventing race condition between operator CRD installation and resource creation. Add unit tests for installPlatformPackageSource covering create, update, and GitRepository sourceRef kind scenarios. Document that hardcoded variant list is an intentional design choice matching the previous Helm template behavior. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-19 17:50:09 +03:00
Aleksei Sviridkin	655133b81c	fix(installer): move PackageSource creation from Helm template to operator Replace the Helm hook approach with programmatic PackageSource creation in the operator startup sequence. Helm hooks are unsuitable for persistent resources like PackageSource because before-hook-creation policy causes cascade deletion of owned ArtifactGenerators during upgrades. The operator now creates the platform PackageSource after installing CRDs and the Flux source resource, using the same create-or-update pattern as installPlatformSourceResource(). The sourceRef.kind is derived from the platform source URL (OCIRepository for oci://, GitRepository for git). Also fix stale comment in e2e test referencing deleted crds/ directory. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-19 17:38:23 +03:00
Aleksei Sviridkin	668ddc552e	refactor(installer): remove CRDs from Helm chart, rely on operator --install-crds Remove the crds/ directory from the cozy-installer Helm chart. The operator already installs embedded CRDs via server-side apply on every startup with the --install-crds=true flag, making the Helm crds/ directory redundant. Convert templates/packagesource.yaml to a Helm post-install/post-upgrade hook so it is applied after the operator has started and installed CRDs. Update codegen to write CRDs only to internal/crdinstall/manifests/ (single source of truth) and update the Makefile to source build assets from there. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-19 17:23:37 +03:00
Andrei Kvapil	7ff5b2ba23	[harbor] Add managed Harbor container registry (#2058 ) ## What this PR does Adds Harbor v2.14.2 as a managed tenant-level container registry service in the PaaS bundle. Architecture: - Wrapper chart (`apps/harbor`) — HelmRelease, Ingress, WorkloadMonitors, BucketClaim, dashboard RBAC - Vendored upstream chart (`system/harbor`) from helm.goharbor.io v1.18.2 - System chart (`system/harbor`) provisions PostgreSQL via CloudNativePG and Redis via redis-operator - ApplicationDefinition (`system/harbor-rd`) for dynamic `Harbor` CRD registration - PackageSource and paas.yaml bundle entry for platform integration Key design decisions: - Database and Redis provisioned via CPNG and redis-operator (not internal Helm-based instances) for reliable day-2 operations - Registry image storage uses S3 via COSI BucketClaim/BucketAccess from namespace SeaweedFS - Trivy vulnerability scanner cache uses PVC (S3 not supported by vendored chart) - Token CA key/cert persisted across upgrades via Secret lookup - Per-component resource configuration (core, registry, jobservice, trivy) - Ingress with TLS via cert-manager, cloudflare issuer type handling, proxy timeouts for large image pushes - Auto-generated admin credentials persisted across upgrades E2E test: Creates Harbor instance, verifies HelmRelease readiness, deployment availability, credentials secret, service port, then cleans up. ### Release note ```release-note [harbor] Add managed Harbor container registry as a tenant-level service ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added Harbor container registry deployment with integrated Kubernetes support, including database and cache layers. * Enabled metrics monitoring via Prometheus integration. * Configured dashboard management interface for Harbor administration. * Tests * Added end-to-end testing for Harbor deployment and verification. * Chores * Integrated Harbor into the platform's application package bundle. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-18 13:54:26 +01:00
Aleksei Sviridkin	87d0390256	fix(harbor): include tenant domain in default hostname and add E2E cleanup Use tenant base domain in default hostname construction (harbor.RELEASE.DOMAIN) to match the pattern used by other apps (kubernetes, vpn). Remove unused $ingress variable from harbor.yaml. Add cleanup of stale resources from previous failed E2E runs. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 02:07:38 +03:00
Andrei Kvapil	3b267d6882	refactor(e2e): use helm install instead of kubectl apply for cozystack installation (#2060 ) ## Summary - Replace pre-rendered static YAML application (`kubectl apply`) with direct `helm upgrade --install` of the `packages/core/installer` chart in E2E tests - Remove CRD/operator artifact upload/download from CI workflow — the chart with correct values is already present in the sandbox via workspace copy and `pr.patch` - Remove `copy-installer-manifest` Makefile target and its dependencies ## Test plan - [ ] CI build job completes without uploading CRD/operator artifacts - [ ] E2E `install-cozystack` step succeeds with `helm upgrade --install` - [ ] All existing E2E app tests pass <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * PR workflows now only keep the primary disk asset; publishing/fetching of auxiliary operator and CRD artifacts removed. * CRD manifests are produced by concatenation and a verify-crds check was added to unit tests; file-write permissions for embedded manifests tightened. * New Features * Operator can install CRDs at startup to ensure resources exist before reconcile. * E2E install now uses the chart-based installer flow. * Tests * Added comprehensive tests for CRD-install handling and manifest writing. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-17 23:31:08 +01:00
Aleksei Sviridkin	e7ffc21743	feat(harbor): switch registry storage to S3 via COSI BucketClaim Replace PVC-based registry storage with S3 via COSI BucketClaim/BucketAccess. The system chart parses BucketInfo secret and creates a registry-s3 Secret with REGISTRY_STORAGE_S3_* env vars that override Harbor's ConfigMap values. - Add bucket-secret.yaml to system chart (BucketInfo parser) - Remove storageType/size from registry config (S3 is now the only option) - Use Harbor's existingSecret support for S3 credentials injection - Add objectstorage-controller to PackageSource dependencies - Update E2E test with COSI bucket provisioning waits and diagnostics Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 01:23:02 +03:00
kklinch0	7ac989923d	Add monitoring for NATs Co-authored-by: Andrei Kvapil <kvapss@gmail.com> Signed-off-by: kklinch0 <kklinch0@gmail.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-17 22:54:12 +01:00
Aleksei Sviridkin	0f2ba5aba2	fix(harbor): add diagnostic output on E2E system HelmRelease timeout Dump HelmRelease status, pods, events, and ExternalArtifact info when harbor-test-system fails to become ready, to diagnose the root cause of the persistent timeout. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:50:12 +03:00
Aleksei Sviridkin	490faaf292	fix(harbor): add operator dependencies, fix persistence rendering, increase E2E timeout Add postgres-operator and redis-operator to PackageSource dependsOn to ensure CRDs are available before Harbor system chart deploys. Make persistentVolumeClaim conditional to avoid empty YAML mapping when using S3 storage without Trivy. Increase E2E system HelmRelease timeout from 300s to 600s to account for CPNG + Redis + Harbor bootstrap time on QEMU. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:50:12 +03:00
Aleksei Sviridkin	cea57f62c8	[harbor] Make registry storage configurable: S3 or PVC Add registry.storageType parameter (pvc/s3) to let users choose between PVC storage and S3 via COSI BucketClaim. Default is pvc, which works without SeaweedFS in the tenant namespace. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:50:12 +03:00
Aleksei Sviridkin	c815725bcf	[harbor] Fix E2E test: use correct HelmRelease name with prefix ApplicationDefinition has prefix "harbor-", so CR name "harbor" produces HelmRelease "harbor-harbor". Use name="test" and release="harbor-test" to correctly reference all resources. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:50:12 +03:00
Aleksei Sviridkin	0c85639fed	[harbor] Move to apps/, use S3 via BucketClaim for registry storage Move Harbor from packages/extra/ to packages/apps/ as it is a self-sufficient end-user application, not a singleton tenant module. Update bundle entry from system to paas accordingly. Replace registry PVC storage with S3 via COSI BucketClaim/BucketAccess, provisioned from the namespace's SeaweedFS instance. S3 credentials are injected into the HelmRelease via valuesFrom with targetPath. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:50:12 +03:00
Aleksei Sviridkin	2dd3c03279	[harbor] Use CPNG and redis-operator instead of internal databases Replace Harbor's internal PostgreSQL with CloudNativePG operator and internal Redis with redis-operator (RedisFailover), following established Cozystack patterns from seaweedfs and redis apps. Additional fixes from code review: - Fix registry resources nesting level (registry.registry/controller) - Persist token CA across upgrades to prevent JWT invalidation - Update values schema and ApplicationDefinition Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:50:11 +03:00
Aleksei Sviridkin	543ce6e5fd	[harbor] Add managed Harbor container registry application Add Harbor v2.14.2 as a tenant-level managed service with per-component resource configuration, ingress with TLS termination, and internal PostgreSQL/Redis. Includes: - extra/harbor wrapper chart with HelmRelease, WorkloadMonitors, Ingress - system/harbor with vendored upstream chart (helm.goharbor.io v1.18.2) - harbor-rd ApplicationDefinition for dynamic CRD registration - PackageSource and system.yaml bundle entry - E2E test with Secret and Service verification Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:50:11 +03:00
Aleksei Sviridkin	1558fb428a	build(codegen): sync CRDs to operator embed directory After generating CRDs to packages/core/installer/crds/, copy them to internal/crdinstall/manifests/ so the operator binary embeds the latest CRD definitions. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:49:56 +03:00
Aleksei Sviridkin	55cd8fc0e1	refactor(installer): move CRDs to crds/ directory for proper Helm install ordering Helm installs crds/ contents before processing templates, resolving the chicken-and-egg problem where PackageSource CR validation fails because its CRD hasn't been registered yet. - Move definitions/ to crds/ in the installer chart - Remove templates/crds.yaml (Helm auto-installs from crds/) - Update codegen script to write CRDs to crds/ - Replace helm template with cat for static CRD manifest generation - Remove pre-apply CRD workaround from e2e test Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:49:55 +03:00
Aleksei Sviridkin	58dfc97201	fix(e2e): apply CRDs before helm install to resolve dependency ordering Helm cannot validate PackageSource CR during install because the CRD is part of the same chart. Pre-apply CRDs via helm template + kubectl apply --server-side before running helm upgrade --install. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:49:55 +03:00
Aleksei Sviridkin	153d2c48ae	refactor(e2e): use helm install instead of kubectl apply for cozystack installation Replace pre-rendered static YAML application with direct helm chart installation in e2e tests. The chart directory with correct values is already present in the sandbox after pr.patch application. - Remove CRD/operator artifact upload/download from CI workflow - Remove copy-installer-manifest target from testing Makefile - Use helm upgrade --install from local chart in e2e-install-cozystack.bats Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-18 00:49:55 +03:00
Aleksei Sviridkin	dd4723386f	test(openbao): add E2E test for standalone mode Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-17 21:23:27 +03:00
Andrei Kvapil	6c431d0857	fix(codegen): add gen_client to update-codegen.sh and regenerate applyconfiguration (#2061 ) ## What this PR does Fix build error in `pkg/generated/applyconfiguration/utils.go` caused by a reference to `testing.TypeConverter` which was removed in client-go v0.34.1. The root cause was that `hack/update-codegen.sh` called `gen_helpers` and `gen_openapi` but never called `gen_client`, so the applyconfiguration code was never regenerated after the client-go upgrade. Changes: - Fix `THIS_PKG` from `k8s.io/sample-apiserver` template leftover to correct module path - Add `kube::codegen::gen_client` call with `--with-applyconfig` flag - Regenerate applyconfiguration (now uses `managedfields.TypeConverter`) - Add tests for `ForKind` and `NewTypeConverter` functions ### Release note ```release-note [maintenance] Regenerate applyconfiguration code for client-go v0.34.1 compatibility ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Documentation * Updated backup class definitions example to reference MariaDB instead of MySQL. * Chores * Updated code generation tooling and module dependencies to support enhanced functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-17 18:21:39 +01:00
Aleksei Sviridkin	a52da8dd8d	style(e2e): consistently quote kubeconfig variable references Quote all tenantkubeconfig-${test_name} references in run-kubernetes.sh for consistent shell scripting style. The only exception is line 195 inside a sh -ec "..." double-quoted string where inner quotes would break the outer quoting. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-17 02:00:44 +03:00
Aleksei Sviridkin	315e5dc0bd	fix(e2e): make kubernetes test retries effective by cleaning up stale resources When the kubernetes E2E test fails at the deployment wait step, set -eu causes immediate exit before cleanup. On retry, kubectl apply outputs "unchanged" for the stuck deployment, making retries 2 and 3 guaranteed to fail against the same stuck pod. Add pre-creation cleanup of backend deployment/service and NFS test resources using --ignore-not-found, so retries start fresh. Also increase the deployment wait timeout from 90s to 300s to handle CI resource pressure, aligning with other timeouts in the same function. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-17 01:58:13 +03:00
Aleksei Sviridkin	75e25fa977	fix(codegen): add gen_client to update-codegen.sh and regenerate applyconfiguration The applyconfiguration code referenced testing.TypeConverter from k8s.io/client-go/testing, which was removed in client-go v0.34.1. Root cause: hack/update-codegen.sh called gen_helpers and gen_openapi but not gen_client, so applyconfiguration was never regenerated after the client-go upgrade. Changes: - Fix THIS_PKG from sample-apiserver template leftover to correct module path - Add kube::codegen::gen_client call with --with-applyconfig flag - Regenerate applyconfiguration (now uses managedfields.TypeConverter) - Add tests for ForKind and NewTypeConverter functions Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-16 23:01:38 +03:00
Andrei Kvapil	2bc5e01fda	fix kubernetes e2e test for rwx volume Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-14 11:02:12 +01:00
Andrei Kvapil	dbba5c325b	fix kubernetes e2e test Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-14 10:13:09 +01:00
Andrei Kvapil	13aa341a28	fix(platform): address review comments in vm migration script Replace `\|\| echo ""` with `\|\| true` to avoid newline bugs in variable assignments. Switch `for x in $(cmd)` loops to `while read` for safer iteration over kubectl output. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-14 03:02:54 +01:00
Andrei Kvapil	168f6f2445	fix(csi): address review feedback for kubevirt-csi-driver RWX support - Move nil check before req dereference in CreateVolume - Scope CiliumNetworkPolicy endpointSelector to specific VMI - Use vmNamespace from NodeId for VMI lookup instead of infraNamespace - Log PVC lookup errors in ControllerExpandVolume - Wrap CNP ownerReference updates in retry.RetryOnConflict - Fix infraClusterLabels validation to check runControllerService flag - Dereference nodeName pointer in error message - Replace panic with klog.Fatal for consistent error handling - Honor CSI readonly flag in NFS NodePublishVolume - Log mount list errors in isNFSMount - Reorder Dockerfile ENTRYPOINT after COPY for better layer caching - Add cleanup on e2e test failure and --wait on pod deletion Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-13 09:04:33 +01:00
Andrei Kvapil	46103400f2	test(e2e): adapt kubernetes NFS test for native RWX CSI support Remove separate NFS Application dependency from e2e test. The kubevirt CSI driver wrapper now handles RWX Filesystem volumes natively - PVCs with ReadWriteMany accessMode use the standard kubevirt StorageClass. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-13 08:57:14 +01:00
Andrei Kvapil	9a86551e40	fix(e2e): correct s3Bucket reference in mariadb test Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-12 15:18:06 +01:00
Andrei Kvapil	bce5300116	refactor: rename mysql application to mariadb The mysql chart actually deploys MariaDB via mariadb-operator, but was incorrectly named "mysql". Rename all references to use the correct "mariadb" name across the codebase. Changes: - Rename packages/apps/mysql -> packages/apps/mariadb - Rename packages/system/mysql-rd -> packages/system/mariadb-rd - Rename platform source and bundle references - Update CRD kind from MySQL to MariaDB - Update RBAC, e2e tests, backup controller tests - Keep real MySQL CLI/config tool names unchanged (mysqldump, [mysqld], etc.) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-12 15:15:20 +01:00
Andrei Kvapil	0260b15aaf	refactor(apps): remove FerretDB application (#2028 ) ## What this PR does Remove the FerretDB managed application from Cozystack. This includes the application Helm chart, resource definition, platform source, PaaS bundle entry, RBAC clusterrole entry, and e2e test. Historical migration scripts are left intact for upgrade compatibility. ### Release note ```release-note [ferretdb] Removed FerretDB managed application ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Removed FerretDB managed database service and associated Helm chart, documentation, and test components from the platform. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-12 09:30:01 +01:00
Andrei Kvapil	3971e9cb39	[installer] Rename talos asset to cozystack-operator-talos.yaml Add -talos suffix to the default variant output file for consistency with -generic and -hosted variants. Update all references in CI workflows, e2e tests, upload scripts, and testing Makefile. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2026-02-11 22:05:37 +01:00
Timofei Larkin	5f27152d18	[ci] Cozyreport improvements (#2032 ) ## What this PR does Previously the debug log collection script that fired when CI failed treated Packages and PackageSources as namespaced resources and as a result of incorrect parsing failed to correctly kubectl describe and kubectl get -oyaml them. Additionally, the script did not read the logs of init containers. These issues are fixed with this patch. ### Release note ```release-note [ci] Improvements to cozyreport.sh (ci log collection script): fix retrieval of Package and PackageSource details, consider initContainers as well as containers, when fetching logs of errored pods. ```	2026-02-11 20:40:07 +04:00
Aleksei Sviridkin	2673624261	fix(e2e): apply increased timeout only to ingress-nginx Keep the 1-minute timeout for other components (cilium, coredns, csi, vsnap-crd) to preserve fast failure detection, and apply the 5-minute timeout specifically to ingress-nginx which needs it after the hostNetwork to NodePort migration. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>	2026-02-11 17:39:02 +03:00

1 2 3 4 5 ...

251 Commits