fix updateStatus field

fix image tag for victorialogs
fix
2026-01-30 02:18:49 +00:00 · 2025-04-10 14:28:29 +03:00 · 2025-04-10 14:04:45 +03:00 · 2025-04-10 11:58:50 +03:00 · 2025-04-10 11:58:50 +03:00 · 2025-04-10 11:58:50 +03:00
187 changed files with 18362 additions and 9149 deletions
--- a/.github/workflows/pre-commit.yml
+++ b/.github/workflows/pre-commit.yml
@@ -1,7 +1,12 @@
 name: Pre-Commit Checks

-on: [push, pull_request]
-
+on:
+  push:
+    branches:
+      - main
+  pull_request:
+    paths-ignore:
+      - '**.md'
 jobs:
  pre-commit:
    runs-on: ubuntu-22.04
--- a/.github/workflows/pull-requests-release.yaml
+++ b/.github/workflows/pull-requests-release.yaml
@@ -1,4 +1,4 @@
-name: Verify and Finalize Release PR
+name: Releasing PR

 on:
  pull_request:
--- a/.github/workflows/pull-requests.yaml
+++ b/.github/workflows/pull-requests.yaml
@@ -1,4 +1,4 @@
-name: Build and Test
+name: Pull Request

 on:
  pull_request:
@@ -6,7 +6,7 @@ on:

 jobs:
  e2e:
-    name: Build and Test for Pull Requests
+    name: Build and Test
    runs-on: [self-hosted]
    permissions:
      contents: read
--- a/.github/workflows/tags.yaml
+++ b/.github/workflows/tags.yaml
@@ -1,4 +1,4 @@
-name: Prepare Release
+name: Versioned Tag

 on:
  push:
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -6,13 +6,13 @@ As you get started, you are in the best position to give us feedbacks on areas o

 * Problems found while setting up the development environment
 * Gaps in our documentation
-* Bugs in our Github actions
+* Bugs in our GitHub actions

-First, though, it is important that you read the [code of conduct](CODE_OF_CONDUCT.md).
+First, though, it is important that you read the [CNCF Code of Conduct](https://github.com/cncf/foundation/blob/master/code-of-conduct.md).

 The guidelines below are a starting point. We don't want to limit your
 creativity, passion, and initiative. If you think there's a better way, please
-feel free to bring it up in a Github discussion, or open a pull request. We're
+feel free to bring it up in a GitHub discussion, or open a pull request. We're
 certain there are always better ways to do things, we just need to start some
 constructive dialogue!

@@ -23,9 +23,9 @@ We welcome many types of contributions including:
 * New features
 * Builds, CI/CD
 * Bug fixes
-* [Documentation](https://github.com/cozystack/cozystack-website/tree/main)
+* [Documentation](https://GitHub.com/cozystack/cozystack-website/tree/main)
 * Issue Triage
-* Answering questions on Slack or Github Discussions
+* Answering questions on Slack or GitHub Discussions
 * Web design
 * Communications / Social Media / Blog Posts
 * Events participation
@@ -34,7 +34,7 @@ We welcome many types of contributions including:
 ## Ask for Help

 The best way to reach us with a question when contributing is to drop a line in
-our [Telegram channel](https://t.me/cozystack), or start a new Github discussion.
+our [Telegram channel](https://t.me/cozystack), or start a new GitHub discussion.

 ## Raising Issues

--- a/README.md
+++ b/README.md
@@ -12,20 +12,21 @@

 **Cozystack** is a free PaaS platform and framework for building clouds.

-With Cozystack, you can transform your bunch of servers into an intelligent system with a simple REST API for spawning Kubernetes clusters, Database-as-a-Service, virtual machines, load balancers, HTTP caching services, and other services with ease.
+With Cozystack, you can transform a bunch of servers into an intelligent system with a simple REST API for spawning Kubernetes clusters,
+Database-as-a-Service, virtual machines, load balancers, HTTP caching services, and other services with ease.

-You can use Cozystack to build your own cloud or to provide a cost-effective development environments.  
+Use Cozystack to build your own cloud or provide a cost-effective development environment.  

 ## Use-Cases

-* [**Using Cozystack to build public cloud**](https://cozystack.io/docs/use-cases/public-cloud/)  
-You can use Cozystack as backend for a public cloud
+* [**Using Cozystack to build a public cloud**](https://cozystack.io/docs/guides/use-cases/public-cloud/)  
+You can use Cozystack as a backend for a public cloud

-* [**Using Cozystack to build private cloud**](https://cozystack.io/docs/use-cases/private-cloud/)  
-You can use Cozystack as platform to build a private cloud powered by Infrastructure-as-Code approach
+* [**Using Cozystack to build a private cloud**](https://cozystack.io/docs/guides/use-cases/private-cloud/)  
+You can use Cozystack as a platform to build a private cloud powered by Infrastructure-as-Code approach

-* [**Using Cozystack as Kubernetes distribution**](https://cozystack.io/docs/use-cases/kubernetes-distribution/)  
-You can use Cozystack as Kubernetes distribution for Bare Metal
+* [**Using Cozystack as a Kubernetes distribution**](https://cozystack.io/docs/guides/use-cases/kubernetes-distribution/)  
+You can use Cozystack as a Kubernetes distribution for Bare Metal

 ## Screenshot

@@ -33,11 +34,11 @@ You can use Cozystack as Kubernetes distribution for Bare Metal

 ## Documentation

-The documentation is located on official [cozystack.io](https://cozystack.io) website.
+The documentation is located on the [cozystack.io](https://cozystack.io) website.

-Read [Get Started](https://cozystack.io/docs/get-started/) section for a quick start.
+Read the [Getting Started](https://cozystack.io/docs/getting-started/) section for a quick start.

-If you encounter any difficulties, start with the [troubleshooting guide](https://cozystack.io/docs/troubleshooting/), and work your way through the process that we've outlined.
+If you encounter any difficulties, start with the [troubleshooting guide](https://cozystack.io/docs/operations/troubleshooting/) and work your way through the process that we've outlined.

 ## Versioning

@@ -50,15 +51,15 @@ A full list of the available releases is available in the GitHub repository's [R

 Contributions are highly appreciated and very welcomed!

-In case of bugs, please, check if the issue has been already opened by checking the [GitHub Issues](https://github.com/cozystack/cozystack/issues) section.
-In case it isn't, you can open a new one: a detailed report will help us to replicate it, assess it, and work on a fix.
+In case of bugs, please check if the issue has already been opened by checking the [GitHub Issues](https://github.com/cozystack/cozystack/issues) section.
+If it isn't, you can open a new one. A detailed report will help us replicate it, assess it, and work on a fix.

-You can express your intention in working on the fix on your own.
+You can express your intention to on the fix on your own.
 Commits are used to generate the changelog, and their author will be referenced in it.

-In case of **Feature Requests** please use the [Discussion's Feature Request section](https://github.com/cozystack/cozystack/discussions/categories/feature-requests).
+If you have **Feature Requests** please use the [Discussion's Feature Request section](https://github.com/cozystack/cozystack/discussions/categories/feature-requests).

-You can join our weekly community meetings (just add this events to your [Google Calendar](https://calendar.google.com/calendar?cid=ZTQzZDIxZTVjOWI0NWE5NWYyOGM1ZDY0OWMyY2IxZTFmNDMzZTJlNjUzYjU2ZGJiZGE3NGNhMzA2ZjBkMGY2OEBncm91cC5jYWxlbmRhci5nb29nbGUuY29t) or [iCal](https://calendar.google.com/calendar/ical/e43d21e5c9b45a95f28c5d649c2cb1e1f433e2e653b56dbbda74ca306f0d0f68%40group.calendar.google.com/public/basic.ics)) or [Telegram group](https://t.me/cozystack).
+You are welcome to join our weekly community meetings (just add this events to your [Google Calendar](https://calendar.google.com/calendar?cid=ZTQzZDIxZTVjOWI0NWE5NWYyOGM1ZDY0OWMyY2IxZTFmNDMzZTJlNjUzYjU2ZGJiZGE3NGNhMzA2ZjBkMGY2OEBncm91cC5jYWxlbmRhci5nb29nbGUuY29t) or [iCal](https://calendar.google.com/calendar/ical/e43d21e5c9b45a95f28c5d649c2cb1e1f433e2e653b56dbbda74ca306f0d0f68%40group.calendar.google.com/public/basic.ics)) or [Telegram group](https://t.me/cozystack).

 ## License

--- a/cmd/cozystack-controller/main.go
+++ b/cmd/cozystack-controller/main.go
@@ -178,6 +178,15 @@ func main() {
 		setupLog.Error(err, "unable to create controller", "controller", "WorkloadMonitor")
 		os.Exit(1)
 	}
+
+	if err = (&controller.WorkloadReconciler{
+		Client: mgr.GetClient(),
+		Scheme: mgr.GetScheme(),
+	}).SetupWithManager(mgr); err != nil {
+		setupLog.Error(err, "unable to create controller", "controller", "Workload")
+		os.Exit(1)
+	}
+
 	// +kubebuilder:scaffold:builder

 	if err := mgr.AddHealthzCheck("healthz", healthz.Ping); err != nil {
--- a/hack/e2e.sh
+++ b/hack/e2e.sh
@@ -113,6 +113,11 @@ machine:
        - usermode_helper=disabled
    - name: zfs
    - name: spl
+  registries:
+    mirrors:
+      docker.io:
+        endpoints:
+        - https://mirror.gcr.io
  files:
  - content: |
      [plugins]
@@ -313,7 +318,12 @@ kubectl patch -n tenant-root tenants.apps.cozystack.io root --type=merge -p '{"s
 timeout 60 sh -c 'until kubectl get hr -n tenant-root etcd ingress monitoring tenant-root; do sleep 1; done'

 # Wait for HelmReleases be installed
-kubectl wait --timeout=2m --for=condition=ready -n tenant-root hr etcd ingress monitoring tenant-root
+kubectl wait --timeout=2m --for=condition=ready -n tenant-root hr etcd ingress tenant-root
+
+if ! kubectl wait --timeout=2m --for=condition=ready -n tenant-root hr monitoring; then
+  flux reconcile hr monitoring -n tenant-root --force
+  kubectl wait --timeout=2m --for=condition=ready -n tenant-root hr monitoring
+fi

 kubectl patch -n tenant-root ingresses.apps.cozystack.io ingress --type=merge -p '{"spec":{
  "dashboard": true
@@ -328,7 +338,7 @@ kubectl wait --timeout=5m --for=jsonpath=.status.readyReplicas=3 -n tenant-root

 # Wait for Victoria metrics
 kubectl wait --timeout=5m --for=jsonpath=.status.updateStatus=operational -n tenant-root vmalert/vmalert-shortterm vmalertmanager/alertmanager
-kubectl wait --timeout=5m --for=jsonpath=.status.status=operational -n tenant-root vlogs/generic
+kubectl wait --timeout=5m --for=jsonpath=.status.updateStatus=operational -n tenant-root vlogs/generic
 kubectl wait --timeout=5m --for=jsonpath=.status.clusterStatus=operational -n tenant-root vmcluster/shortterm vmcluster/longterm

 # Wait for grafana
@@ -347,5 +357,5 @@ kubectl patch -n cozy-system cm/cozystack --type=merge -p '{"data":{
  "oidc-enabled": "true"
 }}'

-timeout 60 sh -c 'until kubectl get hr -n cozy-keycloak keycloak keycloak-configure keycloak-operator; do sleep 1; done'
+timeout 120 sh -c 'until kubectl get hr -n cozy-keycloak keycloak keycloak-configure keycloak-operator; do sleep 1; done'
 kubectl wait --timeout=10m --for=condition=ready -n cozy-keycloak hr keycloak keycloak-configure keycloak-operator
--- a/hack/gen_versions_map.sh
+++ b/hack/gen_versions_map.sh
@@ -19,21 +19,19 @@ fi
 miss_map=$(echo "$new_map" | awk 'NR==FNR { nm[$1 " " $2] = $3; next } { if (!($1 " " $2 in nm)) print $1, $2, $3}' - "$file")

 # search accross all tags sorted by version
-search_commits=$(git ls-remote --tags origin | grep 'refs/tags/v' | sort -k2,2 -rV | awk '{print $1}')
-# add latest main commit to search
-search_commits="${search_commits} $(git rev-parse "origin/main")"
+search_commits=$(git ls-remote --tags origin | awk -F/ '$3 ~ /v[0-9]+.[0-9]+.[0-9]+/ {print}' | sort -k2,2 -rV | awk '{print $1}')

 resolved_miss_map=$(
  echo "$miss_map" | while read -r chart version commit; do
    # if version is found in HEAD, it's HEAD
-    if grep -q "^version: $version$" ./${chart}/Chart.yaml; then
+    if [ $(awk '$1 == "version:" {print $2}' ./${chart}/Chart.yaml) = "${version}" ]; then
      echo "$chart $version HEAD"
      continue
    fi

    # if commit is not HEAD, check if it's valid
    if [ $commit != "HEAD" ]; then
-      if ! git show "${commit}:./${chart}/Chart.yaml" 2>/dev/null | grep -q "^version: $version$"; then
+      if [ $(git show "${commit}:./${chart}/Chart.yaml" 2>/dev/null | awk '$1 == "version:" {print $2}') != "${version}" ]; then
        echo "Commit $commit for $chart $version is not valid" >&2
        exit 1
      fi
@@ -46,15 +44,15 @@ resolved_miss_map=$(
    # if commit is HEAD, but version is not found in HEAD, check all tags
    found_tag=""
    for tag in $search_commits; do
-      if git show "${tag}:./${chart}/Chart.yaml" 2>/dev/null | grep -q "^version: $version$"; then
+      if [ $(git show "${tag}:./${chart}/Chart.yaml" 2>/dev/null | awk '$1 == "version:" {print $2}') = "${version}" ]; then
        found_tag=$(git rev-parse --short "${tag}")
        break
      fi
    done
    
    if [ -z "$found_tag" ]; then
-      echo "Can't find $chart $version in any version tag or in the latest main commit" >&2
-      exit 1
+      echo "Can't find $chart $version in any version tag, removing it" >&2
+      continue
    fi
    
    echo "$chart $version $found_tag"
--- a/hack/upload-assets.sh
+++ b/hack/upload-assets.sh
@@ -7,3 +7,5 @@ gh release upload --clobber $version _out/assets/cozystack-installer.yaml
 gh release upload --clobber $version _out/assets/metal-amd64.iso
 gh release upload --clobber $version _out/assets/metal-amd64.raw.xz
 gh release upload --clobber $version _out/assets/nocloud-amd64.raw.xz
+gh release upload --clobber $version _out/assets/kernel-amd64
+gh release upload --clobber $version _out/assets/initramfs-metal-amd64.xz
--- a/internal/controller/workload_controller.go
+++ b/internal/controller/workload_controller.go
@@ -0,0 +1,87 @@
+package controller
+
+import (
+	"context"
+	"strings"
+
+	corev1 "k8s.io/api/core/v1"
+	apierrors "k8s.io/apimachinery/pkg/api/errors"
+	"k8s.io/apimachinery/pkg/runtime"
+	"k8s.io/apimachinery/pkg/types"
+	ctrl "sigs.k8s.io/controller-runtime"
+	"sigs.k8s.io/controller-runtime/pkg/client"
+	"sigs.k8s.io/controller-runtime/pkg/log"
+
+	cozyv1alpha1 "github.com/cozystack/cozystack/api/v1alpha1"
+)
+
+// WorkloadMonitorReconciler reconciles a WorkloadMonitor object
+type WorkloadReconciler struct {
+	client.Client
+	Scheme *runtime.Scheme
+}
+
+func (r *WorkloadReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
+	logger := log.FromContext(ctx)
+	w := &cozyv1alpha1.Workload{}
+	err := r.Get(ctx, req.NamespacedName, w)
+	if err != nil {
+		if apierrors.IsNotFound(err) {
+			return ctrl.Result{}, nil
+		}
+		logger.Error(err, "Unable to fetch Workload")
+		return ctrl.Result{}, err
+	}
+
+	// it's being deleted, nothing to handle
+	if w.DeletionTimestamp != nil {
+		return ctrl.Result{}, nil
+	}
+
+	t := getMonitoredObject(w)
+	err = r.Get(ctx, types.NamespacedName{Name: t.GetName(), Namespace: t.GetNamespace()}, t)
+
+	// found object, nothing to do
+	if err == nil {
+		return ctrl.Result{}, nil
+	}
+
+	// error getting object but not 404 -- requeue
+	if !apierrors.IsNotFound(err) {
+		logger.Error(err, "failed to get dependent object", "kind", t.GetObjectKind(), "dependent-object-name", t.GetName())
+		return ctrl.Result{}, err
+	}
+
+	err = r.Delete(ctx, w)
+	if err != nil {
+		logger.Error(err, "failed to delete workload")
+	}
+	return ctrl.Result{}, err
+}
+
+// SetupWithManager registers our controller with the Manager and sets up watches.
+func (r *WorkloadReconciler) SetupWithManager(mgr ctrl.Manager) error {
+	return ctrl.NewControllerManagedBy(mgr).
+		// Watch WorkloadMonitor objects
+		For(&cozyv1alpha1.Workload{}).
+		Complete(r)
+}
+
+func getMonitoredObject(w *cozyv1alpha1.Workload) client.Object {
+	if strings.HasPrefix(w.Name, "pvc-") {
+		obj := &corev1.PersistentVolumeClaim{}
+		obj.Name = strings.TrimPrefix(w.Name, "pvc-")
+		obj.Namespace = w.Namespace
+		return obj
+	}
+	if strings.HasPrefix(w.Name, "svc-") {
+		obj := &corev1.Service{}
+		obj.Name = strings.TrimPrefix(w.Name, "svc-")
+		obj.Namespace = w.Namespace
+		return obj
+	}
+	obj := &corev1.Pod{}
+	obj.Name = w.Name
+	obj.Namespace = w.Namespace
+	return obj
+}
--- a/internal/controller/workloadmonitor_controller.go
+++ b/internal/controller/workloadmonitor_controller.go
@@ -3,6 +3,7 @@ package controller
 import (
 	"context"
 	"encoding/json"
+	"fmt"
 	"sort"

 	apierrors "k8s.io/apimachinery/pkg/api/errors"
@@ -33,6 +34,17 @@ type WorkloadMonitorReconciler struct {
 // +kubebuilder:rbac:groups=cozystack.io,resources=workloads,verbs=get;list;watch;create;update;patch;delete
 // +kubebuilder:rbac:groups=cozystack.io,resources=workloads/status,verbs=get;update;patch
 // +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch
+// +kubebuilder:rbac:groups=core,resources=persistentvolumeclaims,verbs=get;list;watch
+
+// isServiceReady checks if the service has an external IP bound
+func (r *WorkloadMonitorReconciler) isServiceReady(svc *corev1.Service) bool {
+	return len(svc.Status.LoadBalancer.Ingress) > 0
+}
+
+// isPVCReady checks if the PVC is bound
+func (r *WorkloadMonitorReconciler) isPVCReady(pvc *corev1.PersistentVolumeClaim) bool {
+	return pvc.Status.Phase == corev1.ClaimBound
+}

 // isPodReady checks if the Pod is in the Ready condition.
 func (r *WorkloadMonitorReconciler) isPodReady(pod *corev1.Pod) bool {
@@ -88,6 +100,96 @@ func updateOwnerReferences(obj metav1.Object, monitor client.Object) {
 	obj.SetOwnerReferences(owners)
 }

+// reconcileServiceForMonitor creates or updates a Workload object for the given Service and WorkloadMonitor.
+func (r *WorkloadMonitorReconciler) reconcileServiceForMonitor(
+	ctx context.Context,
+	monitor *cozyv1alpha1.WorkloadMonitor,
+	svc corev1.Service,
+) error {
+	logger := log.FromContext(ctx)
+	workload := &cozyv1alpha1.Workload{
+		ObjectMeta: metav1.ObjectMeta{
+			Name:      fmt.Sprintf("svc-%s", svc.Name),
+			Namespace: svc.Namespace,
+		},
+	}
+
+	resources := make(map[string]resource.Quantity)
+
+	q := resource.MustParse("0")
+
+	for _, ing := range svc.Status.LoadBalancer.Ingress {
+		if ing.IP != "" {
+			q.Add(resource.MustParse("1"))
+		}
+	}
+
+	resources["public-ips"] = q
+
+	_, err := ctrl.CreateOrUpdate(ctx, r.Client, workload, func() error {
+		// Update owner references with the new monitor
+		updateOwnerReferences(workload.GetObjectMeta(), monitor)
+
+		workload.Labels = svc.Labels
+
+		// Fill Workload status fields:
+		workload.Status.Kind = monitor.Spec.Kind
+		workload.Status.Type = monitor.Spec.Type
+		workload.Status.Resources = resources
+		workload.Status.Operational = r.isServiceReady(&svc)
+
+		return nil
+	})
+	if err != nil {
+		logger.Error(err, "Failed to CreateOrUpdate Workload", "workload", workload.Name)
+		return err
+	}
+
+	return nil
+}
+
+// reconcilePVCForMonitor creates or updates a Workload object for the given PVC and WorkloadMonitor.
+func (r *WorkloadMonitorReconciler) reconcilePVCForMonitor(
+	ctx context.Context,
+	monitor *cozyv1alpha1.WorkloadMonitor,
+	pvc corev1.PersistentVolumeClaim,
+) error {
+	logger := log.FromContext(ctx)
+	workload := &cozyv1alpha1.Workload{
+		ObjectMeta: metav1.ObjectMeta{
+			Name:      fmt.Sprintf("pvc-%s", pvc.Name),
+			Namespace: pvc.Namespace,
+		},
+	}
+
+	resources := make(map[string]resource.Quantity)
+
+	for resourceName, resourceQuantity := range pvc.Status.Capacity {
+		resources[resourceName.String()] = resourceQuantity
+	}
+
+	_, err := ctrl.CreateOrUpdate(ctx, r.Client, workload, func() error {
+		// Update owner references with the new monitor
+		updateOwnerReferences(workload.GetObjectMeta(), monitor)
+
+		workload.Labels = pvc.Labels
+
+		// Fill Workload status fields:
+		workload.Status.Kind = monitor.Spec.Kind
+		workload.Status.Type = monitor.Spec.Type
+		workload.Status.Resources = resources
+		workload.Status.Operational = r.isPVCReady(&pvc)
+
+		return nil
+	})
+	if err != nil {
+		logger.Error(err, "Failed to CreateOrUpdate Workload", "workload", workload.Name)
+		return err
+	}
+
+	return nil
+}
+
 // reconcilePodForMonitor creates or updates a Workload object for the given Pod and WorkloadMonitor.
 func (r *WorkloadMonitorReconciler) reconcilePodForMonitor(
 	ctx context.Context,
@@ -205,6 +307,45 @@ func (r *WorkloadMonitorReconciler) Reconcile(ctx context.Context, req ctrl.Requ
 		}
 	}

+	pvcList := &corev1.PersistentVolumeClaimList{}
+	if err := r.List(
+		ctx,
+		pvcList,
+		client.InNamespace(monitor.Namespace),
+		client.MatchingLabels(monitor.Spec.Selector),
+	); err != nil {
+		logger.Error(err, "Unable to list PVCs for WorkloadMonitor", "monitor", monitor.Name)
+		return ctrl.Result{}, err
+	}
+
+	for _, pvc := range pvcList.Items {
+		if err := r.reconcilePVCForMonitor(ctx, monitor, pvc); err != nil {
+			logger.Error(err, "Failed to reconcile Workload for PVC", "PVC", pvc.Name)
+			continue
+		}
+	}
+
+	svcList := &corev1.ServiceList{}
+	if err := r.List(
+		ctx,
+		svcList,
+		client.InNamespace(monitor.Namespace),
+		client.MatchingLabels(monitor.Spec.Selector),
+	); err != nil {
+		logger.Error(err, "Unable to list Services for WorkloadMonitor", "monitor", monitor.Name)
+		return ctrl.Result{}, err
+	}
+
+	for _, svc := range svcList.Items {
+		if svc.Spec.Type != corev1.ServiceTypeLoadBalancer {
+			continue
+		}
+		if err := r.reconcileServiceForMonitor(ctx, monitor, svc); err != nil {
+			logger.Error(err, "Failed to reconcile Workload for Service", "Service", svc.Name)
+			continue
+		}
+	}
+
 	// Update WorkloadMonitor status based on observed pods
 	monitor.Status.ObservedReplicas = observedReplicas
 	monitor.Status.AvailableReplicas = availableReplicas
@@ -233,41 +374,51 @@ func (r *WorkloadMonitorReconciler) SetupWithManager(mgr ctrl.Manager) error {
 		// Also watch Pod objects and map them back to WorkloadMonitor if labels match
 		Watches(
 			&corev1.Pod{},
-			handler.EnqueueRequestsFromMapFunc(func(ctx context.Context, obj client.Object) []reconcile.Request {
-				pod, ok := obj.(*corev1.Pod)
-				if !ok {
-					return nil
-				}
-
-				var monitorList cozyv1alpha1.WorkloadMonitorList
-				// List all WorkloadMonitors in the same namespace
-				if err := r.List(ctx, &monitorList, client.InNamespace(pod.Namespace)); err != nil {
-					return nil
-				}
-
-				// Match each monitor's selector with the Pod's labels
-				var requests []reconcile.Request
-				for _, m := range monitorList.Items {
-					matches := true
-					for k, v := range m.Spec.Selector {
-						if podVal, exists := pod.Labels[k]; !exists || podVal != v {
-							matches = false
-							break
-						}
-					}
-					if matches {
-						requests = append(requests, reconcile.Request{
-							NamespacedName: types.NamespacedName{
-								Namespace: m.Namespace,
-								Name:      m.Name,
-							},
-						})
-					}
-				}
-				return requests
-			}),
+			handler.EnqueueRequestsFromMapFunc(mapObjectToMonitor(&corev1.Pod{}, r.Client)),
+		).
+		// Watch PVCs as well
+		Watches(
+			&corev1.PersistentVolumeClaim{},
+			handler.EnqueueRequestsFromMapFunc(mapObjectToMonitor(&corev1.PersistentVolumeClaim{}, r.Client)),
 		).
 		// Watch for changes to Workload objects we create (owned by WorkloadMonitor)
 		Owns(&cozyv1alpha1.Workload{}).
 		Complete(r)
 }
+
+func mapObjectToMonitor[T client.Object](_ T, c client.Client) func(ctx context.Context, obj client.Object) []reconcile.Request {
+	return func(ctx context.Context, obj client.Object) []reconcile.Request {
+		concrete, ok := obj.(T)
+		if !ok {
+			return nil
+		}
+
+		var monitorList cozyv1alpha1.WorkloadMonitorList
+		// List all WorkloadMonitors in the same namespace
+		if err := c.List(ctx, &monitorList, client.InNamespace(concrete.GetNamespace())); err != nil {
+			return nil
+		}
+
+		labels := concrete.GetLabels()
+		// Match each monitor's selector with the Pod's labels
+		var requests []reconcile.Request
+		for _, m := range monitorList.Items {
+			matches := true
+			for k, v := range m.Spec.Selector {
+				if labelVal, exists := labels[k]; !exists || labelVal != v {
+					matches = false
+					break
+				}
+			}
+			if matches {
+				requests = append(requests, reconcile.Request{
+					NamespacedName: types.NamespacedName{
+						Namespace: m.Namespace,
+						Name:      m.Name,
+					},
+				})
+			}
+		}
+		return requests
+	}
+}
--- a/packages/apps/kubernetes/Chart.yaml
+++ b/packages/apps/kubernetes/Chart.yaml
@@ -16,7 +16,7 @@ type: application
 # This is the chart version. This version number should be incremented each time you make changes
 # to the chart and its templates, including the app version.
 # Versions are expected to follow Semantic Versioning (https://semver.org/)
-version: 0.17.0
+version: 0.17.1

 # This is the version number of the application being deployed. This version number should be
 # incremented each time you make changes to the application. Versions are not expected to
--- a/packages/apps/kubernetes/values.yaml
+++ b/packages/apps/kubernetes/values.yaml
@@ -85,7 +85,7 @@ kamajiControlPlane:
    #     memory: 512Mi
    
    ## @param kamajiControlPlane.apiServer.resourcesPreset Set container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if resources is set (resources is recommended for production).
-    resourcesPreset: "micro"
+    resourcesPreset: "small"

  controllerManager:
    ## @param kamajiControlPlane.controllerManager.resources Resources
--- a/packages/apps/tenant/Chart.yaml
+++ b/packages/apps/tenant/Chart.yaml
@@ -4,4 +4,4 @@ description: Separated tenant namespace
 icon: /logos/tenant.svg

 type: application
-version: 1.9.1
+version: 1.9.2
--- a/packages/apps/tenant/templates/monitoring.yaml
+++ b/packages/apps/tenant/templates/monitoring.yaml
@@ -46,4 +46,8 @@ spec:
        resources: {}
    oncall:
      enabled: false
+  {{- if .Values.ingress }}
+  dependsOn:
+    - name: ingress
+  {{- end }}
 {{- end }}
--- a/packages/apps/versions_map
+++ b/packages/apps/versions_map
@@ -56,7 +56,8 @@ kubernetes 0.15.0 4e68e65c
 kubernetes 0.15.1 160e4e2a
 kubernetes 0.15.2 8267072d
 kubernetes 0.16.0 077045b0
-kubernetes 0.17.0 HEAD
+kubernetes 0.17.0 1fbbfcd0
+kubernetes 0.17.1 HEAD
 mysql 0.1.0 263e47be
 mysql 0.2.0 c24a103f
 mysql 0.3.0 53f2365e
@@ -127,7 +128,8 @@ tenant 1.6.8 bc95159a
 tenant 1.7.0 24fa7222
 tenant 1.8.0 160e4e2a
 tenant 1.9.0 728743db
-tenant 1.9.1 HEAD
+tenant 1.9.1 de19450f
+tenant 1.9.2 HEAD
 virtual-machine 0.1.4 f2015d65
 virtual-machine 0.1.5 263e47be
 virtual-machine 0.2.0 c0685f43
@@ -139,7 +141,8 @@ virtual-machine 0.7.0 e23286a3
 virtual-machine 0.7.1 0ab39f20
 virtual-machine 0.8.0 3fa4dd3a
 virtual-machine 0.8.1 93c46161
-virtual-machine 0.8.2 HEAD
+virtual-machine 0.8.2 de19450f
+virtual-machine 0.9.0 HEAD
 vm-disk 0.1.0 d971f2ff
 vm-disk 0.1.1 HEAD
 vm-instance 0.1.0 1ec10165
@@ -148,7 +151,8 @@ vm-instance 0.3.0 4e68e65c
 vm-instance 0.4.0 e23286a3
 vm-instance 0.4.1 0ab39f20
 vm-instance 0.5.0 3fa4dd3a
-vm-instance 0.5.1 HEAD
+vm-instance 0.5.1 de19450f
+vm-instance 0.6.0 HEAD
 vpn 0.1.0 263e47be
 vpn 0.2.0 53f2365e
 vpn 0.3.0 6c5cf5bf
--- a/packages/apps/virtual-machine/Chart.yaml
+++ b/packages/apps/virtual-machine/Chart.yaml
@@ -17,10 +17,10 @@ type: application
 # This is the chart version. This version number should be incremented each time you make changes
 # to the chart and its templates, including the app version.
 # Versions are expected to follow Semantic Versioning (https://semver.org/)
-version: 0.8.2
+version: 0.9.0

 # This is the version number of the application being deployed. This version number should be
 # incremented each time you make changes to the application. Versions are not expected to
 # follow Semantic Versioning. They should reflect the version the application is using.
 # It is recommended to use it with quotes.
-appVersion: "0.8.2"
+appVersion: 0.9.0
--- a/packages/apps/virtual-machine/Makefile
+++ b/packages/apps/virtual-machine/Makefile
@@ -2,6 +2,7 @@ include ../../../scripts/package.mk

 generate:
 	readme-generator -v values.yaml -s values.schema.json -r README.md
+	yq -o json -i '.properties.gpus.items.type = "object" | .properties.gpus.default = []' values.schema.json
 	INSTANCE_TYPES=$$(yq e '.metadata.name' -o=json -r ../../system/kubevirt-instancetypes/templates/instancetypes.yaml | yq 'split(" ") | . + [""]' -o json) \
 	  && yq -i -o json ".properties.instanceType.optional=true | .properties.instanceType.enum = $${INSTANCE_TYPES}" values.schema.json
 	PREFERENCES=$$(yq e '.metadata.name' -o=json -r ../../system/kubevirt-instancetypes/templates/preferences.yaml | yq 'split(" ") | . + [""]' -o json) \
--- a/packages/apps/virtual-machine/README.md
+++ b/packages/apps/virtual-machine/README.md
@@ -36,22 +36,23 @@ virtctl ssh <user>@<vm>

 ### Common parameters

-| Name                      | Description                                                                                                | Value            |
-| ------------------------- | ---------------------------------------------------------------------------------------------------------- | ---------------- |
-| `external`                | Enable external access from outside the cluster                                                            | `false`          |
-| `externalMethod`          | specify method to passthrough the traffic to the virtual machine. Allowed values: `WholeIP` and `PortList` | `WholeIP`        |
-| `externalPorts`           | Specify ports to forward from outside the cluster                                                          | `[]`             |
-| `running`                 | Determines if the virtual machine should be running                                                        | `true`           |
-| `instanceType`            | Virtual Machine instance type                                                                              | `u1.medium`      |
-| `instanceProfile`         | Virtual Machine prefferences profile                                                                       | `ubuntu`         |
-| `systemDisk.image`        | The base image for the virtual machine. Allowed values: `ubuntu`, `cirros`, `alpine`, `fedora` and `talos` | `ubuntu`         |
-| `systemDisk.storage`      | The size of the disk allocated for the virtual machine                                                     | `5Gi`            |
-| `systemDisk.storageClass` | StorageClass used to store the data                                                                        | `replicated`     |
-| `resources.cpu`           | The number of CPU cores allocated to the virtual machine                                                   | `""`             |
-| `resources.memory`        | The amount of memory allocated to the virtual machine                                                      | `""`             |
-| `sshKeys`                 | List of SSH public keys for authentication. Can be a single key or a list of keys.                         | `[]`             |
-| `cloudInit`               | cloud-init user data config. See cloud-init documentation for more details.                                | `#cloud-config
-` |
+| Name                      | Description                                                                                                | Value        |
+| ------------------------- | ---------------------------------------------------------------------------------------------------------- | ------------ |
+| `external`                | Enable external access from outside the cluster                                                            | `false`      |
+| `externalMethod`          | specify method to passthrough the traffic to the virtual machine. Allowed values: `WholeIP` and `PortList` | `WholeIP`    |
+| `externalPorts`           | Specify ports to forward from outside the cluster                                                          | `[]`         |
+| `running`                 | Determines if the virtual machine should be running                                                        | `true`       |
+| `instanceType`            | Virtual Machine instance type                                                                              | `u1.medium`  |
+| `instanceProfile`         | Virtual Machine preferences profile                                                                        | `ubuntu`     |
+| `systemDisk.image`        | The base image for the virtual machine. Allowed values: `ubuntu`, `cirros`, `alpine`, `fedora` and `talos` | `ubuntu`     |
+| `systemDisk.storage`      | The size of the disk allocated for the virtual machine                                                     | `5Gi`        |
+| `systemDisk.storageClass` | StorageClass used to store the data                                                                        | `replicated` |
+| `gpus`                    | List of GPUs to attach                                                                                     | `[]`         |
+| `resources.cpu`           | The number of CPU cores allocated to the virtual machine                                                   | `""`         |
+| `resources.memory`        | The amount of memory allocated to the virtual machine                                                      | `""`         |
+| `sshKeys`                 | List of SSH public keys for authentication. Can be a single key or a list of keys.                         | `[]`         |
+| `cloudInit`               | cloud-init user data config. See cloud-init documentation for more details.                                | `""`         |
+| `cloudInitSeed`           | A seed string to generate an SMBIOS UUID for the VM.                                                       | `""`         |

 ## U Series

--- a/packages/apps/virtual-machine/templates/_helpers.tpl
+++ b/packages/apps/virtual-machine/templates/_helpers.tpl
@@ -49,3 +49,23 @@ Selector labels
 app.kubernetes.io/name: {{ include "virtual-machine.name" . }}
 app.kubernetes.io/instance: {{ .Release.Name }}
 {{- end }}
+
+{{/*
+Generate a stable UUID for cloud-init re-initialization upon upgrade.
+*/}}
+{{- define "virtual-machine.stableUuid" -}}
+{{- $source := printf "%s-%s-%s" .Release.Namespace (include "virtual-machine.fullname" .) .Values.cloudInitSeed }}
+{{- $hash := sha256sum $source }}
+{{- $uuid := printf "%s-%s-4%s-9%s-%s" (substr 0 8 $hash) (substr 8 12 $hash) (substr 13 16 $hash) (substr 17 20 $hash) (substr 20 32 $hash) }}
+{{- if eq .Values.cloudInitSeed "" }}
+  {{- /*  Try to save previous uuid to not trigger full cloud-init again if user decided to remove the seed. */}}
+  {{- $vmResource := lookup "kubevirt.io/v1" "VirtualMachine" .Release.Namespace (include "virtual-machine.fullname" .) -}}
+  {{- if $vmResource }}
+    {{- $existingUuid := $vmResource | dig "spec" "template" "spec" "domain" "firmware" "uuid" "" }}
+    {{- if $existingUuid }}
+      {{- $uuid = $existingUuid }}
+    {{- end }}
+  {{- end }}
+{{- end }}
+{{- $uuid }}
+{{- end }}
--- a/packages/apps/virtual-machine/templates/vm.yaml
+++ b/packages/apps/virtual-machine/templates/vm.yaml
@@ -68,7 +68,15 @@ spec:
          requests:
            memory: {{ .Values.resources.memory | quote }}
        {{- end }}
+        firmware:
+          uuid: {{ include "virtual-machine.stableUuid" . }}
        devices:
+          {{- if .Values.gpus }}
+          gpus:
+          {{- range $i, $gpu := .Values.gpus }}
+          - deviceName: {{ $gpu.name }}
+          {{- end }}
+          {{- end }}
          disks:
          - disk:
              bus: scsi
@@ -90,6 +98,7 @@ spec:
            secret:
              secretName: {{ include "virtual-machine.fullname" $ }}-ssh-keys
          propagationMethod:
+            # keys will be injected into metadata part of cloud-init disk
            noCloud: {}
      {{- end }}
      terminationGracePeriodSeconds: 30
@@ -100,8 +109,14 @@ spec:
      {{- if or .Values.sshKeys .Values.cloudInit }}
      - name: cloudinitdisk
        cloudInitNoCloud:
+        {{- if .Values.cloudInit }}
          secretRef:
            name: {{ include "virtual-machine.fullname" . }}-cloud-init
+        {{- else }}
+          userData: |
+            #cloud-config
+            final_message: Cloud-init user-data was left blank intentionally.
+        {{- end }}
      {{- end }}
      networks:
      - name: default
--- a/packages/apps/virtual-machine/values.schema.json
+++ b/packages/apps/virtual-machine/values.schema.json
@@ -88,7 +88,7 @@
    },
    "instanceProfile": {
      "type": "string",
-      "description": "Virtual Machine prefferences profile",
+      "description": "Virtual Machine preferences profile",
      "default": "ubuntu",
      "optional": true,
      "enum": [
@@ -164,6 +164,14 @@
        }
      }
    },
+    "gpus": {
+      "type": "array",
+      "description": "List of GPUs to attach",
+      "default": [],
+      "items": {
+        "type": "object"
+      }
+    },
    "resources": {
      "type": "object",
      "properties": {
@@ -190,7 +198,12 @@
    "cloudInit": {
      "type": "string",
      "description": "cloud-init user data config. See cloud-init documentation for more details.",
-      "default": "#cloud-config\n"
+      "default": ""
+    },
+    "cloudInitSeed": {
+      "type": "string",
+      "description": "A seed string to generate an SMBIOS UUID for the VM.",
+      "default": ""
    }
  }
 }
--- a/packages/apps/virtual-machine/values.yaml
+++ b/packages/apps/virtual-machine/values.yaml
@@ -12,7 +12,7 @@ externalPorts:
 running: true

 ## @param instanceType Virtual Machine instance type
-## @param instanceProfile Virtual Machine prefferences profile
+## @param instanceProfile Virtual Machine preferences profile
 ##
 instanceType: "u1.medium"
 instanceProfile: ubuntu
@@ -26,6 +26,12 @@ systemDisk:
  storage: 5Gi
  storageClass: replicated

+## @param gpus [array] List of GPUs to attach
+## Example:
+## gpus:
+## - name: nvidia.com/GA102GL_A10
+gpus: []
+
 ## @param resources.cpu The number of CPU cores allocated to the virtual machine
 ## @param resources.memory The amount of memory allocated to the virtual machine
 resources:
@@ -49,5 +55,13 @@ sshKeys: []
 ##   password: ubuntu
 ##   chpasswd: { expire: False }
 ##
-cloudInit: |
-  #cloud-config
+cloudInit: ""
+
+## @param cloudInitSeed A seed string to generate an SMBIOS UUID for the VM.
+cloudInitSeed: ""
+## Change it to any new value to force a full cloud-init reconfiguration. Change it when you want to apply
+## to an existing VM settings that are usually written only once, like new SSH keys or new network configuration.
+## An empty value does nothing (and the existing UUID is not reverted). Please note that changing this value
+## does not trigger a VM restart. You must perform the restart separately.
+## Example:
+## cloudInitSeed: "upd1"
--- a/packages/apps/vm-instance/Chart.yaml
+++ b/packages/apps/vm-instance/Chart.yaml
@@ -17,10 +17,10 @@ type: application
 # This is the chart version. This version number should be incremented each time you make changes
 # to the chart and its templates, including the app version.
 # Versions are expected to follow Semantic Versioning (https://semver.org/)
-version: 0.5.1
+version: 0.6.0

 # This is the version number of the application being deployed. This version number should be
 # incremented each time you make changes to the application. Versions are not expected to
 # follow Semantic Versioning. They should reflect the version the application is using.
 # It is recommended to use it with quotes.
-appVersion: "0.5.1"
+appVersion: 0.6.0
--- a/packages/apps/vm-instance/Makefile
+++ b/packages/apps/vm-instance/Makefile
@@ -3,6 +3,7 @@ include ../../../scripts/package.mk
 generate:
 	readme-generator -v values.yaml -s values.schema.json -r README.md
 	yq -o json -i '.properties.disks.items.type = "object" | .properties.disks.default = []' values.schema.json
+	yq -o json -i '.properties.gpus.items.type = "object" | .properties.gpus.default = []' values.schema.json
 	INSTANCE_TYPES=$$(yq e '.metadata.name' -o=json -r ../../system/kubevirt-instancetypes/templates/instancetypes.yaml | yq 'split(" ") | . + [""]' -o json) \
 	  && yq -i -o json ".properties.instanceType.optional=true | .properties.instanceType.enum = $${INSTANCE_TYPES}" values.schema.json
 	PREFERENCES=$$(yq e '.metadata.name' -o=json -r ../../system/kubevirt-instancetypes/templates/preferences.yaml | yq 'split(" ") | . + [""]' -o json) \
--- a/packages/apps/vm-instance/README.md
+++ b/packages/apps/vm-instance/README.md
@@ -36,20 +36,21 @@ virtctl ssh <user>@<vm>

 ### Common parameters

-| Name               | Description                                                                                                | Value            |
-| ------------------ | ---------------------------------------------------------------------------------------------------------- | ---------------- |
-| `external`         | Enable external access from outside the cluster                                                            | `false`          |
-| `externalMethod`   | specify method to passthrough the traffic to the virtual machine. Allowed values: `WholeIP` and `PortList` | `WholeIP`        |
-| `externalPorts`    | Specify ports to forward from outside the cluster                                                          | `[]`             |
-| `running`          | Determines if the virtual machine should be running                                                        | `true`           |
-| `instanceType`     | Virtual Machine instance type                                                                              | `u1.medium`      |
-| `instanceProfile`  | Virtual Machine prefferences profile                                                                       | `ubuntu`         |
-| `disks`            | List of disks to attach                                                                                    | `[]`             |
-| `resources.cpu`    | The number of CPU cores allocated to the virtual machine                                                   | `""`             |
-| `resources.memory` | The amount of memory allocated to the virtual machine                                                      | `""`             |
-| `sshKeys`          | List of SSH public keys for authentication. Can be a single key or a list of keys.                         | `[]`             |
-| `cloudInit`        | cloud-init user data config. See cloud-init documentation for more details.                                | `#cloud-config
-` |
+| Name               | Description                                                                                                | Value       |
+| ------------------ | ---------------------------------------------------------------------------------------------------------- | ----------- |
+| `external`         | Enable external access from outside the cluster                                                            | `false`     |
+| `externalMethod`   | specify method to passthrough the traffic to the virtual machine. Allowed values: `WholeIP` and `PortList` | `WholeIP`   |
+| `externalPorts`    | Specify ports to forward from outside the cluster                                                          | `[]`        |
+| `running`          | Determines if the virtual machine should be running                                                        | `true`      |
+| `instanceType`     | Virtual Machine instance type                                                                              | `u1.medium` |
+| `instanceProfile`  | Virtual Machine preferences profile                                                                        | `ubuntu`    |
+| `disks`            | List of disks to attach                                                                                    | `[]`        |
+| `gpus`             | List of GPUs to attach                                                                                     | `[]`        |
+| `resources.cpu`    | The number of CPU cores allocated to the virtual machine                                                   | `""`        |
+| `resources.memory` | The amount of memory allocated to the virtual machine                                                      | `""`        |
+| `sshKeys`          | List of SSH public keys for authentication. Can be a single key or a list of keys.                         | `[]`        |
+| `cloudInit`        | cloud-init user data config. See cloud-init documentation for more details.                                | `""`        |
+| `cloudInitSeed`    | A seed string to generate an SMBIOS UUID for the VM.                                                       | `""`        |

 ## U Series

--- a/packages/apps/vm-instance/templates/_helpers.tpl
+++ b/packages/apps/vm-instance/templates/_helpers.tpl
@@ -49,3 +49,23 @@ Selector labels
 app.kubernetes.io/name: {{ include "virtual-machine.name" . }}
 app.kubernetes.io/instance: {{ .Release.Name }}
 {{- end }}
+
+{{/*
+Generate a stable UUID for cloud-init re-initialization upon upgrade.
+*/}}
+{{- define "virtual-machine.stableUuid" -}}
+{{- $source := printf "%s-%s-%s" .Release.Namespace (include "virtual-machine.fullname" .) .Values.cloudInitSeed }}
+{{- $hash := sha256sum $source }}
+{{- $uuid := printf "%s-%s-4%s-9%s-%s" (substr 0 8 $hash) (substr 8 12 $hash) (substr 13 16 $hash) (substr 17 20 $hash) (substr 20 32 $hash) }}
+{{- if eq .Values.cloudInitSeed "" }}
+  {{- /*  Try to save previous uuid to not trigger full cloud-init again if user decided to remove the seed. */}}
+  {{- $vmResource := lookup "kubevirt.io/v1" "VirtualMachine" .Release.Namespace (include "virtual-machine.fullname" .) -}}
+  {{- if $vmResource }}
+    {{- $existingUuid := $vmResource | dig "spec" "template" "spec" "domain" "firmware" "uuid" "" }}
+    {{- if $existingUuid }}
+      {{- $uuid = $existingUuid }}
+    {{- end }}
+  {{- end }}
+{{- end }}
+{{- $uuid }}
+{{- end }}
--- a/packages/apps/vm-instance/templates/dashboard-resourcemap.yaml
+++ b/packages/apps/vm-instance/templates/dashboard-resourcemap.yaml
@@ -22,5 +22,5 @@ spec:
  kind: virtual-machine
  type: virtual-machine
  selector:
-    vm.kubevirt.io/name: {{ $.Release.Name }}
+    {{- include "virtual-machine.selectorLabels" . | nindent 4 }}
  version: {{ $.Chart.Version }}
--- a/packages/apps/vm-instance/templates/vm.yaml
+++ b/packages/apps/vm-instance/templates/vm.yaml
@@ -1,8 +1,8 @@
 {{- if and .Values.instanceType (not (lookup "instancetype.kubevirt.io/v1beta1" "VirtualMachineClusterInstancetype" "" .Values.instanceType)) }}
-{{-   fail (printf "Specified instancetype not exists in cluster: %s" .Values.instanceType) }}
+{{-   fail (printf "Specified instanceType does not exist in the cluster: %s" .Values.instanceType) }}
 {{- end }}
 {{- if and .Values.instanceProfile (not (lookup "instancetype.kubevirt.io/v1beta1" "VirtualMachineClusterPreference" "" .Values.instanceProfile)) }}
-{{-   fail (printf "Specified profile not exists in cluster: %s" .Values.instanceProfile) }}
+{{-   fail (printf "Specified instanceProfile does not exist in the cluster: %s" .Values.instanceProfile) }}
 {{- end }}

 apiVersion: kubevirt.io/v1
@@ -40,11 +40,19 @@ spec:
          requests:
            memory: {{ .Values.resources.memory | quote }}
        {{- end }}
+        firmware:
+          uuid: {{ include "virtual-machine.stableUuid" . }}
        devices:
+          {{- if .Values.gpus }}
+          gpus:
+          {{- range $i, $gpu := .Values.gpus }}
+          - deviceName: {{ $gpu.name }}
+          {{- end }}
+          {{- end }}
          disks:
          {{- range $i, $disk := .Values.disks }}
-          - name: disk-{{ .name }}
-            {{- $disk := lookup "cdi.kubevirt.io/v1beta1" "DataVolume" $.Release.Namespace (printf "vm-disk-%s" .name) }}
+          - name: disk-{{ $disk.name }}
+            {{- $disk := lookup "cdi.kubevirt.io/v1beta1" "DataVolume" $.Release.Namespace (printf "vm-disk-%s" $disk.name) }}
            {{- if $disk }}
            {{- if and (hasKey $disk.metadata.annotations "vm-disk.cozystack.io/optical") (eq (index $disk.metadata.annotations "vm-disk.cozystack.io/optical") "true") }}
            cdrom: {}
@@ -75,6 +83,7 @@ spec:
            secret:
              secretName: {{ include "virtual-machine.fullname" $ }}-ssh-keys
          propagationMethod:
+            # keys will be injected into metadata part of cloud-init disk
            noCloud: {}
      {{- end }}
      terminationGracePeriodSeconds: 30
@@ -87,8 +96,14 @@ spec:
      {{- if or .Values.sshKeys .Values.cloudInit }}
      - name: cloudinitdisk
        cloudInitNoCloud:
+        {{- if .Values.cloudInit }}
          secretRef:
            name: {{ include "virtual-machine.fullname" . }}-cloud-init
+        {{- else }}
+          userData: |
+            #cloud-config
+            final_message: Cloud-init user-data was left blank intentionally.
+        {{- end }}
      {{- end }}
      networks:
      - name: default
--- a/packages/apps/vm-instance/values.schema.json
+++ b/packages/apps/vm-instance/values.schema.json
@@ -88,7 +88,7 @@
    },
    "instanceProfile": {
      "type": "string",
-      "description": "Virtual Machine prefferences profile",
+      "description": "Virtual Machine preferences profile",
      "default": "ubuntu",
      "optional": true,
      "enum": [
@@ -145,6 +145,14 @@
        "type": "object"
      }
    },
+    "gpus": {
+      "type": "array",
+      "description": "List of GPUs to attach",
+      "default": [],
+      "items": {
+        "type": "object"
+      }
+    },
    "resources": {
      "type": "object",
      "properties": {
@@ -171,7 +179,12 @@
    "cloudInit": {
      "type": "string",
      "description": "cloud-init user data config. See cloud-init documentation for more details.",
-      "default": "#cloud-config\n"
+      "default": ""
+    },
+    "cloudInitSeed": {
+      "type": "string",
+      "description": "A seed string to generate an SMBIOS UUID for the VM.",
+      "default": ""
    }
  }
 }
--- a/packages/apps/vm-instance/values.yaml
+++ b/packages/apps/vm-instance/values.yaml
@@ -12,7 +12,7 @@ externalPorts:
 running: true

 ## @param instanceType Virtual Machine instance type
-## @param instanceProfile Virtual Machine prefferences profile
+## @param instanceProfile Virtual Machine preferences profile
 ##
 instanceType: "u1.medium"
 instanceProfile: ubuntu
@@ -24,6 +24,12 @@ instanceProfile: ubuntu
 ## - name: example-data
 disks: []

+## @param gpus [array] List of GPUs to attach
+## Example:
+## gpus:
+## - name: nvidia.com/GA102GL_A10
+gpus: []
+
 ## @param resources.cpu The number of CPU cores allocated to the virtual machine
 ## @param resources.memory The amount of memory allocated to the virtual machine
 resources:
@@ -47,5 +53,13 @@ sshKeys: []
 ##   password: ubuntu
 ##   chpasswd: { expire: False }
 ##
-cloudInit: |
-  #cloud-config
+cloudInit: ""
+
+## @param cloudInitSeed A seed string to generate an SMBIOS UUID for the VM.
+cloudInitSeed: ""
+## Change it to any new value to force a full cloud-init reconfiguration. Change it when you want to apply
+## to an existing VM settings that are usually written only once, like new SSH keys or new network configuration.
+## An empty value does nothing (and the existing UUID is not reverted). Please note that changing this value
+## does not trigger a VM restart. You must perform the restart separately.
+## Example:
+## cloudInitSeed: "upd1"
--- a/packages/core/installer/Makefile
+++ b/packages/core/installer/Makefile
@@ -59,7 +59,7 @@ image-matchbox:
 		> ../../extra/bootbox/images/matchbox.tag
 	rm -f images/matchbox.json

-assets: talos-iso talos-nocloud talos-metal
+assets: talos-iso talos-nocloud talos-metal talos-kernel talos-initramfs

 talos-initramfs talos-kernel talos-installer talos-iso talos-nocloud talos-metal:
 	mkdir -p ../../../_out/assets
--- a/packages/core/platform/bundles/paas-full.yaml
+++ b/packages/core/platform/bundles/paas-full.yaml
@@ -116,7 +116,7 @@ releases:
  chart: cozy-monitoring-agents
  namespace: cozy-monitoring
  privileged: true
-  dependsOn: [cilium,kubeovn,victoria-metrics-operator]
+  dependsOn: [victoria-metrics-operator, vertical-pod-autoscaler-crds]
  values:
    scrapeRules:
      etcd:
@@ -153,6 +153,17 @@ releases:
  namespace: cozy-kubevirt-cdi
  dependsOn: [cilium,kubeovn,kubevirt-cdi-operator]

+- name: gpu-operator
+  releaseName: gpu-operator
+  chart: cozy-gpu-operator
+  namespace: cozy-gpu-operator
+  privileged: true
+  optional: true
+  dependsOn: [cilium,kubeovn]
+  valuesFiles:
+  - values.yaml
+  - values-talos.yaml
+
 - name: metallb
  releaseName: metallb
  chart: cozy-metallb
@@ -388,6 +399,13 @@ releases:
  privileged: true
  dependsOn: [monitoring-agents]

+- name: vertical-pod-autoscaler-crds
+  releaseName: vertical-pod-autoscaler-crds
+  chart: cozy-vertical-pod-autoscaler-crds
+  namespace: cozy-vertical-pod-autoscaler
+  privileged: true
+  dependsOn: [cilium, kubeovn]
+
 - name: reloader
  releaseName: reloader
  chart: cozy-reloader
--- a/packages/core/platform/bundles/paas-hosted.yaml
+++ b/packages/core/platform/bundles/paas-hosted.yaml
@@ -69,7 +69,7 @@ releases:
  chart: cozy-monitoring-agents
  namespace: cozy-monitoring
  privileged: true
-  dependsOn: [victoria-metrics-operator]
+  dependsOn: [victoria-metrics-operator, vertical-pod-autoscaler-crds]
  values:
    scrapeRules:
      etcd:
@@ -254,3 +254,10 @@ releases:
  namespace: cozy-vertical-pod-autoscaler
  privileged: true
  dependsOn: [monitoring-agents]
+
+- name: vertical-pod-autoscaler-crds
+  releaseName: vertical-pod-autoscaler-crds
+  chart: cozy-vertical-pod-autoscaler-crds
+  namespace: cozy-vertical-pod-autoscaler
+  privileged: true
+  dependsOn: [cilium, kubeovn]
--- a/packages/core/testing/images/e2e-sandbox/Dockerfile
+++ b/packages/core/testing/images/e2e-sandbox/Dockerfile
@@ -14,3 +14,4 @@ RUN curl -LO "https://dl.k8s.io/release/v${KUBECTL_VERSION}/bin/linux/amd64/kube
 && mv kubectl /usr/local/bin/kubectl
 RUN curl -sSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash -s - --version "v${HELM_VERSION}"
 RUN wget https://github.com/mikefarah/yq/releases/download/v4.44.3/yq_linux_amd64 -O /usr/local/bin/yq && chmod +x /usr/local/bin/yq
+RUN curl -s https://fluxcd.io/install.sh | bash
--- a/packages/extra/monitoring/templates/vlogs/vlogs.yaml
+++ b/packages/extra/monitoring/templates/vlogs/vlogs.yaml
@@ -4,6 +4,8 @@ kind: VLogs
 metadata:
  name: {{ .name }}
 spec:
+  image:
+    tag: v1.17.0-victorialogs
  storage:
    resources:
      requests:
--- a/packages/system/capi-operator/charts/cluster-api-operator/Chart.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/Chart.yaml
@@ -1,6 +1,6 @@
 apiVersion: v2
-appVersion: 0.17.0
+appVersion: 0.18.1
 description: Cluster API Operator
 name: cluster-api-operator
 type: application
-version: 0.17.0
+version: 0.18.1
--- a/packages/system/capi-operator/charts/cluster-api-operator/templates/addon.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/templates/addon.yaml
@@ -26,8 +26,10 @@ apiVersion: v1
 kind: Namespace
 metadata:
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "1"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "1"
  name: {{ $addonNamespace }}
 ---
@@ -37,8 +39,10 @@ metadata:
  name: {{ $addonName }}
  namespace: {{ $addonNamespace }}
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "2"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "2"
 {{- if or $addonVersion $.Values.secretName }}
 spec:
--- a/packages/system/capi-operator/charts/cluster-api-operator/templates/bootstrap.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/templates/bootstrap.yaml
@@ -26,8 +26,11 @@ apiVersion: v1
 kind: Namespace
 metadata:
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "1"
+    {{- end }}
+    "argocd.argoproj.io/sync-wave": "1"
  name: {{ $bootstrapNamespace }}
 ---
 apiVersion: operator.cluster.x-k8s.io/v1alpha2
@@ -36,8 +39,11 @@ metadata:
  name: {{ $bootstrapName }}
  namespace: {{ $bootstrapNamespace }}
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "2"
+    {{- end }}
+    "argocd.argoproj.io/sync-wave": "2"
 {{- if or $bootstrapVersion $.Values.configSecret.name }}
 spec:
 {{- end}}
--- a/packages/system/capi-operator/charts/cluster-api-operator/templates/control-plane.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/templates/control-plane.yaml
@@ -26,8 +26,11 @@ apiVersion: v1
 kind: Namespace
 metadata:
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "1"
+    {{- end }}
+    "argocd.argoproj.io/sync-wave": "1"
  name: {{ $controlPlaneNamespace }}
 ---
 apiVersion: operator.cluster.x-k8s.io/v1alpha2
@@ -36,8 +39,11 @@ metadata:
  name: {{ $controlPlaneName }}
  namespace: {{ $controlPlaneNamespace }}
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "2"
+    {{- end }}
+    "argocd.argoproj.io/sync-wave": "2"
 {{- if or $controlPlaneVersion $.Values.configSecret.name $.Values.manager }}
 spec:
 {{- end}}
--- a/packages/system/capi-operator/charts/cluster-api-operator/templates/core-conditions.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/templates/core-conditions.yaml
@@ -1,4 +1,4 @@
-{{- if or .Values.addon .Values.bootstrap .Values.controlPlane .Values.infrastructure }}
+{{- if or .Values.addon .Values.bootstrap .Values.controlPlane .Values.infrastructure .Values.ipam }}
 # Deploy core components if not specified
 {{- if not .Values.core }}
 ---
@@ -6,8 +6,11 @@ apiVersion: v1
 kind: Namespace
 metadata:
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "1"
+    {{- end }}
+    "argocd.argoproj.io/sync-wave": "1"
  name: capi-system
 ---
 apiVersion: operator.cluster.x-k8s.io/v1alpha2
@@ -16,8 +19,11 @@ metadata:
  name: cluster-api
  namespace: capi-system
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "2"
+    {{- end }}
+    "argocd.argoproj.io/sync-wave": "2"
 {{- with .Values.configSecret }}
 spec:
  configSecret:
@@ -28,4 +34,3 @@ spec:
 {{- end }}
 {{- end }}
 {{- end }}
-
--- a/packages/system/capi-operator/charts/cluster-api-operator/templates/core.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/templates/core.yaml
@@ -25,8 +25,11 @@ apiVersion: v1
 kind: Namespace
 metadata:
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "1"
+    {{- end }}
+    "argocd.argoproj.io/sync-wave": "1"
  name: {{ $coreNamespace }}
 ---
 apiVersion: operator.cluster.x-k8s.io/v1alpha2
@@ -35,8 +38,10 @@ metadata:
  name: {{ $coreName }}
  namespace: {{ $coreNamespace }}
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "2"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "2"
 {{- if or $coreVersion $.Values.configSecret.name $.Values.manager }}
 spec:
@@ -45,8 +50,8 @@ spec:
  version: {{ $coreVersion }}
 {{- end }}
 {{- if $.Values.manager }}
-  manager:
 {{- if and $.Values.manager.featureGates $.Values.manager.featureGates.core }}
+  manager:
    featureGates:
    {{- range $key, $value := $.Values.manager.featureGates.core }}
      {{ $key }}: {{ $value }}
--- a/packages/system/capi-operator/charts/cluster-api-operator/templates/infra-conditions.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/templates/infra-conditions.yaml
@@ -7,8 +7,10 @@ apiVersion: v1
 kind: Namespace
 metadata:
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "1"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "1"
  name: capi-kubeadm-bootstrap-system
 ---
@@ -18,8 +20,10 @@ metadata:
  name: kubeadm
  namespace: capi-kubeadm-bootstrap-system
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "2"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "2"
 {{- with .Values.configSecret }}
 spec:
@@ -37,8 +41,10 @@ apiVersion: v1
 kind: Namespace
 metadata:
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "1"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "1"
  name: capi-kubeadm-control-plane-system
 ---
@@ -48,14 +54,16 @@ metadata:
  name: kubeadm
  namespace: capi-kubeadm-control-plane-system
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "2"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "2"
 {{- with .Values.configSecret }}
 spec:
 {{- if $.Values.manager }}
-  manager:
 {{- if and $.Values.manager.featureGates $.Values.manager.featureGates.kubeadm }}
+  manager:
    featureGates:
    {{- range $key, $value := $.Values.manager.featureGates.kubeadm }}
      {{ $key }}: {{ $value }}
--- a/packages/system/capi-operator/charts/cluster-api-operator/templates/infra.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/templates/infra.yaml
@@ -26,8 +26,10 @@ apiVersion: v1
 kind: Namespace
 metadata:
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "1"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "1"
  name: {{ $infrastructureNamespace }}
 ---
@@ -37,8 +39,10 @@ metadata:
  name: {{ $infrastructureName }}
  namespace: {{ $infrastructureNamespace }}
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "2"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "2"
 {{- if or $infrastructureVersion $.Values.configSecret.name $.Values.manager $.Values.additionalDeployments }}
 spec:
@@ -47,8 +51,8 @@ spec:
  version: {{ $infrastructureVersion }}
 {{- end }}
 {{- if $.Values.manager }}
-  manager:
 {{- if and (kindIs "map" $.Values.manager.featureGates) (hasKey $.Values.manager.featureGates $infrastructureName) }}
+  manager:
 {{- range $key, $value := $.Values.manager.featureGates }}
  {{- if eq $key $infrastructureName }}
    featureGates:
--- a/packages/system/capi-operator/charts/cluster-api-operator/templates/ipam.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/templates/ipam.yaml
@@ -26,8 +26,10 @@ apiVersion: v1
 kind: Namespace
 metadata:
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "1"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "1"
  name: {{ $ipamNamespace }}
 ---
@@ -37,8 +39,10 @@ metadata:
  name: {{ $ipamName }}
  namespace: {{ $ipamNamespace }}
  annotations:
+    {{- if $.Values.enableHelmHook }}
    "helm.sh/hook": "post-install,post-upgrade"
    "helm.sh/hook-weight": "2"
+    {{- end }}
    "argocd.argoproj.io/sync-wave": "2"
 {{- if or $ipamVersion $.Values.configSecret.name $.Values.manager $.Values.additionalDeployments }}
 spec:
@@ -47,8 +51,8 @@ spec:
  version: {{ $ipamVersion }}
 {{- end }}
 {{- if $.Values.manager }}
-  manager:
 {{- if and (kindIs "map" $.Values.manager.featureGates) (hasKey $.Values.manager.featureGates $ipamName) }}
+  manager:
 {{- range $key, $value := $.Values.manager.featureGates }}
  {{- if eq $key $ipamName }}
    featureGates:
--- a/packages/system/capi-operator/charts/cluster-api-operator/values.yaml
+++ b/packages/system/capi-operator/charts/cluster-api-operator/values.yaml
@@ -21,7 +21,7 @@ leaderElection:
 image:
  manager:
    repository: registry.k8s.io/capi-operator/cluster-api-operator
-    tag: v0.17.0
+    tag: v0.18.1
    pullPolicy: IfNotPresent
 env:
  manager: []
@@ -69,3 +69,4 @@ volumeMounts:
    - mountPath: /tmp/k8s-webhook-server/serving-certs
      name: cert
      readOnly: true
+enableHelmHook: true
--- a/packages/system/cilium/charts/cilium/Chart.yaml
+++ b/packages/system/cilium/charts/cilium/Chart.yaml
@@ -79,7 +79,7 @@ annotations:
    Pod IP Pool\n  description: |\n    CiliumPodIPPool defines an IP pool that can
    be used for pooled IPAM (i.e. the multi-pool IPAM mode).\n"
 apiVersion: v2
-appVersion: 1.17.1
+appVersion: 1.17.2
 description: eBPF-based Networking, Security, and Observability
 home: https://cilium.io/
 icon: https://cdn.jsdelivr.net/gh/cilium/cilium@main/Documentation/images/logo-solo.svg
@@ -95,4 +95,4 @@ kubeVersion: '>= 1.21.0-0'
 name: cilium
 sources:
 - https://github.com/cilium/cilium
-version: 1.17.1
+version: 1.17.2
--- a/packages/system/cilium/charts/cilium/README.md
+++ b/packages/system/cilium/charts/cilium/README.md
@@ -1,6 +1,6 @@
 # cilium

-![Version: 1.17.1](https://img.shields.io/badge/Version-1.17.1-informational?style=flat-square) ![AppVersion: 1.17.1](https://img.shields.io/badge/AppVersion-1.17.1-informational?style=flat-square)
+![Version: 1.17.2](https://img.shields.io/badge/Version-1.17.2-informational?style=flat-square) ![AppVersion: 1.17.2](https://img.shields.io/badge/AppVersion-1.17.2-informational?style=flat-square)

 Cilium is open source software for providing and transparently securing
 network connectivity and loadbalancing between application workloads such as
@@ -85,7 +85,7 @@ contributors across the globe, there is almost always someone available to help.
 | authentication.mutual.spire.install.agent.tolerations | list | `[{"effect":"NoSchedule","key":"node.kubernetes.io/not-ready"},{"effect":"NoSchedule","key":"node-role.kubernetes.io/master"},{"effect":"NoSchedule","key":"node-role.kubernetes.io/control-plane"},{"effect":"NoSchedule","key":"node.cloudprovider.kubernetes.io/uninitialized","value":"true"},{"key":"CriticalAddonsOnly","operator":"Exists"}]` | SPIRE agent tolerations configuration By default it follows the same tolerations as the agent itself to allow the Cilium agent on this node to connect to SPIRE. ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/ |
 | authentication.mutual.spire.install.enabled | bool | `true` | Enable SPIRE installation. This will only take effect only if authentication.mutual.spire.enabled is true |
 | authentication.mutual.spire.install.existingNamespace | bool | `false` | SPIRE namespace already exists. Set to true if Helm should not create, manage, and import the SPIRE namespace. |
-| authentication.mutual.spire.install.initImage | object | `{"digest":"sha256:a5d0ce49aa801d475da48f8cb163c354ab95cab073cd3c138bd458fc8257fbf1","override":null,"pullPolicy":"IfNotPresent","repository":"docker.io/library/busybox","tag":"1.37.0","useDigest":true}` | init container image of SPIRE agent and server |
+| authentication.mutual.spire.install.initImage | object | `{"digest":"sha256:498a000f370d8c37927118ed80afe8adc38d1edcbfc071627d17b25c88efcab0","override":null,"pullPolicy":"IfNotPresent","repository":"docker.io/library/busybox","tag":"1.37.0","useDigest":true}` | init container image of SPIRE agent and server |
 | authentication.mutual.spire.install.namespace | string | `"cilium-spire"` | SPIRE namespace to install into |
 | authentication.mutual.spire.install.server.affinity | object | `{}` | SPIRE server affinity configuration |
 | authentication.mutual.spire.install.server.annotations | object | `{}` | SPIRE server annotations |
@@ -131,6 +131,8 @@ contributors across the globe, there is almost always someone available to help.
 | bpf.ctTcpMax | int | `524288` | Configure the maximum number of entries in the TCP connection tracking table. |
 | bpf.datapathMode | string | `veth` | Mode for Pod devices for the core datapath (veth, netkit, netkit-l2, lb-only) |
 | bpf.disableExternalIPMitigation | bool | `false` | Disable ExternalIP mitigation (CVE-2020-8554) |
+| bpf.distributedLRU | object | `{"enabled":false}` | Control to use a distributed per-CPU backend memory for the core BPF LRU maps which Cilium uses. This improves performance significantly, but it is also recommended to increase BPF map sizing along with that. |
+| bpf.distributedLRU.enabled | bool | `false` | Enable distributed LRU backend memory. For compatibility with existing installations it is off by default. |
 | bpf.enableTCX | bool | `true` | Attach endpoint programs using tcx instead of legacy tc hooks on supported kernels. |
 | bpf.events | object | `{"default":{"burstLimit":null,"rateLimit":null},"drop":{"enabled":true},"policyVerdict":{"enabled":true},"trace":{"enabled":true}}` | Control events generated by the Cilium datapath exposed to Cilium monitor and Hubble. Helm configuration for BPF events map rate limiting is experimental and might change in upcoming releases. |
 | bpf.events.default | object | `{"burstLimit":null,"rateLimit":null}` | Default settings for all types of events except dbg and pcap. |
@@ -195,7 +197,7 @@ contributors across the globe, there is almost always someone available to help.
 | clustermesh.apiserver.extraVolumeMounts | list | `[]` | Additional clustermesh-apiserver volumeMounts. |
 | clustermesh.apiserver.extraVolumes | list | `[]` | Additional clustermesh-apiserver volumes. |
 | clustermesh.apiserver.healthPort | int | `9880` | TCP port for the clustermesh-apiserver health API. |
-| clustermesh.apiserver.image | object | `{"digest":"sha256:1de22f46bfdd638de72c2224d5223ddc3bbeacda1803cb75799beca3d4bf7a4c","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/clustermesh-apiserver","tag":"v1.17.1","useDigest":true}` | Clustermesh API server image. |
+| clustermesh.apiserver.image | object | `{"digest":"sha256:981250ebdc6e66e190992eaf75cfca169113a8f08d5c3793fe15822176980398","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/clustermesh-apiserver","tag":"v1.17.2","useDigest":true}` | Clustermesh API server image. |
 | clustermesh.apiserver.kvstoremesh.enabled | bool | `true` | Enable KVStoreMesh. KVStoreMesh caches the information retrieved from the remote clusters in the local etcd instance. |
 | clustermesh.apiserver.kvstoremesh.extraArgs | list | `[]` | Additional KVStoreMesh arguments. |
 | clustermesh.apiserver.kvstoremesh.extraEnv | list | `[]` | Additional KVStoreMesh environment variables. |
@@ -375,7 +377,7 @@ contributors across the globe, there is almost always someone available to help.
 | envoy.healthPort | int | `9878` | TCP port for the health API. |
 | envoy.httpRetryCount | int | `3` | Maximum number of retries for each HTTP request |
 | envoy.idleTimeoutDurationSeconds | int | `60` | Set Envoy upstream HTTP idle connection timeout seconds. Does not apply to connections with pending requests. Default 60s |
-| envoy.image | object | `{"digest":"sha256:fc708bd36973d306412b2e50c924cd8333de67e0167802c9b48506f9d772f521","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/cilium-envoy","tag":"v1.31.5-1739264036-958bef243c6c66fcfd73ca319f2eb49fff1eb2ae","useDigest":true}` | Envoy container image. |
+| envoy.image | object | `{"digest":"sha256:377c78c13d2731f3720f931721ee309159e782d882251709cb0fac3b42c03f4b","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/cilium-envoy","tag":"v1.31.5-1741765102-efed3defcc70ab5b263a0fc44c93d316b846a211","useDigest":true}` | Envoy container image. |
 | envoy.initialFetchTimeoutSeconds | int | `30` | Time in seconds after which the initial fetch on an xDS stream is considered timed out |
 | envoy.livenessProbe.failureThreshold | int | `10` | failure threshold of liveness probe |
 | envoy.livenessProbe.periodSeconds | int | `30` | interval between checks of the liveness probe |
@@ -392,6 +394,7 @@ contributors across the globe, there is almost always someone available to help.
 | envoy.podLabels | object | `{}` | Labels to be added to envoy pods |
 | envoy.podSecurityContext | object | `{"appArmorProfile":{"type":"Unconfined"}}` | Security Context for cilium-envoy pods. |
 | envoy.podSecurityContext.appArmorProfile | object | `{"type":"Unconfined"}` | AppArmorProfile options for the `cilium-agent` and init containers |
+| envoy.policyRestoreTimeoutDuration | string | `nil` | Max duration to wait for endpoint policies to be restored on restart. Default "3m". |
 | envoy.priorityClassName | string | `nil` | The priority class to use for cilium-envoy. |
 | envoy.prometheus | object | `{"enabled":true,"port":"9964","serviceMonitor":{"annotations":{},"enabled":false,"interval":"10s","labels":{},"metricRelabelings":null,"relabelings":[{"replacement":"${1}","sourceLabels":["__meta_kubernetes_pod_node_name"],"targetLabel":"node"}]}}` | Configure Cilium Envoy Prometheus options. Note that some of these apply to either cilium-agent or cilium-envoy. |
 | envoy.prometheus.enabled | bool | `true` | Enable prometheus metrics for cilium-envoy |
@@ -515,7 +518,7 @@ contributors across the globe, there is almost always someone available to help.
 | hubble.relay.extraVolumes | list | `[]` | Additional hubble-relay volumes. |
 | hubble.relay.gops.enabled | bool | `true` | Enable gops for hubble-relay |
 | hubble.relay.gops.port | int | `9893` | Configure gops listen port for hubble-relay |
-| hubble.relay.image | object | `{"digest":"sha256:397e8fbb188157f744390a7b272a1dec31234e605bcbe22d8919a166d202a3dc","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/hubble-relay","tag":"v1.17.1","useDigest":true}` | Hubble-relay container image. |
+| hubble.relay.image | object | `{"digest":"sha256:42a8db5c256c516cacb5b8937c321b2373ad7a6b0a1e5a5120d5028433d586cc","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/hubble-relay","tag":"v1.17.2","useDigest":true}` | Hubble-relay container image. |
 | hubble.relay.listenHost | string | `""` | Host to listen to. Specify an empty string to bind to all the interfaces. |
 | hubble.relay.listenPort | string | `"4245"` | Port to listen to. |
 | hubble.relay.nodeSelector | object | `{"kubernetes.io/os":"linux"}` | Node labels for pod assignment ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector |
@@ -582,7 +585,7 @@ contributors across the globe, there is almost always someone available to help.
 | hubble.ui.backend.extraEnv | list | `[]` | Additional hubble-ui backend environment variables. |
 | hubble.ui.backend.extraVolumeMounts | list | `[]` | Additional hubble-ui backend volumeMounts. |
 | hubble.ui.backend.extraVolumes | list | `[]` | Additional hubble-ui backend volumes. |
-| hubble.ui.backend.image | object | `{"digest":"sha256:0e0eed917653441fded4e7cdb096b7be6a3bddded5a2dd10812a27b1fc6ed95b","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/hubble-ui-backend","tag":"v0.13.1","useDigest":true}` | Hubble-ui backend image. |
+| hubble.ui.backend.image | object | `{"digest":"sha256:a034b7e98e6ea796ed26df8f4e71f83fc16465a19d166eff67a03b822c0bfa15","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/hubble-ui-backend","tag":"v0.13.2","useDigest":true}` | Hubble-ui backend image. |
 | hubble.ui.backend.livenessProbe.enabled | bool | `false` | Enable liveness probe for Hubble-ui backend (requires Hubble-ui 0.12+) |
 | hubble.ui.backend.readinessProbe.enabled | bool | `false` | Enable readiness probe for Hubble-ui backend (requires Hubble-ui 0.12+) |
 | hubble.ui.backend.resources | object | `{}` | Resource requests and limits for the 'backend' container of the 'hubble-ui' deployment. |
@@ -592,7 +595,7 @@ contributors across the globe, there is almost always someone available to help.
 | hubble.ui.frontend.extraEnv | list | `[]` | Additional hubble-ui frontend environment variables. |
 | hubble.ui.frontend.extraVolumeMounts | list | `[]` | Additional hubble-ui frontend volumeMounts. |
 | hubble.ui.frontend.extraVolumes | list | `[]` | Additional hubble-ui frontend volumes. |
-| hubble.ui.frontend.image | object | `{"digest":"sha256:e2e9313eb7caf64b0061d9da0efbdad59c6c461f6ca1752768942bfeda0796c6","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/hubble-ui","tag":"v0.13.1","useDigest":true}` | Hubble-ui frontend image. |
+| hubble.ui.frontend.image | object | `{"digest":"sha256:9e37c1296b802830834cc87342a9182ccbb71ffebb711971e849221bd9d59392","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/hubble-ui","tag":"v0.13.2","useDigest":true}` | Hubble-ui frontend image. |
 | hubble.ui.frontend.resources | object | `{}` | Resource requests and limits for the 'frontend' container of the 'hubble-ui' deployment. |
 | hubble.ui.frontend.securityContext | object | `{}` | Hubble-ui frontend security context. |
 | hubble.ui.frontend.server.ipv6 | object | `{"enabled":true}` | Controls server listener for ipv6 |
@@ -622,7 +625,7 @@ contributors across the globe, there is almost always someone available to help.
 | hubble.ui.updateStrategy | object | `{"rollingUpdate":{"maxUnavailable":1},"type":"RollingUpdate"}` | hubble-ui update strategy. |
 | identityAllocationMode | string | `"crd"` | Method to use for identity allocation (`crd`, `kvstore` or `doublewrite-readkvstore` / `doublewrite-readcrd` for migrating between identity backends). |
 | identityChangeGracePeriod | string | `"5s"` | Time to wait before using new identity on endpoint identity change. |
-| image | object | `{"digest":"sha256:8969bfd9c87cbea91e40665f8ebe327268c99d844ca26d7d12165de07f702866","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/cilium","tag":"v1.17.1","useDigest":true}` | Agent container image. |
+| image | object | `{"digest":"sha256:3c4c9932b5d8368619cb922a497ff2ebc8def5f41c18e410bcc84025fcd385b1","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/cilium","tag":"v1.17.2","useDigest":true}` | Agent container image. |
 | imagePullSecrets | list | `[]` | Configure image pull secrets for pulling container images |
 | ingressController.default | bool | `false` | Set cilium ingress controller to be the default ingress controller This will let cilium ingress controller route entries without ingress class set |
 | ingressController.defaultSecretName | string | `nil` | Default secret name for ingresses without .spec.tls[].secretName set. |
@@ -759,7 +762,7 @@ contributors across the globe, there is almost always someone available to help.
 | operator.hostNetwork | bool | `true` | HostNetwork setting |
 | operator.identityGCInterval | string | `"15m0s"` | Interval for identity garbage collection. |
 | operator.identityHeartbeatTimeout | string | `"30m0s"` | Timeout for identity heartbeats. |
-| operator.image | object | `{"alibabacloudDigest":"sha256:034b479fba340f9d98510e509c7ce1c36e8889a109d5f1c2240fcb0942bc772c","awsDigest":"sha256:da74748057c836471bfdc0e65bb29ba0edb82916ec4b99f6a4f002b2fcc849d6","azureDigest":"sha256:b9e3e3994f5fcf1832e1f344f3b3b544832851b1990f124b2c2c68e3ffe04a9b","genericDigest":"sha256:628becaeb3e4742a1c36c4897721092375891b58bae2bfcae48bbf4420aaee97","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/operator","suffix":"","tag":"v1.17.1","useDigest":true}` | cilium-operator image. |
+| operator.image | object | `{"alibabacloudDigest":"sha256:7cb8c23417f65348bb810fe92fb05b41d926f019d77442f3fa1058d17fea7ffe","awsDigest":"sha256:955096183e22a203bbb198ca66e3266ce4dbc2b63f1a2fbd03f9373dcd97893c","azureDigest":"sha256:455fb88b558b1b8ba09d63302ccce76b4930581be89def027184ab04335c20e0","genericDigest":"sha256:81f2d7198366e8dec2903a3a8361e4c68d47d19c68a0d42f0b7b6e3f0523f249","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/operator","suffix":"","tag":"v1.17.2","useDigest":true}` | cilium-operator image. |
 | operator.nodeGCInterval | string | `"5m0s"` | Interval for cilium node garbage collection. |
 | operator.nodeSelector | object | `{"kubernetes.io/os":"linux"}` | Node labels for cilium-operator pod assignment ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector |
 | operator.podAnnotations | object | `{}` | Annotations to be added to cilium-operator pods |
@@ -809,7 +812,7 @@ contributors across the globe, there is almost always someone available to help.
 | preflight.extraEnv | list | `[]` | Additional preflight environment variables. |
 | preflight.extraVolumeMounts | list | `[]` | Additional preflight volumeMounts. |
 | preflight.extraVolumes | list | `[]` | Additional preflight volumes. |
-| preflight.image | object | `{"digest":"sha256:8969bfd9c87cbea91e40665f8ebe327268c99d844ca26d7d12165de07f702866","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/cilium","tag":"v1.17.1","useDigest":true}` | Cilium pre-flight image. |
+| preflight.image | object | `{"digest":"sha256:3c4c9932b5d8368619cb922a497ff2ebc8def5f41c18e410bcc84025fcd385b1","override":null,"pullPolicy":"IfNotPresent","repository":"quay.io/cilium/cilium","tag":"v1.17.2","useDigest":true}` | Cilium pre-flight image. |
 | preflight.nodeSelector | object | `{"kubernetes.io/os":"linux"}` | Node labels for preflight pod assignment ref: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector |
 | preflight.podAnnotations | object | `{}` | Annotations to be added to preflight pods |
 | preflight.podDisruptionBudget.enabled | bool | `false` | enable PodDisruptionBudget ref: https://kubernetes.io/docs/concepts/workloads/pods/disruptions/ |
@@ -883,7 +886,7 @@ contributors across the globe, there is almost always someone available to help.
 | tls.caBundle.useSecret | bool | `false` | Use a Secret instead of a ConfigMap. |
 | tls.readSecretsOnlyFromSecretsNamespace | string | `nil` | Configure if the Cilium Agent will only look in `tls.secretsNamespace` for    CiliumNetworkPolicy relevant Secrets.    If false, the Cilium Agent will be granted READ (GET/LIST/WATCH) access    to _all_ secrets in the entire cluster. This is not recommended and is    included for backwards compatibility.    This value obsoletes `tls.secretsBackend`, with `true` == `local` in the old    setting, and `false` == `k8s`. |
 | tls.secretSync | object | `{"enabled":null}` | Configures settings for synchronization of TLS Interception Secrets |
-| tls.secretSync.enabled | string | `nil` | Enable synchronization of Secrets for TLS Interception. If disabled and tls.secretsBackend is set to 'k8s', then secrets will be read directly by the agent. |
+| tls.secretSync.enabled | string | `nil` | Enable synchronization of Secrets for TLS Interception. If disabled and tls.readSecretsOnlyFromSecretsNamespace is set to 'false', then secrets will be read directly by the agent. |
 | tls.secretsBackend | string | `nil` | This configures how the Cilium agent loads the secrets used TLS-aware CiliumNetworkPolicies (namely the secrets referenced by terminatingTLS and originatingTLS). This value is DEPRECATED and will be removed in a future version. Use `tls.readSecretsOnlyFromSecretsNamespace` instead. Possible values:   - local   - k8s |
 | tls.secretsNamespace | object | `{"create":true,"name":"cilium-secrets"}` | Configures where secrets used in CiliumNetworkPolicies will be looked for |
 | tls.secretsNamespace.create | bool | `true` | Create secrets namespace for TLS Interception secrets. |
@@ -891,6 +894,7 @@ contributors across the globe, there is almost always someone available to help.
 | tolerations | list | `[{"operator":"Exists"}]` | Node tolerations for agent scheduling to nodes with taints ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/ |
 | tunnelPort | int | Port 8472 for VXLAN, Port 6081 for Geneve | Configure VXLAN and Geneve tunnel port. |
 | tunnelProtocol | string | `"vxlan"` | Tunneling protocol to use in tunneling mode and for ad-hoc tunnels. Possible values:   - ""   - vxlan   - geneve |
+| tunnelSourcePortRange | string | 0-0 to let the kernel driver decide the range | Configure VXLAN and Geneve tunnel source port range hint. |
 | updateStrategy | object | `{"rollingUpdate":{"maxUnavailable":2},"type":"RollingUpdate"}` | Cilium agent update strategy |
 | upgradeCompatibility | string | `nil` | upgradeCompatibility helps users upgrading to ensure that the configMap for Cilium will not change critical values to ensure continued operation This flag is not required for new installations. For example: '1.7', '1.8', '1.9' |
 | vtep.cidr | string | `""` | A space separated list of VTEP device CIDRs, for example "1.1.1.0/24 1.1.2.0/24" |
--- a/packages/system/cilium/charts/cilium/files/cilium-envoy/configmap/bootstrap-config.yaml
+++ b/packages/system/cilium/charts/cilium/files/cilium-envoy/configmap/bootstrap-config.yaml
@@ -7,8 +7,15 @@ staticResources:
  - name: "envoy-prometheus-metrics-listener"
    address:
      socketAddress:
-        address: "0.0.0.0"
+        address: {{ .Values.ipv4.enabled | ternary "0.0.0.0" "::" | quote }}
        portValue: {{ .Values.envoy.prometheus.port }}
+    {{- if and .Values.ipv4.enabled .Values.ipv6.enabled }}
+    additionalAddresses:
+    - address:
+        socketAddress:
+          address: "::"
+          portValue: {{ .Values.envoy.prometheus.port }}
+    {{- end }}
    filterChains:
    - filters:
      - name: "envoy.filters.network.http_connection_manager"
@@ -289,7 +296,7 @@ overloadManager:
 applicationLogConfig:
  logFormat:
    {{- if .Values.envoy.log.format_json }}
-    jsonFormat: "{{ .Values.envoy.log.format_json | toJson }}"
+    jsonFormat: {{ .Values.envoy.log.format_json | toJson }}
    {{- else }}
    textFormat: "{{ .Values.envoy.log.format }}"
    {{- end }}
--- a/packages/system/cilium/charts/cilium/templates/cilium-agent/daemonset.yaml
+++ b/packages/system/cilium/charts/cilium/templates/cilium-agent/daemonset.yaml
@@ -232,7 +232,7 @@ spec:
        resources:
          {{- toYaml . | trim | nindent 10 }}
        {{- end }}
-        {{- if or .Values.prometheus.enabled .Values.hubble.metrics.enabled }}
+        {{- if or .Values.prometheus.enabled (or .Values.hubble.metrics.enabled .Values.hubble.metrics.dynamic.enabled) }}
        ports:
        - name: peer-service
          containerPort: {{ .Values.hubble.peerService.targetPort }}
@@ -364,7 +364,7 @@ spec:
          mountPath: {{ .Values.kubeConfigPath }}
          readOnly: true
        {{- end }}
-        {{- if and .Values.hubble.enabled .Values.hubble.metrics.enabled .Values.hubble.metrics.tls.enabled }}
+        {{- if and .Values.hubble.enabled (or .Values.hubble.metrics.enabled .Values.hubble.metrics.dynamic.enabled) .Values.hubble.metrics.tls.enabled }}
        - name: hubble-metrics-tls
          mountPath: /var/lib/cilium/tls/hubble-metrics
          readOnly: true
@@ -999,7 +999,7 @@ spec:
                path: client-ca.crt
          {{- end }}
      {{- end }}
-      {{- if and .Values.hubble.enabled .Values.hubble.metrics.enabled .Values.hubble.metrics.tls.enabled }}
+      {{- if and .Values.hubble.enabled (or .Values.hubble.metrics.enabled .Values.hubble.metrics.dynamic.enabled) .Values.hubble.metrics.tls.enabled }}
      - name: hubble-metrics-tls
        projected:
          # note: the leading zero means this number is in octal representation: do not remove it
--- a/packages/system/cilium/charts/cilium/templates/cilium-agent/rolebinding.yaml
+++ b/packages/system/cilium/charts/cilium/templates/cilium-agent/rolebinding.yaml
@@ -39,6 +39,9 @@ metadata:
  {{- end }}
  labels:
    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
@@ -62,6 +65,9 @@ metadata:
  {{- end }}
  labels:
    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
@@ -85,6 +91,9 @@ metadata:
  {{- end }}
  labels:
    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
@@ -104,6 +113,9 @@ metadata:
  namespace: {{ .Values.bgpControlPlane.secretsNamespace.name | quote }}
  labels:
    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
@@ -123,6 +135,9 @@ metadata:
  namespace: {{ .Values.tls.secretsNamespace.name | quote }}
  labels:
    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
--- a/packages/system/cilium/charts/cilium/templates/cilium-agent/service.yaml
+++ b/packages/system/cilium/charts/cilium/templates/cilium-agent/service.yaml
@@ -46,6 +46,9 @@ metadata:
    k8s-app: cilium
    app.kubernetes.io/name: cilium-agent
    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
 spec:
  clusterIP: None
  type: ClusterIP
--- a/packages/system/cilium/charts/cilium/templates/cilium-configmap.yaml
+++ b/packages/system/cilium/charts/cilium/templates/cilium-configmap.yaml
@@ -403,7 +403,7 @@ data:

 {{- if .Values.bpf.authMapMax }}
  # bpf-auth-map-max specifies the maximum number of entries in the auth map
-  bpf-auth-map-max: {{ .Values.bpf.authMapMax | quote }}
+  bpf-auth-map-max: "{{ .Values.bpf.authMapMax | int }}"
 {{- end }}
 {{- if or $bpfCtTcpMax $bpfCtAnyMax }}
  # bpf-ct-global-*-max specifies the maximum number of connections
@@ -419,34 +419,34 @@ data:
  # For users upgrading from Cilium 1.2 or earlier, to minimize disruption
  # during the upgrade process, set bpf-ct-global-tcp-max to 1000000.
 {{- if $bpfCtTcpMax }}
-  bpf-ct-global-tcp-max: {{ $bpfCtTcpMax | quote }}
+  bpf-ct-global-tcp-max: "{{ $bpfCtTcpMax | int }}"
 {{- end }}
 {{- if $bpfCtAnyMax }}
-  bpf-ct-global-any-max: {{ $bpfCtAnyMax | quote }}
+  bpf-ct-global-any-max: "{{ $bpfCtAnyMax | int }}"
 {{- end }}
 {{- end }}
 {{- if .Values.bpf.ctAccounting }}
-  bpf-conntrack-accounting: "{{ .Values.bpf.ctAccounting }}"
+  bpf-conntrack-accounting: "{{ .Values.bpf.ctAccounting | int }}"
 {{- end }}
 {{- if .Values.bpf.natMax }}
  # bpf-nat-global-max specified the maximum number of entries in the
  # BPF NAT table.
-  bpf-nat-global-max: "{{ .Values.bpf.natMax }}"
+  bpf-nat-global-max: "{{ .Values.bpf.natMax | int }}"
 {{- end }}
 {{- if .Values.bpf.neighMax }}
  # bpf-neigh-global-max specified the maximum number of entries in the
  # BPF neighbor table.
-  bpf-neigh-global-max: "{{ .Values.bpf.neighMax }}"
+  bpf-neigh-global-max: "{{ .Values.bpf.neighMax | int }}"
 {{- end }}
 {{- if hasKey .Values.bpf "policyMapMax" }}
  # bpf-policy-map-max specifies the maximum number of entries in endpoint
  # policy map (per endpoint)
-  bpf-policy-map-max: "{{ .Values.bpf.policyMapMax }}"
+  bpf-policy-map-max: "{{ .Values.bpf.policyMapMax | int }}"
 {{- end }}
 {{- if hasKey .Values.bpf "lbMapMax" }}
  # bpf-lb-map-max specifies the maximum number of entries in bpf lb service,
  # backend and affinity maps.
-  bpf-lb-map-max: "{{ .Values.bpf.lbMapMax }}"
+  bpf-lb-map-max: "{{ .Values.bpf.lbMapMax | int }}"
 {{- end }}
 {{- if hasKey .Values.bpf "lbExternalClusterIP" }}
  bpf-lb-external-clusterip: {{ .Values.bpf.lbExternalClusterIP | quote }}
@@ -461,6 +461,7 @@ data:
  bpf-lb-mode-annotation: {{ .Values.bpf.lbModeAnnotation | quote }}
 {{- end }}

+  bpf-distributed-lru: {{ .Values.bpf.distributedLRU.enabled | quote }}
  bpf-events-drop-enabled: {{ .Values.bpf.events.drop.enabled | quote }}
  bpf-events-policy-verdict-enabled: {{ .Values.bpf.events.policyVerdict.enabled | quote }}
  bpf-events-trace-enabled: {{ .Values.bpf.events.trace.enabled | quote }}
@@ -513,6 +514,9 @@ data:
 {{- if .Values.tunnelPort }}
  tunnel-port: {{ .Values.tunnelPort | quote }}
 {{- end }}
+{{- if .Values.tunnelSourcePortRange }}
+  tunnel-source-port-range: {{ .Values.tunnelSourcePortRange | quote }}
+{{- end }}

 {{- if .Values.serviceNoBackendResponse }}
  service-no-backend-response: "{{ .Values.serviceNoBackendResponse }}"
@@ -927,9 +931,8 @@ data:
  operator-api-serve-addr: {{ $defaultOperatorApiServeAddr | quote }}
 {{- end }}

-{{- if .Values.hubble.enabled }}
-  # Enable Hubble gRPC service.
  enable-hubble: {{ .Values.hubble.enabled | quote }}
+{{- if .Values.hubble.enabled }}
  # UNIX domain socket for Hubble server to listen to.
  hubble-socket-path: {{ .Values.hubble.socketPath | quote }}
 {{- if hasKey .Values.hubble "eventQueueSize" }}
@@ -941,7 +944,7 @@ data:
  # Capacity of the buffer to store recent events.
  hubble-event-buffer-capacity: {{ .Values.hubble.eventBufferCapacity | quote }}
 {{- end }}
-{{- if .Values.hubble.metrics.enabled }}
+{{- if or .Values.hubble.metrics.enabled .Values.hubble.metrics.dynamic.enabled}}
  # Address to expose Hubble metrics (e.g. ":7070"). Metrics server will be disabled if this
  # field is not set.
  hubble-metrics-server: ":{{ .Values.hubble.metrics.port }}"
@@ -953,14 +956,20 @@ data:
  hubble-metrics-server-tls-client-ca-files: /var/lib/cilium/tls/hubble-metrics/client-ca.crt
  {{- end }}
  {{- end }}
+{{- end }}
+{{- if .Values.hubble.metrics.enabled }}
  # A space separated list of metrics to enable. See [0] for available metrics.
  #
  # https://github.com/cilium/hubble/blob/master/Documentation/metrics.md
  hubble-metrics: {{- range .Values.hubble.metrics.enabled }}
    {{.}}
+  {{- end}}
+{{- if .Values.hubble.metrics.dynamic.enabled }}
+  hubble-dynamic-metrics-config-path: /dynamic-metrics-config/dynamic-metrics.yaml
 {{- end }}
  enable-hubble-open-metrics: {{ .Values.hubble.metrics.enableOpenMetrics | quote }}
 {{- end }}
+
 {{- if .Values.hubble.redact }}
 {{- if eq .Values.hubble.redact.enabled true }}
  # Enables hubble redact capabilities
@@ -1004,10 +1013,6 @@ data:
  hubble-flowlogs-config-path: /flowlog-config/flowlogs.yaml
 {{- end }}
 {{- end }}
-{{- if .Values.hubble.metrics.dynamic.enabled }}
-  hubble-dynamic-metrics-config-path: /dynamic-metrics-config/dynamic-metrics.yaml
-  hubble-metrics-server: ":{{ .Values.hubble.metrics.port }}"
-{{- end }}
 {{- if hasKey .Values.hubble "listenAddress" }}
  # An additional address for Hubble server to listen to (e.g. ":4244").
  hubble-listen-address: {{ .Values.hubble.listenAddress | quote }}
@@ -1041,8 +1046,8 @@ data:
 {{- else }}
  ipam: {{ $ipam | quote }}
 {{- end }}
-{{- if hasKey .Values.ipam "multiPoolPreAllocation" }}
-  ipam-multi-pool-pre-allocation: {{ .Values.ipam.multiPoolPreAllocation }}
+{{- if .Values.ipam.multiPoolPreAllocation }}
+  ipam-multi-pool-pre-allocation: {{ .Values.ipam.multiPoolPreAllocation | quote }}
 {{- end }}

 {{- if .Values.ipam.ciliumNodeUpdateRate }}
@@ -1335,6 +1340,10 @@ data:
  external-envoy-proxy: {{ include "envoyDaemonSetEnabled" . | quote }}
  envoy-base-id: {{ .Values.envoy.baseID | quote }}

+{{- if .Values.envoy.policyRestoreTimeoutDuration }}
+  envoy-policy-restore-timeout: {{ .Values.envoy.policyRestoreTimeoutDuration | quote }}
+{{- end }}
+
 {{- if .Values.envoy.log.path }}
  envoy-log: {{ .Values.envoy.log.path | quote }}
 {{- end }}
--- a/packages/system/cilium/charts/cilium/templates/cilium-operator/role.yaml
+++ b/packages/system/cilium/charts/cilium/templates/cilium-operator/role.yaml
@@ -41,6 +41,9 @@ metadata:
  {{- end }}
  labels:
    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
 rules:
 - apiGroups:
  - ""
@@ -66,6 +69,9 @@ metadata:
  {{- end }}
  labels:
    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
 rules:
 - apiGroups:
  - ""
--- a/packages/system/cilium/charts/cilium/templates/cilium-operator/rolebinding.yaml
+++ b/packages/system/cilium/charts/cilium/templates/cilium-operator/rolebinding.yaml
@@ -7,24 +7,23 @@ kind: RoleBinding
 metadata:
  name: cilium-operator-ingress-secrets
  namespace: {{ .Values.ingressController.secretsNamespace.name | quote }}
-  {{- with .Values.commonLabels }}
  labels:
+    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
    {{- toYaml . | nindent 4 }}
-  {{- end }}
+    {{- end }}
  {{- with .Values.operator.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
-  labels:
-    app.kubernetes.io/part-of: cilium
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cilium-operator-ingress-secrets
 subjects:
-  - kind: ServiceAccount
-    name: {{ .Values.serviceAccounts.operator.name | quote }}
-    namespace: {{ include "cilium.namespace" . }}
+- kind: ServiceAccount
+  name: {{ .Values.serviceAccounts.operator.name | quote }}
+  namespace: {{ include "cilium.namespace" . }}
 {{- end }}

 {{- if and .Values.operator.enabled .Values.serviceAccounts.operator.create .Values.gatewayAPI.enabled .Values.gatewayAPI.secretsNamespace.sync .Values.gatewayAPI.secretsNamespace.name }}
@@ -34,12 +33,15 @@ kind: RoleBinding
 metadata:
  name: cilium-operator-gateway-secrets
  namespace: {{ .Values.gatewayAPI.secretsNamespace.name | quote }}
+  labels:
+    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
  {{- with .Values.operator.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
-  labels:
-    app.kubernetes.io/part-of: cilium
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
@@ -57,12 +59,15 @@ kind: RoleBinding
 metadata:
  name: cilium-operator-tlsinterception-secrets
  namespace: {{ .Values.tls.secretsNamespace.name | quote }}
+  labels:
+    app.kubernetes.io/part-of: cilium
+    {{- with .Values.commonLabels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
  {{- with .Values.operator.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
-  labels:
-    app.kubernetes.io/part-of: cilium
 roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
--- a/packages/system/cilium/charts/cilium/templates/hubble/servicemonitor.yaml
+++ b/packages/system/cilium/charts/cilium/templates/hubble/servicemonitor.yaml
@@ -1,4 +1,4 @@
-{{- if and .Values.hubble.enabled .Values.hubble.metrics.enabled .Values.hubble.metrics.serviceMonitor.enabled }}
+{{- if and .Values.hubble.enabled (or .Values.hubble.metrics.enabled .Values.hubble.metrics.dynamic.enabled) .Values.hubble.metrics.serviceMonitor.enabled }}
 apiVersion: monitoring.coreos.com/v1
 kind: ServiceMonitor
 metadata:
--- a/packages/system/cilium/charts/cilium/templates/spire/server/service.yaml
+++ b/packages/system/cilium/charts/cilium/templates/spire/server/service.yaml
@@ -4,10 +4,13 @@ kind: Service
 metadata:
  name: spire-server
  namespace: {{ .Values.authentication.mutual.spire.install.namespace }}
-  {{- with .Values.commonLabels }}
  labels:
+    {{- with .Values.commonLabels }}
    {{- toYaml . | nindent 4 }}
-  {{- end }}
+    {{- end }}
+    {{- with .Values.authentication.mutual.spire.install.server.service.labels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
  {{- if or .Values.authentication.mutual.spire.install.server.service.annotations .Values.authentication.mutual.spire.annotations }}
  annotations:
    {{- with .Values.authentication.mutual.spire.annotations }}
@@ -17,10 +20,6 @@ metadata:
      {{- toYaml . | nindent 4 }}
    {{- end }}
  {{- end }}
-  {{- with .Values.authentication.mutual.spire.install.server.service.labels }}
-  labels:
-    {{- toYaml . | nindent 8 }}
-  {{- end }}
 spec:
  type: {{ .Values.authentication.mutual.spire.install.server.service.type }}
  ports:
--- a/packages/system/cilium/charts/cilium/templates/spire/server/statefulset.yaml
+++ b/packages/system/cilium/charts/cilium/templates/spire/server/statefulset.yaml
@@ -4,10 +4,6 @@ kind: StatefulSet
 metadata:
  name: spire-server
  namespace: {{ .Values.authentication.mutual.spire.install.namespace }}
-  {{- with .Values.commonLabels }}
-  labels:
-    {{- toYaml . | nindent 4 }}
-  {{- end }}
  {{- if or .Values.authentication.mutual.spire.install.server.annotations .Values.authentication.mutual.spire.annotations }}
  annotations:
    {{- with .Values.authentication.mutual.spire.annotations }}
@@ -19,9 +15,12 @@ metadata:
  {{- end }}
  labels:
    app: spire-server
-  {{- with .Values.authentication.mutual.spire.install.server.labels }}
+    {{- with .Values.commonLabels }}
    {{- toYaml . | nindent 4 }}
-  {{- end }}
+    {{- end }}
+    {{- with .Values.authentication.mutual.spire.install.server.labels }}
+    {{- toYaml . | nindent 4 }}
+    {{- end }}
 spec:
  replicas: 1
  selector:
--- a/packages/system/cilium/charts/cilium/values.schema.json
+++ b/packages/system/cilium/charts/cilium/values.schema.json
@@ -519,6 +519,14 @@
        "disableExternalIPMitigation": {
          "type": "boolean"
        },
+        "distributedLRU": {
+          "properties": {
+            "enabled": {
+              "type": "boolean"
+            }
+          },
+          "type": "object"
+        },
        "enableTCX": {
          "type": "boolean"
        },
@@ -2110,6 +2118,12 @@
          },
          "type": "object"
        },
+        "policyRestoreTimeoutDuration": {
+          "type": [
+            "null",
+            "string"
+          ]
+        },
        "priorityClassName": {
          "type": [
            "null",
@@ -5462,6 +5476,9 @@
    "tunnelProtocol": {
      "type": "string"
    },
+    "tunnelSourcePortRange": {
+      "type": "string"
+    },
    "updateStrategy": {
      "properties": {
        "rollingUpdate": {
--- a/packages/system/cilium/charts/cilium/values.yaml
+++ b/packages/system/cilium/charts/cilium/values.yaml
@@ -191,10 +191,10 @@ image:
  # @schema
  override: ~
  repository: "quay.io/cilium/cilium"
-  tag: "v1.17.1"
+  tag: "v1.17.2"
  pullPolicy: "IfNotPresent"
  # cilium-digest
-  digest: "sha256:8969bfd9c87cbea91e40665f8ebe327268c99d844ca26d7d12165de07f702866"
+  digest: "sha256:3c4c9932b5d8368619cb922a497ff2ebc8def5f41c18e410bcc84025fcd385b1"
  useDigest: true
 # -- Scheduling configurations for cilium pods
 scheduling:
@@ -495,6 +495,13 @@ bpf:
  # tracking table.
  # @default -- `262144`
  ctAnyMax: ~
+  # -- Control to use a distributed per-CPU backend memory for the core BPF LRU maps
+  # which Cilium uses. This improves performance significantly, but it is also
+  # recommended to increase BPF map sizing along with that.
+  distributedLRU:
+    # -- Enable distributed LRU backend memory. For compatibility with existing
+    # installations it is off by default.
+    enabled: false
  # -- Control events generated by the Cilium datapath exposed to Cilium monitor and Hubble.
  # Helm configuration for BPF events map rate limiting is experimental and might change
  # in upcoming releases.
@@ -1433,9 +1440,9 @@ hubble:
      # @schema
      override: ~
      repository: "quay.io/cilium/hubble-relay"
-      tag: "v1.17.1"
+      tag: "v1.17.2"
      # hubble-relay-digest
-      digest: "sha256:397e8fbb188157f744390a7b272a1dec31234e605bcbe22d8919a166d202a3dc"
+      digest: "sha256:42a8db5c256c516cacb5b8937c321b2373ad7a6b0a1e5a5120d5028433d586cc"
      useDigest: true
      pullPolicy: "IfNotPresent"
    # -- Specifies the resources for the hubble-relay pods
@@ -1684,8 +1691,8 @@ hubble:
        # @schema
        override: ~
        repository: "quay.io/cilium/hubble-ui-backend"
-        tag: "v0.13.1"
-        digest: "sha256:0e0eed917653441fded4e7cdb096b7be6a3bddded5a2dd10812a27b1fc6ed95b"
+        tag: "v0.13.2"
+        digest: "sha256:a034b7e98e6ea796ed26df8f4e71f83fc16465a19d166eff67a03b822c0bfa15"
        useDigest: true
        pullPolicy: "IfNotPresent"
      # -- Hubble-ui backend security context.
@@ -1718,8 +1725,8 @@ hubble:
        # @schema
        override: ~
        repository: "quay.io/cilium/hubble-ui"
-        tag: "v0.13.1"
-        digest: "sha256:e2e9313eb7caf64b0061d9da0efbdad59c6c461f6ca1752768942bfeda0796c6"
+        tag: "v0.13.2"
+        digest: "sha256:9e37c1296b802830834cc87342a9182ccbb71ffebb711971e849221bd9d59392"
        useDigest: true
        pullPolicy: "IfNotPresent"
      # -- Hubble-ui frontend security context.
@@ -2332,6 +2339,11 @@ envoy:
  xffNumTrustedHopsL7PolicyIngress: 0
  # -- Number of trusted hops regarding the x-forwarded-for and related HTTP headers for the egress L7 policy enforcement Envoy listeners.
  xffNumTrustedHopsL7PolicyEgress: 0
+  # @schema
+  # type: [null, string]
+  # @schema
+  # -- Max duration to wait for endpoint policies to be restored on restart. Default "3m".
+  policyRestoreTimeoutDuration: null
  # -- Envoy container image.
  image:
    # @schema
@@ -2339,9 +2351,9 @@ envoy:
    # @schema
    override: ~
    repository: "quay.io/cilium/cilium-envoy"
-    tag: "v1.31.5-1739264036-958bef243c6c66fcfd73ca319f2eb49fff1eb2ae"
+    tag: "v1.31.5-1741765102-efed3defcc70ab5b263a0fc44c93d316b846a211"
    pullPolicy: "IfNotPresent"
-    digest: "sha256:fc708bd36973d306412b2e50c924cd8333de67e0167802c9b48506f9d772f521"
+    digest: "sha256:377c78c13d2731f3720f931721ee309159e782d882251709cb0fac3b42c03f4b"
    useDigest: true
  # -- Additional containers added to the cilium Envoy DaemonSet.
  extraContainers: []
@@ -2605,7 +2617,7 @@ tls:
    # type: [null, boolean]
    # @schema
    # -- Enable synchronization of Secrets for TLS Interception. If disabled and
-    # tls.secretsBackend is set to 'k8s', then secrets will be read directly by the agent.
+    # tls.readSecretsOnlyFromSecretsNamespace is set to 'false', then secrets will be read directly by the agent.
    enabled: ~
  # -- Base64 encoded PEM values for the CA certificate and private key.
  # This can be used as common CA to generate certificates used by hubble and clustermesh components.
@@ -2658,6 +2670,9 @@ routingMode: ""
 # -- Configure VXLAN and Geneve tunnel port.
 # @default -- Port 8472 for VXLAN, Port 6081 for Geneve
 tunnelPort: 0
+# -- Configure VXLAN and Geneve tunnel source port range hint.
+# @default -- 0-0 to let the kernel driver decide the range
+tunnelSourcePortRange: 0-0
 # -- Configure what the response should be to traffic for a service without backends.
 # Possible values:
 #  - reject (default)
@@ -2693,15 +2708,15 @@ operator:
    # @schema
    override: ~
    repository: "quay.io/cilium/operator"
-    tag: "v1.17.1"
+    tag: "v1.17.2"
    # operator-generic-digest
-    genericDigest: "sha256:628becaeb3e4742a1c36c4897721092375891b58bae2bfcae48bbf4420aaee97"
+    genericDigest: "sha256:81f2d7198366e8dec2903a3a8361e4c68d47d19c68a0d42f0b7b6e3f0523f249"
    # operator-azure-digest
-    azureDigest: "sha256:b9e3e3994f5fcf1832e1f344f3b3b544832851b1990f124b2c2c68e3ffe04a9b"
+    azureDigest: "sha256:455fb88b558b1b8ba09d63302ccce76b4930581be89def027184ab04335c20e0"
    # operator-aws-digest
-    awsDigest: "sha256:da74748057c836471bfdc0e65bb29ba0edb82916ec4b99f6a4f002b2fcc849d6"
+    awsDigest: "sha256:955096183e22a203bbb198ca66e3266ce4dbc2b63f1a2fbd03f9373dcd97893c"
    # operator-alibabacloud-digest
-    alibabacloudDigest: "sha256:034b479fba340f9d98510e509c7ce1c36e8889a109d5f1c2240fcb0942bc772c"
+    alibabacloudDigest: "sha256:7cb8c23417f65348bb810fe92fb05b41d926f019d77442f3fa1058d17fea7ffe"
    useDigest: true
    pullPolicy: "IfNotPresent"
    suffix: ""
@@ -2976,9 +2991,9 @@ preflight:
    # @schema
    override: ~
    repository: "quay.io/cilium/cilium"
-    tag: "v1.17.1"
+    tag: "v1.17.2"
    # cilium-digest
-    digest: "sha256:8969bfd9c87cbea91e40665f8ebe327268c99d844ca26d7d12165de07f702866"
+    digest: "sha256:3c4c9932b5d8368619cb922a497ff2ebc8def5f41c18e410bcc84025fcd385b1"
    useDigest: true
    pullPolicy: "IfNotPresent"
  # -- The priority class to use for the preflight pod.
@@ -3125,9 +3140,9 @@ clustermesh:
      # @schema
      override: ~
      repository: "quay.io/cilium/clustermesh-apiserver"
-      tag: "v1.17.1"
+      tag: "v1.17.2"
      # clustermesh-apiserver-digest
-      digest: "sha256:1de22f46bfdd638de72c2224d5223ddc3bbeacda1803cb75799beca3d4bf7a4c"
+      digest: "sha256:981250ebdc6e66e190992eaf75cfca169113a8f08d5c3793fe15822176980398"
      useDigest: true
      pullPolicy: "IfNotPresent"
    # -- TCP port for the clustermesh-apiserver health API.
@@ -3634,7 +3649,7 @@ authentication:
          override: ~
          repository: "docker.io/library/busybox"
          tag: "1.37.0"
-          digest: "sha256:a5d0ce49aa801d475da48f8cb163c354ab95cab073cd3c138bd458fc8257fbf1"
+          digest: "sha256:498a000f370d8c37927118ed80afe8adc38d1edcbfc071627d17b25c88efcab0"
          useDigest: true
          pullPolicy: "IfNotPresent"
        # SPIRE agent configuration
--- a/packages/system/cilium/charts/cilium/values.yaml.tmpl
+++ b/packages/system/cilium/charts/cilium/values.yaml.tmpl
@@ -500,6 +500,13 @@ bpf:
  # tracking table.
  # @default -- `262144`
  ctAnyMax: ~
+  # -- Control to use a distributed per-CPU backend memory for the core BPF LRU maps
+  # which Cilium uses. This improves performance significantly, but it is also
+  # recommended to increase BPF map sizing along with that.
+  distributedLRU:
+      # -- Enable distributed LRU backend memory. For compatibility with existing
+      # installations it is off by default.
+      enabled: false
  # -- Control events generated by the Cilium datapath exposed to Cilium monitor and Hubble.
  # Helm configuration for BPF events map rate limiting is experimental and might change
  # in upcoming releases.
@@ -2351,6 +2358,11 @@ envoy:
  xffNumTrustedHopsL7PolicyIngress: 0
  # -- Number of trusted hops regarding the x-forwarded-for and related HTTP headers for the egress L7 policy enforcement Envoy listeners.
  xffNumTrustedHopsL7PolicyEgress: 0
+  # @schema
+  # type: [null, string]
+  # @schema
+  # -- Max duration to wait for endpoint policies to be restored on restart. Default "3m".
+  policyRestoreTimeoutDuration: null
  # -- Envoy container image.
  image:
    # @schema
@@ -2626,7 +2638,7 @@ tls:
    # type: [null, boolean]
    # @schema
    # -- Enable synchronization of Secrets for TLS Interception. If disabled and
-    # tls.secretsBackend is set to 'k8s', then secrets will be read directly by the agent.
+    # tls.readSecretsOnlyFromSecretsNamespace is set to 'false', then secrets will be read directly by the agent.
    enabled: ~
  # -- Base64 encoded PEM values for the CA certificate and private key.
  # This can be used as common CA to generate certificates used by hubble and clustermesh components.
@@ -2679,6 +2691,9 @@ routingMode: ""
 # -- Configure VXLAN and Geneve tunnel port.
 # @default -- Port 8472 for VXLAN, Port 6081 for Geneve
 tunnelPort: 0
+# -- Configure VXLAN and Geneve tunnel source port range hint.
+# @default -- 0-0 to let the kernel driver decide the range
+tunnelSourcePortRange: 0-0
 # -- Configure what the response should be to traffic for a service without backends.
 # Possible values:
 #  - reject (default)
--- a/packages/system/cilium/images/cilium/Dockerfile
+++ b/packages/system/cilium/images/cilium/Dockerfile
@@ -1,2 +1,2 @@
-ARG VERSION=v1.17.1
+ARG VERSION=v1.17.2
 FROM quay.io/cilium/cilium:${VERSION}
--- a/packages/system/cozystack-controller/templates/rbac.yaml
+++ b/packages/system/cozystack-controller/templates/rbac.yaml
@@ -4,7 +4,7 @@ metadata:
  name: cozystack-controller
 rules:
 - apiGroups: [""]
-  resources: ["configmaps", "pods", "namespaces", "nodes", "services", "persistentvolumes"]
+  resources: ["configmaps", "pods", "namespaces", "nodes", "services", "persistentvolumes", "persistentvolumeclaims"]
  verbs: ["get", "watch", "list"]
 - apiGroups: ['cozystack.io']
  resources: ['*']
--- a/packages/system/gpu-operator/Chart.yaml
+++ b/packages/system/gpu-operator/Chart.yaml
@@ -0,0 +1,3 @@
+apiVersion: v2
+name: cozy-gpu-operator
+version: 0.0.0 # Placeholder, the actual version will be automatically set during the build process
--- a/packages/system/gpu-operator/Makefile
+++ b/packages/system/gpu-operator/Makefile
@@ -0,0 +1,11 @@
+export NAME=gpu-operator
+export NAMESPACE=cozy-$(NAME)
+
+include ../../../scripts/common-envs.mk
+include ../../../scripts/package.mk
+
+update:
+	rm -rf charts
+	helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
+	helm repo update nvidia
+	helm pull nvidia/gpu-operator --untar --untardir charts
--- a/packages/system/gpu-operator/charts/gpu-operator/.helmignore
+++ b/packages/system/gpu-operator/charts/gpu-operator/.helmignore
@@ -0,0 +1,22 @@
+# Patterns to ignore when building packages.
+# This supports shell glob matching, relative path matching, and
+# negation (prefixed with !). Only one pattern per line.
+.DS_Store
+# Common VCS dirs
+.git/
+.gitignore
+.bzr/
+.bzrignore
+.hg/
+.hgignore
+.svn/
+# Common backup files
+*.swp
+*.bak
+*.tmp
+*~
+# Various IDEs
+.project
+.idea/
+*.tmproj
+.vscode/
--- a/packages/system/gpu-operator/charts/gpu-operator/Chart.lock
+++ b/packages/system/gpu-operator/charts/gpu-operator/Chart.lock
@@ -0,0 +1,6 @@
+dependencies:
+- name: node-feature-discovery
+  repository: https://kubernetes-sigs.github.io/node-feature-discovery/charts
+  version: 0.17.2
+digest: sha256:4c55d30d958027ef8997a2976449326de3c90049025c3ebb9bee017cad32cc3f
+generated: "2025-02-25T09:08:49.128088-08:00"
--- a/packages/system/gpu-operator/charts/gpu-operator/Chart.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/Chart.yaml
@@ -0,0 +1,23 @@
+apiVersion: v2
+appVersion: v25.3.0
+dependencies:
+- condition: nfd.enabled
+  name: node-feature-discovery
+  repository: https://kubernetes-sigs.github.io/node-feature-discovery/charts
+  version: v0.17.2
+description: NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
+home: https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/overview.html
+icon: https://assets.nvidiagrid.net/ngc/logos/GPUoperator.png
+keywords:
+- gpu
+- cuda
+- compute
+- operator
+- deep learning
+- monitoring
+- tesla
+kubeVersion: '>= 1.16.0-0'
+name: gpu-operator
+sources:
+- https://github.com/NVIDIA/gpu-operator
+version: v25.3.0
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/.helmignore
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/.helmignore
@@ -0,0 +1,23 @@
+# Patterns to ignore when building packages.
+# This supports shell glob matching, relative path matching, and
+# negation (prefixed with !). Only one pattern per line.
+.DS_Store
+# Common VCS dirs
+.git/
+.gitignore
+.bzr/
+.bzrignore
+.hg/
+.hgignore
+.svn/
+# Common backup files
+*.swp
+*.bak
+*.tmp
+*.orig
+*~
+# Various IDEs
+.project
+.idea/
+*.tmproj
+.vscode/
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/Chart.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/Chart.yaml
@@ -0,0 +1,14 @@
+apiVersion: v2
+appVersion: v0.17.2
+description: 'Detects hardware features available on each node in a Kubernetes cluster,
+  and advertises those features using node labels. '
+home: https://github.com/kubernetes-sigs/node-feature-discovery
+keywords:
+- feature-discovery
+- feature-detection
+- node-labels
+name: node-feature-discovery
+sources:
+- https://github.com/kubernetes-sigs/node-feature-discovery
+type: application
+version: 0.17.2
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/README.md
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/README.md
@@ -0,0 +1,10 @@
+# Node Feature Discovery
+
+Node Feature Discovery (NFD) is a Kubernetes add-on for detecting hardware
+features and system configuration. Detected features are advertised as node
+labels. NFD provides flexible configuration and extension points for a wide
+range of vendor and application specific node labeling needs.
+
+See
+[NFD documentation](https://kubernetes-sigs.github.io/node-feature-discovery/v0.17/deployment/helm.html)
+for deployment instructions.
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/crds/nfd-api-crds.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/crds/nfd-api-crds.yaml
@@ -0,0 +1,711 @@
+---
+apiVersion: apiextensions.k8s.io/v1
+kind: CustomResourceDefinition
+metadata:
+  annotations:
+    controller-gen.kubebuilder.io/version: v0.16.3
+  name: nodefeatures.nfd.k8s-sigs.io
+spec:
+  group: nfd.k8s-sigs.io
+  names:
+    kind: NodeFeature
+    listKind: NodeFeatureList
+    plural: nodefeatures
+    singular: nodefeature
+  scope: Namespaced
+  versions:
+  - name: v1alpha1
+    schema:
+      openAPIV3Schema:
+        description: |-
+          NodeFeature resource holds the features discovered for one node in the
+          cluster.
+        properties:
+          apiVersion:
+            description: |-
+              APIVersion defines the versioned schema of this representation of an object.
+              Servers should convert recognized schemas to the latest internal value, and
+              may reject unrecognized values.
+              More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
+            type: string
+          kind:
+            description: |-
+              Kind is a string value representing the REST resource this object represents.
+              Servers may infer this from the endpoint the client submits requests to.
+              Cannot be updated.
+              In CamelCase.
+              More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
+            type: string
+          metadata:
+            type: object
+          spec:
+            description: Specification of the NodeFeature, containing features discovered
+              for a node.
+            properties:
+              features:
+                description: Features is the full "raw" features data that has been
+                  discovered.
+                properties:
+                  attributes:
+                    additionalProperties:
+                      description: AttributeFeatureSet is a set of features having
+                        string value.
+                      properties:
+                        elements:
+                          additionalProperties:
+                            type: string
+                          description: Individual features of the feature set.
+                          type: object
+                      required:
+                      - elements
+                      type: object
+                    description: Attributes contains all the attribute-type features
+                      of the node.
+                    type: object
+                  flags:
+                    additionalProperties:
+                      description: FlagFeatureSet is a set of simple features only
+                        containing names without values.
+                      properties:
+                        elements:
+                          additionalProperties:
+                            description: |-
+                              Nil is a dummy empty struct for protobuf compatibility.
+                              NOTE: protobuf definitions have been removed but this is kept for API compatibility.
+                            type: object
+                          description: Individual features of the feature set.
+                          type: object
+                      required:
+                      - elements
+                      type: object
+                    description: Flags contains all the flag-type features of the
+                      node.
+                    type: object
+                  instances:
+                    additionalProperties:
+                      description: InstanceFeatureSet is a set of features each of
+                        which is an instance having multiple attributes.
+                      properties:
+                        elements:
+                          description: Individual features of the feature set.
+                          items:
+                            description: InstanceFeature represents one instance of
+                              a complex features, e.g. a device.
+                            properties:
+                              attributes:
+                                additionalProperties:
+                                  type: string
+                                description: Attributes of the instance feature.
+                                type: object
+                            required:
+                            - attributes
+                            type: object
+                          type: array
+                      required:
+                      - elements
+                      type: object
+                    description: Instances contains all the instance-type features
+                      of the node.
+                    type: object
+                type: object
+              labels:
+                additionalProperties:
+                  type: string
+                description: Labels is the set of node labels that are requested to
+                  be created.
+                type: object
+            type: object
+        required:
+        - spec
+        type: object
+    served: true
+    storage: true
+---
+apiVersion: apiextensions.k8s.io/v1
+kind: CustomResourceDefinition
+metadata:
+  annotations:
+    controller-gen.kubebuilder.io/version: v0.16.3
+  name: nodefeaturegroups.nfd.k8s-sigs.io
+spec:
+  group: nfd.k8s-sigs.io
+  names:
+    kind: NodeFeatureGroup
+    listKind: NodeFeatureGroupList
+    plural: nodefeaturegroups
+    shortNames:
+    - nfg
+    singular: nodefeaturegroup
+  scope: Namespaced
+  versions:
+  - name: v1alpha1
+    schema:
+      openAPIV3Schema:
+        description: NodeFeatureGroup resource holds Node pools by featureGroup
+        properties:
+          apiVersion:
+            description: |-
+              APIVersion defines the versioned schema of this representation of an object.
+              Servers should convert recognized schemas to the latest internal value, and
+              may reject unrecognized values.
+              More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
+            type: string
+          kind:
+            description: |-
+              Kind is a string value representing the REST resource this object represents.
+              Servers may infer this from the endpoint the client submits requests to.
+              Cannot be updated.
+              In CamelCase.
+              More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
+            type: string
+          metadata:
+            type: object
+          spec:
+            description: Spec defines the rules to be evaluated.
+            properties:
+              featureGroupRules:
+                description: List of rules to evaluate to determine nodes that belong
+                  in this group.
+                items:
+                  description: GroupRule defines a rule for nodegroup filtering.
+                  properties:
+                    matchAny:
+                      description: MatchAny specifies a list of matchers one of which
+                        must match.
+                      items:
+                        description: MatchAnyElem specifies one sub-matcher of MatchAny.
+                        properties:
+                          matchFeatures:
+                            description: MatchFeatures specifies a set of matcher
+                              terms all of which must match.
+                            items:
+                              description: |-
+                                FeatureMatcherTerm defines requirements against one feature set. All
+                                requirements (specified as MatchExpressions) are evaluated against each
+                                element in the feature set.
+                              properties:
+                                feature:
+                                  description: Feature is the name of the feature
+                                    set to match against.
+                                  type: string
+                                matchExpressions:
+                                  additionalProperties:
+                                    description: |-
+                                      MatchExpression specifies an expression to evaluate against a set of input
+                                      values. It contains an operator that is applied when matching the input and
+                                      an array of values that the operator evaluates the input against.
+                                    properties:
+                                      op:
+                                        description: Op is the operator to be applied.
+                                        enum:
+                                        - In
+                                        - NotIn
+                                        - InRegexp
+                                        - Exists
+                                        - DoesNotExist
+                                        - Gt
+                                        - Lt
+                                        - GtLt
+                                        - IsTrue
+                                        - IsFalse
+                                        type: string
+                                      value:
+                                        description: |-
+                                          Value is the list of values that the operand evaluates the input
+                                          against. Value should be empty if the operator is Exists, DoesNotExist,
+                                          IsTrue or IsFalse. Value should contain exactly one element if the
+                                          operator is Gt or Lt and exactly two elements if the operator is GtLt.
+                                          In other cases Value should contain at least one element.
+                                        items:
+                                          type: string
+                                        type: array
+                                    required:
+                                    - op
+                                    type: object
+                                  description: |-
+                                    MatchExpressions is the set of per-element expressions evaluated. These
+                                    match against the value of the specified elements.
+                                  type: object
+                                matchName:
+                                  description: |-
+                                    MatchName in an expression that is matched against the name of each
+                                    element in the feature set.
+                                  properties:
+                                    op:
+                                      description: Op is the operator to be applied.
+                                      enum:
+                                      - In
+                                      - NotIn
+                                      - InRegexp
+                                      - Exists
+                                      - DoesNotExist
+                                      - Gt
+                                      - Lt
+                                      - GtLt
+                                      - IsTrue
+                                      - IsFalse
+                                      type: string
+                                    value:
+                                      description: |-
+                                        Value is the list of values that the operand evaluates the input
+                                        against. Value should be empty if the operator is Exists, DoesNotExist,
+                                        IsTrue or IsFalse. Value should contain exactly one element if the
+                                        operator is Gt or Lt and exactly two elements if the operator is GtLt.
+                                        In other cases Value should contain at least one element.
+                                      items:
+                                        type: string
+                                      type: array
+                                  required:
+                                  - op
+                                  type: object
+                              required:
+                              - feature
+                              type: object
+                            type: array
+                        required:
+                        - matchFeatures
+                        type: object
+                      type: array
+                    matchFeatures:
+                      description: MatchFeatures specifies a set of matcher terms
+                        all of which must match.
+                      items:
+                        description: |-
+                          FeatureMatcherTerm defines requirements against one feature set. All
+                          requirements (specified as MatchExpressions) are evaluated against each
+                          element in the feature set.
+                        properties:
+                          feature:
+                            description: Feature is the name of the feature set to
+                              match against.
+                            type: string
+                          matchExpressions:
+                            additionalProperties:
+                              description: |-
+                                MatchExpression specifies an expression to evaluate against a set of input
+                                values. It contains an operator that is applied when matching the input and
+                                an array of values that the operator evaluates the input against.
+                              properties:
+                                op:
+                                  description: Op is the operator to be applied.
+                                  enum:
+                                  - In
+                                  - NotIn
+                                  - InRegexp
+                                  - Exists
+                                  - DoesNotExist
+                                  - Gt
+                                  - Lt
+                                  - GtLt
+                                  - IsTrue
+                                  - IsFalse
+                                  type: string
+                                value:
+                                  description: |-
+                                    Value is the list of values that the operand evaluates the input
+                                    against. Value should be empty if the operator is Exists, DoesNotExist,
+                                    IsTrue or IsFalse. Value should contain exactly one element if the
+                                    operator is Gt or Lt and exactly two elements if the operator is GtLt.
+                                    In other cases Value should contain at least one element.
+                                  items:
+                                    type: string
+                                  type: array
+                              required:
+                              - op
+                              type: object
+                            description: |-
+                              MatchExpressions is the set of per-element expressions evaluated. These
+                              match against the value of the specified elements.
+                            type: object
+                          matchName:
+                            description: |-
+                              MatchName in an expression that is matched against the name of each
+                              element in the feature set.
+                            properties:
+                              op:
+                                description: Op is the operator to be applied.
+                                enum:
+                                - In
+                                - NotIn
+                                - InRegexp
+                                - Exists
+                                - DoesNotExist
+                                - Gt
+                                - Lt
+                                - GtLt
+                                - IsTrue
+                                - IsFalse
+                                type: string
+                              value:
+                                description: |-
+                                  Value is the list of values that the operand evaluates the input
+                                  against. Value should be empty if the operator is Exists, DoesNotExist,
+                                  IsTrue or IsFalse. Value should contain exactly one element if the
+                                  operator is Gt or Lt and exactly two elements if the operator is GtLt.
+                                  In other cases Value should contain at least one element.
+                                items:
+                                  type: string
+                                type: array
+                            required:
+                            - op
+                            type: object
+                        required:
+                        - feature
+                        type: object
+                      type: array
+                    name:
+                      description: Name of the rule.
+                      type: string
+                  required:
+                  - name
+                  type: object
+                type: array
+            required:
+            - featureGroupRules
+            type: object
+          status:
+            description: |-
+              Status of the NodeFeatureGroup after the most recent evaluation of the
+              specification.
+            properties:
+              nodes:
+                description: Nodes is a list of FeatureGroupNode in the cluster that
+                  match the featureGroupRules
+                items:
+                  properties:
+                    name:
+                      description: Name of the node.
+                      type: string
+                  required:
+                  - name
+                  type: object
+                type: array
+                x-kubernetes-list-map-keys:
+                - name
+                x-kubernetes-list-type: map
+            type: object
+        required:
+        - spec
+        type: object
+    served: true
+    storage: true
+    subresources:
+      status: {}
+---
+apiVersion: apiextensions.k8s.io/v1
+kind: CustomResourceDefinition
+metadata:
+  annotations:
+    controller-gen.kubebuilder.io/version: v0.16.3
+  name: nodefeaturerules.nfd.k8s-sigs.io
+spec:
+  group: nfd.k8s-sigs.io
+  names:
+    kind: NodeFeatureRule
+    listKind: NodeFeatureRuleList
+    plural: nodefeaturerules
+    shortNames:
+    - nfr
+    singular: nodefeaturerule
+  scope: Cluster
+  versions:
+  - name: v1alpha1
+    schema:
+      openAPIV3Schema:
+        description: |-
+          NodeFeatureRule resource specifies a configuration for feature-based
+          customization of node objects, such as node labeling.
+        properties:
+          apiVersion:
+            description: |-
+              APIVersion defines the versioned schema of this representation of an object.
+              Servers should convert recognized schemas to the latest internal value, and
+              may reject unrecognized values.
+              More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
+            type: string
+          kind:
+            description: |-
+              Kind is a string value representing the REST resource this object represents.
+              Servers may infer this from the endpoint the client submits requests to.
+              Cannot be updated.
+              In CamelCase.
+              More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
+            type: string
+          metadata:
+            type: object
+          spec:
+            description: Spec defines the rules to be evaluated.
+            properties:
+              rules:
+                description: Rules is a list of node customization rules.
+                items:
+                  description: Rule defines a rule for node customization such as
+                    labeling.
+                  properties:
+                    annotations:
+                      additionalProperties:
+                        type: string
+                      description: Annotations to create if the rule matches.
+                      type: object
+                    extendedResources:
+                      additionalProperties:
+                        type: string
+                      description: ExtendedResources to create if the rule matches.
+                      type: object
+                    labels:
+                      additionalProperties:
+                        type: string
+                      description: Labels to create if the rule matches.
+                      type: object
+                    labelsTemplate:
+                      description: |-
+                        LabelsTemplate specifies a template to expand for dynamically generating
+                        multiple labels. Data (after template expansion) must be keys with an
+                        optional value (<key>[=<value>]) separated by newlines.
+                      type: string
+                    matchAny:
+                      description: MatchAny specifies a list of matchers one of which
+                        must match.
+                      items:
+                        description: MatchAnyElem specifies one sub-matcher of MatchAny.
+                        properties:
+                          matchFeatures:
+                            description: MatchFeatures specifies a set of matcher
+                              terms all of which must match.
+                            items:
+                              description: |-
+                                FeatureMatcherTerm defines requirements against one feature set. All
+                                requirements (specified as MatchExpressions) are evaluated against each
+                                element in the feature set.
+                              properties:
+                                feature:
+                                  description: Feature is the name of the feature
+                                    set to match against.
+                                  type: string
+                                matchExpressions:
+                                  additionalProperties:
+                                    description: |-
+                                      MatchExpression specifies an expression to evaluate against a set of input
+                                      values. It contains an operator that is applied when matching the input and
+                                      an array of values that the operator evaluates the input against.
+                                    properties:
+                                      op:
+                                        description: Op is the operator to be applied.
+                                        enum:
+                                        - In
+                                        - NotIn
+                                        - InRegexp
+                                        - Exists
+                                        - DoesNotExist
+                                        - Gt
+                                        - Lt
+                                        - GtLt
+                                        - IsTrue
+                                        - IsFalse
+                                        type: string
+                                      value:
+                                        description: |-
+                                          Value is the list of values that the operand evaluates the input
+                                          against. Value should be empty if the operator is Exists, DoesNotExist,
+                                          IsTrue or IsFalse. Value should contain exactly one element if the
+                                          operator is Gt or Lt and exactly two elements if the operator is GtLt.
+                                          In other cases Value should contain at least one element.
+                                        items:
+                                          type: string
+                                        type: array
+                                    required:
+                                    - op
+                                    type: object
+                                  description: |-
+                                    MatchExpressions is the set of per-element expressions evaluated. These
+                                    match against the value of the specified elements.
+                                  type: object
+                                matchName:
+                                  description: |-
+                                    MatchName in an expression that is matched against the name of each
+                                    element in the feature set.
+                                  properties:
+                                    op:
+                                      description: Op is the operator to be applied.
+                                      enum:
+                                      - In
+                                      - NotIn
+                                      - InRegexp
+                                      - Exists
+                                      - DoesNotExist
+                                      - Gt
+                                      - Lt
+                                      - GtLt
+                                      - IsTrue
+                                      - IsFalse
+                                      type: string
+                                    value:
+                                      description: |-
+                                        Value is the list of values that the operand evaluates the input
+                                        against. Value should be empty if the operator is Exists, DoesNotExist,
+                                        IsTrue or IsFalse. Value should contain exactly one element if the
+                                        operator is Gt or Lt and exactly two elements if the operator is GtLt.
+                                        In other cases Value should contain at least one element.
+                                      items:
+                                        type: string
+                                      type: array
+                                  required:
+                                  - op
+                                  type: object
+                              required:
+                              - feature
+                              type: object
+                            type: array
+                        required:
+                        - matchFeatures
+                        type: object
+                      type: array
+                    matchFeatures:
+                      description: MatchFeatures specifies a set of matcher terms
+                        all of which must match.
+                      items:
+                        description: |-
+                          FeatureMatcherTerm defines requirements against one feature set. All
+                          requirements (specified as MatchExpressions) are evaluated against each
+                          element in the feature set.
+                        properties:
+                          feature:
+                            description: Feature is the name of the feature set to
+                              match against.
+                            type: string
+                          matchExpressions:
+                            additionalProperties:
+                              description: |-
+                                MatchExpression specifies an expression to evaluate against a set of input
+                                values. It contains an operator that is applied when matching the input and
+                                an array of values that the operator evaluates the input against.
+                              properties:
+                                op:
+                                  description: Op is the operator to be applied.
+                                  enum:
+                                  - In
+                                  - NotIn
+                                  - InRegexp
+                                  - Exists
+                                  - DoesNotExist
+                                  - Gt
+                                  - Lt
+                                  - GtLt
+                                  - IsTrue
+                                  - IsFalse
+                                  type: string
+                                value:
+                                  description: |-
+                                    Value is the list of values that the operand evaluates the input
+                                    against. Value should be empty if the operator is Exists, DoesNotExist,
+                                    IsTrue or IsFalse. Value should contain exactly one element if the
+                                    operator is Gt or Lt and exactly two elements if the operator is GtLt.
+                                    In other cases Value should contain at least one element.
+                                  items:
+                                    type: string
+                                  type: array
+                              required:
+                              - op
+                              type: object
+                            description: |-
+                              MatchExpressions is the set of per-element expressions evaluated. These
+                              match against the value of the specified elements.
+                            type: object
+                          matchName:
+                            description: |-
+                              MatchName in an expression that is matched against the name of each
+                              element in the feature set.
+                            properties:
+                              op:
+                                description: Op is the operator to be applied.
+                                enum:
+                                - In
+                                - NotIn
+                                - InRegexp
+                                - Exists
+                                - DoesNotExist
+                                - Gt
+                                - Lt
+                                - GtLt
+                                - IsTrue
+                                - IsFalse
+                                type: string
+                              value:
+                                description: |-
+                                  Value is the list of values that the operand evaluates the input
+                                  against. Value should be empty if the operator is Exists, DoesNotExist,
+                                  IsTrue or IsFalse. Value should contain exactly one element if the
+                                  operator is Gt or Lt and exactly two elements if the operator is GtLt.
+                                  In other cases Value should contain at least one element.
+                                items:
+                                  type: string
+                                type: array
+                            required:
+                            - op
+                            type: object
+                        required:
+                        - feature
+                        type: object
+                      type: array
+                    name:
+                      description: Name of the rule.
+                      type: string
+                    taints:
+                      description: Taints to create if the rule matches.
+                      items:
+                        description: |-
+                          The node this Taint is attached to has the "effect" on
+                          any pod that does not tolerate the Taint.
+                        properties:
+                          effect:
+                            description: |-
+                              Required. The effect of the taint on pods
+                              that do not tolerate the taint.
+                              Valid effects are NoSchedule, PreferNoSchedule and NoExecute.
+                            type: string
+                          key:
+                            description: Required. The taint key to be applied to
+                              a node.
+                            type: string
+                          timeAdded:
+                            description: |-
+                              TimeAdded represents the time at which the taint was added.
+                              It is only written for NoExecute taints.
+                            format: date-time
+                            type: string
+                          value:
+                            description: The taint value corresponding to the taint
+                              key.
+                            type: string
+                        required:
+                        - effect
+                        - key
+                        type: object
+                      type: array
+                    vars:
+                      additionalProperties:
+                        type: string
+                      description: |-
+                        Vars is the variables to store if the rule matches. Variables do not
+                        directly inflict any changes in the node object. However, they can be
+                        referenced from other rules enabling more complex rule hierarchies,
+                        without exposing intermediary output values as labels.
+                      type: object
+                    varsTemplate:
+                      description: |-
+                        VarsTemplate specifies a template to expand for dynamically generating
+                        multiple variables. Data (after template expansion) must be keys with an
+                        optional value (<key>[=<value>]) separated by newlines.
+                      type: string
+                  required:
+                  - name
+                  type: object
+                type: array
+            required:
+            - rules
+            type: object
+        required:
+        - spec
+        type: object
+    served: true
+    storage: true
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/_helpers.tpl
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/_helpers.tpl
@@ -0,0 +1,107 @@
+{{/* vim: set filetype=mustache: */}}
+{{/*
+Expand the name of the chart.
+*/}}
+{{- define "node-feature-discovery.name" -}}
+{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+
+{{/*
+Create a default fully qualified app name.
+We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
+If release name contains chart name it will be used as a full name.
+*/}}
+{{- define "node-feature-discovery.fullname" -}}
+{{- if .Values.fullnameOverride -}}
+{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
+{{- else -}}
+{{- $name := default .Chart.Name .Values.nameOverride -}}
+{{- if contains $name .Release.Name -}}
+{{- .Release.Name | trunc 63 | trimSuffix "-" -}}
+{{- else -}}
+{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+{{- end -}}
+{{- end -}}
+
+{{/*
+Allow the release namespace to be overridden for multi-namespace deployments in combined charts
+*/}}
+{{- define "node-feature-discovery.namespace" -}}
+  {{- if .Values.namespaceOverride -}}
+    {{- .Values.namespaceOverride -}}
+  {{- else -}}
+    {{- .Release.Namespace -}}
+  {{- end -}}
+{{- end -}}
+
+{{/*
+Create chart name and version as used by the chart label.
+*/}}
+{{- define "node-feature-discovery.chart" -}}
+{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+
+{{/*
+Common labels
+*/}}
+{{- define "node-feature-discovery.labels" -}}
+helm.sh/chart: {{ include "node-feature-discovery.chart" . }}
+{{ include "node-feature-discovery.selectorLabels" . }}
+{{- if .Chart.AppVersion }}
+app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
+{{- end }}
+app.kubernetes.io/managed-by: {{ .Release.Service }}
+{{- end -}}
+
+{{/*
+Selector labels
+*/}}
+{{- define "node-feature-discovery.selectorLabels" -}}
+app.kubernetes.io/name: {{ include "node-feature-discovery.name" . }}
+app.kubernetes.io/instance: {{ .Release.Name }}
+{{- end -}}
+
+{{/*
+Create the name of the service account which the nfd master will use
+*/}}
+{{- define "node-feature-discovery.master.serviceAccountName" -}}
+{{- if .Values.master.serviceAccount.create -}}
+    {{ default (include "node-feature-discovery.fullname" .) .Values.master.serviceAccount.name }}
+{{- else -}}
+    {{ default "default" .Values.master.serviceAccount.name }}
+{{- end -}}
+{{- end -}}
+
+{{/*
+Create the name of the service account which the nfd worker will use
+*/}}
+{{- define "node-feature-discovery.worker.serviceAccountName" -}}
+{{- if .Values.worker.serviceAccount.create -}}
+    {{ default (printf "%s-worker" (include "node-feature-discovery.fullname" .)) .Values.worker.serviceAccount.name }}
+{{- else -}}
+    {{ default "default" .Values.worker.serviceAccount.name }}
+{{- end -}}
+{{- end -}}
+
+{{/*
+Create the name of the service account which topologyUpdater will use
+*/}}
+{{- define "node-feature-discovery.topologyUpdater.serviceAccountName" -}}
+{{- if .Values.topologyUpdater.serviceAccount.create -}}
+    {{ default (printf "%s-topology-updater" (include "node-feature-discovery.fullname" .)) .Values.topologyUpdater.serviceAccount.name }}
+{{- else -}}
+    {{ default "default" .Values.topologyUpdater.serviceAccount.name }}
+{{- end -}}
+{{- end -}}
+
+{{/*
+Create the name of the service account which nfd-gc will use
+*/}}
+{{- define "node-feature-discovery.gc.serviceAccountName" -}}
+{{- if .Values.gc.serviceAccount.create -}}
+    {{ default (printf "%s-gc" (include "node-feature-discovery.fullname" .)) .Values.gc.serviceAccount.name }}
+{{- else -}}
+    {{ default "default" .Values.gc.serviceAccount.name }}
+{{- end -}}
+{{- end -}}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/clusterrole.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/clusterrole.yaml
@@ -0,0 +1,140 @@
+{{- if and .Values.master.enable .Values.master.rbac.create }}
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+rules:
+- apiGroups:
+  - ""
+  resources:
+  - namespaces
+  verbs:
+  - watch
+  - list
+- apiGroups:
+  - ""
+  resources:
+  - nodes
+  - nodes/status
+  verbs:
+  - get
+  - patch
+  - update
+  - list
+- apiGroups:
+  - nfd.k8s-sigs.io
+  resources:
+  - nodefeatures
+  - nodefeaturerules
+  - nodefeaturegroups
+  verbs:
+  - get
+  - list
+  - watch
+- apiGroups:
+  - nfd.k8s-sigs.io
+  resources:
+  - nodefeaturegroups/status
+  verbs:
+  - patch
+  - update
+- apiGroups:
+  - coordination.k8s.io
+  resources:
+  - leases
+  verbs:
+  - create
+- apiGroups:
+  - coordination.k8s.io
+  resources:
+  - leases
+  resourceNames:
+  - "nfd-master.nfd.kubernetes.io"
+  verbs:
+  - get
+  - update
+{{- end }}
+
+{{- if and .Values.topologyUpdater.enable .Values.topologyUpdater.rbac.create }}
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-topology-updater
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+rules:
+- apiGroups:
+  - ""
+  resources:
+  - nodes
+  verbs:
+  - get
+  - list
+- apiGroups:
+  - ""
+  resources:
+  - namespaces
+  verbs:
+  - get
+- apiGroups:
+    - ""
+  resources:
+    - nodes/proxy
+  verbs:
+    - get
+- apiGroups:
+  - ""
+  resources:
+  - pods
+  verbs:
+  - get
+- apiGroups:
+  - topology.node.k8s.io
+  resources:
+  - noderesourcetopologies
+  verbs:
+  - create
+  - get
+  - update
+{{- end }}
+
+{{- if and .Values.gc.enable .Values.gc.rbac.create }}
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-gc
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+rules:
+- apiGroups:
+  - ""
+  resources:
+  - nodes
+  verbs:
+  - list
+  - watch
+- apiGroups:
+  - ""
+  resources:
+  - nodes/proxy
+  verbs:
+  - get
+- apiGroups:
+  - topology.node.k8s.io
+  resources:
+  - noderesourcetopologies
+  verbs:
+  - delete
+  - list
+- apiGroups:
+  - nfd.k8s-sigs.io
+  resources:
+  - nodefeatures
+  verbs:
+  - delete
+  - list
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/clusterrolebinding.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/clusterrolebinding.yaml
@@ -0,0 +1,52 @@
+{{- if and .Values.master.enable .Values.master.rbac.create }}
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: {{ include "node-feature-discovery.fullname" . }}
+subjects:
+- kind: ServiceAccount
+  name: {{ include "node-feature-discovery.master.serviceAccountName" . }}
+  namespace: {{ include "node-feature-discovery.namespace" .  }}
+{{- end }}
+
+{{- if and .Values.topologyUpdater.enable .Values.topologyUpdater.rbac.create }}
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-topology-updater
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: {{ include "node-feature-discovery.fullname" . }}-topology-updater
+subjects:
+- kind: ServiceAccount
+  name: {{ include "node-feature-discovery.topologyUpdater.serviceAccountName" . }}
+  namespace: {{ include "node-feature-discovery.namespace" .  }}
+{{- end }}
+
+{{- if and .Values.gc.enable .Values.gc.rbac.create }}
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-gc
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: {{ include "node-feature-discovery.fullname" . }}-gc
+subjects:
+- kind: ServiceAccount
+  name: {{ include "node-feature-discovery.gc.serviceAccountName" . }}
+  namespace: {{ include "node-feature-discovery.namespace" .  }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/master.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/master.yaml
@@ -0,0 +1,170 @@
+{{- if .Values.master.enable }}
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name:  {{ include "node-feature-discovery.fullname" . }}-master
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+    role: master
+  {{- with .Values.master.deploymentAnnotations }}
+  annotations:
+    {{- toYaml . | nindent 4 }}
+  {{- end }}
+spec:
+  replicas: {{ .Values.master.replicaCount }}
+  revisionHistoryLimit: {{ .Values.master.revisionHistoryLimit }}
+  selector:
+    matchLabels:
+      {{- include "node-feature-discovery.selectorLabels" . | nindent 6 }}
+      role: master
+  template:
+    metadata:
+      labels:
+        {{- include "node-feature-discovery.selectorLabels" . | nindent 8 }}
+        role: master
+      annotations:
+        checksum/config: {{ include (print $.Template.BasePath "/nfd-master-conf.yaml") . | sha256sum }}
+        {{- with .Values.master.annotations }}
+        {{- toYaml . | nindent 8 }}
+        {{- end }}
+    spec:
+    {{- with .Values.priorityClassName }}
+      priorityClassName: {{ . }}
+    {{- end }}
+    {{- with .Values.imagePullSecrets }}
+      imagePullSecrets:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+      serviceAccountName: {{ include "node-feature-discovery.master.serviceAccountName" . }}
+      enableServiceLinks: false
+      securityContext:
+        {{- toYaml .Values.master.podSecurityContext | nindent 8 }}
+      hostNetwork: {{ .Values.master.hostNetwork }}
+      containers:
+        - name: master
+          securityContext:
+            {{- toYaml .Values.master.securityContext | nindent 12 }}
+          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
+          imagePullPolicy: {{ .Values.image.pullPolicy }}
+          startupProbe:
+            grpc:
+              port: {{ .Values.master.healthPort | default "8082" }}
+          {{- with .Values.master.startupProbe.initialDelaySeconds }}
+            initialDelaySeconds: {{ . }}
+          {{- end }}
+          {{- with .Values.master.startupProbe.failureThreshold }}
+            failureThreshold: {{ . }}
+          {{- end }}
+          {{- with .Values.master.startupProbe.periodSeconds }}
+            periodSeconds: {{ . }}
+          {{- end }}
+          {{- with .Values.master.startupProbe.timeoutSeconds }}
+            timeoutSeconds: {{ . }}
+          {{- end }}
+          livenessProbe:
+            grpc:
+              port: {{ .Values.master.healthPort | default "8082" }}
+          {{- with .Values.master.livenessProbe.initialDelaySeconds }}
+            initialDelaySeconds: {{ . }}
+          {{- end }}
+          {{- with .Values.master.livenessProbe.failureThreshold }}
+            failureThreshold: {{ . }}
+          {{- end }}
+          {{- with .Values.master.livenessProbe.periodSeconds }}
+            periodSeconds: {{ . }}
+          {{- end }}
+          {{- with .Values.master.livenessProbe.timeoutSeconds }}
+            timeoutSeconds: {{ . }}
+          {{- end }}
+          readinessProbe:
+            grpc:
+              port: {{ .Values.master.healthPort | default "8082" }}
+          {{- with .Values.master.readinessProbe.initialDelaySeconds }}
+            initialDelaySeconds: {{ . }}
+          {{- end }}
+          {{- with .Values.master.readinessProbe.failureThreshold }}
+            failureThreshold: {{ . }}
+          {{- end }}
+          {{- with .Values.master.readinessProbe.periodSeconds }}
+            periodSeconds: {{ . }}
+          {{- end }}
+          {{- with .Values.master.readinessProbe.timeoutSeconds }}
+            timeoutSeconds: {{ . }}
+          {{- end }}
+          {{- with .Values.master.readinessProbe.successThreshold }}
+            successThreshold: {{ . }}
+          {{- end }}
+          ports:
+          - containerPort: {{ .Values.master.metricsPort | default "8081" }}
+            name: metrics
+          - containerPort: {{ .Values.master.healthPort | default "8082" }}
+            name: health
+          env:
+          - name: NODE_NAME
+            valueFrom:
+              fieldRef:
+                fieldPath: spec.nodeName
+        {{- with .Values.master.extraEnvs }}
+          {{- toYaml . | nindent 8 }}
+        {{- end}}
+          command:
+            - "nfd-master"
+          resources:
+            {{- toYaml .Values.master.resources | nindent 12 }}
+          args:
+            {{- if .Values.master.instance | empty | not }}
+            - "-instance={{ .Values.master.instance }}"
+            {{- end }}
+            - "-enable-leader-election"
+            {{- if .Values.master.extraLabelNs | empty | not }}
+            - "-extra-label-ns={{- join "," .Values.master.extraLabelNs }}"
+            {{- end }}
+            {{- if .Values.master.denyLabelNs | empty | not }}
+            - "-deny-label-ns={{- join "," .Values.master.denyLabelNs }}"
+            {{- end }}
+            {{- if .Values.master.enableTaints }}
+            - "-enable-taints"
+            {{- end }}
+            {{- if .Values.master.featureRulesController | kindIs "invalid" | not }}
+            - "-featurerules-controller={{ .Values.master.featureRulesController }}"
+            {{- end }}
+            {{- if .Values.master.resyncPeriod }}
+            - "-resync-period={{ .Values.master.resyncPeriod }}"
+            {{- end }}
+            {{- if .Values.master.nfdApiParallelism | empty | not }}
+            - "-nfd-api-parallelism={{ .Values.master.nfdApiParallelism }}"
+            {{- end }}
+            # Go over featureGates and add the feature-gate flag
+            {{- range $key, $value := .Values.featureGates }}
+            - "-feature-gates={{ $key }}={{ $value }}"
+            {{- end }}
+            - "-metrics={{ .Values.master.metricsPort  | default "8081" }}"
+            - "-grpc-health={{ .Values.master.healthPort | default "8082" }}"
+            {{- with .Values.master.extraArgs }}
+            {{- toYaml . | nindent 12 }}
+            {{- end }}
+          volumeMounts:
+            - name: nfd-master-conf
+              mountPath: "/etc/kubernetes/node-feature-discovery"
+              readOnly: true
+      volumes:
+        - name: nfd-master-conf
+          configMap:
+            name: {{ include "node-feature-discovery.fullname" . }}-master-conf
+            items:
+              - key: nfd-master.conf
+                path: nfd-master.conf
+    {{- with .Values.master.nodeSelector }}
+      nodeSelector:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
+    {{- with .Values.master.affinity }}
+      affinity:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+    {{- with .Values.master.tolerations }}
+      tolerations:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-gc.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-gc.yaml
@@ -0,0 +1,88 @@
+{{- if and .Values.gc.enable -}}
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-gc
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+    role: gc
+  {{- with .Values.gc.deploymentAnnotations }}
+  annotations:
+    {{- toYaml . | nindent 4 }}
+  {{- end }}
+spec:
+  replicas: {{ .Values.gc.replicaCount | default 1 }}
+  revisionHistoryLimit: {{ .Values.gc.revisionHistoryLimit }}
+  selector:
+    matchLabels:
+      {{- include "node-feature-discovery.selectorLabels" . | nindent 6 }}
+      role: gc
+  template:
+    metadata:
+      labels:
+        {{- include "node-feature-discovery.selectorLabels" . | nindent 8 }}
+        role: gc
+      {{- with .Values.gc.annotations }}
+      annotations:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
+    spec:
+      serviceAccountName: {{ include "node-feature-discovery.gc.serviceAccountName" . }}
+      dnsPolicy: ClusterFirstWithHostNet
+    {{- with .Values.priorityClassName }}
+      priorityClassName: {{ . }}
+    {{- end }}
+    {{- with .Values.imagePullSecrets }}
+      imagePullSecrets:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+      securityContext:
+        {{- toYaml .Values.gc.podSecurityContext | nindent 8 }}
+      hostNetwork: {{ .Values.gc.hostNetwork }}
+      containers:
+      - name: gc
+        image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
+        imagePullPolicy: "{{ .Values.image.pullPolicy }}"
+        env:
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: spec.nodeName
+      {{- with .Values.gc.extraEnvs }}
+        {{- toYaml . | nindent 8 }}
+      {{- end}}
+        command:
+          - "nfd-gc"
+        args:
+          {{- if .Values.gc.interval | empty | not }}
+          - "-gc-interval={{ .Values.gc.interval }}"
+          {{- end }}
+          {{- with .Values.gc.extraArgs }}
+          {{- toYaml . | nindent 10 }}
+          {{- end }}
+        resources:
+      {{- toYaml .Values.gc.resources | nindent 12 }}
+        securityContext:
+          allowPrivilegeEscalation: false
+          capabilities:
+            drop: [ "ALL" ]
+          readOnlyRootFilesystem: true
+          runAsNonRoot: true
+        ports:
+          - name: metrics
+            containerPort: {{ .Values.gc.metricsPort | default "8081"}}
+
+    {{- with .Values.gc.nodeSelector }}
+      nodeSelector:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
+    {{- with .Values.gc.affinity }}
+      affinity:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+    {{- with .Values.gc.tolerations }}
+      tolerations:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-master-conf.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-master-conf.yaml
@@ -0,0 +1,12 @@
+{{- if .Values.master.enable }}
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-master-conf
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+  {{- include "node-feature-discovery.labels" . | nindent 4 }}
+data:
+  nfd-master.conf: |-
+    {{- .Values.master.config | toYaml | nindent 4 }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-topologyupdater-conf.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-topologyupdater-conf.yaml
@@ -0,0 +1,12 @@
+{{- if .Values.topologyUpdater.enable -}}
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-topology-updater-conf
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+  {{- include "node-feature-discovery.labels" . | nindent 4 }}
+data:
+  nfd-topology-updater.conf: |-
+    {{- .Values.topologyUpdater.config | toYaml | nindent 4 }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-worker-conf.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/nfd-worker-conf.yaml
@@ -0,0 +1,12 @@
+{{- if .Values.worker.enable }}
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-worker-conf
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+  {{- include "node-feature-discovery.labels" . | nindent 4 }}
+data:
+  nfd-worker.conf: |-
+    {{- .Values.worker.config | toYaml | nindent 4 }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/post-delete-job.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/post-delete-job.yaml
@@ -0,0 +1,94 @@
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-prune
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+  annotations:
+    "helm.sh/hook": post-delete
+    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-prune
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+  annotations:
+    "helm.sh/hook": post-delete
+    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
+rules:
+- apiGroups:
+  - ""
+  resources:
+  - nodes
+  - nodes/status
+  verbs:
+  - get
+  - patch
+  - update
+  - list
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-prune
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+  annotations:
+    "helm.sh/hook": post-delete
+    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: {{ include "node-feature-discovery.fullname" . }}-prune
+subjects:
+- kind: ServiceAccount
+  name: {{ include "node-feature-discovery.fullname" . }}-prune
+  namespace: {{ include "node-feature-discovery.namespace" .  }}
+---
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name:  {{ include "node-feature-discovery.fullname" . }}-prune
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+  annotations:
+    "helm.sh/hook": post-delete
+    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
+spec:
+  template:
+    metadata:
+      labels:
+        {{- include "node-feature-discovery.labels" . | nindent 8 }}
+        role: prune
+    spec:
+      serviceAccountName: {{ include "node-feature-discovery.fullname" . }}-prune
+      containers:
+        - name: nfd-master
+          securityContext:
+            {{- toYaml .Values.master.securityContext | nindent 12 }}
+          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
+          imagePullPolicy: {{ .Values.image.pullPolicy }}
+          command:
+            - "nfd-master"
+          args:
+            - "-prune"
+            {{- if .Values.master.instance | empty | not }}
+            - "-instance={{ .Values.master.instance }}"
+            {{- end }}
+      restartPolicy: Never
+      {{- with .Values.master.nodeSelector }}
+      nodeSelector:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
+      {{- with .Values.master.affinity }}
+      affinity:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
+      {{- with .Values.master.tolerations }}
+      tolerations:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/prometheus.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/prometheus.yaml
@@ -0,0 +1,26 @@
+{{- if .Values.prometheus.enable }}
+# Prometheus Monitor Service (Metrics)
+apiVersion: monitoring.coreos.com/v1
+kind: PodMonitor
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}
+  labels:
+    {{- include "node-feature-discovery.selectorLabels" . | nindent 4 }}
+    {{- with .Values.prometheus.labels }}
+    {{ toYaml . | nindent 4 }}
+    {{- end }}
+spec:
+  podMetricsEndpoints:
+    - honorLabels: true
+      interval: {{ .Values.prometheus.scrapeInterval }}
+      path: /metrics
+      port: metrics
+      scheme: http
+  namespaceSelector:
+    matchNames:
+    - {{ include "node-feature-discovery.namespace" . }}
+  selector:
+    matchExpressions:
+    - {key: app.kubernetes.io/instance, operator: In, values: ["{{ .Release.Name }}"]}
+    - {key: app.kubernetes.io/name, operator: In, values: ["{{ include "node-feature-discovery.name" . }}"]}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/role.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/role.yaml
@@ -0,0 +1,25 @@
+{{- if and .Values.worker.enable .Values.worker.rbac.create }}
+apiVersion: rbac.authorization.k8s.io/v1
+kind: Role
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-worker
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+rules:
+- apiGroups:
+  - nfd.k8s-sigs.io
+  resources:
+  - nodefeatures
+  verbs:
+  - create
+  - get
+  - update
+  - delete
+- apiGroups:
+  - ""
+  resources:
+  - pods
+  verbs:
+  - get
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/rolebinding.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/rolebinding.yaml
@@ -0,0 +1,18 @@
+{{- if and .Values.worker.enable .Values.worker.rbac.create }}
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-worker
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: Role
+  name: {{ include "node-feature-discovery.fullname" . }}-worker
+subjects:
+- kind: ServiceAccount
+  name: {{ include "node-feature-discovery.worker.serviceAccountName" . }}
+  namespace: {{ include "node-feature-discovery.namespace" .  }}
+{{- end }}
+
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/serviceaccount.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/serviceaccount.yaml
@@ -0,0 +1,58 @@
+{{- if and .Values.master.enable .Values.master.serviceAccount.create }}
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: {{ include "node-feature-discovery.master.serviceAccountName" . }}
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+  {{- with .Values.master.serviceAccount.annotations }}
+  annotations:
+    {{- toYaml . | nindent 4 }}
+  {{- end }}
+{{- end }}
+
+{{- if and .Values.topologyUpdater.enable .Values.topologyUpdater.serviceAccount.create }}
+---
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: {{ include "node-feature-discovery.topologyUpdater.serviceAccountName" . }}
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+  {{- with .Values.topologyUpdater.serviceAccount.annotations }}
+  annotations:
+    {{- toYaml . | nindent 4 }}
+  {{- end }}
+{{- end }}
+
+{{- if and .Values.gc.enable .Values.gc.serviceAccount.create }}
+---
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: {{ include "node-feature-discovery.gc.serviceAccountName" . }}
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+  {{- with .Values.gc.serviceAccount.annotations }}
+  annotations:
+    {{- toYaml . | nindent 4 }}
+  {{- end }}
+{{- end }}
+
+{{- if and .Values.worker.enable .Values.worker.serviceAccount.create }}
+---
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: {{ include "node-feature-discovery.worker.serviceAccountName" . }}
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+  {{- with .Values.worker.serviceAccount.annotations }}
+  annotations:
+    {{- toYaml . | nindent 4 }}
+  {{- end }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/topologyupdater-crds.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/topologyupdater-crds.yaml
@@ -0,0 +1,278 @@
+{{- if and .Values.topologyUpdater.enable .Values.topologyUpdater.createCRDs -}}
+apiVersion: apiextensions.k8s.io/v1
+kind: CustomResourceDefinition
+metadata:
+  annotations:
+    api-approved.kubernetes.io: https://github.com/kubernetes/enhancements/pull/1870
+    controller-gen.kubebuilder.io/version: v0.11.2
+  creationTimestamp: null
+  name: noderesourcetopologies.topology.node.k8s.io
+spec:
+  group: topology.node.k8s.io
+  names:
+    kind: NodeResourceTopology
+    listKind: NodeResourceTopologyList
+    plural: noderesourcetopologies
+    shortNames:
+    - node-res-topo
+    singular: noderesourcetopology
+  scope: Cluster
+  versions:
+  - name: v1alpha1
+    schema:
+      openAPIV3Schema:
+        description: NodeResourceTopology describes node resources and their topology.
+        properties:
+          apiVersion:
+            description: 'APIVersion defines the versioned schema of this representation
+              of an object. Servers should convert recognized schemas to the latest
+              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
+            type: string
+          kind:
+            description: 'Kind is a string value representing the REST resource this
+              object represents. Servers may infer this from the endpoint the client
+              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
+            type: string
+          metadata:
+            type: object
+          topologyPolicies:
+            items:
+              type: string
+            type: array
+          zones:
+            description: ZoneList contains an array of Zone objects.
+            items:
+              description: Zone represents a resource topology zone, e.g. socket,
+                node, die or core.
+              properties:
+                attributes:
+                  description: AttributeList contains an array of AttributeInfo objects.
+                  items:
+                    description: AttributeInfo contains one attribute of a Zone.
+                    properties:
+                      name:
+                        type: string
+                      value:
+                        type: string
+                    required:
+                    - name
+                    - value
+                    type: object
+                  type: array
+                costs:
+                  description: CostList contains an array of CostInfo objects.
+                  items:
+                    description: CostInfo describes the cost (or distance) between
+                      two Zones.
+                    properties:
+                      name:
+                        type: string
+                      value:
+                        format: int64
+                        type: integer
+                    required:
+                    - name
+                    - value
+                    type: object
+                  type: array
+                name:
+                  type: string
+                parent:
+                  type: string
+                resources:
+                  description: ResourceInfoList contains an array of ResourceInfo
+                    objects.
+                  items:
+                    description: ResourceInfo contains information about one resource
+                      type.
+                    properties:
+                      allocatable:
+                        anyOf:
+                        - type: integer
+                        - type: string
+                        description: Allocatable quantity of the resource, corresponding
+                          to allocatable in node status, i.e. total amount of this
+                          resource available to be used by pods.
+                        pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                        x-kubernetes-int-or-string: true
+                      available:
+                        anyOf:
+                        - type: integer
+                        - type: string
+                        description: Available is the amount of this resource currently
+                          available for new (to be scheduled) pods, i.e. Allocatable
+                          minus the resources reserved by currently running pods.
+                        pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                        x-kubernetes-int-or-string: true
+                      capacity:
+                        anyOf:
+                        - type: integer
+                        - type: string
+                        description: Capacity of the resource, corresponding to capacity
+                          in node status, i.e. total amount of this resource that
+                          the node has.
+                        pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                        x-kubernetes-int-or-string: true
+                      name:
+                        description: Name of the resource.
+                        type: string
+                    required:
+                    - allocatable
+                    - available
+                    - capacity
+                    - name
+                    type: object
+                  type: array
+                type:
+                  type: string
+              required:
+              - name
+              - type
+              type: object
+            type: array
+        required:
+        - topologyPolicies
+        - zones
+        type: object
+    served: true
+    storage: false
+  - name: v1alpha2
+    schema:
+      openAPIV3Schema:
+        description: NodeResourceTopology describes node resources and their topology.
+        properties:
+          apiVersion:
+            description: 'APIVersion defines the versioned schema of this representation
+              of an object. Servers should convert recognized schemas to the latest
+              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
+            type: string
+          attributes:
+            description: AttributeList contains an array of AttributeInfo objects.
+            items:
+              description: AttributeInfo contains one attribute of a Zone.
+              properties:
+                name:
+                  type: string
+                value:
+                  type: string
+              required:
+              - name
+              - value
+              type: object
+            type: array
+          kind:
+            description: 'Kind is a string value representing the REST resource this
+              object represents. Servers may infer this from the endpoint the client
+              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
+            type: string
+          metadata:
+            type: object
+          topologyPolicies:
+            description: 'DEPRECATED (to be removed in v1beta1): use top level attributes
+              if needed'
+            items:
+              type: string
+            type: array
+          zones:
+            description: ZoneList contains an array of Zone objects.
+            items:
+              description: Zone represents a resource topology zone, e.g. socket,
+                node, die or core.
+              properties:
+                attributes:
+                  description: AttributeList contains an array of AttributeInfo objects.
+                  items:
+                    description: AttributeInfo contains one attribute of a Zone.
+                    properties:
+                      name:
+                        type: string
+                      value:
+                        type: string
+                    required:
+                    - name
+                    - value
+                    type: object
+                  type: array
+                costs:
+                  description: CostList contains an array of CostInfo objects.
+                  items:
+                    description: CostInfo describes the cost (or distance) between
+                      two Zones.
+                    properties:
+                      name:
+                        type: string
+                      value:
+                        format: int64
+                        type: integer
+                    required:
+                    - name
+                    - value
+                    type: object
+                  type: array
+                name:
+                  type: string
+                parent:
+                  type: string
+                resources:
+                  description: ResourceInfoList contains an array of ResourceInfo
+                    objects.
+                  items:
+                    description: ResourceInfo contains information about one resource
+                      type.
+                    properties:
+                      allocatable:
+                        anyOf:
+                        - type: integer
+                        - type: string
+                        description: Allocatable quantity of the resource, corresponding
+                          to allocatable in node status, i.e. total amount of this
+                          resource available to be used by pods.
+                        pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                        x-kubernetes-int-or-string: true
+                      available:
+                        anyOf:
+                        - type: integer
+                        - type: string
+                        description: Available is the amount of this resource currently
+                          available for new (to be scheduled) pods, i.e. Allocatable
+                          minus the resources reserved by currently running pods.
+                        pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                        x-kubernetes-int-or-string: true
+                      capacity:
+                        anyOf:
+                        - type: integer
+                        - type: string
+                        description: Capacity of the resource, corresponding to capacity
+                          in node status, i.e. total amount of this resource that
+                          the node has.
+                        pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                        x-kubernetes-int-or-string: true
+                      name:
+                        description: Name of the resource.
+                        type: string
+                    required:
+                    - allocatable
+                    - available
+                    - capacity
+                    - name
+                    type: object
+                  type: array
+                type:
+                  type: string
+              required:
+              - name
+              - type
+              type: object
+            type: array
+        required:
+        - zones
+        type: object
+    served: true
+    storage: true
+status:
+  acceptedNames:
+    kind: ""
+    plural: ""
+  conditions: []
+  storedVersions: []
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/topologyupdater.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/topologyupdater.yaml
@@ -0,0 +1,188 @@
+{{- if .Values.topologyUpdater.enable -}}
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+  name: {{ include "node-feature-discovery.fullname" . }}-topology-updater
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+    role: topology-updater
+  {{- with .Values.topologyUpdater.daemonsetAnnotations }}
+  annotations:
+    {{- toYaml . | nindent 4 }}
+  {{- end }}
+spec:
+  revisionHistoryLimit: {{ .Values.topologyUpdater.revisionHistoryLimit }}
+  selector:
+    matchLabels:
+      {{- include "node-feature-discovery.selectorLabels" . | nindent 6 }}
+      role: topology-updater
+  template:
+    metadata:
+      labels:
+        {{- include "node-feature-discovery.selectorLabels" . | nindent 8 }}
+        role: topology-updater
+      annotations:
+        checksum/config: {{ include (print $.Template.BasePath "/nfd-topologyupdater-conf.yaml") . | sha256sum }}
+        {{- with .Values.topologyUpdater.annotations }}
+        {{- toYaml . | nindent 8 }}
+        {{- end }}
+    spec:
+      serviceAccountName: {{ include "node-feature-discovery.topologyUpdater.serviceAccountName" . }}
+      dnsPolicy: ClusterFirstWithHostNet
+    {{- with .Values.priorityClassName }}
+      priorityClassName: {{ . }}
+    {{- end }}
+    {{- with .Values.imagePullSecrets }}
+      imagePullSecrets:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+      securityContext:
+        {{- toYaml .Values.topologyUpdater.podSecurityContext | nindent 8 }}
+      hostNetwork: {{ .Values.topologyUpdater.hostNetwork }}
+      containers:
+      - name: topology-updater
+        image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
+        imagePullPolicy: "{{ .Values.image.pullPolicy }}"
+        livenessProbe:
+          grpc:
+            port: {{ .Values.topologyUpdater.healthPort | default "8082" }}
+        {{- with .Values.topologyUpdater.livenessProbe.initialDelaySeconds }}
+          initialDelaySeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.topologyUpdater.livenessProbe.failureThreshold }}
+          failureThreshold: {{ . }}
+        {{- end }}
+        {{- with .Values.topologyUpdater.livenessProbe.periodSeconds }}
+          periodSeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.topologyUpdater.livenessProbe.timeoutSeconds }}
+          timeoutSeconds: {{ . }}
+        {{- end }}
+        readinessProbe:
+          grpc:
+            port: {{ .Values.topologyUpdater.healthPort | default "8082" }}
+        {{- with .Values.topologyUpdater.readinessProbe.initialDelaySeconds }}
+          initialDelaySeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.topologyUpdater.readinessProbe.failureThreshold }}
+          failureThreshold: {{ . }}
+        {{- end }}
+        {{- with .Values.topologyUpdater.readinessProbe.periodSeconds }}
+          periodSeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.topologyUpdater.readinessProbe.timeoutSeconds }}
+          timeoutSeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.topologyUpdater.readinessProbe.successThreshold }}
+          successThreshold: {{ . }}
+        {{- end }}
+        env:
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: spec.nodeName
+        - name: NODE_ADDRESS
+          valueFrom:
+            fieldRef:
+              fieldPath: status.hostIP
+      {{- with .Values.topologyUpdater.extraEnvs }}
+        {{- toYaml . | nindent 8 }}
+      {{- end}}
+        command:
+          - "nfd-topology-updater"
+        args:
+          - "-podresources-socket=/host-var/lib/kubelet-podresources/kubelet.sock"
+          {{- if .Values.topologyUpdater.updateInterval | empty | not }}
+          - "-sleep-interval={{ .Values.topologyUpdater.updateInterval }}"
+          {{- else }}
+          - "-sleep-interval=3s"
+          {{- end }}
+          {{- if .Values.topologyUpdater.watchNamespace | empty | not }}
+          - "-watch-namespace={{ .Values.topologyUpdater.watchNamespace }}"
+          {{- else }}
+          - "-watch-namespace=*"
+          {{- end }}
+          {{- if not .Values.topologyUpdater.podSetFingerprint }}
+          - "-pods-fingerprint=false"
+          {{- end }}
+          {{- if .Values.topologyUpdater.kubeletConfigPath | empty | not }}
+          - "-kubelet-config-uri=file:///host-var/kubelet-config"
+          {{- end }}
+          {{- if .Values.topologyUpdater.kubeletStateDir | empty }}
+          # Disable kubelet state tracking by giving an empty path
+          - "-kubelet-state-dir="
+          {{- end }}
+          - "-metrics={{ .Values.topologyUpdater.metricsPort | default "8081"}}"
+          - "-grpc-health={{ .Values.topologyUpdater.healthPort | default "8082" }}"
+          {{- with .Values.topologyUpdater.extraArgs }}
+          {{- toYaml . | nindent 10 }}
+          {{- end }}
+        ports:
+          - containerPort: {{ .Values.topologyUpdater.metricsPort | default "8081"}}
+            name: metrics
+          - containerPort: {{ .Values.topologyUpdater.healthPort | default "8082" }}
+            name: health
+        volumeMounts:
+        {{- if .Values.topologyUpdater.kubeletConfigPath | empty | not }}
+        - name: kubelet-config
+          mountPath: /host-var/kubelet-config
+        {{- end }}
+        - name: kubelet-podresources-sock
+          mountPath: /host-var/lib/kubelet-podresources/kubelet.sock
+        - name: host-sys
+          mountPath: /host-sys
+        {{- if .Values.topologyUpdater.kubeletStateDir | empty | not }}
+        - name: kubelet-state-files
+          mountPath: /host-var/lib/kubelet
+          readOnly: true
+        {{- end }}
+        - name: nfd-topology-updater-conf
+          mountPath: "/etc/kubernetes/node-feature-discovery"
+          readOnly: true
+
+        resources:
+      {{- toYaml .Values.topologyUpdater.resources | nindent 12 }}
+        securityContext:
+      {{- toYaml .Values.topologyUpdater.securityContext | nindent 12 }}
+      volumes:
+      - name: host-sys
+        hostPath:
+          path: "/sys"
+      {{- if .Values.topologyUpdater.kubeletConfigPath | empty | not }}
+      - name: kubelet-config
+        hostPath:
+          path: {{ .Values.topologyUpdater.kubeletConfigPath }}
+      {{- end }}
+      - name: kubelet-podresources-sock
+        hostPath:
+          {{- if .Values.topologyUpdater.kubeletPodResourcesSockPath | empty | not }}
+          path: {{ .Values.topologyUpdater.kubeletPodResourcesSockPath }}
+          {{- else }}
+          path: /var/lib/kubelet/pod-resources/kubelet.sock
+          {{- end }}
+      {{- if .Values.topologyUpdater.kubeletStateDir | empty | not }}
+      - name: kubelet-state-files
+        hostPath:
+          path: {{ .Values.topologyUpdater.kubeletStateDir }}
+      {{- end }}
+      - name: nfd-topology-updater-conf
+        configMap:
+          name: {{ include "node-feature-discovery.fullname" . }}-topology-updater-conf
+          items:
+            - key: nfd-topology-updater.conf
+              path: nfd-topology-updater.conf
+
+    {{- with .Values.topologyUpdater.nodeSelector }}
+      nodeSelector:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
+    {{- with .Values.topologyUpdater.affinity }}
+      affinity:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+    {{- with .Values.topologyUpdater.tolerations }}
+      tolerations:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/worker.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/templates/worker.yaml
@@ -0,0 +1,195 @@
+{{- if .Values.worker.enable }}
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+  name:  {{ include "node-feature-discovery.fullname" . }}-worker
+  namespace: {{ include "node-feature-discovery.namespace" . }}
+  labels:
+    {{- include "node-feature-discovery.labels" . | nindent 4 }}
+    role: worker
+  {{- with .Values.worker.daemonsetAnnotations }}
+  annotations:
+    {{- toYaml . | nindent 4 }}
+  {{- end }}
+spec:
+  revisionHistoryLimit: {{ .Values.worker.revisionHistoryLimit }}
+  selector:
+    matchLabels:
+      {{- include "node-feature-discovery.selectorLabels" . | nindent 6 }}
+      role: worker
+  template:
+    metadata:
+      labels:
+        {{- include "node-feature-discovery.selectorLabels" . | nindent 8 }}
+        role: worker
+      annotations:
+        checksum/config: {{ include (print $.Template.BasePath "/nfd-worker-conf.yaml") . | sha256sum }}
+        {{- with .Values.worker.annotations }}
+        {{- toYaml . | nindent 8 }}
+        {{- end }}
+    spec:
+      dnsPolicy: ClusterFirstWithHostNet
+    {{- with .Values.priorityClassName }}
+      priorityClassName: {{ . }}
+    {{- end }}
+    {{- with .Values.imagePullSecrets }}
+      imagePullSecrets:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+      serviceAccountName: {{ include "node-feature-discovery.worker.serviceAccountName" . }}
+      securityContext:
+        {{- toYaml .Values.worker.podSecurityContext | nindent 8 }}
+      hostNetwork: {{ .Values.worker.hostNetwork }}
+      containers:
+      - name: worker
+        securityContext:
+          {{- toYaml .Values.worker.securityContext | nindent 12 }}
+        image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
+        imagePullPolicy: {{ .Values.image.pullPolicy }}
+        livenessProbe:
+          grpc:
+            port: {{ .Values.worker.healthPort | default "8082" }}
+        {{- with .Values.worker.livenessProbe.initialDelaySeconds }}
+          initialDelaySeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.worker.livenessProbe.failureThreshold }}
+          failureThreshold: {{ . }}
+        {{- end }}
+        {{- with .Values.worker.livenessProbe.periodSeconds }}
+          periodSeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.worker.livenessProbe.timeoutSeconds }}
+          timeoutSeconds: {{ . }}
+        {{- end }}
+        readinessProbe:
+          grpc:
+            port: {{ .Values.worker.healthPort | default "8082" }}
+        {{- with .Values.worker.readinessProbe.initialDelaySeconds }}
+          initialDelaySeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.worker.readinessProbe.failureThreshold }}
+          failureThreshold: {{ . }}
+        {{- end }}
+        {{- with .Values.worker.readinessProbe.periodSeconds }}
+          periodSeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.worker.readinessProbe.timeoutSeconds }}
+          timeoutSeconds: {{ . }}
+        {{- end }}
+        {{- with .Values.worker.readinessProbe.successThreshold }}
+          successThreshold: {{ . }}
+        {{- end }}
+        env:
+        - name: NODE_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: spec.nodeName
+        - name: POD_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.name
+        - name: POD_UID
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.uid
+      {{- with .Values.worker.extraEnvs }}
+        {{- toYaml . | nindent 8 }}
+      {{- end}}
+        resources:
+        {{- toYaml .Values.worker.resources | nindent 12 }}
+        command:
+        - "nfd-worker"
+        args:
+        # Go over featureGate and add the feature-gate flag
+        {{- range $key, $value := .Values.featureGates }}
+        - "-feature-gates={{ $key }}={{ $value }}"
+        {{- end }}
+        - "-metrics={{ .Values.worker.metricsPort | default "8081"}}"
+        - "-grpc-health={{ .Values.worker.healthPort | default "8082" }}"
+        {{- with .Values.worker.extraArgs }}
+        {{- toYaml . | nindent 8 }}
+        {{- end }}
+        ports:
+          - containerPort: {{ .Values.worker.metricsPort | default "8081"}}
+            name: metrics
+          - containerPort: {{ .Values.worker.healthPort | default "8082" }}
+            name: health
+        volumeMounts:
+        - name: host-boot
+          mountPath: "/host-boot"
+          readOnly: true
+        - name: host-os-release
+          mountPath: "/host-etc/os-release"
+          readOnly: true
+        - name: host-sys
+          mountPath: "/host-sys"
+          readOnly: true
+        - name: host-usr-lib
+          mountPath: "/host-usr/lib"
+          readOnly: true
+        - name: host-lib
+          mountPath: "/host-lib"
+          readOnly: true
+        - name: host-proc-swaps
+          mountPath: "/host-proc/swaps"
+          readOnly: true
+        {{- if .Values.worker.mountUsrSrc }}
+        - name: host-usr-src
+          mountPath: "/host-usr/src"
+          readOnly: true
+        {{- end }}
+        - name: features-d
+          mountPath: "/etc/kubernetes/node-feature-discovery/features.d/"
+          readOnly: true
+        - name: nfd-worker-conf
+          mountPath: "/etc/kubernetes/node-feature-discovery"
+          readOnly: true
+      volumes:
+        - name: host-boot
+          hostPath:
+            path: "/boot"
+        - name: host-os-release
+          hostPath:
+            path: "/etc/os-release"
+        - name: host-sys
+          hostPath:
+            path: "/sys"
+        - name: host-usr-lib
+          hostPath:
+            path: "/usr/lib"
+        - name: host-lib
+          hostPath:
+            path: "/lib"
+        - name: host-proc-swaps
+          hostPath:
+            path: "/proc/swaps"
+        {{- if .Values.worker.mountUsrSrc }}
+        - name: host-usr-src
+          hostPath:
+            path: "/usr/src"
+        {{- end }}
+        - name: features-d
+          hostPath:
+            path: "/etc/kubernetes/node-feature-discovery/features.d/"
+        - name: nfd-worker-conf
+          configMap:
+            name: {{ include "node-feature-discovery.fullname" . }}-worker-conf
+            items:
+              - key: nfd-worker.conf
+                path: nfd-worker.conf
+      {{- with .Values.worker.nodeSelector }}
+      nodeSelector:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
+    {{- with .Values.worker.affinity }}
+      affinity:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+    {{- with .Values.worker.tolerations }}
+      tolerations:
+        {{- toYaml . | nindent 8 }}
+    {{- end }}
+    {{- with .Values.worker.priorityClassName }}
+      priorityClassName: {{ . | quote }}
+    {{- end }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/values.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/charts/node-feature-discovery/values.yaml
@@ -0,0 +1,599 @@
+image:
+  repository: registry.k8s.io/nfd/node-feature-discovery
+  # This should be set to 'IfNotPresent' for released version
+  pullPolicy: IfNotPresent
+  # tag, if defined will use the given image tag, else Chart.AppVersion will be used
+  # tag
+imagePullSecrets: []
+
+nameOverride: ""
+fullnameOverride: ""
+namespaceOverride: ""
+
+featureGates:
+  NodeFeatureGroupAPI: false
+
+priorityClassName: ""
+
+master:
+  enable: true
+  extraArgs: []
+  extraEnvs: []
+  hostNetwork: false
+  config: ### <NFD-MASTER-CONF-START-DO-NOT-REMOVE>
+    # noPublish: false
+    # autoDefaultNs: true
+    # extraLabelNs: ["added.ns.io","added.kubernets.io"]
+    # denyLabelNs: ["denied.ns.io","denied.kubernetes.io"]
+    # enableTaints: false
+    # labelWhiteList: "foo"
+    # resyncPeriod: "2h"
+    # restrictions:
+    #   disableLabels: true
+    #   disableTaints: true
+    #   disableExtendedResources: true
+    #   disableAnnotations: true
+    #   allowOverwrite: false
+    #   denyNodeFeatureLabels: true
+    #   nodeFeatureNamespaceSelector:
+    #    matchLabels:
+    #      kubernetes.io/metadata.name: "node-feature-discovery"
+    #    matchExpressions:
+    #      - key: "kubernetes.io/metadata.name"
+    #        operator: "In"
+    #        values:
+    #           - "node-feature-discovery"
+    # klog:
+    #    addDirHeader: false
+    #    alsologtostderr: false
+    #    logBacktraceAt:
+    #    logtostderr: true
+    #    skipHeaders: false
+    #    stderrthreshold: 2
+    #    v: 0
+    #    vmodule:
+    ##   NOTE: the following options are not dynamically run-time configurable
+    ##         and require a nfd-master restart to take effect after being changed
+    #    logDir:
+    #    logFile:
+    #    logFileMaxSize: 1800
+    #    skipLogHeaders: false
+    # leaderElection:
+    #   leaseDuration: 15s
+    #   # this value has to be lower than leaseDuration and greater than retryPeriod*1.2
+    #   renewDeadline: 10s
+    #   # this value has to be greater than 0
+    #   retryPeriod: 2s
+    # nfdApiParallelism: 10
+  ### <NFD-MASTER-CONF-END-DO-NOT-REMOVE>
+  metricsPort: 8081
+  healthPort: 8082
+  instance:
+  featureApi:
+  resyncPeriod:
+  denyLabelNs: []
+  extraLabelNs: []
+  enableTaints: false
+  featureRulesController: null
+  nfdApiParallelism: null
+  deploymentAnnotations: {}
+  replicaCount: 1
+
+  podSecurityContext: {}
+    # fsGroup: 2000
+
+  securityContext:
+    allowPrivilegeEscalation: false
+    capabilities:
+      drop: [ "ALL" ]
+    readOnlyRootFilesystem: true
+    runAsNonRoot: true
+    # runAsUser: 1000
+
+  serviceAccount:
+    # Specifies whether a service account should be created
+    create: true
+    # Annotations to add to the service account
+    annotations: {}
+    # The name of the service account to use.
+    # If not set and create is true, a name is generated using the fullname template
+    name:
+
+  # specify how many old ReplicaSets for the Deployment to retain.
+  revisionHistoryLimit:
+
+  rbac:
+    create: true
+
+  resources:
+    limits:
+      memory: 4Gi
+    requests:
+      cpu: 100m
+      # You may want to use the same value for `requests.memory` and `limits.memory`. The “requests” value affects scheduling to accommodate pods on nodes.
+      # If there is a large difference between “requests” and “limits” and nodes experience memory pressure, the kernel may invoke
+      # the OOM Killer, even if the memory does not exceed the “limits” threshold. This can cause unexpected pod evictions. Memory
+      # cannot be compressed and once allocated to a pod, it can only be reclaimed by killing the pod.
+      # Natan Yellin 22/09/2022 https://home.robusta.dev/blog/kubernetes-memory-limit
+      memory: 128Mi
+
+  nodeSelector: {}
+
+  tolerations:
+  - key: "node-role.kubernetes.io/master"
+    operator: "Equal"
+    value: ""
+    effect: "NoSchedule"
+  - key: "node-role.kubernetes.io/control-plane"
+    operator: "Equal"
+    value: ""
+    effect: "NoSchedule"
+
+  annotations: {}
+
+  affinity:
+    nodeAffinity:
+      preferredDuringSchedulingIgnoredDuringExecution:
+        - weight: 1
+          preference:
+            matchExpressions:
+              - key: "node-role.kubernetes.io/master"
+                operator: In
+                values: [""]
+        - weight: 1
+          preference:
+            matchExpressions:
+              - key: "node-role.kubernetes.io/control-plane"
+                operator: In
+                values: [""]
+                
+  startupProbe:
+    grpc:
+      port: 8082
+    failureThreshold: 30
+    # periodSeconds: 10
+  livenessProbe:
+    grpc:
+      port: 8082
+    # failureThreshold: 3
+    # initialDelaySeconds: 0
+    # periodSeconds: 10
+    # timeoutSeconds: 1
+  readinessProbe:
+    grpc:
+      port: 8082
+    failureThreshold: 10
+    # initialDelaySeconds: 0
+    # periodSeconds: 10
+    # timeoutSeconds: 1
+    # successThreshold: 1
+
+worker:
+  enable: true
+  extraArgs: []
+  extraEnvs: []
+  hostNetwork: false
+  config: ### <NFD-WORKER-CONF-START-DO-NOT-REMOVE>
+    #core:
+    #  labelWhiteList:
+    #  noPublish: false
+    #  noOwnerRefs: false
+    #  sleepInterval: 60s
+    #  featureSources: [all]
+    #  labelSources: [all]
+    #  klog:
+    #    addDirHeader: false
+    #    alsologtostderr: false
+    #    logBacktraceAt:
+    #    logtostderr: true
+    #    skipHeaders: false
+    #    stderrthreshold: 2
+    #    v: 0
+    #    vmodule:
+    ##   NOTE: the following options are not dynamically run-time configurable
+    ##         and require a nfd-worker restart to take effect after being changed
+    #    logDir:
+    #    logFile:
+    #    logFileMaxSize: 1800
+    #    skipLogHeaders: false
+    #sources:
+    #  cpu:
+    #    cpuid:
+    ##     NOTE: whitelist has priority over blacklist
+    #      attributeBlacklist:
+    #        - "AVX10"
+    #        - "BMI1"
+    #        - "BMI2"
+    #        - "CLMUL"
+    #        - "CMOV"
+    #        - "CX16"
+    #        - "ERMS"
+    #        - "F16C"
+    #        - "HTT"
+    #        - "LZCNT"
+    #        - "MMX"
+    #        - "MMXEXT"
+    #        - "NX"
+    #        - "POPCNT"
+    #        - "RDRAND"
+    #        - "RDSEED"
+    #        - "RDTSCP"
+    #        - "SGX"
+    #        - "SSE"
+    #        - "SSE2"
+    #        - "SSE3"
+    #        - "SSE4"
+    #        - "SSE42"
+    #        - "SSSE3"
+    #        - "TDX_GUEST"
+    #      attributeWhitelist:
+    #  kernel:
+    #    kconfigFile: "/path/to/kconfig"
+    #    configOpts:
+    #      - "NO_HZ"
+    #      - "X86"
+    #      - "DMI"
+    #  pci:
+    #    deviceClassWhitelist:
+    #      - "0200"
+    #      - "03"
+    #      - "12"
+    #    deviceLabelFields:
+    #      - "class"
+    #      - "vendor"
+    #      - "device"
+    #      - "subsystem_vendor"
+    #      - "subsystem_device"
+    #  usb:
+    #    deviceClassWhitelist:
+    #      - "0e"
+    #      - "ef"
+    #      - "fe"
+    #      - "ff"
+    #    deviceLabelFields:
+    #      - "class"
+    #      - "vendor"
+    #      - "device"
+    #  custom:
+    #    # The following feature demonstrates the capabilities of the matchFeatures
+    #    - name: "my custom rule"
+    #      labels:
+    #        "vendor.io/my-ng-feature": "true"
+    #      # matchFeatures implements a logical AND over all matcher terms in the
+    #      # list (i.e. all of the terms, or per-feature matchers, must match)
+    #      matchFeatures:
+    #        - feature: cpu.cpuid
+    #          matchExpressions:
+    #            AVX512F: {op: Exists}
+    #        - feature: cpu.cstate
+    #          matchExpressions:
+    #            enabled: {op: IsTrue}
+    #        - feature: cpu.pstate
+    #          matchExpressions:
+    #            no_turbo: {op: IsFalse}
+    #            scaling_governor: {op: In, value: ["performance"]}
+    #        - feature: cpu.rdt
+    #          matchExpressions:
+    #            RDTL3CA: {op: Exists}
+    #        - feature: cpu.sst
+    #          matchExpressions:
+    #            bf.enabled: {op: IsTrue}
+    #        - feature: cpu.topology
+    #          matchExpressions:
+    #            hardware_multithreading: {op: IsFalse}
+    #
+    #        - feature: kernel.config
+    #          matchExpressions:
+    #            X86: {op: Exists}
+    #            LSM: {op: InRegexp, value: ["apparmor"]}
+    #        - feature: kernel.loadedmodule
+    #          matchExpressions:
+    #            e1000e: {op: Exists}
+    #        - feature: kernel.selinux
+    #          matchExpressions:
+    #            enabled: {op: IsFalse}
+    #        - feature: kernel.version
+    #          matchExpressions:
+    #            major: {op: In, value: ["5"]}
+    #            minor: {op: Gt, value: ["10"]}
+    #
+    #        - feature: storage.block
+    #          matchExpressions:
+    #            rotational: {op: In, value: ["0"]}
+    #            dax: {op: In, value: ["0"]}
+    #
+    #        - feature: network.device
+    #          matchExpressions:
+    #            operstate: {op: In, value: ["up"]}
+    #            speed: {op: Gt, value: ["100"]}
+    #
+    #        - feature: memory.numa
+    #          matchExpressions:
+    #            node_count: {op: Gt, value: ["2"]}
+    #        - feature: memory.nv
+    #          matchExpressions:
+    #            devtype: {op: In, value: ["nd_dax"]}
+    #            mode: {op: In, value: ["memory"]}
+    #
+    #        - feature: system.osrelease
+    #          matchExpressions:
+    #            ID: {op: In, value: ["fedora", "centos"]}
+    #        - feature: system.name
+    #          matchExpressions:
+    #            nodename: {op: InRegexp, value: ["^worker-X"]}
+    #
+    #        - feature: local.label
+    #          matchExpressions:
+    #            custom-feature-knob: {op: Gt, value: ["100"]}
+    #
+    #    # The following feature demonstrates the capabilities of the matchAny
+    #    - name: "my matchAny rule"
+    #      labels:
+    #        "vendor.io/my-ng-feature-2": "my-value"
+    #      # matchAny implements a logical IF over all elements (sub-matchers) in
+    #      # the list (i.e. at least one feature matcher must match)
+    #      matchAny:
+    #        - matchFeatures:
+    #            - feature: kernel.loadedmodule
+    #              matchExpressions:
+    #                driver-module-X: {op: Exists}
+    #            - feature: pci.device
+    #              matchExpressions:
+    #                vendor: {op: In, value: ["8086"]}
+    #                class: {op: In, value: ["0200"]}
+    #        - matchFeatures:
+    #            - feature: kernel.loadedmodule
+    #              matchExpressions:
+    #                driver-module-Y: {op: Exists}
+    #            - feature: usb.device
+    #              matchExpressions:
+    #                vendor: {op: In, value: ["8086"]}
+    #                class: {op: In, value: ["02"]}
+    #
+    #    - name: "avx wildcard rule"
+    #      labels:
+    #        "my-avx-feature": "true"
+    #      matchFeatures:
+    #        - feature: cpu.cpuid
+    #          matchName: {op: InRegexp, value: ["^AVX512"]}
+    #
+    #    # The following features demonstreate label templating capabilities
+    #    - name: "my template rule"
+    #      labelsTemplate: |
+    #        {{ range .system.osrelease }}vendor.io/my-system-feature.{{ .Name }}={{ .Value }}
+    #        {{ end }}
+    #      matchFeatures:
+    #        - feature: system.osrelease
+    #          matchExpressions:
+    #            ID: {op: InRegexp, value: ["^open.*"]}
+    #            VERSION_ID.major: {op: In, value: ["13", "15"]}
+    #
+    #    - name: "my template rule 2"
+    #      labelsTemplate: |
+    #        {{ range .pci.device }}vendor.io/my-pci-device.{{ .class }}-{{ .device }}=with-cpuid
+    #        {{ end }}
+    #      matchFeatures:
+    #        - feature: pci.device
+    #          matchExpressions:
+    #            class: {op: InRegexp, value: ["^06"]}
+    #            vendor: ["8086"]
+    #        - feature: cpu.cpuid
+    #          matchExpressions:
+    #            AVX: {op: Exists}
+    #
+    #    # The following examples demonstrate vars field and back-referencing
+    #    # previous labels and vars
+    #    - name: "my dummy kernel rule"
+    #      labels:
+    #        "vendor.io/my.kernel.feature": "true"
+    #      matchFeatures:
+    #        - feature: kernel.version
+    #          matchExpressions:
+    #            major: {op: Gt, value: ["2"]}
+    #
+    #    - name: "my dummy rule with no labels"
+    #      vars:
+    #        "my.dummy.var": "1"
+    #      matchFeatures:
+    #        - feature: cpu.cpuid
+    #          matchExpressions: {}
+    #
+    #    - name: "my rule using backrefs"
+    #      labels:
+    #        "vendor.io/my.backref.feature": "true"
+    #      matchFeatures:
+    #        - feature: rule.matched
+    #          matchExpressions:
+    #            vendor.io/my.kernel.feature: {op: IsTrue}
+    #            my.dummy.var: {op: Gt, value: ["0"]}
+    #
+    #    - name: "kconfig template rule"
+    #      labelsTemplate: |
+    #        {{ range .kernel.config }}kconfig-{{ .Name }}={{ .Value }}
+    #        {{ end }}
+    #      matchFeatures:
+    #        - feature: kernel.config
+    #          matchName: {op: In, value: ["SWAP", "X86", "ARM"]}
+### <NFD-WORKER-CONF-END-DO-NOT-REMOVE>
+
+  metricsPort: 8081
+  healthPort: 8082
+  daemonsetAnnotations: {}
+  podSecurityContext: {}
+    # fsGroup: 2000
+
+  securityContext:
+    allowPrivilegeEscalation: false
+    capabilities:
+      drop: [ "ALL" ]
+    readOnlyRootFilesystem: true
+    runAsNonRoot: true
+    # runAsUser: 1000
+
+  livenessProbe:
+    grpc:
+      port: 8082
+    initialDelaySeconds: 10
+    # failureThreshold: 3
+    # periodSeconds: 10
+    # timeoutSeconds: 1
+  readinessProbe:
+    grpc:
+      port: 8082
+    initialDelaySeconds: 5
+    failureThreshold: 10
+    # periodSeconds: 10
+    # timeoutSeconds: 1
+    # successThreshold: 1
+
+  serviceAccount:
+    # Specifies whether a service account should be created.
+    # We create this by default to make it easier for downstream users to apply PodSecurityPolicies.
+    create: true
+    # Annotations to add to the service account
+    annotations: {}
+    # The name of the service account to use.
+    # If not set and create is true, a name is generated using the fullname template
+    name:
+
+  # specify how many old ControllerRevisions for the DaemonSet to retain.
+  revisionHistoryLimit:
+
+  rbac:
+    create: true
+
+  # Allow users to mount the hostPath /usr/src, useful for RHCOS on s390x
+  # Does not work on systems without /usr/src AND a read-only /usr, such as Talos
+  mountUsrSrc: false
+
+  resources:
+    limits:
+      memory: 512Mi
+    requests:
+      cpu: 5m
+      memory: 64Mi
+
+  nodeSelector: {}
+
+  tolerations: []
+
+  annotations: {}
+
+  affinity: {}
+
+  priorityClassName: ""
+
+topologyUpdater:
+  config: ### <NFD-TOPOLOGY-UPDATER-CONF-START-DO-NOT-REMOVE>
+    ## key = node name, value = list of resources to be excluded.
+    ## use * to exclude from all nodes.
+    ## an example for how the exclude list should looks like
+    #excludeList:
+    #  node1: [cpu]
+    #  node2: [memory, example/deviceA]
+    #  *: [hugepages-2Mi]
+### <NFD-TOPOLOGY-UPDATER-CONF-END-DO-NOT-REMOVE>
+
+  enable: false
+  createCRDs: false
+  extraArgs: []
+  extraEnvs: []
+  hostNetwork: false
+
+  serviceAccount:
+    create: true
+    annotations: {}
+    name:
+
+  # specify how many old ControllerRevisions for the DaemonSet to retain.
+  revisionHistoryLimit:
+
+  rbac:
+    create: true
+
+  metricsPort: 8081
+  healthPort: 8082
+  kubeletConfigPath:
+  kubeletPodResourcesSockPath:
+  updateInterval: 60s
+  watchNamespace: "*"
+  kubeletStateDir: /var/lib/kubelet
+
+  podSecurityContext: {}
+  securityContext:
+    allowPrivilegeEscalation: false
+    capabilities:
+      drop: [ "ALL" ]
+    readOnlyRootFilesystem: true
+    runAsUser: 0
+  
+  livenessProbe:
+    grpc:
+      port: 8082
+    initialDelaySeconds: 10
+    # failureThreshold: 3
+    # periodSeconds: 10
+    # timeoutSeconds: 1
+  readinessProbe:
+    grpc:
+      port: 8082
+    initialDelaySeconds: 5
+    failureThreshold: 10
+    # periodSeconds: 10
+    # timeoutSeconds: 1
+    # successThreshold: 1
+
+  resources:
+    limits:
+      memory: 60Mi
+    requests:
+      cpu: 50m
+      memory: 40Mi
+
+  nodeSelector: {}
+  tolerations: []
+  annotations: {}
+  daemonsetAnnotations: {}
+  affinity: {}
+  podSetFingerprint: true
+
+gc:
+  enable: true
+  extraArgs: []
+  extraEnvs: []
+  hostNetwork: false
+  replicaCount: 1
+
+  serviceAccount:
+    create: true
+    annotations: {}
+    name:
+  rbac:
+    create: true
+
+  interval: 1h
+
+  podSecurityContext: {}
+
+  resources:
+    limits:
+      memory: 1Gi
+    requests:
+      cpu: 10m
+      memory: 128Mi
+
+  metricsPort: 8081
+
+  nodeSelector: {}
+  tolerations: []
+  annotations: {}
+  deploymentAnnotations: {}
+  affinity: {}
+
+  # specify how many old ReplicaSets for the Deployment to retain.
+  revisionHistoryLimit:
+
+prometheus:
+  enable: false
+  scrapeInterval: 10s
+  labels: {}
--- a/packages/system/gpu-operator/charts/gpu-operator/crds/nvidia.com_clusterpolicies.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/crds/nvidia.com_clusterpolicies.yaml
--- a/packages/system/gpu-operator/charts/gpu-operator/crds/nvidia.com_nvidiadrivers.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/crds/nvidia.com_nvidiadrivers.yaml
@@ -0,0 +1,809 @@
+---
+apiVersion: apiextensions.k8s.io/v1
+kind: CustomResourceDefinition
+metadata:
+  annotations:
+    controller-gen.kubebuilder.io/version: v0.17.2
+  name: nvidiadrivers.nvidia.com
+spec:
+  group: nvidia.com
+  names:
+    kind: NVIDIADriver
+    listKind: NVIDIADriverList
+    plural: nvidiadrivers
+    shortNames:
+    - nvd
+    - nvdriver
+    - nvdrivers
+    singular: nvidiadriver
+  scope: Cluster
+  versions:
+  - additionalPrinterColumns:
+    - jsonPath: .status.state
+      name: Status
+      type: string
+    - jsonPath: .metadata.creationTimestamp
+      name: Age
+      type: string
+    name: v1alpha1
+    schema:
+      openAPIV3Schema:
+        description: NVIDIADriver is the Schema for the nvidiadrivers API
+        properties:
+          apiVersion:
+            description: |-
+              APIVersion defines the versioned schema of this representation of an object.
+              Servers should convert recognized schemas to the latest internal value, and
+              may reject unrecognized values.
+              More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
+            type: string
+          kind:
+            description: |-
+              Kind is a string value representing the REST resource this object represents.
+              Servers may infer this from the endpoint the client submits requests to.
+              Cannot be updated.
+              In CamelCase.
+              More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
+            type: string
+          metadata:
+            type: object
+          spec:
+            description: NVIDIADriverSpec defines the desired state of NVIDIADriver
+            properties:
+              annotations:
+                additionalProperties:
+                  type: string
+                description: |-
+                  Optional: Annotations is an unstructured key value map stored with a resource that may be
+                  set by external tools to store and retrieve arbitrary metadata. They are not
+                  queryable and should be preserved when modifying objects.
+                type: object
+              args:
+                description: 'Optional: List of arguments'
+                items:
+                  type: string
+                type: array
+              certConfig:
+                description: 'Optional: Custom certificates configuration for NVIDIA
+                  Driver container'
+                properties:
+                  name:
+                    type: string
+                type: object
+              driverType:
+                default: gpu
+                description: DriverType defines NVIDIA driver type
+                enum:
+                - gpu
+                - vgpu
+                - vgpu-host-manager
+                type: string
+                x-kubernetes-validations:
+                - message: driverType is an immutable field. Please create a new NvidiaDriver
+                    resource instead when you want to change this setting.
+                  rule: self == oldSelf
+              env:
+                description: 'Optional: List of environment variables'
+                items:
+                  description: EnvVar represents an environment variable present in
+                    a Container.
+                  properties:
+                    name:
+                      description: Name of the environment variable.
+                      type: string
+                    value:
+                      description: Value of the environment variable.
+                      type: string
+                  required:
+                  - name
+                  type: object
+                type: array
+              gdrcopy:
+                description: GDRCopy defines the spec for GDRCopy driver
+                properties:
+                  args:
+                    description: 'Optional: List of arguments'
+                    items:
+                      type: string
+                    type: array
+                  enabled:
+                    description: Enabled indicates if GDRCopy is enabled through GPU
+                      operator
+                    type: boolean
+                  env:
+                    description: 'Optional: List of environment variables'
+                    items:
+                      description: EnvVar represents an environment variable present
+                        in a Container.
+                      properties:
+                        name:
+                          description: Name of the environment variable.
+                          type: string
+                        value:
+                          description: Value of the environment variable.
+                          type: string
+                      required:
+                      - name
+                      type: object
+                    type: array
+                  image:
+                    description: GDRCopy driver image name
+                    pattern: '[a-zA-Z0-9\-]+'
+                    type: string
+                  imagePullPolicy:
+                    description: Image pull policy
+                    type: string
+                  imagePullSecrets:
+                    description: Image pull secrets
+                    items:
+                      type: string
+                    type: array
+                  repository:
+                    description: GDRCopy diver image repository
+                    type: string
+                  version:
+                    description: GDRCopy driver image tag
+                    type: string
+                type: object
+              gds:
+                description: GPUDirectStorage defines the spec for GDS driver
+                properties:
+                  args:
+                    description: 'Optional: List of arguments'
+                    items:
+                      type: string
+                    type: array
+                  enabled:
+                    description: Enabled indicates if GPUDirect Storage is enabled
+                      through GPU operator
+                    type: boolean
+                  env:
+                    description: 'Optional: List of environment variables'
+                    items:
+                      description: EnvVar represents an environment variable present
+                        in a Container.
+                      properties:
+                        name:
+                          description: Name of the environment variable.
+                          type: string
+                        value:
+                          description: Value of the environment variable.
+                          type: string
+                      required:
+                      - name
+                      type: object
+                    type: array
+                  image:
+                    description: NVIDIA GPUDirect Storage Driver image name
+                    pattern: '[a-zA-Z0-9\-]+'
+                    type: string
+                  imagePullPolicy:
+                    description: Image pull policy
+                    type: string
+                  imagePullSecrets:
+                    description: Image pull secrets
+                    items:
+                      type: string
+                    type: array
+                  repository:
+                    description: NVIDIA GPUDirect Storage Driver image repository
+                    type: string
+                  version:
+                    description: NVIDIA GPUDirect Storage Driver image tag
+                    type: string
+                type: object
+              image:
+                default: nvcr.io/nvidia/driver
+                description: NVIDIA Driver container image name
+                type: string
+              imagePullPolicy:
+                description: Image pull policy
+                type: string
+              imagePullSecrets:
+                description: Image pull secrets
+                items:
+                  type: string
+                type: array
+              kernelModuleConfig:
+                description: 'Optional: Kernel module configuration parameters for
+                  the NVIDIA Driver'
+                properties:
+                  name:
+                    type: string
+                type: object
+              kernelModuleType:
+                default: auto
+                description: |-
+                  KernelModuleType represents the type of driver kernel modules to be used when installing the GPU driver.
+                  Accepted values are auto, proprietary and open. NOTE: If auto is chosen, it means that the recommended kernel module
+                  type is chosen based on the GPU devices on the host and the driver branch used
+                enum:
+                - auto
+                - open
+                - proprietary
+                type: string
+              labels:
+                additionalProperties:
+                  type: string
+                description: |-
+                  Optional: Map of string keys and values that can be used to organize and categorize
+                  (scope and select) objects. May match selectors of replication controllers
+                  and services.
+                type: object
+              licensingConfig:
+                description: 'Optional: Licensing configuration for NVIDIA vGPU licensing'
+                properties:
+                  name:
+                    type: string
+                  nlsEnabled:
+                    description: NLSEnabled indicates if NVIDIA Licensing System is
+                      used for licensing.
+                    type: boolean
+                type: object
+              livenessProbe:
+                description: NVIDIA Driver container liveness probe settings
+                properties:
+                  failureThreshold:
+                    description: |-
+                      Minimum consecutive failures for the probe to be considered failed after having succeeded.
+                      Defaults to 3. Minimum value is 1.
+                    format: int32
+                    minimum: 1
+                    type: integer
+                  initialDelaySeconds:
+                    description: |-
+                      Number of seconds after the container has started before liveness probes are initiated.
+                      More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
+                    format: int32
+                    type: integer
+                  periodSeconds:
+                    description: |-
+                      How often (in seconds) to perform the probe.
+                      Default to 10 seconds. Minimum value is 1.
+                    format: int32
+                    minimum: 1
+                    type: integer
+                  successThreshold:
+                    description: |-
+                      Minimum consecutive successes for the probe to be considered successful after having failed.
+                      Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1.
+                    format: int32
+                    minimum: 1
+                    type: integer
+                  timeoutSeconds:
+                    description: |-
+                      Number of seconds after which the probe times out.
+                      Defaults to 1 second. Minimum value is 1.
+                      More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
+                    format: int32
+                    minimum: 1
+                    type: integer
+                type: object
+              manager:
+                description: Manager represents configuration for NVIDIA Driver Manager
+                  initContainer
+                properties:
+                  env:
+                    description: 'Optional: List of environment variables'
+                    items:
+                      description: EnvVar represents an environment variable present
+                        in a Container.
+                      properties:
+                        name:
+                          description: Name of the environment variable.
+                          type: string
+                        value:
+                          description: Value of the environment variable.
+                          type: string
+                      required:
+                      - name
+                      type: object
+                    type: array
+                  image:
+                    description: Image represents NVIDIA Driver Manager image name
+                    pattern: '[a-zA-Z0-9\-]+'
+                    type: string
+                  imagePullPolicy:
+                    description: Image pull policy
+                    type: string
+                  imagePullSecrets:
+                    description: Image pull secrets
+                    items:
+                      type: string
+                    type: array
+                  repository:
+                    description: Repository represents Driver Managerrepository path
+                    type: string
+                  version:
+                    description: Version represents NVIDIA Driver Manager image tag(version)
+                    type: string
+                type: object
+              nodeAffinity:
+                description: Affinity specifies node affinity rules for driver pods
+                properties:
+                  preferredDuringSchedulingIgnoredDuringExecution:
+                    description: |-
+                      The scheduler will prefer to schedule pods to nodes that satisfy
+                      the affinity expressions specified by this field, but it may choose
+                      a node that violates one or more of the expressions. The node that is
+                      most preferred is the one with the greatest sum of weights, i.e.
+                      for each node that meets all of the scheduling requirements (resource
+                      request, requiredDuringScheduling affinity expressions, etc.),
+                      compute a sum by iterating through the elements of this field and adding
+                      "weight" to the sum if the node matches the corresponding matchExpressions; the
+                      node(s) with the highest sum are the most preferred.
+                    items:
+                      description: |-
+                        An empty preferred scheduling term matches all objects with implicit weight 0
+                        (i.e. it's a no-op). A null preferred scheduling term matches no objects (i.e. is also a no-op).
+                      properties:
+                        preference:
+                          description: A node selector term, associated with the corresponding
+                            weight.
+                          properties:
+                            matchExpressions:
+                              description: A list of node selector requirements by
+                                node's labels.
+                              items:
+                                description: |-
+                                  A node selector requirement is a selector that contains values, a key, and an operator
+                                  that relates the key and values.
+                                properties:
+                                  key:
+                                    description: The label key that the selector applies
+                                      to.
+                                    type: string
+                                  operator:
+                                    description: |-
+                                      Represents a key's relationship to a set of values.
+                                      Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.
+                                    type: string
+                                  values:
+                                    description: |-
+                                      An array of string values. If the operator is In or NotIn,
+                                      the values array must be non-empty. If the operator is Exists or DoesNotExist,
+                                      the values array must be empty. If the operator is Gt or Lt, the values
+                                      array must have a single element, which will be interpreted as an integer.
+                                      This array is replaced during a strategic merge patch.
+                                    items:
+                                      type: string
+                                    type: array
+                                    x-kubernetes-list-type: atomic
+                                required:
+                                - key
+                                - operator
+                                type: object
+                              type: array
+                              x-kubernetes-list-type: atomic
+                            matchFields:
+                              description: A list of node selector requirements by
+                                node's fields.
+                              items:
+                                description: |-
+                                  A node selector requirement is a selector that contains values, a key, and an operator
+                                  that relates the key and values.
+                                properties:
+                                  key:
+                                    description: The label key that the selector applies
+                                      to.
+                                    type: string
+                                  operator:
+                                    description: |-
+                                      Represents a key's relationship to a set of values.
+                                      Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.
+                                    type: string
+                                  values:
+                                    description: |-
+                                      An array of string values. If the operator is In or NotIn,
+                                      the values array must be non-empty. If the operator is Exists or DoesNotExist,
+                                      the values array must be empty. If the operator is Gt or Lt, the values
+                                      array must have a single element, which will be interpreted as an integer.
+                                      This array is replaced during a strategic merge patch.
+                                    items:
+                                      type: string
+                                    type: array
+                                    x-kubernetes-list-type: atomic
+                                required:
+                                - key
+                                - operator
+                                type: object
+                              type: array
+                              x-kubernetes-list-type: atomic
+                          type: object
+                          x-kubernetes-map-type: atomic
+                        weight:
+                          description: Weight associated with matching the corresponding
+                            nodeSelectorTerm, in the range 1-100.
+                          format: int32
+                          type: integer
+                      required:
+                      - preference
+                      - weight
+                      type: object
+                    type: array
+                    x-kubernetes-list-type: atomic
+                  requiredDuringSchedulingIgnoredDuringExecution:
+                    description: |-
+                      If the affinity requirements specified by this field are not met at
+                      scheduling time, the pod will not be scheduled onto the node.
+                      If the affinity requirements specified by this field cease to be met
+                      at some point during pod execution (e.g. due to an update), the system
+                      may or may not try to eventually evict the pod from its node.
+                    properties:
+                      nodeSelectorTerms:
+                        description: Required. A list of node selector terms. The
+                          terms are ORed.
+                        items:
+                          description: |-
+                            A null or empty node selector term matches no objects. The requirements of
+                            them are ANDed.
+                            The TopologySelectorTerm type implements a subset of the NodeSelectorTerm.
+                          properties:
+                            matchExpressions:
+                              description: A list of node selector requirements by
+                                node's labels.
+                              items:
+                                description: |-
+                                  A node selector requirement is a selector that contains values, a key, and an operator
+                                  that relates the key and values.
+                                properties:
+                                  key:
+                                    description: The label key that the selector applies
+                                      to.
+                                    type: string
+                                  operator:
+                                    description: |-
+                                      Represents a key's relationship to a set of values.
+                                      Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.
+                                    type: string
+                                  values:
+                                    description: |-
+                                      An array of string values. If the operator is In or NotIn,
+                                      the values array must be non-empty. If the operator is Exists or DoesNotExist,
+                                      the values array must be empty. If the operator is Gt or Lt, the values
+                                      array must have a single element, which will be interpreted as an integer.
+                                      This array is replaced during a strategic merge patch.
+                                    items:
+                                      type: string
+                                    type: array
+                                    x-kubernetes-list-type: atomic
+                                required:
+                                - key
+                                - operator
+                                type: object
+                              type: array
+                              x-kubernetes-list-type: atomic
+                            matchFields:
+                              description: A list of node selector requirements by
+                                node's fields.
+                              items:
+                                description: |-
+                                  A node selector requirement is a selector that contains values, a key, and an operator
+                                  that relates the key and values.
+                                properties:
+                                  key:
+                                    description: The label key that the selector applies
+                                      to.
+                                    type: string
+                                  operator:
+                                    description: |-
+                                      Represents a key's relationship to a set of values.
+                                      Valid operators are In, NotIn, Exists, DoesNotExist. Gt, and Lt.
+                                    type: string
+                                  values:
+                                    description: |-
+                                      An array of string values. If the operator is In or NotIn,
+                                      the values array must be non-empty. If the operator is Exists or DoesNotExist,
+                                      the values array must be empty. If the operator is Gt or Lt, the values
+                                      array must have a single element, which will be interpreted as an integer.
+                                      This array is replaced during a strategic merge patch.
+                                    items:
+                                      type: string
+                                    type: array
+                                    x-kubernetes-list-type: atomic
+                                required:
+                                - key
+                                - operator
+                                type: object
+                              type: array
+                              x-kubernetes-list-type: atomic
+                          type: object
+                          x-kubernetes-map-type: atomic
+                        type: array
+                        x-kubernetes-list-type: atomic
+                    required:
+                    - nodeSelectorTerms
+                    type: object
+                    x-kubernetes-map-type: atomic
+                type: object
+              nodeSelector:
+                additionalProperties:
+                  type: string
+                description: NodeSelector specifies a selector for installation of
+                  NVIDIA driver
+                type: object
+              priorityClassName:
+                description: 'Optional: Set priorityClassName'
+                type: string
+              rdma:
+                description: GPUDirectRDMA defines the spec for NVIDIA Peer Memory
+                  driver
+                properties:
+                  enabled:
+                    description: Enabled indicates if GPUDirect RDMA is enabled through
+                      GPU operator
+                    type: boolean
+                  useHostMofed:
+                    description: UseHostMOFED indicates to use MOFED drivers directly
+                      installed on the host to enable GPUDirect RDMA
+                    type: boolean
+                type: object
+              readinessProbe:
+                description: NVIDIA Driver container readiness probe settings
+                properties:
+                  failureThreshold:
+                    description: |-
+                      Minimum consecutive failures for the probe to be considered failed after having succeeded.
+                      Defaults to 3. Minimum value is 1.
+                    format: int32
+                    minimum: 1
+                    type: integer
+                  initialDelaySeconds:
+                    description: |-
+                      Number of seconds after the container has started before liveness probes are initiated.
+                      More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
+                    format: int32
+                    type: integer
+                  periodSeconds:
+                    description: |-
+                      How often (in seconds) to perform the probe.
+                      Default to 10 seconds. Minimum value is 1.
+                    format: int32
+                    minimum: 1
+                    type: integer
+                  successThreshold:
+                    description: |-
+                      Minimum consecutive successes for the probe to be considered successful after having failed.
+                      Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1.
+                    format: int32
+                    minimum: 1
+                    type: integer
+                  timeoutSeconds:
+                    description: |-
+                      Number of seconds after which the probe times out.
+                      Defaults to 1 second. Minimum value is 1.
+                      More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
+                    format: int32
+                    minimum: 1
+                    type: integer
+                type: object
+              repoConfig:
+                description: 'Optional: Custom repo configuration for NVIDIA Driver
+                  container'
+                properties:
+                  name:
+                    type: string
+                type: object
+              repository:
+                description: NVIDIA Driver repository
+                type: string
+              resources:
+                description: 'Optional: Define resources requests and limits for each
+                  pod'
+                properties:
+                  limits:
+                    additionalProperties:
+                      anyOf:
+                      - type: integer
+                      - type: string
+                      pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                      x-kubernetes-int-or-string: true
+                    description: |-
+                      Limits describes the maximum amount of compute resources allowed.
+                      More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
+                    type: object
+                  requests:
+                    additionalProperties:
+                      anyOf:
+                      - type: integer
+                      - type: string
+                      pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                      x-kubernetes-int-or-string: true
+                    description: |-
+                      Requests describes the minimum amount of compute resources required.
+                      If Requests is omitted for a container, it defaults to Limits if that is explicitly specified,
+                      otherwise to an implementation-defined value. Requests cannot exceed Limits.
+                      More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
+                    type: object
+                type: object
+              startupProbe:
+                description: NVIDIA Driver container startup probe settings
+                properties:
+                  failureThreshold:
+                    description: |-
+                      Minimum consecutive failures for the probe to be considered failed after having succeeded.
+                      Defaults to 3. Minimum value is 1.
+                    format: int32
+                    minimum: 1
+                    type: integer
+                  initialDelaySeconds:
+                    description: |-
+                      Number of seconds after the container has started before liveness probes are initiated.
+                      More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
+                    format: int32
+                    type: integer
+                  periodSeconds:
+                    description: |-
+                      How often (in seconds) to perform the probe.
+                      Default to 10 seconds. Minimum value is 1.
+                    format: int32
+                    minimum: 1
+                    type: integer
+                  successThreshold:
+                    description: |-
+                      Minimum consecutive successes for the probe to be considered successful after having failed.
+                      Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1.
+                    format: int32
+                    minimum: 1
+                    type: integer
+                  timeoutSeconds:
+                    description: |-
+                      Number of seconds after which the probe times out.
+                      Defaults to 1 second. Minimum value is 1.
+                      More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
+                    format: int32
+                    minimum: 1
+                    type: integer
+                type: object
+              tolerations:
+                description: 'Optional: Set tolerations'
+                items:
+                  description: |-
+                    The pod this Toleration is attached to tolerates any taint that matches
+                    the triple <key,value,effect> using the matching operator <operator>.
+                  properties:
+                    effect:
+                      description: |-
+                        Effect indicates the taint effect to match. Empty means match all taint effects.
+                        When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute.
+                      type: string
+                    key:
+                      description: |-
+                        Key is the taint key that the toleration applies to. Empty means match all taint keys.
+                        If the key is empty, operator must be Exists; this combination means to match all values and all keys.
+                      type: string
+                    operator:
+                      description: |-
+                        Operator represents a key's relationship to the value.
+                        Valid operators are Exists and Equal. Defaults to Equal.
+                        Exists is equivalent to wildcard for value, so that a pod can
+                        tolerate all taints of a particular category.
+                      type: string
+                    tolerationSeconds:
+                      description: |-
+                        TolerationSeconds represents the period of time the toleration (which must be
+                        of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default,
+                        it is not set, which means tolerate the taint forever (do not evict). Zero and
+                        negative values will be treated as 0 (evict immediately) by the system.
+                      format: int64
+                      type: integer
+                    value:
+                      description: |-
+                        Value is the taint value the toleration matches to.
+                        If the operator is Exists, the value should be empty, otherwise just a regular string.
+                      type: string
+                  type: object
+                type: array
+              useOpenKernelModules:
+                description: |-
+                  Deprecated: This field is no longer honored by the gpu-operator. Please use KernelModuleType instead.
+                  UseOpenKernelModules indicates if the open GPU kernel modules should be used
+                type: boolean
+              usePrecompiled:
+                description: UsePrecompiled indicates if deployment of NVIDIA Driver
+                  using pre-compiled modules is enabled
+                type: boolean
+                x-kubernetes-validations:
+                - message: usePrecompiled is an immutable field. Please create a new
+                    NvidiaDriver resource instead when you want to change this setting.
+                  rule: self == oldSelf
+              version:
+                description: NVIDIA Driver version (or just branch for precompiled
+                  drivers)
+                type: string
+              virtualTopologyConfig:
+                description: 'Optional: Virtual Topology Daemon configuration for
+                  NVIDIA vGPU drivers'
+                properties:
+                  name:
+                    description: 'Optional: Config name representing virtual topology
+                      daemon configuration file nvidia-topologyd.conf'
+                    type: string
+                type: object
+            required:
+            - driverType
+            - image
+            type: object
+          status:
+            description: NVIDIADriverStatus defines the observed state of NVIDIADriver
+            properties:
+              conditions:
+                description: Conditions is a list of conditions representing the NVIDIADriver's
+                  current state.
+                items:
+                  description: Condition contains details for one aspect of the current
+                    state of this API Resource.
+                  properties:
+                    lastTransitionTime:
+                      description: |-
+                        lastTransitionTime is the last time the condition transitioned from one status to another.
+                        This should be when the underlying condition changed.  If that is not known, then using the time when the API field changed is acceptable.
+                      format: date-time
+                      type: string
+                    message:
+                      description: |-
+                        message is a human readable message indicating details about the transition.
+                        This may be an empty string.
+                      maxLength: 32768
+                      type: string
+                    observedGeneration:
+                      description: |-
+                        observedGeneration represents the .metadata.generation that the condition was set based upon.
+                        For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
+                        with respect to the current state of the instance.
+                      format: int64
+                      minimum: 0
+                      type: integer
+                    reason:
+                      description: |-
+                        reason contains a programmatic identifier indicating the reason for the condition's last transition.
+                        Producers of specific condition types may define expected values and meanings for this field,
+                        and whether the values are considered a guaranteed API.
+                        The value should be a CamelCase string.
+                        This field may not be empty.
+                      maxLength: 1024
+                      minLength: 1
+                      pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
+                      type: string
+                    status:
+                      description: status of the condition, one of True, False, Unknown.
+                      enum:
+                      - "True"
+                      - "False"
+                      - Unknown
+                      type: string
+                    type:
+                      description: type of condition in CamelCase or in foo.example.com/CamelCase.
+                      maxLength: 316
+                      pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
+                      type: string
+                  required:
+                  - lastTransitionTime
+                  - message
+                  - reason
+                  - status
+                  - type
+                  type: object
+                type: array
+              namespace:
+                description: Namespace indicates a namespace in which the operator
+                  and driver are installed
+                type: string
+              state:
+                description: |-
+                  INSERT ADDITIONAL STATUS FIELD - define observed state of cluster
+                  Important: Run "make" to regenerate code after modifying this file
+                  State indicates status of NVIDIADriver instance
+                enum:
+                - ignored
+                - ready
+                - notReady
+                type: string
+            required:
+            - state
+            type: object
+        type: object
+    served: true
+    storage: true
+    subresources:
+      status: {}
--- a/packages/system/gpu-operator/charts/gpu-operator/templates/_helpers.tpl
+++ b/packages/system/gpu-operator/charts/gpu-operator/templates/_helpers.tpl
@@ -0,0 +1,80 @@
+{{/* vim: set filetype=mustache: */}}
+{{/*
+Expand the name of the chart.
+*/}}
+{{- define "gpu-operator.name" -}}
+{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+
+{{/*
+Create a default fully qualified app name.
+We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
+If release name contains chart name it will be used as a full name.
+*/}}
+{{- define "gpu-operator.fullname" -}}
+{{- if .Values.fullnameOverride -}}
+{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
+{{- else -}}
+{{- $name := default .Chart.Name .Values.nameOverride -}}
+{{- if contains $name .Release.Name -}}
+{{- .Release.Name | trunc 63 | trimSuffix "-" -}}
+{{- else -}}
+{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+{{- end -}}
+{{- end -}}
+
+{{/*
+Create chart name and version as used by the chart label.
+*/}}
+{{- define "gpu-operator.chart" -}}
+{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}}
+{{- end -}}
+
+{{/*
+Common labels
+*/}}
+
+{{- define "gpu-operator.labels" -}}
+app.kubernetes.io/name: {{ include "gpu-operator.name" . }}
+helm.sh/chart: {{ include "gpu-operator.chart" . }}
+app.kubernetes.io/instance: {{ .Release.Name }}
+{{- if .Chart.AppVersion }}
+app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
+{{- end }}
+app.kubernetes.io/managed-by: {{ .Release.Service }}
+{{- if .Values.operator.labels }}
+{{ toYaml .Values.operator.labels }}
+{{- end }}
+{{- end -}}
+
+{{- define "gpu-operator.operand-labels" -}}
+helm.sh/chart: {{ include "gpu-operator.chart" . }}
+app.kubernetes.io/managed-by: {{ include "gpu-operator.name" . }}
+{{- if .Values.daemonsets.labels }}
+{{ toYaml .Values.daemonsets.labels }}
+{{- end }}
+{{- end -}}
+
+{{- define "gpu-operator.matchLabels" -}}
+app.kubernetes.io/name: {{ include "gpu-operator.name" . }}
+app.kubernetes.io/instance: {{ .Release.Name }}
+{{- if .Chart.AppVersion }}
+app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
+{{- end }}
+app.kubernetes.io/managed-by: {{ .Release.Service }}
+{{- end -}}
+
+{{/*
+Full image name with tag
+*/}}
+{{- define "gpu-operator.fullimage" -}}
+{{- .Values.operator.repository -}}/{{- .Values.operator.image -}}:{{- .Values.operator.version | default .Chart.AppVersion -}}
+{{- end }}
+
+{{/*
+Full image name with tag
+*/}}
+{{- define "driver-manager.fullimage" -}}
+{{- .Values.driver.manager.repository -}}/{{- .Values.driver.manager.image -}}:{{- .Values.driver.manager.version -}}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/templates/cleanup_crd.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/templates/cleanup_crd.yaml
@@ -0,0 +1,50 @@
+{{- if .Values.operator.cleanupCRD }}
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: gpu-operator-cleanup-crd
+  namespace: {{ .Release.Namespace }}
+  annotations:
+    "helm.sh/hook": pre-delete
+    "helm.sh/hook-weight": "1"
+    "helm.sh/hook-delete-policy": hook-succeeded,before-hook-creation
+  labels:
+    {{- include "gpu-operator.labels" . | nindent 4 }}
+    app.kubernetes.io/component: "gpu-operator"
+spec:
+  template:
+    metadata:
+      name: gpu-operator-cleanup-crd
+      labels:
+        {{- include "gpu-operator.labels" . | nindent 8 }}
+        app.kubernetes.io/component: "gpu-operator"
+    spec:
+      serviceAccountName: gpu-operator
+      {{- if .Values.operator.imagePullSecrets }}
+      imagePullSecrets:
+      {{- range .Values.operator.imagePullSecrets }}
+        - name: {{ . }}
+      {{- end }}
+      {{- end }}
+      {{- with .Values.operator.tolerations }}
+      tolerations:
+        {{- toYaml . | nindent 8 }}
+      {{- end }}
+      containers:
+        - name: cleanup-crd
+          image: {{ include "gpu-operator.fullimage" . }}
+          imagePullPolicy: {{ .Values.operator.imagePullPolicy }}
+          command:
+          - /bin/sh
+          - -c
+          - >
+              kubectl delete clusterpolicy cluster-policy;
+              kubectl delete crd clusterpolicies.nvidia.com;
+              kubectl delete crd nvidiadrivers.nvidia.com --ignore-not-found=true;
+            {{- if .Values.nfd.enabled -}}
+              kubectl delete crd nodefeatures.nfd.k8s-sigs.io --ignore-not-found=true;
+              kubectl delete crd nodefeaturegroups.nfd.k8s-sigs.io --ignore-not-found=true;
+              kubectl delete crd nodefeaturerules.nfd.k8s-sigs.io --ignore-not-found=true;
+            {{- end }}
+      restartPolicy: OnFailure
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/templates/clusterpolicy.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/templates/clusterpolicy.yaml
@@ -0,0 +1,680 @@
+apiVersion: nvidia.com/v1
+kind: ClusterPolicy
+metadata:
+  name: cluster-policy
+  labels:
+    {{- include "gpu-operator.labels" . | nindent 4 }}
+    app.kubernetes.io/component: "gpu-operator"
+  {{- if .Values.operator.cleanupCRD }}
+  # CR cleanup is handled during pre-delete hook
+  # Add below annotation so that helm doesn't attempt to cleanup CR twice
+  annotations:
+    "helm.sh/resource-policy": keep
+  {{- end }}
+spec:
+  hostPaths:
+    rootFS: {{ .Values.hostPaths.rootFS }}
+    driverInstallDir: {{ .Values.hostPaths.driverInstallDir }}
+  operator:
+    {{- if .Values.operator.runtimeClass }}
+    runtimeClass: {{ .Values.operator.runtimeClass }}
+    {{- end }}
+    {{- if .Values.operator.defaultGPUMode }}
+    defaultGPUMode: {{ .Values.operator.defaultGPUMode }}
+    {{- end }}
+    {{- if .Values.operator.initContainer }}
+    initContainer:
+      {{- if .Values.operator.initContainer.repository }}
+      repository: {{ .Values.operator.initContainer.repository }}
+      {{- end }}
+      {{- if .Values.operator.initContainer.image }}
+      image: {{ .Values.operator.initContainer.image }}
+      {{- end }}
+      {{- if .Values.operator.initContainer.version }}
+      version: {{ .Values.operator.initContainer.version | quote }}
+      {{- end }}
+      {{- if .Values.operator.initContainer.imagePullPolicy }}
+      imagePullPolicy: {{ .Values.operator.initContainer.imagePullPolicy }}
+      {{- end }}
+      {{- if .Values.operator.initContainer.imagePullSecrets }}
+      imagePullSecrets: {{ toYaml .Values.operator.initContainer.imagePullSecrets | nindent 8 }}
+      {{- end }}
+    {{- end }}
+    {{- if .Values.operator.use_ocp_driver_toolkit }}
+    use_ocp_driver_toolkit: {{ .Values.operator.use_ocp_driver_toolkit }}
+    {{- end }}
+  daemonsets:
+    labels:
+      {{- include "gpu-operator.operand-labels" . | nindent 6 }}
+    {{- if .Values.daemonsets.annotations }}
+    annotations: {{ toYaml .Values.daemonsets.annotations | nindent 6 }}
+    {{- end }}
+    {{- if .Values.daemonsets.tolerations }}
+    tolerations: {{ toYaml .Values.daemonsets.tolerations | nindent 6 }}
+    {{- end }}
+    {{- if .Values.daemonsets.priorityClassName }}
+    priorityClassName: {{ .Values.daemonsets.priorityClassName }}
+    {{- end }}
+    {{- if .Values.daemonsets.updateStrategy }}
+    updateStrategy: {{ .Values.daemonsets.updateStrategy }}
+    {{- end }}
+    {{- if .Values.daemonsets.rollingUpdate }}
+    rollingUpdate:
+      maxUnavailable: {{ .Values.daemonsets.rollingUpdate.maxUnavailable | quote }}
+    {{- end }}
+  validator:
+    {{- if .Values.validator.repository }}
+    repository: {{ .Values.validator.repository }}
+    {{- end }}
+    {{- if .Values.validator.image }}
+    image: {{ .Values.validator.image }}
+    {{- end }}
+    version: {{ .Values.validator.version | default .Chart.AppVersion | quote }}
+    {{- if .Values.validator.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.validator.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.validator.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.validator.imagePullSecrets | nindent 8 }}
+    {{- end }}
+    {{- if .Values.validator.resources }}
+    resources: {{ toYaml .Values.validator.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.validator.env }}
+    env: {{ toYaml .Values.validator.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.validator.args }}
+    args: {{ toYaml .Values.validator.args | nindent 6 }}
+    {{- end }}
+    {{- if .Values.validator.plugin }}
+    plugin:
+      {{- if .Values.validator.plugin.env }}
+      env: {{ toYaml .Values.validator.plugin.env | nindent 8 }}
+      {{- end }}
+    {{- end }}
+    {{- if .Values.validator.cuda }}
+    cuda:
+      {{- if .Values.validator.cuda.env }}
+      env: {{ toYaml .Values.validator.cuda.env | nindent 8 }}
+      {{- end }}
+    {{- end }}
+    {{- if .Values.validator.driver }}
+    driver:
+      {{- if .Values.validator.driver.env }}
+      env: {{ toYaml .Values.validator.driver.env | nindent 8 }}
+      {{- end }}
+    {{- end }}
+    {{- if .Values.validator.toolkit }}
+    toolkit:
+      {{- if .Values.validator.toolkit.env }}
+      env: {{ toYaml .Values.validator.toolkit.env | nindent 8 }}
+      {{- end }}
+    {{- end }}
+    {{- if .Values.validator.vfioPCI }}
+    vfioPCI:
+      {{- if .Values.validator.vfioPCI.env }}
+      env: {{ toYaml .Values.validator.vfioPCI.env | nindent 8 }}
+      {{- end }}
+    {{- end }}
+    {{- if .Values.validator.vgpuManager }}
+    vgpuManager:
+      {{- if .Values.validator.vgpuManager.env }}
+      env: {{ toYaml .Values.validator.vgpuManager.env | nindent 8 }}
+      {{- end }}
+    {{- end }}
+    {{- if .Values.validator.vgpuDevices }}
+    vgpuDevices:
+      {{- if .Values.validator.vgpuDevices.env }}
+      env: {{ toYaml .Values.validator.vgpuDevices.env | nindent 8 }}
+      {{- end }}
+    {{- end }}
+
+  mig:
+    {{- if .Values.mig.strategy }}
+    strategy: {{ .Values.mig.strategy }}
+    {{- end }}
+  psa:
+    enabled: {{ .Values.psa.enabled }}
+  cdi:
+    enabled: {{ .Values.cdi.enabled }}
+    default: {{ .Values.cdi.default }}
+  driver:
+    enabled: {{ .Values.driver.enabled }}
+    useNvidiaDriverCRD: {{ .Values.driver.nvidiaDriverCRD.enabled }}
+    kernelModuleType: {{ .Values.driver.kernelModuleType }}
+    usePrecompiled: {{ .Values.driver.usePrecompiled }}
+    {{- if .Values.driver.repository }}
+    repository: {{ .Values.driver.repository }}
+    {{- end }}
+    {{- if .Values.driver.image }}
+    image: {{ .Values.driver.image }}
+    {{- end }}
+    {{- if .Values.driver.version }}
+    version: {{ .Values.driver.version | quote }}
+    {{- end }}
+    {{- if .Values.driver.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.driver.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.driver.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.driver.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.startupProbe }}
+    startupProbe: {{ toYaml .Values.driver.startupProbe | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.livenessProbe }}
+    livenessProbe: {{ toYaml .Values.driver.livenessProbe | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.readinessProbe }}
+    readinessProbe: {{ toYaml .Values.driver.readinessProbe | nindent 6 }}
+    {{- end }}
+    rdma:
+      enabled: {{ .Values.driver.rdma.enabled }}
+      useHostMofed: {{ .Values.driver.rdma.useHostMofed }}
+    manager:
+      {{- if .Values.driver.manager.repository }}
+      repository: {{ .Values.driver.manager.repository }}
+      {{- end }}
+      {{- if .Values.driver.manager.image }}
+      image: {{ .Values.driver.manager.image }}
+      {{- end }}
+      {{- if .Values.driver.manager.version }}
+      version: {{ .Values.driver.manager.version | quote }}
+      {{- end }}
+      {{- if .Values.driver.manager.imagePullPolicy }}
+      imagePullPolicy: {{ .Values.driver.manager.imagePullPolicy }}
+      {{- end }}
+      {{- if .Values.driver.manager.env }}
+      env: {{ toYaml .Values.driver.manager.env | nindent 8 }}
+      {{- end }}
+    {{- if .Values.driver.repoConfig }}
+    repoConfig: {{ toYaml .Values.driver.repoConfig | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.certConfig }}
+    certConfig: {{ toYaml .Values.driver.certConfig | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.licensingConfig }}
+    licensingConfig: {{ toYaml .Values.driver.licensingConfig | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.virtualTopology }}
+    virtualTopology: {{ toYaml .Values.driver.virtualTopology | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.kernelModuleConfig }}
+    kernelModuleConfig: {{ toYaml .Values.driver.kernelModuleConfig | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.resources }}
+    resources: {{ toYaml .Values.driver.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.env }}
+    env: {{ toYaml .Values.driver.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.args }}
+    args: {{ toYaml .Values.driver.args | nindent 6 }}
+    {{- end }}
+    {{- if .Values.driver.upgradePolicy }}
+    upgradePolicy:
+      autoUpgrade: {{ .Values.driver.upgradePolicy.autoUpgrade | default false }}
+      maxParallelUpgrades: {{ .Values.driver.upgradePolicy.maxParallelUpgrades | default 0 }}
+      maxUnavailable : {{ .Values.driver.upgradePolicy.maxUnavailable | default "25%" }}
+      waitForCompletion:
+        timeoutSeconds: {{ .Values.driver.upgradePolicy.waitForCompletion.timeoutSeconds }}
+        {{- if .Values.driver.upgradePolicy.waitForCompletion.podSelector }}
+        podSelector: {{ .Values.driver.upgradePolicy.waitForCompletion.podSelector }}
+        {{- end }}
+      podDeletion:
+        force: {{ .Values.driver.upgradePolicy.gpuPodDeletion.force | default false }}
+        timeoutSeconds: {{ .Values.driver.upgradePolicy.gpuPodDeletion.timeoutSeconds }}
+        deleteEmptyDir: {{ .Values.driver.upgradePolicy.gpuPodDeletion.deleteEmptyDir | default false }}
+      drain:
+        enable: {{ .Values.driver.upgradePolicy.drain.enable | default false }}
+        force: {{ .Values.driver.upgradePolicy.drain.force | default false }}
+        {{- if .Values.driver.upgradePolicy.drain.podSelector }}
+        podSelector: {{ .Values.driver.upgradePolicy.drain.podSelector }}
+        {{- end }}
+        timeoutSeconds: {{ .Values.driver.upgradePolicy.drain.timeoutSeconds }}
+        deleteEmptyDir: {{ .Values.driver.upgradePolicy.drain.deleteEmptyDir | default false}}
+    {{- end }}
+  vgpuManager:
+    enabled: {{ .Values.vgpuManager.enabled }}
+    {{- if .Values.vgpuManager.repository }}
+    repository: {{ .Values.vgpuManager.repository }}
+    {{- end }}
+    {{- if .Values.vgpuManager.image }}
+    image: {{ .Values.vgpuManager.image }}
+    {{- end }}
+    {{- if .Values.vgpuManager.version }}
+    version: {{ .Values.vgpuManager.version | quote }}
+    {{- end }}
+    {{- if .Values.vgpuManager.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.vgpuManager.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.vgpuManager.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.vgpuManager.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vgpuManager.resources }}
+    resources: {{ toYaml .Values.vgpuManager.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vgpuManager.env }}
+    env: {{ toYaml .Values.vgpuManager.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vgpuManager.args }}
+    args: {{ toYaml .Values.vgpuManager.args | nindent 6 }}
+    {{- end }}
+    driverManager:
+      {{- if .Values.vgpuManager.driverManager.repository }}
+      repository: {{ .Values.vgpuManager.driverManager.repository }}
+      {{- end }}
+      {{- if .Values.vgpuManager.driverManager.image }}
+      image: {{ .Values.vgpuManager.driverManager.image }}
+      {{- end }}
+      {{- if .Values.vgpuManager.driverManager.version }}
+      version: {{ .Values.vgpuManager.driverManager.version | quote }}
+      {{- end }}
+      {{- if .Values.vgpuManager.driverManager.imagePullPolicy }}
+      imagePullPolicy: {{ .Values.vgpuManager.driverManager.imagePullPolicy }}
+      {{- end }}
+      {{- if .Values.vgpuManager.driverManager.env }}
+      env: {{ toYaml .Values.vgpuManager.driverManager.env | nindent 8 }}
+      {{- end }}
+  kataManager:
+    enabled: {{ .Values.kataManager.enabled }}
+    config: {{ toYaml .Values.kataManager.config | nindent 6 }}
+    {{- if .Values.kataManager.repository }}
+    repository: {{ .Values.kataManager.repository }}
+    {{- end }}
+    {{- if .Values.kataManager.image }}
+    image: {{ .Values.kataManager.image }}
+    {{- end }}
+    {{- if .Values.kataManager.version }}
+    version: {{ .Values.kataManager.version | quote }}
+    {{- end }}
+    {{- if .Values.kataManager.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.kataManager.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.kataManager.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.kataManager.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.kataManager.resources }}
+    resources: {{ toYaml .Values.kataManager.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.kataManager.env }}
+    env: {{ toYaml .Values.kataManager.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.kataManager.args }}
+    args: {{ toYaml .Values.kataManager.args | nindent 6 }}
+    {{- end }}
+  vfioManager:
+    enabled: {{ .Values.vfioManager.enabled }}
+    {{- if .Values.vfioManager.repository }}
+    repository: {{ .Values.vfioManager.repository }}
+    {{- end }}
+    {{- if .Values.vfioManager.image }}
+    image: {{ .Values.vfioManager.image }}
+    {{- end }}
+    {{- if .Values.vfioManager.version }}
+    version: {{ .Values.vfioManager.version | quote }}
+    {{- end }}
+    {{- if .Values.vfioManager.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.vfioManager.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.vfioManager.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.vfioManager.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vfioManager.resources }}
+    resources: {{ toYaml .Values.vfioManager.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vfioManager.env }}
+    env: {{ toYaml .Values.vfioManager.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vfioManager.args }}
+    args: {{ toYaml .Values.vfioManager.args | nindent 6 }}
+    {{- end }}
+    driverManager:
+      {{- if .Values.vfioManager.driverManager.repository }}
+      repository: {{ .Values.vfioManager.driverManager.repository }}
+      {{- end }}
+      {{- if .Values.vfioManager.driverManager.image }}
+      image: {{ .Values.vfioManager.driverManager.image }}
+      {{- end }}
+      {{- if .Values.vfioManager.driverManager.version }}
+      version: {{ .Values.vfioManager.driverManager.version | quote }}
+      {{- end }}
+      {{- if .Values.vfioManager.driverManager.imagePullPolicy }}
+      imagePullPolicy: {{ .Values.vfioManager.driverManager.imagePullPolicy }}
+      {{- end }}
+      {{- if .Values.vfioManager.driverManager.env }}
+      env: {{ toYaml .Values.vfioManager.driverManager.env | nindent 8 }}
+      {{- end }}
+  vgpuDeviceManager:
+    enabled: {{ .Values.vgpuDeviceManager.enabled }}
+    {{- if .Values.vgpuDeviceManager.repository }}
+    repository: {{ .Values.vgpuDeviceManager.repository }}
+    {{- end }}
+    {{- if .Values.vgpuDeviceManager.image }}
+    image: {{ .Values.vgpuDeviceManager.image }}
+    {{- end }}
+    {{- if .Values.vgpuDeviceManager.version }}
+    version: {{ .Values.vgpuDeviceManager.version | quote }}
+    {{- end }}
+    {{- if .Values.vgpuDeviceManager.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.vgpuDeviceManager.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.vgpuDeviceManager.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.vgpuDeviceManager.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vgpuDeviceManager.resources }}
+    resources: {{ toYaml .Values.vgpuDeviceManager.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vgpuDeviceManager.env }}
+    env: {{ toYaml .Values.vgpuDeviceManager.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vgpuDeviceManager.args }}
+    args: {{ toYaml .Values.vgpuDeviceManager.args | nindent 6 }}
+    {{- end }}
+    {{- if .Values.vgpuDeviceManager.config }}
+    config: {{ toYaml .Values.vgpuDeviceManager.config | nindent 6 }}
+    {{- end  }}
+  ccManager:
+    enabled: {{ .Values.ccManager.enabled }}
+    defaultMode: {{ .Values.ccManager.defaultMode | quote }}
+    {{- if .Values.ccManager.repository }}
+    repository: {{ .Values.ccManager.repository }}
+    {{- end }}
+    {{- if .Values.ccManager.image }}
+    image: {{ .Values.ccManager.image }}
+    {{- end }}
+    {{- if .Values.ccManager.version }}
+    version: {{ .Values.ccManager.version | quote }}
+    {{- end }}
+    {{- if .Values.ccManager.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.ccManager.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.ccManager.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.ccManager.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.ccManager.resources }}
+    resources: {{ toYaml .Values.ccManager.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.ccManager.env }}
+    env: {{ toYaml .Values.vfioManager.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.ccManager.args }}
+    args: {{ toYaml .Values.ccManager.args | nindent 6 }}
+    {{- end }}
+  toolkit:
+    enabled: {{ .Values.toolkit.enabled }}
+    {{- if .Values.toolkit.repository }}
+    repository: {{ .Values.toolkit.repository }}
+    {{- end }}
+    {{- if .Values.toolkit.image }}
+    image: {{ .Values.toolkit.image }}
+    {{- end }}
+    {{- if .Values.toolkit.version }}
+    version: {{ .Values.toolkit.version | quote }}
+    {{- end }}
+    {{- if .Values.toolkit.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.toolkit.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.toolkit.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.toolkit.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.toolkit.resources }}
+    resources: {{ toYaml .Values.toolkit.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.toolkit.env }}
+    env: {{ toYaml .Values.toolkit.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.toolkit.installDir }}
+    installDir: {{ .Values.toolkit.installDir }}
+    {{- end }}
+  devicePlugin:
+    enabled: {{ .Values.devicePlugin.enabled }}
+    {{- if .Values.devicePlugin.repository }}
+    repository: {{ .Values.devicePlugin.repository }}
+    {{- end }}
+    {{- if .Values.devicePlugin.image }}
+    image: {{ .Values.devicePlugin.image }}
+    {{- end }}
+    {{- if .Values.devicePlugin.version }}
+    version: {{ .Values.devicePlugin.version | quote }}
+    {{- end }}
+    {{- if .Values.devicePlugin.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.devicePlugin.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.devicePlugin.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.devicePlugin.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.devicePlugin.resources }}
+    resources: {{ toYaml .Values.devicePlugin.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.devicePlugin.env }}
+    env: {{ toYaml .Values.devicePlugin.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.devicePlugin.args }}
+    args: {{ toYaml .Values.devicePlugin.args | nindent 6 }}
+    {{- end }}
+    {{- if .Values.devicePlugin.config.name }}
+    config:
+      name: {{ .Values.devicePlugin.config.name }}
+      default: {{ .Values.devicePlugin.config.default }}
+    {{- end }}
+  dcgm:
+    enabled: {{ .Values.dcgm.enabled }}
+    {{- if .Values.dcgm.repository }}
+    repository: {{ .Values.dcgm.repository }}
+    {{- end }}
+    {{- if .Values.dcgm.image }}
+    image: {{ .Values.dcgm.image }}
+    {{- end }}
+    {{- if .Values.dcgm.version }}
+    version: {{ .Values.dcgm.version | quote }}
+    {{- end }}
+    {{- if .Values.dcgm.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.dcgm.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.dcgm.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.dcgm.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.dcgm.resources }}
+    resources: {{ toYaml .Values.dcgm.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.dcgm.env }}
+    env: {{ toYaml .Values.dcgm.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.dcgm.args }}
+    args: {{ toYaml .Values.dcgm.args | nindent 6 }}
+    {{- end }}
+  dcgmExporter:
+    enabled: {{ .Values.dcgmExporter.enabled }}
+    {{- if .Values.dcgmExporter.repository }}
+    repository: {{ .Values.dcgmExporter.repository }}
+    {{- end }}
+    {{- if .Values.dcgmExporter.image }}
+    image: {{ .Values.dcgmExporter.image }}
+    {{- end }}
+    {{- if .Values.dcgmExporter.version }}
+    version: {{ .Values.dcgmExporter.version | quote }}
+    {{- end }}
+    {{- if .Values.dcgmExporter.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.dcgmExporter.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.dcgmExporter.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.dcgmExporter.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.dcgmExporter.resources }}
+    resources: {{ toYaml .Values.dcgmExporter.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.dcgmExporter.env }}
+    env: {{ toYaml .Values.dcgmExporter.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.dcgmExporter.args }}
+    args: {{ toYaml .Values.dcgmExporter.args | nindent 6 }}
+    {{- end }}
+    {{- if and (.Values.dcgmExporter.config) (.Values.dcgmExporter.config.name) }}
+    config:
+      name: {{ .Values.dcgmExporter.config.name }}
+    {{- end }}
+    {{- if .Values.dcgmExporter.serviceMonitor }}
+    serviceMonitor: {{ toYaml .Values.dcgmExporter.serviceMonitor | nindent 6 }}
+    {{- end }}
+  gfd:
+    enabled: {{ .Values.gfd.enabled }}
+    {{- if .Values.gfd.repository }}
+    repository: {{ .Values.gfd.repository }}
+    {{- end }}
+    {{- if .Values.gfd.image }}
+    image: {{ .Values.gfd.image }}
+    {{- end }}
+    {{- if .Values.gfd.version }}
+    version: {{ .Values.gfd.version | quote }}
+    {{- end }}
+    {{- if .Values.gfd.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.gfd.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.gfd.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.gfd.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.gfd.resources }}
+    resources: {{ toYaml .Values.gfd.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.gfd.env }}
+    env: {{ toYaml .Values.gfd.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.gfd.args }}
+    args: {{ toYaml .Values.gfd.args | nindent 6 }}
+    {{- end }}
+  migManager:
+    enabled: {{ .Values.migManager.enabled }}
+    {{- if .Values.migManager.repository }}
+    repository: {{ .Values.migManager.repository }}
+    {{- end }}
+    {{- if .Values.migManager.image }}
+    image: {{ .Values.migManager.image }}
+    {{- end }}
+    {{- if .Values.migManager.version }}
+    version: {{ .Values.migManager.version | quote }}
+    {{- end }}
+    {{- if .Values.migManager.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.migManager.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.migManager.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.migManager.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.migManager.resources }}
+    resources: {{ toYaml .Values.migManager.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.migManager.env }}
+    env: {{ toYaml .Values.migManager.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.migManager.args }}
+    args: {{ toYaml .Values.migManager.args | nindent 6 }}
+    {{- end }}
+    {{- if .Values.migManager.config }}
+    config:
+      name: {{ .Values.migManager.config.name }}
+      default: {{ .Values.migManager.config.default }}
+    {{- end }}
+    {{- if .Values.migManager.gpuClientsConfig }}
+    gpuClientsConfig: {{ toYaml .Values.migManager.gpuClientsConfig | nindent 6 }}
+    {{- end }}
+  nodeStatusExporter:
+    enabled: {{ .Values.nodeStatusExporter.enabled }}
+    {{- if .Values.nodeStatusExporter.repository }}
+    repository: {{ .Values.nodeStatusExporter.repository }}
+    {{- end }}
+    {{- if .Values.nodeStatusExporter.image }}
+    image: {{ .Values.nodeStatusExporter.image }}
+    {{- end }}
+    version: {{ .Values.nodeStatusExporter.version | default .Chart.AppVersion | quote }}
+    {{- if .Values.nodeStatusExporter.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.nodeStatusExporter.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.nodeStatusExporter.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.nodeStatusExporter.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.nodeStatusExporter.resources }}
+    resources: {{ toYaml .Values.nodeStatusExporter.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.nodeStatusExporter.env }}
+    env: {{ toYaml .Values.nodeStatusExporter.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.nodeStatusExporter.args }}
+    args: {{ toYaml .Values.nodeStatusExporter.args | nindent 6 }}
+    {{- end }}
+  {{- if .Values.gds.enabled }}
+  gds:
+    enabled: {{ .Values.gds.enabled }}
+    {{- if .Values.gds.repository }}
+    repository: {{ .Values.gds.repository }}
+    {{- end }}
+    {{- if .Values.gds.image }}
+    image: {{ .Values.gds.image }}
+    {{- end }}
+    version: {{ .Values.gds.version | quote }}
+    {{- if .Values.gds.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.gds.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.gds.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.gds.imagePullSecrets | nindent 8 }}
+    {{- end }}
+    {{- if .Values.gds.env }}
+    env: {{ toYaml .Values.gds.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.gds.args }}
+    args: {{ toYaml .Values.gds.args | nindent 6 }}
+    {{- end }}
+  {{- end }}
+  {{- if .Values.gdrcopy }}
+  gdrcopy:
+    enabled: {{ .Values.gdrcopy.enabled | default false }}
+    {{- if .Values.gdrcopy.repository }}
+    repository: {{ .Values.gdrcopy.repository }}
+    {{- end }}
+    {{- if .Values.gdrcopy.image }}
+    image: {{ .Values.gdrcopy.image }}
+    {{- end }}
+    version: {{ .Values.gdrcopy.version | quote }}
+    {{- if .Values.gdrcopy.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.gdrcopy.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.gdrcopy.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.gdrcopy.imagePullSecrets | nindent 8 }}
+    {{- end }}
+    {{- if .Values.gdrcopy.env }}
+    env: {{ toYaml .Values.gdrcopy.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.gdrcopy.args }}
+    args: {{ toYaml .Values.gdrcopy.args | nindent 6 }}
+    {{- end }}
+  {{- end }}
+  sandboxWorkloads:
+    enabled: {{ .Values.sandboxWorkloads.enabled }}
+    {{- if .Values.sandboxWorkloads.defaultWorkload }}
+    defaultWorkload: {{ .Values.sandboxWorkloads.defaultWorkload }}
+    {{- end }}
+  sandboxDevicePlugin:
+    {{- if .Values.sandboxDevicePlugin.enabled }}
+    enabled: {{ .Values.sandboxDevicePlugin.enabled }}
+    {{- end }}
+    {{- if .Values.sandboxDevicePlugin.repository }}
+    repository: {{ .Values.sandboxDevicePlugin.repository }}
+    {{- end }}
+    {{- if .Values.sandboxDevicePlugin.image }}
+    image: {{ .Values.sandboxDevicePlugin.image }}
+    {{- end }}
+    {{- if .Values.sandboxDevicePlugin.version }}
+    version: {{ .Values.sandboxDevicePlugin.version | quote }}
+    {{- end }}
+    {{- if .Values.sandboxDevicePlugin.imagePullPolicy }}
+    imagePullPolicy: {{ .Values.sandboxDevicePlugin.imagePullPolicy }}
+    {{- end }}
+    {{- if .Values.sandboxDevicePlugin.imagePullSecrets }}
+    imagePullSecrets: {{ toYaml .Values.sandboxDevicePlugin.imagePullSecrets | nindent 6 }}
+    {{- end }}
+    {{- if .Values.sandboxDevicePlugin.resources }}
+    resources: {{ toYaml .Values.sandboxDevicePlugin.resources | nindent 6 }}
+    {{- end }}
+    {{- if .Values.sandboxDevicePlugin.env }}
+    env: {{ toYaml .Values.sandboxDevicePlugin.env | nindent 6 }}
+    {{- end }}
+    {{- if .Values.sandboxDevicePlugin.args }}
+    args: {{ toYaml .Values.sandboxDevicePlugin.args | nindent 6 }}
+    {{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/templates/clusterrole.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/templates/clusterrole.yaml
@@ -0,0 +1,155 @@
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: gpu-operator
+  labels:
+    {{- include "gpu-operator.labels" . | nindent 4 }}
+    app.kubernetes.io/component: "gpu-operator"
+rules:
+- apiGroups:
+  - config.openshift.io
+  resources:
+  - clusterversions
+  - proxies
+  verbs:
+  - get
+  - list
+  - watch
+- apiGroups:
+  - image.openshift.io
+  resources:
+  - imagestreams
+  verbs:
+  - get
+  - list
+  - watch
+- apiGroups:
+  - security.openshift.io
+  resources:
+  - securitycontextconstraints
+  verbs:
+  - create
+  - get
+  - list
+  - watch
+  - update
+  - patch
+  - delete
+  - use
+- apiGroups:
+  - rbac.authorization.k8s.io
+  resources:
+  - clusterroles
+  - clusterrolebindings
+  verbs:
+  - create
+  - get
+  - list
+  - watch
+  - update
+  - patch
+  - delete
+- apiGroups:
+  - ""
+  resources:
+  - nodes
+  verbs:
+  - get
+  - list
+  - watch
+  - update
+  - patch
+- apiGroups:
+  - ""
+  resources:
+  - namespaces
+  verbs:
+  - get
+  - list
+  - watch
+  - update
+  - patch
+- apiGroups:
+  - ""
+  resources:
+  - events
+  verbs:
+  - create
+  - get
+  - list
+  - watch
+  - delete
+- apiGroups:
+  - ""
+  resources:
+  - pods
+  verbs:
+  - get
+  - list
+  - watch
+- apiGroups:
+  - ""
+  resources:
+  - pods/eviction
+  verbs:
+  - create
+- apiGroups:
+  - apps
+  resources:
+  - daemonsets
+  verbs:
+  - get
+  - list
+  - watch
+- apiGroups:
+  - nvidia.com
+  resources:
+  - clusterpolicies
+  - clusterpolicies/finalizers
+  - clusterpolicies/status
+  - nvidiadrivers
+  - nvidiadrivers/finalizers
+  - nvidiadrivers/status
+  verbs:
+  - create
+  - get
+  - list
+  - watch
+  - update
+  - patch
+  - delete
+  - deletecollection
+- apiGroups:
+  - scheduling.k8s.io
+  resources:
+  - priorityclasses
+  verbs:
+  - get
+  - list
+  - watch
+  - create
+- apiGroups:
+  - node.k8s.io
+  resources:
+  - runtimeclasses
+  verbs:
+  - get
+  - list
+  - create
+  - update
+  - watch
+  - delete
+- apiGroups:
+  - apiextensions.k8s.io
+  resources:
+  - customresourcedefinitions
+  verbs:
+  - get
+  - list
+  - watch
+  - update
+  - patch
+  - create
+{{- if .Values.operator.cleanupCRD }}
+  - delete
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/templates/clusterrolebinding.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/templates/clusterrolebinding.yaml
@@ -0,0 +1,15 @@
+kind: ClusterRoleBinding
+apiVersion: rbac.authorization.k8s.io/v1
+metadata:
+  name: gpu-operator
+  labels:
+    {{- include "gpu-operator.labels" . | nindent 4 }}
+    app.kubernetes.io/component: "gpu-operator"
+subjects:
+- kind: ServiceAccount
+  name: gpu-operator
+  namespace: {{ $.Release.Namespace }}
+roleRef:
+  kind: ClusterRole
+  name: gpu-operator
+  apiGroup: rbac.authorization.k8s.io
--- a/packages/system/gpu-operator/charts/gpu-operator/templates/dcgm_exporter_config.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/templates/dcgm_exporter_config.yaml
@@ -0,0 +1,14 @@
+{{- if .Values.dcgmExporter.config }}
+{{- if and (.Values.dcgmExporter.config.create) (not (empty .Values.dcgmExporter.config.data)) }}
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: {{ .Values.dcgmExporter.config.name }}
+  namespace: {{ .Release.Namespace }}
+  labels:
+    {{- include "gpu-operator.labels" . | nindent 4 }}
+data:
+  dcgm-metrics.csv: |
+{{- .Values.dcgmExporter.config.data | nindent 4 }}
+{{- end }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/templates/mig_config.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/templates/mig_config.yaml
@@ -0,0 +1,10 @@
+{{- if and (.Values.migManager.config.create) (not (empty .Values.migManager.config.data)) }}
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: {{ .Values.migManager.config.name }}
+  namespace: {{ .Release.Namespace }}
+  labels:
+    {{- include "gpu-operator.labels" . | nindent 4 }}
+data: {{ toYaml .Values.migManager.config.data | nindent 2 }}
+{{- end }}
--- a/packages/system/gpu-operator/charts/gpu-operator/templates/nodefeaturerules.yaml
+++ b/packages/system/gpu-operator/charts/gpu-operator/templates/nodefeaturerules.yaml
@@ -0,0 +1,107 @@
+{{- if .Values.nfd.nodefeaturerules }}
+apiVersion: nfd.k8s-sigs.io/v1alpha1
+kind: NodeFeatureRule
+metadata:
+  name: nvidia-nfd-nodefeaturerules
+spec:
+  rules:
+    - name: "TDX rule"
+      labels:
+        tdx.enabled: "true"
+      matchFeatures:
+        - feature: cpu.security
+          matchExpressions:
+              tdx.enabled: {op: IsTrue}
+    - name: "TDX total keys rule"
+      extendedResources:
+        tdx.total_keys: "@cpu.security.tdx.total_keys"
+      matchFeatures:
+        - feature: cpu.security
+          matchExpressions:
+            tdx.enabled: {op: IsTrue}
+    - name: "SEV-SNP rule"
+      labels:
+        sev.snp.enabled: "true"
+      matchFeatures:
+      - feature: cpu.security
+        matchExpressions:
+          sev.snp.enabled:
+            op: IsTrue
+    - name: "SEV-ES rule"
+      labels:
+        sev.es.enabled: "true"
+      matchFeatures:
+      - feature: cpu.security
+        matchExpressions:
+          sev.es.enabled:
+            op: IsTrue
+    - name: SEV system capacities
+      extendedResources:
+        sev_asids: '@cpu.security.sev.asids'
+        sev_es: '@cpu.security.sev.encrypted_state_ids'
+      matchFeatures:
+      - feature: cpu.security
+        matchExpressions:
+          sev.enabled:
+            op: Exists
+    - name: "NVIDIA H100"
+      labels:
+        "nvidia.com/gpu.H100": "true"
+        "nvidia.com/gpu.family": "hopper"
+      matchFeatures:
+        - feature: pci.device
+          matchExpressions:
+            vendor: {op: In, value: ["10de"]}
+            device: {op: In, value: ["2339"]}
+    - name: "NVIDIA H100 PCIe"
+      labels:
+        "nvidia.com/gpu.H100.pcie": "true"
+        "nvidia.com/gpu.family": "hopper"
+      matchFeatures:
+        - feature: pci.device
+          matchExpressions:
+            vendor: {op: In, value: ["10de"]}
+            device: {op: In, value: ["2331"]}
+    - name: "NVIDIA H100 80GB HBM3"
+      labels:
+        "nvidia.com/gpu.H100.HBM3": "true"
+        "nvidia.com/gpu.family": "hopper"
+      matchFeatures:
+        - feature: pci.device
+          matchExpressions:
+            vendor: {op: In, value: ["10de"]}
+            device: {op: In, value: ["2330"]}
+    - name: "NVIDIA H800"
+      labels:
+        "nvidia.com/gpu.H800": "true"
+        "nvidia.com/gpu.family": "hopper"
+      matchFeatures:
+        - feature: pci.device
+          matchExpressions:
+            vendor: {op: In, value: ["10de"]}
+            device: {op: In, value: ["2324"]}
+    - name: "NVIDIA H800 PCIE"
+      labels:
+        "nvidia.com/gpu.H800.pcie": "true"
+        "nvidia.com/gpu.family": "hopper"
+      matchFeatures:
+        - feature: pci.device
+          matchExpressions:
+            vendor: {op: In, value: ["10de"]}
+            device: {op: In, value: ["2322"]}
+    - name: "NVIDIA CC Enabled"
+      labels:
+        "nvidia.com/cc.capable": "true"
+      matchAny: # TDX/SEV + Hopper GPU
+       - matchFeatures:
+          - feature: rule.matched
+            matchExpressions:
+              nvidia.com/gpu.family: {op: In, value: ["hopper"]}
+              sev.snp.enabled: {op: IsTrue}
+       - matchFeatures:
+          - feature: rule.matched
+            matchExpressions:
+              nvidia.com/gpu.family: {op: In, value: ["hopper"]}
+              tdx.enabled: {op: IsTrue}
+{{- end }}
+
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
kklinch0	39b31ca9e5	fix updateStatus field	2025-04-10 14:28:29 +03:00
kklinch0	db7c591957	fix image tag for victorialogs	2025-04-10 14:04:45 +03:00
kklinch0	5baa48022e	fix	2025-04-10 11:58:50 +03:00
Andrei Kvapil	1234872bda	Upd: Kube-OVN to v1.13.6	2025-04-10 11:58:50 +03:00
Andrei Kvapil	6afb1aad03	Upd: Cilium to v1.17.2 Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 11:58:50 +03:00
Andrei Kvapil	ad8e09bb35	Upd: Kamaji to v0.9.2 Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 11:58:50 +03:00
Andrei Kvapil	e8faf193eb	Upd: Keycloak-operator to v1.25.0 Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 11:58:50 +03:00
Andrei Kvapil	2393e3427c	Update Cluster-API operator to v0.18.1 Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 11:58:50 +03:00
Andrei Kvapil	ddb237718b	Upd: victoria-metrics operator to v0.55.0 Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 11:58:50 +03:00
Andrei Kvapil	ae619953fb	[tests] Fix e2e tests (dependencies and timeouts) Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 11:58:50 +03:00
Andrei Kvapil	434c5d1b9c	[ci] Add talos-kernel and talos-initramfs to assets (#784 ) Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 10:55:23 +02:00
Andrei Kvapil	cc9abfe03f	[ci] Add talos-kernel and talos-initramfs to assets Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 10:54:54 +02:00
Andrei Kvapil	e02fd14a3c	Fix: versions_map, use awk instead of grep (#780 ) Signed-off-by: Andrei Kvapil <kvapss@gmail.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Enhanced the internal version verification process to ensure improved precision and reliability in version validation. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-10 10:42:52 +02:00
Andrei Kvapil	559eb8dea9	Fix: versions_map, use awk instead of grep Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 10:42:31 +02:00
Andrei Kvapil	9e6478b9c9	[linstor] Add plunger check for disconnected DRBD peers. (#707 ) Sometimes DRBD devices get stuck in "Connecting" state, probably due to some race conditions. This scriptlet provides a workaround for such situations.	2025-04-10 10:38:29 +02:00
Andrei Kvapil	3a295c4474	Add guard against empty cloudInit in vm-instance app (#646 ) Prevent the VM resource from referencing a non-existent secret when `sshKeys` are set and `cloudInit` is set to empty. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Improved cloud-init configuration handling with conditional logic and clearer error messaging when expected configuration values are missing. - Documentation - Refined virtual machine configuration guides by reformatting parameter tables and correcting typographical errors in parameter descriptions. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-10 10:37:50 +02:00
Andrei Kvapil	f8dfc43cae	[e2e] Add mirror.gcr.io as default mirror for docker.io (#782 ) related issues: - https://github.com/cozystack/talm/pull/48 - https://github.com/cozystack/website/pull/154 - https://github.com/cozystack/cozystack/pull/782 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced additional configuration options that enable using Docker image mirrors. This enhancement can improve image retrieval performance and provide redundancy while maintaining the existing functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-10 10:36:23 +02:00
Andrei Kvapil	3e19bc74d4	[tests] Add mirror.gcr.io as default mirror for docker.io Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-10 10:36:02 +02:00
klinch0	2966922c0b	feat(vpa): separate-crds (#781 ) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Improved autoscaling deployment by integrating an additional component for managing custom resource definitions. - Enhanced dependency management now ensures critical prerequisites are deployed in the correct order. - Introduced an automated update mechanism to keep resource definitions current. - Added a new configuration option, giving users the flexibility to enable or disable custom resource definitions as needed. - Introduced two new Custom Resource Definitions: `VerticalPodAutoscalerCheckpoint` and `VerticalPodAutoscaler`. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-10 11:35:36 +03:00
Denis Seleznev	991c7e1943	Handle empty cloudInit. Add a no-op user-data when sshKeys are specified. Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>	2025-04-10 10:07:43 +02:00
kklinch0	c31a7710ad	feat(vpa): separate-crds Signed-off-by: kklinch0 <kklinch0@gmail.com>	2025-04-10 10:57:50 +03:00
Andrei Kvapil	f4cace093c	Add a setting to VMs that allows users to trigger cloud-init full reconfiguration. (#767 ) This will trigger cloud-init reinitialization, including ssh keys update and static network config refresh.	2025-04-10 09:20:55 +02:00
Denis Seleznev	01e417d436	Add Linstor plunger scriptlet to fix DRBD devices that are stuck disconnected. Sometimes DRBD devices get stuck in "Connecting" state, probably due to some race conditions. This scriptlet provides a workaround for such situations. Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>	2025-04-10 03:49:23 +02:00
Denis Seleznev	261ce4278f	Add a setting to VMs that allows users to trigger cloud-init full reconfiguration. Changing `cloudInitSeed` will trigger cloud-init reinitialization, including ssh keys update and static network config refresh. Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>	2025-04-09 20:48:18 +02:00
Timofei Larkin	785898b507	Delete a Workload if the related object is absent (#779 ) Workload object counts were previously getting out of control as the recreation of a related Pod would spawn a new workload, while the old one would never get deleted (except for StatefulSets, where the names of Pods are stable). Workloads without a matching object are now deleted.	2025-04-09 21:02:22 +04:00
Timofei Larkin	47a2cf7cd5	Track public IP usage (#769 ) Like the existing behavior for Pods and the recently merged behavior for PVCs, the WorkloadMonitor controller now creates Workload objects for Services with Type==LoadBalancer to keep track of public IP reservations.	2025-04-09 21:00:35 +04:00
Timofei Larkin	1f19793613	Merge branch 'main' into 176-track-ips	2025-04-09 20:26:29 +04:00
Timofei Larkin	a0df2989af	Track public IP usage Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>	2025-04-09 19:24:36 +03:00
Timofei Larkin	bdb538ab42	Track PVCs with WorkloadMonitor (#768 ) The WorkloadMonitor controller now also watches PVCs, just like it has been watching Pods and creates Workloads per PVC according to the `spec.selector` field to track the used storage space.	2025-04-09 20:01:13 +04:00
klinch0	c844a4fb2b	Fix: versions_map, include only versions from tags (#777 ) Signed-off-by: Andrei Kvapil <kvapss@gmail.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Bug Fixes - Improved error handling so that missing chart versions no longer halt processing, ensuring smoother operations. - Chores - Simplified the version tag lookup to rely solely on remote repository tags for increased consistency and reliability. - Updated the Kafka application version from `0.5.0` to `0.5.2`. - Adjusted versioning information for the Kafka package to reflect fixed commit references. - Streamlined the pre-commit workflow by removing unnecessary steps and logging. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-09 18:46:58 +03:00
Timofei Larkin	fea142774a	Delete a Workload if the related object is absent Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>	2025-04-09 18:36:11 +03:00
Andrei Kvapil	4b575299bc	Log verbose state for DRBD devices that are not healthy. (#771 ) This will help troubleshoot issues that occurred in the past but have already been resolved.	2025-04-09 14:16:53 +02:00
Andrei Kvapil	4eec016f7d	Merge pull request #757 from jokeOps/main kubevirt for able to run CX or RT type of instances.	2025-04-09 14:12:31 +02:00
Andrei Kvapil	4078b21ac6	Merge branch 'main' into main	2025-04-09 14:09:59 +02:00
Andrei Kvapil	1721d397a7	Fix: versions_map, include only versions from tags Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-09 14:07:07 +02:00
Andrei Kvapil	558a0572f5	Merge pull request #776 from cozystack/bugfix/fix_version_map fix version map	2025-04-09 14:06:31 +02:00
kklinch0	d60b81c8a0	fix version map Signed-off-by: kklinch0 <kklinch0@gmail.com>	2025-04-09 14:44:42 +03:00
Timofei Larkin	cc14c1fbab	Track PVCs with WorkloadMonitor Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>	2025-04-09 14:09:36 +03:00
Nick Volynkin	80aee1354b	Merge pull request #774 from cozystack/update-readme * [docs] Update links after restructuring docs Follow-up to cozystack/website#138 * [docs] Proofread the readme and contributing Fix a few errors here and there. * [ci] Run pre-commit checks once on PRs Pre-commit checks used to trigger twice on PRs: for `push` and `pull_request` triggers. Now they will only run on `push` to the main branch and on regular updates to pull requests, except for those that only change the documentation. Note that pushes to feature branches will not trigger this check until a PR was opened.	2025-04-09 11:47:43 +03:00
Andrei Kvapil	332d69259b	Merge pull request #766 from cozystack/vm-gpu [virtual-machine] Add GPU support	2025-04-09 10:40:52 +02:00
Andrei Kvapil	9ad6b0d726	[virtual-machine] Add GPU support Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-09 10:39:49 +02:00
Andrei Kvapil	ea9df9e371	Merge pull request #765 from cozystack/gpu-operator [gpu-operator] Introduce GPU-operator	2025-04-09 10:37:52 +02:00
Nick Volynkin	d69a9c4862	[docs] Proofread the readme and contributing Fix a few errors here and there. Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>	2025-04-09 11:09:19 +03:00
Nick Volynkin	6270a11bb1	[docs] Update links after restructuring docs Follow-up to cozystack/website#138 Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>	2025-04-09 11:09:18 +03:00
Nick Volynkin	18726483a6	[ci] Run pre-commit checks once on PRs Pre-commit checks used to trigger twice on PRs: for `push` and `pull_request` triggers. Now they will only run on `push` to the main branch and on regular updates to pull requests, except for those that only change the documentation. Note that pushes to feature branches will not trigger this check until a PR was opened. Signed-off-by: Nick Volynkin <nick.volynkin@gmail.com>	2025-04-09 11:07:38 +03:00
Denis Seleznev	aed184f6ef	Log verbose state for DRBD devices that are not healthy. This will help troubleshoot issues that occurred in the past but have already been resolved. Signed-off-by: Denis Seleznev <kto.3decb@gmail.com>	2025-04-09 03:46:37 +02:00
Andrei Kvapil	f688a57132	Merge pull request #773 from cozystack/upload-vmlinuz-and-initramfs Upload kernel and initramfs to release assets	2025-04-08 23:31:34 +02:00
Andrei Kvapil	e954ab7f8b	Upload kernel and initramfs to release assets Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-08 23:27:37 +02:00
Timofei Larkin	c9c8235c64	Merge pull request #772 from klinch0/monitoring-add-vpa-for-vmagent [monitoring] add vpa for vmagent	2025-04-08 17:48:38 +04:00
kklinch0	8e2e77da56	[monitoring] add vpa for vmagent Signed-off-by: kklinch0 <kklinch0@gmail.com>	2025-04-08 16:40:39 +03:00
Andrei Kvapil	1e27dedde5	[gpu-operator] Introduce GPU-operator Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-08 14:03:52 +02:00
Timofei Larkin	e947805c15	Track PVCs with WorkloadMonitor Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>	2025-04-08 11:44:36 +03:00
Pavlo Gaponuk	7a1c3b6209	Need for CX or RX type of instances Signed-off-by: Pavlo Gaponuk <pashagaponuk@gmail.com>	2025-04-07 19:03:04 +02:00
Andrei Kvapil	49b5b510ee	Merge pull request #758 from klinch0/k8s-change-CP-default-resourcesPreset [k8s] change CP default resourcesPreset	2025-04-05 21:35:11 +02:00
kklinch0	3cf850c2c4	[k8s] change CP default resourcesPreset Signed-off-by: kklinch0 <kklinch0@gmail.com>	2025-04-05 21:31:17 +03:00
Andrei Kvapil	1fbbfcd063	[ci] Rename workflows Signed-off-by: Andrei Kvapil <kvapss@gmail.com>	2025-04-03 17:05:19 +02:00