5 Commits

Author SHA1 Message Date
Ed Bartosh
4ee7374b24 DRA kubelet: add connection monitoring
This ensures that ResourceSlices get removed also when a plugin becomes
unresponsive without removing the registration socket.

Tests are from https://github.com/kubernetes/kubernetes/pull/131073 by Ed
with some modifications, the implementation is new.
2025-06-24 10:42:41 +02:00
Patrick Ohly
582b421393 DRA kubeletplugin: add RollingUpdate
When the new RollingUpdate option is used, the DRA driver gets deployed such
that it uses unique socket paths and uses file locking to serialize gRPC
calls. This enables the kubelet to pick arbitrarily between two concurrently
instances. The handover is seamless (no downtime, no removal of ResourceSlices
by the kubelet).

For file locking, the fileutils package from etcd is used because that was
already a Kubernetes dependency. Unfortunately that package brings in some
additional indirect dependency for DRA drivers (zap, multierr), but those
seem acceptable.
2025-03-18 12:32:35 +01:00
Patrick Ohly
b471c2c11f DRA kubelet: support rolling upgrades
The key difference is that the kubelet must remember all plugin instances
because it could always happen that the new instance dies and leaves only the
old one running.

The endpoints of each instance must be different. Registering a plugin with the
same endpoint as some other instance is not supported and triggers an error,
which should get reported as "not registered" to the plugin. This should only
happen when the kubelet missed some unregistration event and re-registers the
same instance again. The recovery in this case is for the plugin to shut down,
remove its socket, which should get observed by kubelet, and then try again
after a restart.
2025-03-18 12:32:35 +01:00
Patrick Ohly
0490b9f0b7 kubelet: document seamless upgrade support and guidance
This tries to capture the current state of affairs and a potential plan for
supporting seamless upgrades better.
2025-03-17 14:43:08 +01:00
Tara Gu
5e18554442 Implement plugin manager - a controller that manages plugin registration/unregistration 2019-05-30 19:00:59 -04:00