kubernetes

mirror of https://github.com/outbackdingo/kubernetes.git synced 2026-01-27 10:19:35 +00:00

Author	SHA1	Message	Date
Sunyanan Choochotkaew	7f052afaef	KEP 5075: implement scheduler Signed-off-by: Sunyanan Choochotkaew <sunyanan.choochotkaew1@ibm.com>	2025-07-30 09:52:49 +09:00
Patrick Ohly	5c4f81743c	DRA: use v1 API As before when adding v1beta2, DRA drivers built using the k8s.io/dynamic-resource-allocation helper packages remain compatible with all Kubernetes release >= 1.32. The helper code picks whatever API version is enabled from v1beta1/v1beta2/v1. However, the control plane now depends on v1, so a cluster configuration where only v1beta1 or v1beta2 are enabled without the v1 won't work.	2025-07-24 08:33:45 +02:00
Kubernetes Prow Robot	0617903e9d	Merge pull request #131344 from pohly/dra-taint-unit-test-flake-minimal DRA: work around fake.ClientSet informer deficiency in unit test	2025-07-03 02:51:25 -07:00
Davanum Srinivas	03afe6471b	Add a replacement for cmp.Diff using json+go-difflib Co-authored-by: Jordan Liggitt <jordan@liggitt.net> Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2025-06-16 17:10:42 -04:00
Patrick Ohly	ff108e72a5	DRA device taints: fix rare unit test flake TestCancelEviction flaked with a 0,01% rate because assumed that an event had already been created once the pod was updated, but that was only true under some timing conditions.	2025-04-17 17:16:23 +02:00
Patrick Ohly	ff2e6dddc8	DRA device taints: work around fake.ClientSet informer race fake.Clientset suffers from a race condition related to informers: it does not implement resource version support in its Watch implementation and instead assumes that watches are set up before further changes are made. If a test waits for caches to be synced and then immediately adds an object, that new object will never be seen by event handlers if the race goes wrong and the Watch call hadn't completed yet (can be triggered by adding a sleep before `b53b9fb557/staging/src/k8s.io/client-go/tools/cache/reflector.go (L431)`). To work around this, we count all watches and only proceed when all of them are in place. This replaces the normal watch reactor (`b53b9fb557/staging/src/k8s.io/client-go/kubernetes/fake/clientset_generated.go (L161-L173)`).	2025-04-17 10:57:27 +02:00
Patrick Ohly	638abf0339	DRA device taints: more logging in test	2025-04-17 10:55:13 +02:00
Patrick Ohly	40f2085d68	DRA device taint: clean up test initialization The creation of the shared informer factory and starting it can be done all in the same function, which makes it a bit more obvious what happens in which order and avoids some code duplication.	2025-04-17 10:55:13 +02:00
Patrick Ohly	56adcd06f3	DRA device eviction: fix eviction triggered by pod scheduling Normally the scheduler shouldn't schedule when there is a taint, but perhaps it didn't know yet. The TestEviction/update test covered this, but only failed under the right timing conditions. The new event handler test case covers it reliably.	2025-03-20 19:49:54 +01:00
Patrick Ohly	5856d3ee6f	DRA taint eviction: fix waiting in unit test Events get recorded in the apiserver asynchronously, so even if the test knows that the event has been evicted because the pod is deleted, it still has to also check for the event to be recorded. This caused a flake in the "Consistently" check of events.	2025-03-20 17:59:48 +01:00
Patrick Ohly	ac6e47cb14	DRA taint eviction: improve error handling There was one error path that led to a "controller has shut down" log message. Other errors caused different log entries or are so unlikely (event handler registration failure!) that they weren't checked at all. It's clearer to let Run return an error in all cases and then log the "controller has shut down" error at the call site. This also enables tests to mark themselves as failed, should that ever happen.	2025-03-20 17:59:06 +01:00
Patrick Ohly	9f161590be	metrics testing: add type aliases to avoid direct prometheus imports In tests it is sometimes unavoidable to use the Prometheus types directly, for example when writing a custom gatherer which needs to normalize data before testing it. device_taint_eviction_test.go does this to strip out unpredictable data in a histogram. With type aliases in a package that is explicitly meant for tests we can avoid adding exceptions for such tests to the global exception list.	2025-03-19 09:18:38 +01:00
Patrick Ohly	a027b439e5	DRA: add device taint eviction controller The controller is derived from the node taint eviction controller. In contrast to that controller it tracks the UID of pods to prevent deleting the wrong pod when it got replaced.	2025-03-19 09:18:38 +01:00
Patrick Ohly	13d04d4a92	DRA device taints: copy taintseviction controller This is a verbatim copy of the current pkg/controller/taintseviction code, revision `fc268ecd09` (v1.33.0 plus one commit), minus the TimedWorker helper. The intent is to modify the code such that it enforces eviction of pods which use tainted devices.	2025-03-18 20:52:54 +01:00

14 Commits