kubernetes

mirror of https://github.com/outbackdingo/kubernetes.git synced 2026-01-27 18:19:28 +00:00

Author	SHA1	Message	Date
Sunyanan Choochotkaew	7f052afaef	KEP 5075: implement scheduler Signed-off-by: Sunyanan Choochotkaew <sunyanan.choochotkaew1@ibm.com>	2025-07-30 09:52:49 +09:00
yliao	23d6f73e72	extended resource backed by DRA: test	2025-07-29 18:55:28 +00:00
Kobayashi,Daisuke	6653ef652b	KEP-5007 DRA Device Binding Conditions: Add dra integration test	2025-07-29 11:36:07 +00:00
Rita Zhang	c15a54f8c0	draadminaccess: move metrics test from e2e to integration Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>	2025-07-24 14:08:14 -07:00
Patrick Ohly	24de875ceb	DRA: graduate DynamicResourceAllocation feature to GA It hasn't been on-by-default before, therefore it does not get locked to the new default on yet. This has some impact on the scheduler configuration because the plugin is now enabled by default. Because the feature is now GA, it doesn't need to be a label on E2E tests, which wouldn't be possible anyway once it gets removed entirely.	2025-07-24 08:33:56 +02:00
Patrick Ohly	5c4f81743c	DRA: use v1 API As before when adding v1beta2, DRA drivers built using the k8s.io/dynamic-resource-allocation helper packages remain compatible with all Kubernetes release >= 1.32. The helper code picks whatever API version is enabled from v1beta1/v1beta2/v1. However, the control plane now depends on v1, so a cluster configuration where only v1beta1 or v1beta2 are enabled without the v1 won't work.	2025-07-24 08:33:45 +02:00
Kobayashi,Daisuke	61bd5789be	Updated to not directly change the global variable `claim`	2025-07-23 03:44:48 +00:00
Patrick Ohly	729cd583ad	scheduler integration: fail test instead of existing Calling klog.FlushAndExit causes the `go test` binary to quit without properly recording which test failed. Both callers of StartScheduler already have a ktesting.TContext, so switching to that is easy and also reduces the number of parameters.	2025-07-18 09:43:04 +02:00
Patrick Ohly	5cea72d564	DRA integration: add test case for FilterTimeout This covers disabling the feature via the configuration, failing to schedule because of timeouts for all nodes, and retrying after ResourceSlice changes with partial success (timeout for one node, success for the other). While at it, some helper code gets improved.	2025-07-17 21:18:28 +02:00
Patrick Ohly	241ac018e2	DRA integration: remove unnecessary anonymous import It's unclear why k8s.io/kubernetes/pkg/apis/resource/install needs to be imported explicitly. Having the apiserver and scheduler ready to be started ensures that all APIs are available.	2025-07-17 21:18:28 +02:00
Patrick Ohly	2e966244ed	DRA resourceslice controller: fix recreation after quick delete If a ResourceSlice got published by the ResourceSlice controller in a DRA driver and then that ResourceSlice got deleted quickly (within one minute, the mutation cache TTL) by someone (for example, the kubelet because of a restart), then the controller did not react properly to the deletion unless some other event triggered the syncing of the pool. Found while adding upgrade/downgrade tests with a driver which keeps running across the upgrade/downgrade. The exact sequence leading to this were: - controller adds ResourceSlice, schedules a sync for one minute in the future (the TTL) - someone else deletes the ResourceSlice - add and delete events schedule another sync 30 seconds in the future (the delay), overwriting the other scheduled sync - sync runs once, finds deleted slices in the mutation cache, does not re-create them, and also does not run again One possible fix would be to set a resync period. But then work is done periodically, whether it's necessary or not. Another fix is to ensure that the TTL is shorter than the delay. Then when a sync occurs, all locally stored additional slices are expired. But that renders the whole storing of recently created slices in the cache pointless. So the fix used here is to keep track of when another sync has to run because of added slices. At the end of each sync, the next sync gets scheduled if (and only if) needed, until eventually syncing can stop.	2025-07-03 08:20:39 +02:00
Patrick Ohly	10de6780cf	DRA API: remove obsolete types from v1alpha3 The v1alpha3 version is still needed for DeviceTaintRule, but the rest of the types and most structs became obsolete in v1.32 when we introduced v1beta1 and bumped the storage version to v1beta1. Removing them now simplifies adding new features because new fields don't need to be added to these obsolete types. This could have been done already in 1.33, but wasn't to minimize disrupting on-going work.	2025-06-06 12:06:28 +02:00
Kubernetes Prow Robot	0731167a99	Merge pull request #131996 from ritazh/dra-adminaccess-updatelabelkey DRAAdminAccess: update label key	2025-06-04 12:16:45 -07:00
Patrick Ohly	4f91a69f2b	DRA integration: move and extend device status test This moves the enabled/disabled test into the common test/integration/dra which simplifies the code a bit and amortizes the cost of starting the apiserver because several different tests can use the same instance, running in parallel. While at it, setting the status via SSA also gets tested.	2025-05-30 10:29:18 +02:00
Rita Zhang	5058e385b0	DRAAdminAccess: update label key Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>	2025-05-27 21:19:25 -07:00
Patrick Ohly	e63019a870	DRA integration: refactor code to support other tests Creating class, claim and pod is expected to be fairly common.	2025-05-23 17:52:26 +02:00
Patrick Ohly	50f152440b	DRA integration: start scheduler on demand As soon as we have more than one test using the scheduler, we need some coordination between tests. This is handled by a singleton which starts the scheduler for the first user and stops it after the last one is gone. To avoid having to pass around an additional parameter, the context is used to access the singleton under the hood.	2025-05-23 15:04:00 +02:00
Patrick Ohly	60c36432f2	DRA integration: set up nodes for scheduling This enables proper scheduling tests. Most of them are probably better done in scheduler_perf where the same test then can also be used for benchmarking and creating objects is a bit better supported (from YAML, for example), but some special cases (in particular, anything involving error injection) are better done here.	2025-05-20 17:43:30 +02:00
Patrick Ohly	3b5cfeaf20	DRA: use v1beta2 DRA drivers must provide ResourceSlices using the v1beta2 API types. The controller then converts under the hood to v1beta1 if needed, i.e. drivers are compatible with Kubernetes 1.32 and Kubernetes 1.33, as long as at least one beta API group is enabled. Testing pivots from using v1beta1 as the main API to v1beta2, with only one test case exercising v1beta1.	2025-05-05 08:49:09 +02:00
Patrick Ohly	a171795e31	DRA resourceslices: better error reporting A user of the controller can register an error handler via the controller options. For a kubelet plugin, the error handler is method in the interface which must be implemented. This is a conscious choice to make DRA driver developer aware that they should react intelligently to errors. The controller will invoke that handler with all errors that it encounters while syncing the desired set of slices. This includes validation errors from the apiserver if the driver's slices are invalid. Dropped fields get reported with a special DroppedFieldsError.	2025-05-05 08:40:52 +02:00
Morten Torkildsen	39507d911f	Add resource v1beta2 API	2025-03-26 14:41:09 +00:00
Kubernetes Prow Robot	ab3cec0701	Merge pull request #130447 from pohly/dra-device-taints device taints and tolerations (KEP 5055)	2025-03-19 13:00:32 -07:00
Patrick Ohly	37b47f4724	DRA helper: support dropped fields and TimeAdded defaults Both the new DeviceTaint.TimeAdded and dropped fields when the DRADeviceTaints feature is disabled confused the ResourceSlice controller because what is stored and sent back can be different from what the controller wants to store. It's now more lenient regarding TimeAdded (doesn't need to be exact because of rounding during serialization, only having a value on the server is okay) and dropped fields (doesn't try to store them again). It also preserves a server-side TimeAdded when updating slices.	2025-03-19 09:18:38 +01:00
Rita Zhang	0301e5a9f8	DRA: AdminAccess validate based on namespace label Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>	2025-03-18 22:56:54 -07:00
Patrick Ohly	89440b1239	DRA: integration tests for prioritized list This adds dedicated integration tests for the feature to the general test/integration/dra for the API and some minimal testing with the scheduler. It also adds non-performance test cases for scheduler_perf because that is a better place for running through the complete flow (for example, can reuse infrastructure for setting up nodes).	2025-03-10 11:38:06 +01:00
Patrick Ohly	9492a2ca9b	DRA: add dedicated integration tests DRA had integration tests as part of test/integration/scheduler_perf (for the scheduler plugin) and some others scattered in different places (e.g. test/integration/resourceclaim for device status). The new test/integration/dra is meant to become the common location for all DRA-related integration tests. This makes it simpler to share common setup code.	2025-02-21 20:48:04 +01:00

26 Commits