kubernetes

mirror of https://github.com/optim-enterprises-bv/kubernetes.git synced 2025-11-24 18:35:10 +00:00

Author	SHA1	Message	Date
Patrick Ohly	9f36c8d718	DRA: add DRAControlPlaneController feature gate for "classic DRA" In the API, the effect of the feature gate is that alpha fields get dropped on create. They get preserved during updates if already set. The PodSchedulingContext registration is not restricted by the feature gate. This enables deleting stale PodSchedulingContext objects after disabling the feature gate. The scheduler checks the new feature gate before setting up an informer for PodSchedulingContext objects and when deciding whether it can schedule a pod. If any claim depends on a control plane controller, the scheduler bails out, leading to: Status: Pending ... Warning FailedScheduling 73s default-scheduler 0/1 nodes are available: resourceclaim depends on disabled DRAControlPlaneController feature. no new claims to deallocate, preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling. The rest of the changes prepare for testing the new feature separately from "structured parameters". The goal is to have base "dra" jobs which just enable and test those, then "classic-dra" jobs which add DRAControlPlaneController.	2024-07-22 18:09:34 +02:00
Patrick Ohly	599fe605f9	DRA scheduler: adapt to v1alpha3 API The structured parameter allocation logic was written from scratch in staging/src/k8s.io/dynamic-resource-allocation/structured where it might be useful for out-of-tree components. Besides the new features (amount, admin access) and API it now supports backtracking when the initial device selection doesn't lead to a complete allocation of all claims. Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com> Co-authored-by: John Belamaric <jbelamaric@google.com>	2024-07-22 18:09:34 +02:00
Patrick Ohly	91d7882e86	DRA: new API for 1.31 This is a complete revamp of the original API. Some of the key differences: - refocused on structured parameters and allocating devices - support for constraints across devices - support for allocating "all" or a fixed amount of similar devices in a single request - no class for ResourceClaims, instead individual device requests are associated with a mandatory DeviceClass For the sake of simplicity, optional basic types (ints, strings) where the null value is the default are represented as values in the API types. This makes Go code simpler because it doesn't have to check for nil (consumers) and values can be set directly (producers). The effect is that in protobuf, these fields always get encoded because `opt` only has an effect for pointers. The roundtrip test data for v1.29.0 and v1.30.0 changes because of the new "request" field. This is considered acceptable because the entire `claims` field in the pod spec is still alpha. The implementation is complete enough to bring up the apiserver. Adapting other components follows.	2024-07-22 18:09:34 +02:00
Patrick Ohly	8a629b9f15	DRA: remove "sharable" from claim allocation result Now all claims are shareable up to the limit imposed by the size of the "reserverFor" array. This is one of the agreed simplifications for 1.31.	2024-07-21 17:28:14 +02:00
Patrick Ohly	de5742ae83	DRA: remove immediate allocation As agreed in https://github.com/kubernetes/enhancements/pull/4709, immediate allocation is one of those features which can be removed because it makes no sense for structured parameters and the justification for classic DRA is weak.	2024-07-21 17:28:14 +02:00
Patrick Ohly	b51d68bb87	DRA: bump API v1alpha2 -> v1alpha3 This is in preparation for revamping the resource.k8s.io completely. Because there will be no support for transitioning from v1alpha2 to v1alpha3, the roundtrip test data for that API in 1.29 and 1.30 gets removed. Repeating the version in the import name of the API packages is not really required. It was done for a while to support simpler grepping for usage of alpha APIs, but there are better ways for that now. So during this transition, "resourceapi" gets used instead of "resourcev1alpha3" and the version gets dropped from informer and lister imports. The advantage is that the next bump to v1beta1 will affect fewer source code lines. Only source code where the version really matters (like API registration) retains the versioned import.	2024-07-21 17:28:13 +02:00
Jordan Liggitt	03d48b7683	Move CEL env initialization out of package init() This ensures compatibility version and feature gates can be initialized before cached CEL environments are created.	2024-07-19 15:06:48 -04:00
googs1025	a3978e8315	scheduler: Add ctx param and error return to EnqueueExtensions.EventsToRegister()	2024-07-18 12:22:17 +08:00
Kubernetes Prow Robot	ac9aec9f9b	Merge pull request #125116 from pohly/dra-one-of-source DRA: remove "source" indirection from v1 Pod API	2024-06-28 12:46:45 -07:00
Patrick Ohly	bde9b64cdf	DRA: remove "source" indirection from v1 Pod API This makes the API nicer: resourceClaims: - name: with-template resourceClaimTemplateName: test-inline-claim-template - name: with-claim resourceClaimName: test-shared-claim Previously, this was: resourceClaims: - name: with-template source: resourceClaimTemplateName: test-inline-claim-template - name: with-claim source: resourceClaimName: test-shared-claim A more long-term benefit is that other, future alternatives might not make sense under the "source" umbrella. This is a breaking change. It's justified because DRA is still alpha and will have several other API breaks in 1.31.	2024-06-27 17:53:24 +02:00
Patrick Ohly	4bddebc48e	DRA: fix scheduler/resource claim controller race with retry The JSON patch approach works, but it is complex. A retry loop is easier to understand (detect conflict, get new claim, try again). There is one additional API call (the get), but in practice this scenario is unlikely.	2024-06-27 15:03:56 +02:00
Patrick Ohly	ecbafb8de5	DRA: fix scheduler/resource claim controller race There was a race caused by having to update claim finalizer and status in two different operations: - Resource claim controller removes allocation, does not yet get to remove the finalizer. - Scheduler prepares an allocation, without adding the finalizer because it's there. - Controller removes finalizer. - Scheduler adds allocation. This is an invalid state. Automatic checking found this during the execution of the "with translated parameters on single node.*supports sharing a claim sequentially" E2E test, but only when run stand-alone. When running in parallel (as in the CI), the bad outcome of the race did not occur. The fix is to check that the finalizer is still set when adding the allocation. The apiserver doesn't check that because it doesn't know which finalizer goes with the allocation result. It could check for "some finalizer", but that is not guaranteed to be correct (could be some unrelated one). Checking the finalizer can only be done with a JSON patch. Despite the complications, having the ability to add multiple pods concurrently to ReservedFor seems worth it (avoids expensive rescheduling or a local retry loop). The resource claim controller doesn't need this, it can do a normal update which implicitly checks ResourceVersion.	2024-06-27 15:03:06 +02:00
Kubernetes Prow Robot	8c478a06d8	Merge pull request #124595 from pohly/dra-scheduler-assume-cache-eventhandlers DRA: scheduler event handlers via assume cache	2024-06-25 11:56:28 -07:00
Patrick Ohly	1b63639d31	DRA scheduler: use assume cache to list claims This finishes the transition to the assume cache as source of truth for the current set of claims. The tests have to be adapted. It's not enough anymore to directly put objects into the informer store because that doesn't change the assume cache content. Instead, normal Create/Update calls and waiting for the cache update are needed.	2024-06-25 14:00:25 +02:00
Patrick Ohly	9a6f3b9388	scheduler: central ResourceClaim assume cache This enables connecting the event handler for ResourceClaim to the assume cache, which addresses a theoretic race condition. It may also be useful for implementing the autoscaler support, because now the autoscaler can modify the content of the cache.	2024-06-25 14:00:25 +02:00
Patrick Ohly	e0fce54d02	DRA: fix indexing of generated parameters The claim parameter key didn't include the namespace of the claim. In the case where two namespaces used the exact same parameter reference, the "too many generated parameters" case got triggered incorrectly and lookup could have returned an object from the wrong namespace. Found while running the E2E tests in parallel: message: 'running PreFilter plugin "DynamicResources": multiple generated claim parameters for ConfigMap. dra-8794/parameters-3 found: [dra-4729/parameters-4 dra-7328/parameters-4 dra-8794/parameters-4 dra-3402/parameters-4 dra-6156/parameters-4 dra-1839/parameters-4 dra-7434/parameters-4 dra-6504/parameters-4]'	2024-06-13 17:27:04 +02:00
carlory	2794baf4c0	fix dra flaky test on TestPlugin	2024-05-30 23:22:37 +08:00
carlory	3072987fcc	DRA: scheduler: index claim and class parameters to simplify lookup	2024-05-27 15:57:10 +08:00
carlory	06d3cd33b2	use slices library instead	2024-04-29 16:50:53 +08:00
Kubernetes Prow Robot	cffc2c0b40	Merge pull request #124102 from pohly/dra-scheduler-assume-cache scheduler: move assume cache to utils	2024-04-26 08:49:12 -07:00
Patrick Ohly	7f54c5dfec	scheduler: remove AssumeCache interface There's no reason for having the interface because there is only one implementation. Makes the implementation of the test functions a bit simpler (no casting). They are still stand-alone functions instead of methods because they should not be considered part of the "normal" API.	2024-04-25 11:46:58 +02:00
Patrick Ohly	26e0409c36	scheduler: move assume cache to utils, part 2 This is now used by both the volumebinding and dynamicresources plugin, so promoting it to a common helper package is better. In terms of functionality, nothing was changed. Documentation got updated (warns about storing locally modified objects, clarifies what the Get parameters are). Code coverage should be a bit better than before (tested with and without indexer, exercises event handlers, more error paths). Checking for specific errors can now be done via errors.Is.	2024-04-25 11:45:43 +02:00
Patrick Ohly	a66d2163f9	dra scheduler: fix data race in unit test Clearing some irrelevant fields in objects caused a flaky data race alert because in some cases, the objects were pointers into a shared cache. A better solution is to treat the objects as read-only and ignore the irrelevant fields.	2024-04-19 17:14:13 +02:00
Kubernetes Prow Robot	d2ce87eb94	Merge pull request #123938 from pohly/dra-structured-parameters-tests DRA: test for structured parameters	2024-04-18 02:10:08 -07:00
Patrick Ohly	6f5696b537	dra scheduler: simplify unit tests The guideline in https://github.com/kubernetes/community/blob/master/sig-scheduling/CONTRIBUTING.md#technical-and-style-guidelines is to not compare error strings. This makes the tests less precise. In return, unit tests don't need to be updated when error strings change.	2024-03-27 10:27:01 +01:00
Patrick Ohly	458e227de0	dra scheduler: unit tests Coverage was checked with a cover profile. The biggest remaining gap is for isSchedulableAfterClaimParametersChange and isSchedulableAfterClassParametersChange which will get handled when refactoring the foreachPodResourceClaim (https://github.com/kubernetes/kubernetes/issues/123697).	2024-03-22 10:03:22 +01:00
Patrick Ohly	607261e4c5	dra scheduler: spelling fix	2024-03-22 10:03:22 +01:00
Patrick Ohly	95136db063	dra scheduler: fix re-allocation of claim with structured parameters The code was incorrectly checking for a controller, but only the boolean is set for allocated claims. As a result, deallocation was requested from a non-existent control plane controller. While at it, let's also clear the driver name. It's not needed when the claim is deallocated.	2024-03-22 10:03:22 +01:00
Kubernetes Prow Robot	aa73f3163a	Merge pull request #122292 from sanposhiho/nodeupdate register Node/UpdateTaint event to plugins which has Node/Add only and doesn't have Node/UpdateTaint	2024-03-18 08:33:54 -07:00
Kensei Nakada	2b56de43e5	register Node/UpdateNodeTaint event to plugins which has Node/Add only, doesn't have Node/UpdateNodeTaint	2024-03-16 14:13:06 +00:00
Kevin Klues	21a0dd1d70	dra scheduler: create default claim/class parameters instead of nil Without this, the scheduler was crashing in newClaimController() in pkg/scheduler/framework/plugins/dynamicresources/structuredparameters.go The code in newClaimController() assumes that the parameters are not nil. Furthermore it assumes that there is at least one DriverRequest populated in order to allocate any resources to a claim. This PR adds logic to define default claim/class parameters that will allow allocation to proceed even if an end user doesn't provide any class or claim parameters themselves. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2024-03-11 13:57:16 +00:00
Patrick Ohly	251b3859b0	dra scheduler: consider in-flight allocation for resource calculation Storing a modified claim with allocation and the original resource version in the assume cache was not reliable: if an update was received, it replaced the modified claim and the resource that was reserved for the claim might have been used for some other claim. To fix this, the in-flight claims are now stored in the map instead of just a boolean and the status stored there overrides whatever is in the assume cache. Logging got extended to diagnose this problem better. It started to occur in E2E tests after splitting the claim update so that first the finalizer is set and then the status, because setting the finalizer triggered an update.	2024-03-07 22:26:16 +01:00
Patrick Ohly	0b6a0d686a	dra api: rename NodeResourceSlice -> ResourceSlice While currently those objects only get published by the kubelet for node-local resources, this could change once we also support network-attached resources. Dropping the "Node" prefix enables such a future extension. The NodeName in ResourceSlice and StructuredResourceHandle then becomes optional. The kubelet still needs to provide one and it must match its own node name, otherwise it doesn't have permission to access ResourceSlice objects.	2024-03-07 22:22:55 +01:00
Patrick Ohly	d4d5ade7f5	dra: add "named resources" structured parameter model Like the current device plugin interface, a DRA driver using this model announces a list of resource instances. In contrast to device plugins, this list is made available to the scheduler together with attributes that can be used to select suitable instances when they are not all alike. Because this is the first structured parameter model, some checks that previously were not possible, in particular "is one structured parameter field set", now gets enabled. Adding another structured parameter model will be similar. The applyconfigs code generator assumes that all types in an API are defined in a single package. If it wasn't for that, it would be possible to place the "named resources" types in separate packages, which makes their names in the Go code more natural and provides an indication of their stability level because the package name could include a version.	2024-03-07 22:21:16 +01:00
Patrick Ohly	096e948905	dra scheduler: support structured parameters When a claim uses structured parameters, as indicated by the resource class flag, the scheduler is responsible for allocating it. To do this it needs to gather information about available node resources by watching NodeResourceSlices and then match the in-tree claim parameters against those resources.	2024-03-07 22:21:04 +01:00
Kubernetes Prow Robot	c606448922	Merge pull request #122996 from Huang-Wei/cleanup-dra-postfilter DRA: always returns Unschedulable in PostFilter	2024-01-27 08:19:44 -08:00
Kubernetes Prow Robot	02aaad0de9	Merge pull request #121876 from pohly/dra-reserve-during-pod-binding dra: reserve + publish during pod binding	2024-01-26 19:58:01 +01:00
Wei Huang	ceabc4aba8	DRA: always returns Unschedulable in PostFilter	2024-01-26 09:44:00 -08:00
Patrick Ohly	6cf4203751	dra scheduler: reformat code By continuing with the next item in the if clause, the else is no longer needed and indention can be reduced.	2024-01-26 10:58:03 +01:00
Patrick Ohly	a809a6353b	scheduler: publish PodSchedulingContext during PreBind Blocking API calls during a scheduling cycle like the DRA plugin is doing slow down overall scheduling, i.e. also affecting pods which don't use DRA. It is easy to move the blocking calls into a goroutine while the scheduling cycle ends with "pod unschedulable". The hard part is handling an error when those API calls then fail in the background. There is a solution for that (see https://github.com/kubernetes/kubernetes/pull/120963), but it's complex. Instead, publishing the modified PodSchedulingContext can also be done later. In the more common case of a pod which is ready for binding except for its claims, that'll be in PreBind, which runs in a separate goroutine already. In the less common case that a pod cannot be scheduled, that'll be in Unreserve which is still blocking.	2024-01-26 10:58:03 +01:00
Patrick Ohly	5d1509126f	dra: patch ReservedFor during PreBind This moves adding a pod to ReservedFor out of the main scheduling cycle into PreBind. There it is done concurrently in different goroutines. For claims which were specifically allocated for a pod (the most common case), that usually makes no difference because the claim is already reserved. It starts to matter when that pod then cannot be scheduled for other reasons, because then the claim gets unreserved to allow deallocating it. It also matters for claims that are created separately and then get used multiple times by different pods. Because multiple pods might get added to the same claim rapidly independently from each other, it makes sense to do all claim status updates via patching: then it is no longer necessary to have an up-to-date copy of the claim because the patch operation will succeed if (and only if) the patched claim is valid. Server-side-apply cannot be used for this because a client always has to send the full list of all entries that it wants to be set, i.e. it cannot add one entry unless it knows the full list.	2024-01-26 10:58:03 +01:00
Kubernetes Prow Robot	6c493a1ef9	Merge pull request #122969 from kerthcet/fix/claim [DRA] Fix indexing the error value in unavailableClaim	2024-01-25 17:34:11 +01:00
kerthcet	7801173f6e	get the error claim in dra Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-01-25 23:22:50 +08:00
kerthcet	8371e4cf93	quick break when met Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-01-23 19:40:15 +08:00
Patrick Ohly	b0d4a8cd6d	dra scheduler: fix incorrect tracking of claim candidates for reallocation When dealing with unschedulable pods, the intent was to deallocate only claims which are allocated and use delayed allocation. That if check wasn't handled correctly, causing also claims with immediate allocation to be considered as candidates. Found during code reading, probably has never occurred in practice yet.	2023-12-20 09:04:01 +01:00
AxeZhan	be48c93689	Sched framework: expose NodeInfo in all functions of PluginsRunner interface	2023-12-15 11:30:06 +08:00
Kubernetes Prow Robot	74afd1a06f	Merge pull request #119539 from HirazawaUi/remove-not-register-event-code remove unregistered event code	2023-12-13 21:25:33 +01:00
Kubernetes Prow Robot	9aa04752e7	Merge pull request #118463 from testwill/replace_loop chore: slice replace loop	2023-10-24 15:04:39 +02:00
Kubernetes Prow Robot	5a4e792e06	Merge pull request #120534 from pohly/dra-scheduler-ssa-as-fallback dra scheduler: fall back to SSA for PodSchedulingContext updates	2023-10-23 21:06:58 +02:00
Kensei Nakada	cb5dc46edf	feature(scheduler): simplify QueueingHint by introducing new statuses	2023-10-19 11:02:11 +00:00

1 2

77 Commits