177 Commits

Author SHA1 Message Date
yliao
34a64db2c7 extended resource backed by DRA: implementation 2025-07-29 18:55:21 +00:00
Kubernetes Prow Robot
e2ab840708 Merge pull request #130160 from KobayashiD27/dra-device-binding-conditions
Implement DRA Device Binding Conditions (KEP-5007)
2025-07-29 07:34:26 -07:00
Kobayashi,Daisuke
e8c3af1f5c KEP-5007 DRA Device Binding Conditions: Implement scheduler logic 2025-07-29 11:34:30 +00:00
Kensei Nakada
ac9fad6030 feat: trigger PreFilterPreBind in the binding cycle 2025-07-29 19:01:02 +09:00
Kubernetes Prow Robot
a11bc701e8 Merge pull request #132457 from ania-borowiec/depends_on_cluster_move_podinfo
Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler
2025-07-24 09:38:27 -07:00
Ania Borowiec
aecd37e6fb Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler 2025-07-24 12:10:58 +00:00
Patrick Ohly
5c4f81743c DRA: use v1 API
As before when adding v1beta2, DRA drivers built using the
k8s.io/dynamic-resource-allocation helper packages remain compatible with all
Kubernetes release >= 1.32. The helper code picks whatever API version is
enabled from v1beta1/v1beta2/v1.

However, the control plane now depends on v1, so a cluster configuration where
only v1beta1 or v1beta2 are enabled without the v1 won't work.
2025-07-24 08:33:45 +02:00
Patrick Ohly
5cea72d564 DRA integration: add test case for FilterTimeout
This covers disabling the feature via the configuration, failing to schedule
because of timeouts for all nodes, and retrying after ResourceSlice changes with
partial success (timeout for one node, success for the other).

While at it, some helper code gets improved.
2025-07-17 21:18:28 +02:00
Kensei Nakada
ebae419337 feat: add PreBindPreFlight and implement in in-tree plugins 2025-07-05 17:14:21 -07:00
Ania Borowiec
ee8c265d35 Move Code and Status from pkg/scheduler/framework to k8s.io/kube-scheduler/framework 2025-06-30 10:06:22 +00:00
Avritt Rohwer
087554448c Make nodeports scheduling plugin sidecar initContainer aware 2025-06-06 02:26:05 +00:00
Kubernetes Prow Robot
e0859f91b7 Merge pull request #131887 from ania-borowiec/extract_cyclestate_interface
Moving Scheduler interfaces to staging: split CycleState into interface and implementation, move interface to staging repo
2025-05-30 04:00:18 -07:00
Ania Borowiec
d75af825fb Extract interface CycleState and move is to staging repo. CycleState implementation remains in k/k/pkg/scheduler/framework 2025-05-29 16:18:36 +00:00
googs1025
01820ff7c2 chore(scheduler): add filter integration tests for missing part plugins: NodeAffinity plugin
Signed-off-by: googs1025 <googs1025@gmail.com>
2025-05-23 18:02:32 +08:00
Kubernetes Prow Robot
8a6b916765 Merge pull request #130720 from saintube/scheduler-expose-nodeinfo-in-prefilter
Expose NodeInfo to PreFilter plugins
2025-04-23 13:31:29 -07:00
saintube
8dc6806d26 Expose NodeInfo to PreFilter plugins and Framework
Co-authored-by: Zhan Sheng <49895476+AxeZhan@users.noreply.github.com>
Co-authored-by: shenxin <rougang.hrg@alibaba-inc.com>
Signed-off-by: saintube <saintube@foxmail.com>
2025-03-21 14:55:25 +08:00
Kubernetes Prow Robot
473533adaa Merge pull request #130638 from A-transformer/fix_typo_matchexpressions
fix typo
2025-03-20 03:52:38 -07:00
Patrick Ohly
a027b439e5 DRA: add device taint eviction controller
The controller is derived from the node taint eviction controller.
In contrast to that controller it tracks the UID of pods to prevent
deleting the wrong pod when it got replaced.
2025-03-19 09:18:38 +01:00
A-transformer
decd11414b fix typo
selecterTerms -> selectorTerms
2025-03-18 09:56:15 +04:00
A-transformer
fabd449d7f fix typo
MatchExpressions , fix typo
2025-03-07 17:50:03 +04:00
Kubernetes Prow Robot
43560c620a Merge pull request #130522 from googs1025/feature/integration_filter_TaintToleration
chore(scheduler): add filter integration tests for missing part plugins: TaintToleration plugin
2025-03-07 04:15:45 -08:00
Kubernetes Prow Robot
9d45ea8b9d Merge pull request #128586 from mortent/DRAPrioritizedList
Prioritized Alternatives in Device Requests
2025-03-06 21:01:44 -08:00
googs1025
032b05114c chore(scheduler): add filter integration tests for missing part plugins: TaintToleration plugin 2025-03-07 09:33:49 +08:00
saintube
afb4e96510 Expose NodeInfo to Score plugins
Co-authored-by: shenxin <rougang.hrg@alibaba-inc.com>
Signed-off-by: saintube <saintube@foxmail.com>
2025-03-04 17:57:14 +08:00
Morten Torkildsen
2229a78dfe DRA: Update allocator for Prioritized Alternatives in Device Requests 2025-02-28 19:30:10 +00:00
googs1025
86f504284c feature(scheduler): add queueinghint for volumeattachment deletion 2025-02-22 14:57:41 +08:00
ndixita
6db40446de Scheduler changes:
1. Use pod-level resource when feature is enabled and resources are set at pod-level
2. Edge case handling: When a pod defines only CPU or memory limits at pod-level (but not both), and container-level requests/limits are unset, the pod-level requests stay empty for the resource without a pod-limit. The container's request for that resource is then set to the default request value from schedutil.
2024-11-08 03:00:54 +00:00
Kubernetes Prow Robot
fb033826a8 Merge pull request #128170 from sanposhiho/async-preemption
feature(KEP-4832): asynchronous preemption
2024-11-07 19:44:54 +00:00
utam0k
e828a4b40a Add integration test for NodeVolumeLimits in requeueing scenarios
Signed-off-by: utam0k <k0ma@utam0k.jp>
2024-11-07 19:51:50 +09:00
Yusuke Sakurai
992f1d9a08 add integration test for volumebinding for queueinghint 2024-11-07 14:10:26 +09:00
Kensei Nakada
69a8d0ec0b feature(KEP-4832): asynchronous preemption 2024-11-07 14:09:34 +09:00
Patrick Ohly
33ea278c51 DRA: use v1beta1 API
No code is left which depends on the v1alpha3, except of course the code
implementing that version.
2024-11-06 13:03:19 +01:00
Kubernetes Prow Robot
988769933e Merge pull request #128307 from NoicFank/bugfix-scheduler-preemption
bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it
2024-10-29 19:05:02 +00:00
NoicFank
68f7a7c682 bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it.
Introducing pdb to preemption had disrupted the orderliness of pods in the victims,
which would leads picking wrong victim node with higher priority pod on it.
2024-10-29 19:50:55 +08:00
Kubernetes Prow Robot
352056f09d Merge pull request #127757 from torredil/scheduler-bugfix-5123
scheduler: Improve CSILimits plugin accuracy by using VolumeAttachments
2024-10-23 18:12:52 +01:00
torredil
56f2b192cc scheduler: Improve CSILimits plugin accuracy by using VolumeAttachments
Signed-off-by: torredil <torredil@amazon.com>
2024-10-18 19:02:14 +00:00
Patrick Ohly
f84eb5ecf8 DRA: remove "classic DRA"
This removes the DRAControlPlaneController feature gate, the fields controlled
by it (claim.spec.controller, claim.status.deallocationRequested,
claim.status.allocation.controller, class.spec.suitableNodes), the
PodSchedulingContext type, and all code related to the feature.

The feature gets removed because there is no path towards beta and GA and DRA
with "structured parameters" should be able to replace it.
2024-10-16 23:09:50 +02:00
AxeZhan
b1f07bb36c add tests for scheduler 2024-10-10 15:53:19 +08:00
Kubernetes Prow Robot
7dd03c1ee5 Merge pull request #127353 from Gekko0114/integration_test_volumezone
Add integration test for VolumeZone in requeueing scenarios
2024-10-03 05:48:26 +01:00
googs1025
24a28766d4 chore(scheduler dra): improve dra queue hint unit test 2024-10-01 17:22:15 +08:00
Patrick Ohly
aee77bfc84 DRA scheduler: add special ActionType for ResourceClaim changes
Having a dedicated ActionType which only gets used when the scheduler itself
already detects some change in the list of generated ResourceClaims of a pod
avoids calling the DRA plugin for unrelated Pod changes.
2024-09-27 16:53:58 +02:00
moriya
7b8985dc03 add_test_cases 2024-09-21 01:09:10 +09:00
moriya
75266db65b Add integration test for VolumeZone in requeueing scenarios 2024-09-20 23:05:35 +09:00
Kensei Nakada
24a14aa810 fix: run a test for requeueing with PreFilterResult correctly 2024-09-07 23:52:45 +09:00
Patrick Ohly
599fe605f9 DRA scheduler: adapt to v1alpha3 API
The structured parameter allocation logic was written from scratch in
staging/src/k8s.io/dynamic-resource-allocation/structured where it might be
useful for out-of-tree components.

Besides the new features (amount, admin access) and API it now supports
backtracking when the initial device selection doesn't lead to a complete
allocation of all claims.

Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
Co-authored-by: John Belamaric <jbelamaric@google.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
8a629b9f15 DRA: remove "sharable" from claim allocation result
Now all claims are shareable up to the limit imposed by the size of the
"reserverFor" array.

This is one of the agreed simplifications for 1.31.
2024-07-21 17:28:14 +02:00
Patrick Ohly
de5742ae83 DRA: remove immediate allocation
As agreed in https://github.com/kubernetes/enhancements/pull/4709, immediate
allocation is one of those features which can be removed because it makes no
sense for structured parameters and the justification for classic DRA is weak.
2024-07-21 17:28:14 +02:00
Patrick Ohly
b51d68bb87 DRA: bump API v1alpha2 -> v1alpha3
This is in preparation for revamping the resource.k8s.io completely. Because
there will be no support for transitioning from v1alpha2 to v1alpha3, the
roundtrip test data for that API in 1.29 and 1.30 gets removed.

Repeating the version in the import name of the API packages is not really
required. It was done for a while to support simpler grepping for usage of
alpha APIs, but there are better ways for that now. So during this transition,
"resourceapi" gets used instead of "resourcev1alpha3" and the version gets
dropped from informer and lister imports. The advantage is that the next bump
to v1beta1 will affect fewer source code lines.

Only source code where the version really matters (like API registration)
retains the versioned import.
2024-07-21 17:28:13 +02:00
carlory
3072987fcc DRA: scheduler: index claim and class parameters to simplify lookup 2024-05-27 15:57:10 +08:00
joey
a56cc6b100 add integration test for pod with pvc has node-affinity to non-existent/existent nodes
Signed-off-by: joey <zchengjoey@gmail.com>
2024-05-03 19:45:31 +08:00