Commit Graph

83 Commits

Author SHA1 Message Date
Kensei Nakada
91aad7c97f fix(eventhandler): trigger Node/Delete event 2024-09-22 17:29:00 +09:00
Kensei Nakada
4ee1394b71 feat: disable preCheck when QHint is enabled 2024-09-04 17:43:00 +09:00
Kensei Nakada
8519d3399f chore: move the scheduler internal components out of internal dir 2024-08-25 13:10:29 +09:00
Patrick Ohly
e85d3babf0 DRA scheduler: fix re-scheduling after ResourceSlice changes
Making unschedulable pods schedulable again after ResourceSlice cluster events
was accidentally left out when adding structured parameters to Kubernetes 1.30.

All E2E tests were defined so that a driver starts first. A new test with a
different order (create pod first, wait for unschedulable, start driver)
triggered the bug and now passes.
2024-08-22 10:09:32 +02:00
Patrick Ohly
89e2feaf46 DRA scheduler: fix feature gate check for PodSchedulingContext event
The event is only relevant when DRAControlPlaneController (= "classic DRA") is
enabled.

This change has no effect in practice because the only plugin using this event,
the dynamic resource plugin, also checks feature gates when asking for events
and correctly only asks for PodSchedulingContext events when
DRAControlPlaneController is enabled.
2024-08-20 10:49:08 +02:00
Maciej Skoczeń
6b33e2e632 Use generics in scheduling queue's heap 2024-07-24 06:55:47 +00:00
Kubernetes Prow Robot
39a80796b6 Merge pull request #122628 from sanposhiho/pod-smaller-events
add(scheduler/framework): implement smaller Pod update events
2024-07-23 18:01:46 -07:00
Patrick Ohly
599fe605f9 DRA scheduler: adapt to v1alpha3 API
The structured parameter allocation logic was written from scratch in
staging/src/k8s.io/dynamic-resource-allocation/structured where it might be
useful for out-of-tree components.

Besides the new features (amount, admin access) and API it now supports
backtracking when the initial device selection doesn't lead to a complete
allocation of all claims.

Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
Co-authored-by: John Belamaric <jbelamaric@google.com>
2024-07-22 18:09:34 +02:00
Patrick Ohly
b51d68bb87 DRA: bump API v1alpha2 -> v1alpha3
This is in preparation for revamping the resource.k8s.io completely. Because
there will be no support for transitioning from v1alpha2 to v1alpha3, the
roundtrip test data for that API in 1.29 and 1.30 gets removed.

Repeating the version in the import name of the API packages is not really
required. It was done for a while to support simpler grepping for usage of
alpha APIs, but there are better ways for that now. So during this transition,
"resourceapi" gets used instead of "resourcev1alpha3" and the version gets
dropped from informer and lister imports. The advantage is that the next bump
to v1beta1 will affect fewer source code lines.

Only source code where the version really matters (like API registration)
retains the versioned import.
2024-07-21 17:28:13 +02:00
Kensei Nakada
0dee497876 fix: make updatePodOther private 2024-07-20 17:49:46 +09:00
Kensei Nakada
0cd1ee4259 add(scheduler/framework): implement smaller Pod update events 2024-07-20 17:44:23 +09:00
Kensei Nakada
9ff3227b15 add: implement event_handling_duration_seconds metric 2024-07-18 18:16:57 +09:00
Kensei Nakada
9772ff2848 cleanup: move NodeSchedulingPropertiesChange 2024-07-12 19:21:48 +09:00
Kensei Nakada
533140f065 take PodTopologySpread into consideration when requeueing Pods based on Pod related events 2024-07-06 13:17:14 +00:00
Patrick Ohly
9a6f3b9388 scheduler: central ResourceClaim assume cache
This enables connecting the event handler for ResourceClaim to the assume
cache, which addresses a theoretic race condition.

It may also be useful for implementing the autoscaler support, because now
the autoscaler can modify the content of the cache.
2024-06-25 14:00:25 +02:00
Patrick Ohly
096e948905 dra scheduler: support structured parameters
When a claim uses structured parameters, as indicated by the resource class
flag, the scheduler is responsible for allocating it. To do this it needs to
gather information about available node resources by watching
NodeResourceSlices and then match the in-tree claim parameters against those
resources.
2024-03-07 22:21:04 +01:00
Kubernetes Prow Robot
f38ff3feea Merge pull request #121716 from kerthcet/cleanup/add-log
Add more logs to scheduler event handler
2024-01-15 16:23:19 +01:00
amewayne
71c3593f85 support nodeAnnotationsChanged event to trigger rescheduling 2024-01-10 22:38:54 +08:00
kerthcet
e5b86c1034 Fix node update event will miss some potential changes
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-11-27 15:33:47 +08:00
kerthcet
a96d21b4b0 Add logs for event handler
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-11-03 15:36:06 +08:00
Patrick Ohly
c682d2b8c5 scheduler: add ResourceClass events
When filtering fails because a ResourceClass is missing, we can treat the pod
as "unschedulable" as long as we then also register a cluster event that wakes
up the pod. This is more efficient than periodically retrying.
2023-09-06 11:14:08 +02:00
Patrick Ohly
5269e76990 scheduler: properly skip DRA events
Because of a misplaced `append` (should have been inside if clause, not after
it), some handler from a previous loop iteration was added again. This was
harmless because the resulting slice was only used for waiting for cache sync,
but should better get fixed anyway.
2023-08-28 17:55:44 +02:00
kidddddddddddddddddddddd
9c7166ff63 wait for eventhandlers to sync before run scheduler 2023-06-27 23:19:34 +08:00
Kensei Nakada
6f8d38406a feature(scheduler): implement ClusterEventWithHint to filter out useless events 2023-06-22 13:36:19 +00:00
Mengjiao Liu
074900e81b scheduler: update the scheduler interface and cache methods to use contextual logging 2023-05-29 13:26:32 +08:00
Patrick Ohly
fec5233668 api: resource.k8s.io PodScheduling -> PodSchedulingContext
The name "PodScheduling" was unusual because in contrast to most other names,
it was impossible to put an article in front of it. Now PodSchedulingContext is
used instead.
2023-03-14 10:18:08 +01:00
Patrick Ohly
29941b8d3e api: resource.k8s.io v1alpha1 -> v1alpha2
For Kubernetes 1.27, we intend to make some breaking API changes:
- rename PodScheduling -> PodSchedulingHints (https://github.com/kubernetes/kubernetes/issues/114283)
- extend ResourceClaimStatus (https://github.com/kubernetes/enhancements/pull/3802)

We need to switch from v1alpha1 to v1alpha2 for that.
2023-03-14 07:52:03 +01:00
Patrick Ohly
d2ff210c20 scheduler: add dynamic resource allocation plugin
The plugin handles the interaction with ResourceClaims that are referenced by a
Pod.
2022-11-11 21:58:03 +01:00
Kubernetes Prow Robot
24a71990e0 Merge pull request #108445 from pohly/storage-capacity-ga
storage capacity GA
2022-03-23 08:06:21 -07:00
Paco Xu
acd696266e mark PodOverhead to GA in v1.24; remove in v1.26 2022-03-17 09:30:14 +08:00
Patrick Ohly
f84f4fa291 storage capacity: use V1 API 2022-03-14 20:05:45 +01:00
kerthcet
eafbaad9f7 refactor: rename SchedulerCache to Cache in Scheduler
Signed-off-by: kerthcet <kerthcet@gmail.com>
2022-02-24 09:47:21 +08:00
Kubernetes Prow Robot
0dcd6eaa0d Merge pull request #103934 from boenn/tainttoleration
De-duplicate predicate (known as filter now) logic shared in kubelet and scheduler
2022-02-09 16:53:46 -08:00
BinacsLee
1027b8de40 scheduler: fix race condition during cache refresh 2021-12-10 20:46:12 +08:00
boenn
cec2aae1e5 rebase master 2021-11-25 11:21:12 +08:00
Aldo Culquicondor
ff741f6a96 Ensure deletion of pods in queues and cache
When the client misses a delete event from the watcher, it will use the last state of the pod in the informer cache to produce a delete event. At that point, it's not clear if the pod was in the queues or the cache, so we should issue a deletion in both.

The pod could be assumed, so deletion of assumed pods from the cache should work.

Change-Id: I11ce9785de603924fc121fe2fa6ed5cb1e16922f
2021-11-03 14:00:31 -04:00
kerthcet
fc9533e72f remove scheduler ServiceAffinity plugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2021-10-15 22:10:31 +08:00
Kensei Nakada
0bb4c14519 scheduler: delete docs related to equivalence cache from scheduler (#105417)
* Delete: delete docs related to equivalence cache from scheduler

* Fix: re-add comment about ServiceAffinity
2021-10-04 16:23:49 -07:00
00255991
06a9bfbb21 Add unit tests for scheduler's dynamic event handlers registration 2021-09-14 22:51:52 +08:00
Wei Huang
dc079acc2b sched: retry unschedule pods immediately after a waiting pod's deletion 2021-08-06 19:08:37 -07:00
Yecheng Fu
83ee392ed4 implement EnqueueExtensions interface in volumebinding 2021-07-03 08:25:06 +08:00
Wei Huang
1b3a124ba6 Scheduler now registers event handlers dynamically
- move clusterEventMap to Configurator
- dynamic event handlers registration for core API resources
- dynamic event handlers registration for custom resources
2021-05-21 13:47:06 -07:00
Mike Dame
5a77ebe28b Scheduler: remove pkg/features dependency from NodeResources plugins 2021-05-18 08:59:02 -04:00
Alexander Minbaev
8325c6b0da got rid of ClusterEventReg with generating ClusterEvent objects on the fly 2021-04-14 13:38:46 -05:00
Kubernetes Prow Robot
b678d1b51d Merge pull request #100444 from july2993/mode
Change go file mode from 755 to 644
2021-04-08 22:09:22 -07:00
Kubernetes Prow Robot
ae40c62c49 Merge pull request #100286 from tanjing2020/skip_updates_assumed_pods
Scheduler: skip updates of assumed pods
2021-04-08 20:29:29 -07:00
tanjing2020
d4465b995e Scheduler: skip updates of assumed pods 2021-03-24 10:01:22 +08:00
Jiahao Huang
4621722888 Change go file mode from 755 to 644
to check all file:
find . -perm 755 | grep "\.go$"
2021-03-23 10:50:17 +08:00
Wei Huang
6384f397b4 sched: support PreEnqueueChecks prior to moving Pods 2021-03-11 12:31:50 -08:00
yahaa
22a8a9ab45 fix gosimple lint check
Signed-off-by: yahaa <1477765176@qq.com>
2021-03-06 19:57:36 +08:00