Commit Graph

144 Commits

Author SHA1 Message Date
saintube
afb4e96510 Expose NodeInfo to Score plugins
Co-authored-by: shenxin <rougang.hrg@alibaba-inc.com>
Signed-off-by: saintube <saintube@foxmail.com>
2025-03-04 17:57:14 +08:00
Kensei Nakada
c322294883 implement PodActivator to activate when preemption fails 2024-11-07 14:09:35 +09:00
Kuba Tużnik
87cd496a29 scheduler/framework: introduce pluggable SharedDRAManager
SharedDRAManager will be used by the DRA plugin to obtain DRA
objects, and to track modifications to them in-memory. The current
DRA plugin behavior will be the default implementation of
SharedDRAManager.

Plugging a different implementation will allow Cluster Autoscaler
to provide a simulated state of DRA objects to the DRA plugin when
making scheduling simulations, as well as obtain the modifications
to DRA objects from the plugin.
2024-11-05 13:52:57 +01:00
Kubernetes Prow Robot
ea1143efc7 Merge pull request #126022 from macsko/new_node_to_status_map_structure
Change structure of NodeToStatus map in scheduler
2024-08-13 21:02:55 -07:00
Maciej Skoczeń
98be7dfc5d Change structure of NodeToStatus map in scheduler 2024-07-25 07:48:35 +00:00
googs1025
a3978e8315 scheduler: Add ctx param and error return to EnqueueExtensions.EventsToRegister() 2024-07-18 12:22:17 +08:00
Kubernetes Prow Robot
b6899c5e08 Merge pull request #122251 from olderTaoist/unschedulable-plugin
register unschedulable plugin  for those plugins that PreFilter's PreFilterResult filter out some nodes
2024-07-05 05:44:26 -07:00
olderTaoist
b478621596 register unscheduable plugin when prefileter with NodeNames 2024-07-02 13:02:45 +08:00
Kubernetes Prow Robot
8c478a06d8 Merge pull request #124595 from pohly/dra-scheduler-assume-cache-eventhandlers
DRA: scheduler event handlers via assume cache
2024-06-25 11:56:28 -07:00
Patrick Ohly
9a6f3b9388 scheduler: central ResourceClaim assume cache
This enables connecting the event handler for ResourceClaim to the assume
cache, which addresses a theoretic race condition.

It may also be useful for implementing the autoscaler support, because now
the autoscaler can modify the content of the cache.
2024-06-25 14:00:25 +02:00
NoicFank
31a4b13238 enhancement(scheduler): share waitingPods among profiles 2024-05-17 17:07:27 +08:00
kerthcet
84750fe52e Revert "enhancement(scheduler): share waitingPods among profiles"
This reverts commit 227c1915db.
2024-03-19 22:52:59 +01:00
NoicFank
227c1915db enhancement(scheduler): share waitingPods among profiles 2024-02-01 10:06:23 +08:00
Kubernetes Prow Robot
5b979a3a53 Merge pull request #122498 from Gekko0114/close
Allow framework plugins to be closed
2024-01-08 17:30:36 +01:00
moriya
288c00c0c7 Allow framework plugins to be closed 2024-01-06 10:11:19 +09:00
Kensei Nakada
09abd6be5a address reviews 2024-01-02 02:10:41 +00:00
Kensei Nakada
5ab2317947 run all PreFilter when the preemption will happen later in the same scheduling cycle 2024-01-01 09:44:06 +00:00
AxeZhan
be48c93689 Sched framework: expose NodeInfo in all functions of PluginsRunner interface 2023-12-15 11:30:06 +08:00
Kubernetes Prow Robot
84424a8c19 Merge pull request #122068 from caohe/fix-multi-point
fix(scheduler): fix incorrect loop logic in MultiPoint to avoid a plugin being loaded multiple times
2023-12-14 05:10:37 +01:00
Kubernetes Prow Robot
5322af7f9e Merge pull request #122022 from sanposhiho/extender-fix
fix: requeue pods rejected by Extenders properly
2023-12-14 05:10:01 +01:00
Kubernetes Prow Robot
badc4102ac Merge pull request #121572 from Prateek462003/myFeature
Added Logging for all the enabled plugins in each extension point
2023-12-13 22:34:06 +01:00
caohe
1f5738df84 fix(scheduler): fix incorrect loop logic in MultiPoint to avoid a plugin being loaded multiple times
Signed-off-by: caohe <caohe9603@gmail.com>
2023-11-29 20:14:18 +08:00
hub-Prateek
a601ebd6b6 Changed the log message 2023-11-26 11:41:42 +05:30
hub-Prateek
eb45a8f2f5 Added comments 2023-11-24 11:01:15 +05:30
hub-Prateek
76be319571 Optimzed the code 2023-11-24 10:58:33 +05:30
hub-Prateek
5c99f3a24e Logged the return value of ListPlugins 2023-11-24 00:19:42 +05:30
Kensei Nakada
468e2dac81 fix: requeue pods rejected by Extenders properly 2023-11-23 13:20:02 +00:00
hub-Prateek
9cb2d1cf6d Removed Comments 2023-11-22 22:32:19 +05:30
hub-Prateek
1dca49157a Utilized ListPlugins method 2023-11-22 02:13:55 +05:30
Patrick Ohly
2a23061f6c scheduler: fix performance regression at -v3 + contextual logging
The logging instrumentation for contextual logging that was added for 1.29
slowed down the scheduler (i.e. logging verbosity <= 3) by a significant
percentage (-28.66% for SchedulingBasic/5000Nodes at -v3) if (and only if!)
contextual logging was enabled.

Retrieving the logger from the context causes no measurable slowdown, it's only
the various WithName/WithValues calls which cause this.

By being more careful about when to use those, the performance impact can be
avoided:
- At -v3 or lower, only `WithValues("pod")` is used once per scheduling cycle.
  This has the intended effect that all log messages for the cycle include the
  pod information. Once contextual logging is GA, "pod" key/value pairs can
  be removed from all log calls.
- At -v4 or higher, richer log entries get produced where `WithValues` is also
  used for the node (when applicable) and `WithName` is used for the current
  operation and plugin.

With these changes, enabling contextual logging causes no measurable slowdown
at -v3 or lower. At -v4, the slowdown depends on the test case (-30.51%
throughput for SchedulingBasic/5000Nodes, no change for
SchedulingCSIPVs/5000Nodes). For some unknown reason (measuring bias?),
SchedulingCSIPVs/500Nodes has a ~3& *higher* throughput with contextual
logging.
2023-11-03 17:28:55 +01:00
hub-Prateek
7b60e7e2a3 Added plugins enabled at each extension point 2023-11-01 23:03:13 +05:30
Kubernetes Prow Robot
d84ee0ba69 Merge pull request #121632 from kerthcet/fix/runscoreplugins
Fix panic when process RunScorePlugins for cap out of range
2023-10-31 13:14:32 +01:00
kerthcet
b02aad42fa Fix panic when process RunScorePlugins for cap out of range
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-10-31 16:02:16 +08:00
Kensei Nakada
c7842d9c63 narrow down the scope of EnqueueExtensions to subscribe less cluster events 2023-10-27 14:14:37 +00:00
Kensei Nakada
27bb66fd7b cleanup: rename failedPlugin to plugin in framework.Status 2023-10-25 12:03:56 +00:00
olderTaoist
5d5958e338 fix ImageLocality plugin score is inconsistent 2023-10-17 09:38:03 +08:00
Mengjiao Liu
a7466f44e0 Change the scheduler plugins PluginFactory function to use context parameter to pass logger
- Migrated pkg/scheduler/framework/plugins/nodevolumelimits to use contextual logging
- Fix golangci-lint validation failed
- Check for plugins creation err
2023-09-20 17:49:54 +08:00
kerthcet
ab01848134 Make sure skip score plugins alwarys returned
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-08-24 13:39:47 +08:00
Wei Huang
765f3916c2 Fix a bug that PostFilter plugin may not function if previous PreFilter plugins return Skip 2023-08-10 13:43:00 -07:00
AxeZhan
2863b3d1ab Revert "refactor: simplify RunScorePlugins for readability + performance"
This reverts commit a7eb7ed5c6.
2023-07-20 10:50:32 +08:00
kerthcet
c0eb0caf4a Support fine-gained rescheduling in ReservePlugin
Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 13:30:29 +08:00
Kubernetes Prow Robot
b07a843cb5 Merge pull request #119046 from kerthcet/fix/handle-unschedule-plugins
Fix fitError in Permit plugin not handled perfectly
2023-07-06 21:01:03 -07:00
kerthcet
278a8376e1 Fix: fiterror in permit plugin not handled perfectly
We only added failed plulgins, but actually this will not work unless
we make the status with a fitError because we only copy the failured plugins
to podInfo if it is a fitError

Signed-off-by: kerthcet <kerthcet@gmail.com>
2023-07-07 10:35:59 +08:00
Kubernetes Prow Robot
aeed7da616 Merge pull request #119077 from sanposhiho/follow-up-hint
clean up the implementation around QueueingHintFn
2023-07-06 13:39:15 -07:00
Kensei Nakada
be0db3f93d clean up the implementation around QueueingHintFn 2023-07-06 16:07:39 +00:00
Kubernetes Prow Robot
293c1b8378 Merge pull request #118025 from AxeZhan/score-metrics
feature(scheduler): plugin_evaluation_total metric support preScore/score
2023-07-05 05:14:56 -07:00
Kubernetes Prow Robot
d9714078f8 Merge pull request #118551 from sanposhiho/event-to-register
feature(scheduler): implement ClusterEventWithHint to filter out useless events
2023-06-26 06:41:45 -07:00
Kensei Nakada
6f8d38406a feature(scheduler): implement ClusterEventWithHint to filter out useless events 2023-06-22 13:36:19 +00:00
Kensei Nakada
a7eb7ed5c6 refactor: simplify RunScorePlugins for readability + performance 2023-06-11 03:29:05 +00:00
SataQiu
410b6023d6 scheduler: fix code style issues for pkg/scheduler 2023-06-05 17:29:49 +08:00