Commit Graph

3398 Commits

Author SHA1 Message Date
Patrick Ohly
bde9b64cdf DRA: remove "source" indirection from v1 Pod API
This makes the API nicer:

    resourceClaims:
    - name: with-template
      resourceClaimTemplateName: test-inline-claim-template
    - name: with-claim
      resourceClaimName: test-shared-claim

Previously, this was:

    resourceClaims:
    - name: with-template
      source:
        resourceClaimTemplateName: test-inline-claim-template
    - name: with-claim
      source:
        resourceClaimName: test-shared-claim

A more long-term benefit is that other, future alternatives
might not make sense under the "source" umbrella.

This is a breaking change. It's justified because DRA is still
alpha and will have several other API breaks in 1.31.
2024-06-27 17:53:24 +02:00
Kubernetes Prow Robot
a008776ec9 Merge pull request #125279 from HirazawaUi/add-poddeleted-queueinghintfn
Add QueueingHintFn for pod events in VolumeRestriction plugin
2024-06-19 12:22:41 -07:00
Kubernetes Prow Robot
64355780d9 Merge pull request #125495 from pohly/dra-scheduler-fix-parameter-indexing
DRA: fix indexing of generated parameters
2024-06-18 04:10:38 -07:00
Kubernetes Prow Robot
ab8ad49b47 Merge pull request #125533 from kaisoz/sched-test-disruption-target-cond
scheduler: Test that the DisruptionTarget condition is added at preemption time
2024-06-18 01:14:28 -07:00
Tomas Tormo
8d7c113434 Test that the DisruptionTarget condition is added at preemption 2024-06-17 16:59:52 +00:00
HirazawaUi
f9693e0c0a Implement QueueingHintFn for pod deleted event 2024-06-17 22:42:04 +08:00
Patrick Ohly
e0fce54d02 DRA: fix indexing of generated parameters
The claim parameter key didn't include the namespace of the claim. In the case
where two namespaces used the exact same parameter reference, the "too many
generated parameters" case got triggered incorrectly and lookup could have
returned an object from the wrong namespace.

Found while running the E2E tests in parallel:

              message: 'running PreFilter plugin "DynamicResources": multiple generated claim
                parameters for ConfigMap. dra-8794/parameters-3 found: [dra-4729/parameters-4
                dra-7328/parameters-4 dra-8794/parameters-4 dra-3402/parameters-4 dra-6156/parameters-4
                dra-1839/parameters-4 dra-7434/parameters-4 dra-6504/parameters-4]'
2024-06-13 17:27:04 +02:00
Kubernetes Prow Robot
9c8c61aee4 Merge pull request #122234 from AxeZhan/podUpdateEvent
[Scheduler]Put pod into the correct queue during podUpdate
2024-06-12 12:28:17 -07:00
AxeZhan
d66f8f9413 schedulingQueue update pod by queueHint 2024-06-12 21:26:09 +08:00
Patrick Ohly
c339eafb76 scheduler: allow PreBind to return "Pending" and "Unschedulable"
Any error result from PreBind was treated as a pod scheduling failure. This was
overlooked when moving blocking API calls in the DRA plugin into a PreBind
implementation, leading to:

    E0604 15:45:50.980929  306340 schedule_one.go:1048] "Error scheduling pod; retrying" err="waiting for resource driver" pod="test/test-draqld28"

That's because DRA's PreBind does some updates in the apiserver, then returns
Pending to wait for the outcome.

The fix is to allow PreBind to return the same special status codes as other
extension points.
2024-06-06 15:28:08 +02:00
AxeZhan
cf73c9d93c remove EvaluatedNodes field in Diagnosis struct 2024-06-04 14:20:55 +08:00
Kubernetes Prow Robot
cfe5a7d03a Merge pull request #125213 from carlory/fix-dra-flaky
fix dra flaky test on TestPlugin
2024-06-03 13:32:10 -07:00
Kubernetes Prow Robot
8bd36c60bd Merge pull request #125197 from gabesaba/prefilter_perf
[scheduler] absent key in NodeToStatusMap implies UnschedulableAndUnresolvable
2024-06-03 07:35:41 -07:00
Gabe
c8f0ea1a54 Don't fill in NodeToStatusMap with UnschedulableAndUnresolvable 2024-05-31 15:52:16 +00:00
carlory
2794baf4c0 fix dra flaky test on TestPlugin 2024-05-30 23:22:37 +08:00
Kubernetes Prow Robot
ee2c1ffa80 Merge pull request #124630 from carlory/fix-123731
DRA: scheduler: index claim and class parameters to simplify lookup
2024-05-29 14:38:14 -07:00
Gabe
7ea3bf4db4 Revert "scheduler: preallocation for NodeToStatusMap"
This reverts commit 9fcd791c01.
2024-05-29 14:09:58 +00:00
carlory
3072987fcc DRA: scheduler: index claim and class parameters to simplify lookup 2024-05-27 15:57:10 +08:00
Kubernetes Prow Robot
0f584a9b86 Merge pull request #124933 from AxeZhan/fix_panic
[Scheduler] Use allNodes when calculating nextStartNodeIndex
2024-05-21 10:29:35 -07:00
AxeZhan
d6d1e6ad8a base on allNodes when calculating nextStartNodeIndex 2024-05-18 00:30:38 +08:00
NoicFank
31a4b13238 enhancement(scheduler): share waitingPods among profiles 2024-05-17 17:07:27 +08:00
Toru Komatsu
5722db7aa3 QueueingHint for CSILimit when deleting pods (#121508)
Signed-off-by: utam0k <k0ma@utam0k.jp>
2024-05-14 11:07:11 -07:00
Kensei Nakada
9cd62186e8 cleanup: eliminate unncessary NodeToStatusMap creation 2024-05-11 12:14:22 +00:00
Kubernetes Prow Robot
9d87fa215d Merge pull request #124735 from AxeZhan/evaluatedNodes
Change EvaluatedNodes to count Nodes that reach Filter phase only
2024-05-09 22:43:22 -07:00
AxeZhan
bcf1c55837 evaluated nodes only consider filter stage 2024-05-10 12:40:12 +08:00
Kubernetes Prow Robot
df074ed002 Merge pull request #124546 from carlory/remove-rbd
CephRBD volume plugin and its csi migration support are removed
2024-05-09 20:50:12 -07:00
Kubernetes Prow Robot
db82fd1604 Merge pull request #124618 from gabesaba/gated_performance
Filter gated pods before calling isPodWorthRequeueing
2024-05-09 11:33:23 -07:00
carlory
c8e91b9bc2 CephRBD volume plugin ( ) and its csi migration support were removed in this release 2024-05-09 22:55:34 +08:00
Kubernetes Prow Robot
e798b9c269 Merge pull request #124714 from sanposhiho/prealloc
scheduler: preallocation for NodeToStatusMap
2024-05-07 07:07:58 -07:00
Kensei Nakada
9fcd791c01 scheduler: preallocation for NodeToStatusMap 2024-05-07 00:01:24 +00:00
Kubernetes Prow Robot
8240d882ab Merge pull request #124500 from carlory/scheduler-deprecate-non-csi-plugins
scheduler deprecates non-csi volumelimit plugins
2024-05-06 08:03:04 -07:00
Kubernetes Prow Robot
ade0d2140a Merge pull request #124578 from sanposhiho/scheduler_perf_scheduler_plugin_execution_duration_seconds
support `scheduler_plugin_execution_duration_seconds` in scheduler_perf
2024-05-05 06:40:44 -07:00
Kubernetes Prow Robot
f97ac220fd Merge pull request #124666 from chengjoey/ut-for-123465
add integration test for pod with pvc has node-affinity to non-existent/illegal nodes
2024-05-03 05:50:00 -07:00
joey
a56cc6b100 add integration test for pod with pvc has node-affinity to non-existent/existent nodes
Signed-off-by: joey <zchengjoey@gmail.com>
2024-05-03 19:45:31 +08:00
Kubernetes Prow Robot
de662bb8e0 Merge pull request #124669 from gabesaba/test_gated
Add unit test which checks QueuedPodInfo.Gated matches its source
2024-05-02 05:54:25 -07:00
Gabe
9a8197d0c3 Add unit test which checks Gated is set/unset properly 2024-05-02 10:41:19 +00:00
Gabe
6c6be931ee revert unit test 2024-05-02 10:29:15 +00:00
Kubernetes Prow Robot
29a4812f03 Merge pull request #124080 from claudiubelu/skip-windows-tests
Skip failing Windows tests
2024-05-01 07:48:12 -07:00
Kubernetes Prow Robot
b27608875c Merge pull request #124287 from sanposhiho/tainttoleration
implement QueueingHint in TaintToleration
2024-05-01 00:06:16 -07:00
Gabe
9a8ec13505 make linter happy 2024-04-30 12:06:26 +00:00
Gabe
4e99ada05f Filter gated pods before calling isPodWorthRequeueing 2024-04-29 16:54:40 +00:00
carlory
06d3cd33b2 use slices library instead 2024-04-29 16:50:53 +08:00
wackxu
a4bfaae8a4 implement QueueingHint in TaintToleration 2024-04-29 07:18:35 +00:00
Kensei Nakada
c72b688e12 support scheduler_plugin_execution_duration_seconds in scheduler_perf 2024-04-27 08:22:53 +00:00
Kubernetes Prow Robot
cffc2c0b40 Merge pull request #124102 from pohly/dra-scheduler-assume-cache
scheduler: move assume cache to utils
2024-04-26 08:49:12 -07:00
Claudiu Belu
2be8baeaef unittests: Skip failing Windows tests
Some of the unit tests are currently failing on Windows.

Skip them for now, and remove the skips later, once the underlying issues
have been resolved.
2024-04-25 14:24:16 +00:00
Patrick Ohly
7f54c5dfec scheduler: remove AssumeCache interface
There's no reason for having the interface because there is only one
implementation. Makes the implementation of the test functions a bit
simpler (no casting). They are still stand-alone functions instead of methods
because they should not be considered part of the "normal" API.
2024-04-25 11:46:58 +02:00
Patrick Ohly
26e0409c36 scheduler: move assume cache to utils, part 2
This is now used by both the volumebinding and dynamicresources plugin, so
promoting it to a common helper package is better.

In terms of functionality, nothing was changed. Documentation got
updated (warns about storing locally modified objects, clarifies what the Get
parameters are). Code coverage should be a bit better than before (tested with
and without indexer, exercises event handlers, more error paths).

Checking for specific errors can now be done via errors.Is.
2024-04-25 11:45:43 +02:00
Patrick Ohly
910b90fca3 scheduler: move assume cache to utils, part 1
This is a verbatim move resp. copy of the files. They don't build in their new
location yet.
2024-04-25 10:49:41 +02:00
carlory
a9f6374ba0 scheduler deprecates non-csi plugins 2024-04-25 14:27:01 +08:00