Having a dedicated ActionType which only gets used when the scheduler itself
already detects some change in the list of generated ResourceClaims of a pod
avoids calling the DRA plugin for unrelated Pod changes.
d66f8f9 added that "plugins have to implement a QueueingHint for Pod/Update
event if the rejection from them could be resolved by updating unscheduled
Pods itself".
This applies to DRA because the name of a generated ResourceClaim must be
recorded in the pod status before the pod can be scheduled.
Skip queue on unrelated change that keeps pod schedulable when QueueHints are enabled.
Split add from QHints disabled case
Remove case when QHints are disabled
Remove two GHint alternatives in unit tests
more fine-grained Node QHint for NodeResourceFit plugin
Return early when updated Node causes unmatch
Revert "more fine-grained Node QHint for NodeResourceFit plugin"
This reverts commit dfbceb60e0c1c4e47748c12722d9ed6dba1a8366.
Add integration test for requeue of a pod previously rejected by NodeAffinity plugin when a suitable Node is added
Add integratin test for a Node update operation that does not trigger requeue in NodeAffinity plugin
Remove innacurrate comment
Apply review comments
fix if condition
add test
add log
eliminate unnecessary args from log
fix Queue condition
check original pod status
fix return value when can scheduleable
fix tweak
fix testcase
Making unschedulable pods schedulable again after ResourceSlice cluster events
was accidentally left out when adding structured parameters to Kubernetes 1.30.
All E2E tests were defined so that a driver starts first. A new test with a
different order (create pod first, wait for unschedulable, start driver)
triggered the bug and now passes.
In the API, the effect of the feature gate is that alpha fields get dropped on
create. They get preserved during updates if already set. The
PodSchedulingContext registration is *not* restricted by the feature gate.
This enables deleting stale PodSchedulingContext objects after disabling
the feature gate.
The scheduler checks the new feature gate before setting up an informer for
PodSchedulingContext objects and when deciding whether it can schedule a
pod. If any claim depends on a control plane controller, the scheduler bails
out, leading to:
Status: Pending
...
Warning FailedScheduling 73s default-scheduler 0/1 nodes are available: resourceclaim depends on disabled DRAControlPlaneController feature. no new claims to deallocate, preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
The rest of the changes prepare for testing the new feature separately from
"structured parameters". The goal is to have base "dra" jobs which just enable
and test those, then "classic-dra" jobs which add DRAControlPlaneController.