To allow obtaining and comparing the original error.
Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: Ibcef89f7b876a273ecc24548f8d204966e0e6059
This is a performance optimization that reduces the overhead of inter-pod affinity PreFilter calculaitons. Basically
eliminates that overhead when no pods in the cluster use required pod anti-affinity. This offered 20% improvement on 5k clusters for preferred anti-affinity benchmarks.
The apiserver is expected to send pod deletion events that might arrive at a different time. However, sometimes a node could be recreated without its pods being deleted.
Partial revert of https://github.com/kubernetes/kubernetes/pull/86964
Signed-off-by: Aldo Culquicondor <acondor@google.com>
Change-Id: I51f683e5f05689b711c81ebff34e7118b5337571
The lack of this validation on incoming pods causes unpredictable cluster outcomes
when later calculating affinity results against existing pods (see #92714). This fix
quickly addresses the main source where these problems should be caught.
It is unfortunately difficult to add this validation directly to the API server due
to the fact that it may break migrations with existing pods that fail this check. This
is a compromise to address the current issue.
If a reserve plugin's Reserve method returns an error, there could be
previously allocated resources from successfully completed reserve
plugins that must be unallocated by the corresponding Unreserve
operation. Since Unreserve operations are idempotent, this patch runs
the Unreserve operation of ALL reserve plugins when a Reserve operation
fails.
Previously, separate interfaces were defined for Reserve and Unreserve
plugins. However, in nearly all cases, a plugin that allocates a
resource using Reserve will likely want to register itself for Unreserve
as well in order to free the allocated resource at the end of a failed
scheduling/binding cycle. Having separate plugins for Reserve and
Unreserve also adds unnecessary config toil. To that end, this patch
aims to merge the two plugins into a single interface called a
ReservePlugin that requires implementing both the Reserve and Unreserve
methods.
- Make some private functions stateless
- make addNominatedPods() not dependent on genericScheduler
- make addNominatedPods() not dependent on genericScheduler
- make selectVictimsOnNode() not dependent on genericScheduler
- make selectNodesForPreemption() not dependent on genericScheduler
- rename `UpdateNominatedPodForNode` to `AddNominatedPod`
- promote `update` to `UpdateNominatedPod`
- anonymous lock in nominatedMap
- pass PodNominator as an option to NewFramework