Automatic merge from submit-queue (batch tested with PRs 47043, 48448, 47515, 48446)
Refactor slice intersection
**What this PR does / why we need it**:
In worst case, the original method is O(N^2), while current method is 3 * O(N).
I think it is better.
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue (batch tested with PRs 46926, 48468)
Added helper funcs to schedulercache.Resource.
**What this PR does / why we need it**:
Avoid duplicated code slice by helper funcs.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#46924
**Release note**:
```release-note-none
```
Automatic merge from submit-queue
Cleanup predicates.go
**What this PR does / why we need it**:
cleanup some comments and errors.New().
**Special notes for your reviewer**:
/cc @jayunit100
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Removes alpha feature gate for affinity annotations.
**What this PR does / why we need it**:
In 1.5 we added a backstop to support alpha affinity annotations. This PR removes that support in favor of the Beta fields per discussions.
It also serves as a precursor to some of the component config work that @ncdc has done around @mikedanese design proposal.
xref: https://github.com/kubernetes/kubernetes/pull/41617
**Special notes for your reviewer**:
**Release note**:
```
Removes alpha feature gate for pod affinity annotations.
```
/cc @kubernetes/sig-scheduling-pr-reviews @kubernetes/sig-cluster-lifecycle-misc
Automatic merge from submit-queue (batch tested with PRs 47883, 47179, 46966, 47982, 47945)
Fix local isolation for pod requesting only overlay or scratch
**What this PR does / why we need it**:
Fix overlay resource predicates for pod with only overlay or scratch storage request.
E.g. the following pod can pass predicate even if overlay is only 512Gi.
```yaml
apiVersion: v1
kind: Pod
metadata:
name: pod
spec:
containers:
- name: nginx
image: nginx
resources:
requests:
storage.kubernetes.io/overlay: 1024Gi
```
similarly, following pod will also pass predicate
```yaml
apiVersion: v1
kind: Pod
metadata:
name: pod
spec:
containers:
- name: nginx
image: nginx
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
emptyDir:
sizeLimit: 1024Gi
```
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes https://github.com/kubernetes/kubernetes/issues/47798
**Special notes for your reviewer**:
**Release note**:
```release-note
```
@jingxu97 @vishh @dashpole
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003)
Remove duplicate errors from an aggregate error input.
This PR, in general, removes duplicate errors from an aggregate error input, and returns unique errors with their occurrence count. Specifically, this PR helps with some scheduler errors that fill the log enormously. For example, see the following `truncated` output from a 300-plus nodes cluster, as there was a same error from almost all nodes.
[SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found.........
After this PR, the output looks like (on a 2-node cluster):
SchedulerPredicates failed due to persistentvolumeclaims "mongodb" not found, which is unexpected.(Count=2)
@derekwaynecarr @smarterclayton @kubernetes/sig-scheduling-pr-reviews
Fixes https://github.com/kubernetes/kubernetes/issues/47145
Automatic merge from submit-queue (batch tested with PRs 47083, 44115, 46881, 47082, 46577)
Scheduler should not log an error when there is no fit
**What this PR does / why we need it**:
The scheduler should not log an error when it is unable to find a fit for a pod as it's an expected situation when resources are unavailable on the cluster that satisfy the pods requirements.
Automatic merge from submit-queue (batch tested with PRs 46787, 46876, 46621, 46907, 46819)
Highlight nodeSelector when checking nodeSelector for Pod.
**What this PR does / why we need it**:
Currently, we are using function name as `PodSelectorMatches` to check if `nodeSelector` matches for a Pod, it is better update the function name a bit to reflect it is checking `nodeSelector` for a Pod.
The proposal is rename `PodSelectorMatches` as `PodMatchNodeSelector`.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #
**Special notes for your reviewer**:
**Release note**:
```release-note
NONE
```
Automatic merge from submit-queue
Add local storage (scratch space) allocatable support
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
This feature is part of local storage capacity isolation and described in the proposal https://github.com/kubernetes/community/pull/306
**Release note**:
```release-note
This feature exposes local storage capacity for the primary partitions, and supports & enforces storage reservation in Node Allocatable
```
Automatic merge from submit-queue (batch tested with PRs 46239, 46627, 46346, 46388, 46524)
move labels to components which own the APIs
During the apimachinery split in 1.6, we accidentally moved several label APIs into apimachinery. They don't belong there, since the individual APIs are not general machinery concerns, but instead are the concern of particular components: most commonly the kubelet. This pull moves the labels into their owning components and out of API machinery.
@kubernetes/sig-api-machinery-misc @kubernetes/api-reviewers @kubernetes/api-approvers
@derekwaynecarr since most of these are related to the kubelet
This PR adds the check for local storage request when admitting pods. If
the local storage request exceeds the available resource, pod will be
rejected.
This PR adds the support for allocatable local storage (scratch space).
This feature is only for root file system which is shared by kubernetes
componenets, users' containers and/or images. User could use
--kube-reserved flag to reserve the storage for kube system components.
If the allocatable storage for user's pods is used up, some pods will be
evicted to free the storage resource.
Automatic merge from submit-queue (batch tested with PRs 46076, 43879, 44897, 46556, 46654)
Local storage plugin
**What this PR does / why we need it**:
Volume plugin implementation for local persistent volumes. Scheduler predicate will direct already-bound PVCs to the node that the local PV is at. PVC binding still happens independently.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*:
Part of #43640
**Release note**:
```
Alpha feature: Local volume plugin allows local directories to be created and consumed as a Persistent Volume. These volumes have node affinity and pods will only be scheduled to the node that the volume is at.
```
Automatic merge from submit-queue
Move hardPodAffinitySymmetricWeight to scheduler policy config
**What this PR does / why we need it**:
Move hardPodAffinitySymmetricWeight to scheduler policy config
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes#43845
**Special notes for your reviewer**:
If you like this, will add test later
**Release note**:
```
Move hardPodAffinitySymmetricWeight from KubeSchedulerConfiguration to scheduler Policy config
```
Automatic merge from submit-queue
removing generic_scheduler todo after discussion (#46027)
**What this PR does / why we need it**:
**Which issue this PR fixes** #46027
**Special notes for your reviewer**: just a quick clean cc @wojtek-t
**Release note**:
```release-note
```
Automatic merge from submit-queue (batch tested with PRs 45766, 46223)
Scheduler should use a shared informer, and fix broken watch behavior for cached watches
Can be used either from a true shared informer or a local shared
informer created just for the scheduler.
Fixes a bug in the cache watcher where we were returning the "current" object from a watch event, not the historic event. This means that we broke behavior when introducing the watch cache. This may have API implications for filtering watch consumers - but on the other hand, it prevents clients filtering from seeing objects outside of their watch correctly, which can lead to other subtle bugs.
```release-note
The behavior of some watch calls to the server when filtering on fields was incorrect. If watching objects with a filter, when an update was made that no longer matched the filter a DELETE event was correctly sent. However, the object that was returned by that delete was not the (correct) version before the update, but instead, the newer version. That meant the new object was not matched by the filter. This was a regression from behavior between cached watches on the server side and uncached watches, and thus broke downstream API clients.
```
Automatic merge from submit-queue
Moved qos to api.helpers.
**What this PR does / why we need it**:
The `GetPodQoS` is also used by other components, e.g. kube-scheduler and it's not bound to kubelet; moved it to api helpers so client-go.
**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #N/A
**Release note**:
```release-note-none
```