Commit Graph

305 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
ba791275ce Merge pull request #59671 from bsalamat/sched_queue_perf
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improve performance of scheduling queue by adding a hash map to track all pods with a nominatedNodeName

**What this PR does / why we need it**:
Our investigations show that there is a performance regression in the new scheduling queue which is not enabled by default and is enabled only if "priority and preemption" which is an alpha feature is enabled. This PR is an important performance improvement for those who want to use priority and preemption in larger clusters.
The PR adds a hash table to track nominated Pods so that finding such Pods will be faster.
Other than improving performance, we don't expect this PR to change behavior of scheduler.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

ref/ #56032
ref/ #57471 

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

/sig scheduling
2018-02-13 00:07:58 -08:00
Kubernetes Submit Queue
821cf9234d Merge pull request #59246 from huangjiuyuan/scheduler/add-tests-for-schedulercache
Automatic merge from submit-queue (batch tested with PRs 59479, 59246). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add tests for schedulercache

**What this PR does / why we need it**:
Add tests for `node_info.go` under `schedulercache` package.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:
```
NONE
```
2018-02-12 17:14:31 -08:00
Kubernetes Submit Queue
ab2e1cb02a Merge pull request #59479 from tossmilestone/avoid-ecahe-update-race
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Avoid race condition when updating equivalence cache

**What this PR does / why we need it**:
Lock the ecache to update the ecache on each predicate running, to avoid race condition.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fix #58507 

**Special notes for your reviewer**:
None

**Release note**:

```release-note
None
```
2018-02-12 16:38:07 -08:00
Bobby (Babak) Salamat
df5fc09411 compare Pods by UID, not by name and namespace 2018-02-12 10:13:13 -08:00
mlmhl
b3fff71161 format some import statements in scheduler pkg 2018-02-12 09:04:00 +08:00
Kubernetes Submit Queue
74089bc4bb Merge pull request #58737 from NickrenREN/fix-scheduler-ephemeral-storage
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Subtract local ephemeral storage resource from NodeInfo when removing pod

**What this PR does / why we need it**:
When we are removing pods, we need to subtract local ephemeral storage resource from NodeInfo

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```

/kind bug
/sig storage
/sig scheduling

/assign @jingxu97  @bsalamat
2018-02-11 13:43:01 -08:00
Wang Guoliang
31aad75316 more concise to merge the array 2018-02-11 21:27:11 +08:00
Di Xu
48388fec7e fix all the typos across the project 2018-02-11 11:04:14 +08:00
Jesse Haka
6665fa7144 taint also node controller
fix function

fix gofmt

fix function return value

fix tests

skip notimplemented error

remove factory unused

in openstack we should try to find instanceid from all states instead of ACTIVE, all other cloudproviders do this already

fix tests and lint

fix gofmt

fix nodelifecycletest

fix lint errors
2018-02-10 15:41:24 +02:00
huangjiuyuan
7d12796297 Add tests for schedulercache 2018-02-10 16:24:16 +08:00
NickrenREN
6e44d9522c release local ephemeral storage resource when removing pod 2018-02-10 11:36:11 +08:00
Bobby (Babak) Salamat
69d62a9288 Improve performance of scheduling queue by adding a hash map to track all pods in with a nominatedNodeName. 2018-02-09 14:07:29 -08:00
tossmilestone
e155582662 Avoid race condition when updating equivalence cache. 2018-02-09 16:26:41 +08:00
Jesse Haka
3cf5b172fa add node shutdown taint
shutdowned -> stopped

use shutdown everywhere

use patch in taints api call

use notimplemented in clouds use AddOrUpdateTaintOnNode

correct log text

add fake cloud

try to fix bazel

add shutdown tests

add context
2018-02-08 12:56:06 +02:00
tossmilestone
3fdacfead5 Fix golint errors in pkg/scheduler based on golint check 2018-02-08 15:22:47 +08:00
Kubernetes Submit Queue
5a4b160cf0 Merge pull request #59281 from bsalamat/nominated_node
Automatic merge from submit-queue (batch tested with PRs 59010, 59212, 59281, 59014, 59297). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Replace nominateNodeName annotation with PodStatus.NominatedNodeName

**What this PR does / why we need it**:
Replaces nominateNodeName annotation with PodStatus.NominatedNodeName in scheudler's logic. We don't expect any logic/behavior changes.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```

ref #57471

/sig scheduling
cc: @k82cn @aveshagarwal @resouer
2018-02-07 15:27:43 -08:00
Kubernetes Submit Queue
f72f90f624 Merge pull request #59449 from aveshagarwal/master-rhbz-1540822
Automatic merge from submit-queue (batch tested with PRs 58444, 59283, 59437, 59325, 59449). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix to register priority function ResourceLimitsPriority correctly.

**What this PR does / why we need it**:
This PR fixes registration of priority function ResourceLimitsPriority.  Previously this function was being registered inside `init()`. Since this priority function ResourceLimitsPriority is behind feature gate `ResourceLimitsPriorityFunction` and if the feature is enabled, it was not visible in `init()` function. So now the registration of this priority function is moved inside `ApplyFeatureGates()` in scheduler where it can be correctly registered after the feature has been enabled.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note

None
```

@kubernetes/sig-scheduling-pr-reviews @bsalamat @ravisantoshgudimetla
2018-02-06 22:42:45 -08:00
Kubernetes Submit Queue
7223729d51 Merge pull request #59245 from resouer/equiv-node
Automatic merge from submit-queue (batch tested with PRs 59394, 58769, 59423, 59363, 59245). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Ensure euqiv hash calculation is per schedule

**What this PR does / why we need it**:

Currently, equiv hash is calculated per schedule, but also, per node. This is a potential cause of dragging integration test, see #58881

We should ensure this only happens once during scheduling of specific pod no matter how many nodes we have.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #58989

**Special notes for your reviewer**:

**Release note**:

```release-note
Ensure euqiv hash calculation is per schedule
```
2018-02-06 21:34:48 -08:00
Avesh Agarwal
a450116cc1 Fix to register priority function ResourceLimitsPriority correctly. 2018-02-06 20:08:11 -05:00
Harry Zhang
482dc31937 Ensure euqiv hash calculation per schedule 2018-02-06 14:42:39 -08:00
Yang Guo
f69eaa3b18 kube-scheduler: Use default predicates/prioritizers if policy config does not specify them 2018-02-06 13:01:33 -08:00
Harry Zhang
bff62d2c86 Equiv class volume fixes
Generated bazel
2018-02-05 16:58:43 -08:00
Kubernetes Submit Queue
bdde196191 Merge pull request #58999 from tanshanshan/scheduler-msg
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make predicate errors more human readable

**What this PR does / why we need it**:
Make predicate errors more human readable

Thanks.
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
ref #58546

**Special notes for your reviewer**:

**Release note**:

```release-note

```
2018-02-02 13:36:23 -08:00
Bobby (Babak) Salamat
ec69dd139b autogenerated files 2018-02-02 13:06:33 -08:00
Bobby (Babak) Salamat
bfd950e471 Replace nominateNodeName annotation with PodStatus.NominatedNodeName in scheudler logic 2018-02-02 13:06:33 -08:00
tanshanshan
c389e3cec7 Make predicate errors more human readable 2018-02-01 10:22:53 +08:00
Kubernetes Submit Queue
b3115df40b Merge pull request #58799 from lichuqiang/cleanup
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove unused func in FakeConfigurator of scheduler

**What this PR does / why we need it**:
Current scheduler `Configurator` interface looks like this:
```
type Configurator interface {
	GetPriorityFunctionConfigs(priorityKeys sets.String) ([]algorithm.PriorityConfig, error)
	GetPriorityMetadataProducer() (algorithm.PriorityMetadataProducer, error)
	GetPredicateMetadataProducer() (algorithm.PredicateMetadataProducer, error)
	GetPredicates(predicateKeys sets.String) (map[string]algorithm.FitPredicate, error)
	GetHardPodAffinitySymmetricWeight() int32
	GetSchedulerName() string
	MakeDefaultErrorFunc(backoff *util.PodBackoff, podQueue core.SchedulingQueue) func(pod *v1.Pod, err error)

	// Needs to be exposed for things like integration tests where we want to make fake nodes.
	GetNodeLister() corelisters.NodeLister
	GetClient() clientset.Interface
	GetScheduledPodLister() corelisters.PodLister

	Create() (*Config, error)
	CreateFromProvider(providerName string) (*Config, error)
	CreateFromConfig(policy schedulerapi.Policy) (*Config, error)
	CreateFromKeys(predicateKeys, priorityKeys sets.String, extenders []algorithm.SchedulerExtender) (*Config, error)
}
```
Funcs `ResponsibleForPod` and  `Run` once existed have been removed, so the funcs in `FakeConfigurator` should be removed accordingly.

**Special notes for your reviewer**:
/kind cleanup
/sig scheduling

**Release note**:

```release-note
NONE
```
2018-01-30 22:08:45 -08:00
Bobby (Babak) Salamat
2274e93b64 Revert "Change equivalence class hashing function" 2018-01-26 18:13:15 -08:00
lichuqiang
5da8d55e45 remove unused func in FakeConfigurator of scheduler 2018-01-25 16:08:13 +08:00
Jonathan Basseri
e9a3815a6c Fix equivalence cache hash tests. 2018-01-24 17:15:42 -08:00
Jonathan Basseri
466a499fcb Move equivalence class hash code.
This moves the equivalence hashing code from
algorithm/predicates/utils.go to core/equivalence_cache.go.

In the process, making the hashing function and hashing function factory
both injectable dependencies is removed.
2018-01-24 17:15:42 -08:00
Jonathan Basseri
5ab4714520 Change equivalence hash function.
This changes the equivalence class hashing function to use as inputs all
the Pod fields which are read by FitPredicates. Before we used a
combination of OwnerReference and PersistentVolumeClaim info, which was
a close approximation. The new method ensures that hashing remains
correct regardless of controller behavior.

The PVCSet field can be removed from equivalencePod because it is
implicitly included in the Volume list.

Tests are now broken.
2018-01-24 17:15:42 -08:00
Jonathan Basseri
4ae7075e27 Add benchmark for equivalence hashing. 2018-01-24 17:15:42 -08:00
Jonathan Basseri
59f0a99909 Fix equiv. cache invalidation of Node condition.
Equivalence cache for CheckNodeConditionPred becomes invalid when
Node.Spec.Unschedulable changes. This can happen even if
Node.Status.Conditions does not change, so move the logic around.

This logic is covered by integration test
"test/integration/scheduler".TestUnschedulableNodes but equivalence
cache is currently skipped when test pods have no OwnerReference.
2018-01-24 17:07:52 -08:00
Bobby (Babak) Salamat
79601acb2c Add better event handling for deleted Pods 2018-01-23 12:03:35 -08:00
Kubernetes Submit Queue
cf5655d293 Merge pull request #58689 from k82cn/k8s_58648
Automatic merge from submit-queue (batch tested with PRs 58595, 58689). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Checked node.Unscheulable in Toleration predicate.

Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #58648 

**Release note**:

```release-note
None
```
2018-01-23 09:18:33 -08:00
Da K. Ma
430ebffe2b Checked node.Unscheulable in Toleration predicate.
Signed-off-by: Da K. Ma <madaxa@cn.ibm.com>
2018-01-23 20:54:11 +08:00
Kubernetes Submit Queue
603e7c5377 Merge pull request #58590 from zhangxiaoyu-zidif/fix-assuemePod
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix the wrong err print of assumepod

**What this PR does / why we need it**:
I think the err print is wrong, just opposite the original meaning.
/cc @timothysc 

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-01-23 00:47:26 -08:00
zhangxiaoyu-zidif
a478db6ada fix the wrong err print of assumepod 2018-01-22 10:50:59 +08:00
Reficul
e3c5747750 fix a little typo in BalancedResourceAllocation
Signed-off-by: Reficul <xuzhenglun@gmail.com>
2018-01-18 12:50:20 +08:00
Cao Shufeng
4e7398b67b remove duplicated import 2018-01-17 09:34:59 +08:00
junxu
5deb5f4913 Rename func name according TODO 2018-01-15 00:08:59 -05:00
Kubernetes Submit Queue
5911f87dad Merge pull request #56926 from wgliang/master
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

-Add scheduler optimization options, short circuit all predicates if …

…one predicate fails

Signed-off-by: Wang Guoliang <iamwgliang@gmail.com>

**What this PR does / why we need it**:
Short circuit all predicates if one predicate fails. 

I think we can add a switch to control it, maybe some scenes do not need to know all the causes of failure, but also can get a great performance improvement; if you need to fully understand the reasons for the failure, and accept the current performance requirements, can maintain the current logic. It should expose this switch to the user.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

Fixes #56889 and #48186

**Special notes for your reviewer**:
@davidopp

**Release note**:

```
Allow scheduler set AlwaysCheckAllPredicates, short circuit all predicates if one predicate fails can greatly improve the scheduling performance.
```
2018-01-14 04:53:05 -08:00
Kubernetes Submit Queue
f6ee0f7331 Merge pull request #58192 from ravisantoshgudimetla/premeptions-metrics-additions
Automatic merge from submit-queue (batch tested with PRs 58192, 58231). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Added metrics for preemption

**What this PR does / why we need it**:
Metrics for preemption duration in scheduler.

**Special notes for your reviewer**:
xref:  https://github.com/kubernetes/kubernetes/issues/57471 
**Release note**:

```release-note
NONE
```
cc @bsalamat
2018-01-13 05:36:46 -08:00
Wang Guoliang
b8526cd077 -Add scheduler optimization options, short circuit all predicates if one predicate fails 2018-01-13 18:18:55 +08:00
Kubernetes Submit Queue
50e04f59e7 Merge pull request #58061 from ravisantoshgudimetla/fix-57152
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Improved readability for messages being logged

**What this PR does / why we need it**:
This improves the readability for messages seen by end-user. /cc @jwforres @bsalamat - For UX
**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #57152

**Release note**:

```release-note
NONE
```
2018-01-12 23:50:37 -08:00
ravisantoshgudimetla
8aebf3554c Added metrics for preemption victims, pods preempted and duration of preemption 2018-01-13 12:27:11 +05:30
Kubernetes Submit Queue
98f81e3661 Merge pull request #46245 from ravisantoshgudimetla/metrics_additions
Automatic merge from submit-queue (batch tested with PRs 57266, 58187, 58186, 46245, 56509). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Added metrics for predicate and priority evaluation 

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #45972

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-01-12 20:34:53 -08:00
ravisantoshgudimetla
16ff0c2dda Improved readability for messages being logged 2018-01-13 09:43:11 +05:30
ravisantoshgudimetla
b3c57a880c Build files generated 2018-01-12 09:55:11 +05:30