Commit Graph

11031 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
5fa5b7d6fc Merge pull request #65599 from chrisohaver/splitsvcs
Automatic merge from submit-queue (batch tested with PRs 65348, 65599, 65635, 65688, 65691). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

distribute services between 2 namespaces in e2e DNS scale services test

**What this PR does / why we need it**:

What: Alters the dns scale test, to distribute the scale load of 10K services between 2 namespaces, so that the test does not fail to create the services.
Why: To allow the dns test to proceed.

Expect to Fix #64774 but wont know until it's actually run in e2e tests, so not marking that issue to auto-close on merge.  FWIW, it does pass in local tests using hack/local-up-cluster.sh.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2018-07-02 16:52:09 -07:00
Jeff Grafton
0333b8aadc Update to go1.10.3 2018-07-02 15:46:40 -07:00
Kubernetes Submit Queue
92b81114f4 Merge pull request #65536 from gnufied/fix-flex-crashing-controller-manager
Automatic merge from submit-queue (batch tested with PRs 65299, 65524, 65154, 65329, 65536). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Make various fixes to flex tests and fix some crashes

* Fixes two controller-manager crashes when a flex plugin gets removed from flex directory.
* Also enables e2e tests to run in local clusters and other environments.
* Removes disruptive from flex e2e tests because flex can be installed in a running cluster and does not require kubelet or controller-manager restart anymore.

/sig storage

cc @verult @jsafrane 

```release-note
Fix controller-manager crashes when flex plugin is removed from flex plugin directory
```
2018-07-02 11:06:24 -07:00
David Eads
58136ee568 fail on rbac resources of non-v1 versions in reconcile 2018-07-02 13:07:16 -04:00
Chris O'Haver
e94304d814 split services between multiple namespaces 2018-07-02 10:31:05 -04:00
Kubernetes Submit Queue
7786bd8c9a Merge pull request #64654 from atlassian/missing-error-handling
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add missing error handling in schema-related code

**What this PR does / why we need it**:
Adds missing error handling to a few places.

**Which issue(s) this PR fixes**
Updates #51457. Still more work to do to fix the issue - client generation code needs to be updated (addressed in https://github.com/kubernetes/kubernetes/pull/64664).

**Release note**:
```release-note
NONE
```

/kind bug
/sig api-machinery
2018-07-02 07:14:34 -07:00
Davanum Srinivas
5feab86329 Remove --cadvisor-port - has been deprecated since v1.10
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2018-07-02 08:54:14 -04:00
Kubernetes Submit Queue
7496c64b46 Merge pull request #65593 from bsalamat/priority_admission
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Limit usage of system critical priority classes to the system namespace

**What this PR does / why we need it**:
Changes Priority admission controller to limit usage of system critical priority classes to the system namespace. This change is needed to mitigate the risk of creating many pods at system critical priority levels that could cause preemption of system critical components.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

ref/ #65557

**Special notes for your reviewer**:

**Release note**:

```release-note
Limit the usage of system-node-critical and system-cluster-critical priority classes to kube-system namespace.
```

/sig scheduling
2018-07-02 01:09:06 -07:00
Jose A. Rivera
db69638911 Add StatefulSet rollback and rolling update test with PVCs
Signed-off-by: Jose A. Rivera <jarrpa@redhat.com>
2018-07-01 21:17:04 -05:00
Kubernetes Submit Queue
e49e3baa83 Merge pull request #64939 from hzxuzhonghu/rm-etcd-quoram-read-flag
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

stop using deprecated --etcd-quorum-read

etcd-quorum-read was deprecated, but it is still used. 
This pr stops using it.

**Release note**:

```release-note
NONE
```
2018-06-30 19:32:34 -07:00
Kubernetes Submit Queue
f119fa14de Merge pull request #65541 from jiayingz/upgrade-test
Automatic merge from submit-queue (batch tested with PRs 65188, 65541, 65534). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Increase certain waiting time window in gpu_device_plugin e2e_node test.

Kubelet restart process seems to get a bit slower recently. From running
the gpu_device_plugin e2e_node test on GCE, I saw it took ~37 seconds
for kubelet to start CM DeviceManager after it restarts, and then took
~12 seconds for the gpu device plugin to re-register. As the result,
this e2e_node test fails because the current 10 sec waiting time is too
small. Restarting a container also seems to get slower that it sometimes
exceeds the current 2 min waiting time in ensurePodContainerRestart().
This change increase both waiting time to 5 min to leave enough space
on slower machines.



**What this PR does / why we need it**:

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
none
```
2018-06-29 21:42:10 -07:00
Hemant Kumar
4e7c2f638d Make various fixes to flex tests and fix some crashes
Remove disruptive from flex
2018-06-29 11:10:26 -04:00
Lubomir I. Ivanov
945e3b3ee1 test/e2e_node/system/types_unix: support ZFS
Docker validation tests in the case of ZFS used as the graph driver
fail due to "zfs" not being present in the default Docker specification.

Add "zfs" in the GraphDriver slice.
2018-06-29 16:53:15 +03:00
Wojciech Tyczynski
9340bca14e Revert "Make no. of services in load test configurable" 2018-06-29 11:33:14 +02:00
Bobby (Babak) Salamat
0daedee0de Change our tests to ensure that critical system pods are created in the system namespace 2018-06-28 22:25:27 -07:00
Kubernetes Submit Queue
a883243c9c Merge pull request #65462 from liggitt/debug-cli-scale-error
Automatic merge from submit-queue (batch tested with PRs 65600, 65203, 65462). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add debugging for scale e2e test errors

xref https://github.com/kubernetes/kubernetes/issues/64450#issuecomment-400124285
2018-06-28 22:20:08 -07:00
Kubernetes Submit Queue
0b5d3af049 Merge pull request #65203 from mgdevstack/master-conformance-namespace-pod
Automatic merge from submit-queue (batch tested with PRs 65600, 65203, 65462). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Promote [sig-api-machinery] Namespaces [Serial] e2e test for Conformance

**What this PR does / why we need it**:
This PR promotes two e2e tests cases for Conformance.
1. [sig-api-machinery] Namespaces [Serial] should ensure that all pods are removed when a namespace is deleted.
2. [sig-api-machinery] Namespaces [Serial] should ensure that all services are removed when a namespace is deleted.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:
- No flakes found.
- https://github.com/cncf/k8s-conformance/issues/221#issuecomment-397375358

**Release note**:

```release-note
NONE
```
cc @fedebongio, @AishSundar
2018-06-28 22:20:05 -07:00
Kubernetes Submit Queue
75c8b56dcb Merge pull request #64575 from immutableT/in-memory-domain-socket
Automatic merge from submit-queue (batch tested with PRs 64575, 65120, 65463, 65434, 65522). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add support for Linux Abstract Socket Namespace for KMS provider plugin.

**What this PR does / why we need it**:
Currently, kube-apiserver and kms-plugin interact via a Unix Domain Socket. The current implementation, assumes that such a Domain Socket is supported via a socket file, which is in turn is supported via a volume shared between kube-apiserver and kms-plugin containers.
However, Linux supports Abstract Socket Namespace, where a socket does not need to be back-up by a file. In golang, such sockets are created by prefixing a socket's name with @.

Benefits of using Linux Abstract Socket Namespace:
1. Don't need to worry about possible collisions with existing files.
2. Simpler configuration of master's manifest - no need to setup a shared volume between kube-apiserver and kms-plugin containers.
3. Don't need to remember to unlink the socket when KMS Plugin shuts down.
4. Creates a possibility to run KMS Plugin without access to file system.

This PR adds the ability to define a KMS endpoint as: unix:///@kms-provider.sock

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-28 02:20:09 -07:00
Jiaying Zhang
265f3a48d3 Increase certain waiting time window in gpu_device_plugin e2e_node test.
Kubelet restart process seems to get a bit slower recently. From running
the gpu_device_plugin e2e_node test on GCE, I saw it took ~37 seconds
for kubelet to start CM DeviceManager after it restarts, and then took
~12 seconds for the gpu device plugin to re-register. As the result,
this e2e_node test fails because the current 10 sec waiting time is too
small. Restarting a container also seems to get slower that it sometimes
exceeds the current 2 min waiting time in ensurePodContainerRestart().
This change increase both waiting time to 5 min to leave enough space
on slower machines.
2018-06-27 11:00:36 -07:00
Masaki Kimura
1b06ba5072 Add e2e tests for volumeMode of persistent volume
This set of e2e tests is to confirm that persistent volume works well for all volumeModes.
Coverage of the tests are shown in the figure of [Test cases], below.

Once implementation policy is confirmed to be good, we can add plugins and test cases to this.

[Test cases]
 #   plugin      volumeMode    Test case                                              Expectation
--- ---------- -------------- ------------------------------------------------------ ------------
 1    iSCSI      Block         (a) Create Pod with PV and confirm Read/Write to PV    Success
 2    iSCSI      FileSystem    (a) Create Pod with PV and confirm Read/Write to PV    Success
 3    RBD        Block         (a) Create Pod with PV and confirm Read/Write to PV    Success
 4    RBD        FileSystem    (a) Create Pod with PV and confirm Read/Write to PV    Success
 5    CephFS     Block         (a) Create Pod with PV and confirm Read/Write to PV    Fail
 6    CephFS     FileSystem    (a) Create Pod with PV and confirm Read/Write to PV    Success
 7    NFS        Block         (a) Create Pod with PV and confirm Read/Write to PV    Fail
 8    NFS        FileSystem    (a) Create Pod with PV and confirm Read/Write to PV    Success

fixes: #56803
2018-06-27 17:25:55 +00:00
Kubernetes Submit Queue
6d3bba7391 Merge pull request #64246 from wojtek-t/lease_object_type
Automatic merge from submit-queue (batch tested with PRs 64246, 65489, 65443). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Create "Lease" API in the new "coordination.k8s.io" api group

Part of "Efficient Node heartbeats" KEP:
https://github.com/kubernetes/community/blob/master/keps/0009-node-heartbeat.md

Part of: https://github.com/kubernetes/kubernetes/issues/14733

```release-note
NONE
```
2018-06-27 08:17:10 -07:00
wojtekt
c79b54db9f Enable coordination api group 2018-06-27 13:30:13 +02:00
Kubernetes Submit Queue
9090832793 Merge pull request #65492 from agau4779/add_neg_annotation
Automatic merge from submit-queue (batch tested with PRs 65492, 65516, 65447). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[GCE] update NEGAnnotation

**What this PR does / why we need it**:
Updates the NEG annotation in a few more places in the e2e test for Ingress.

```release-note
NONE
```
2018-06-27 02:15:04 -07:00
Kubernetes Submit Queue
2da49321e6 Merge pull request #63653 from WanLinghao/token_expiry_limit
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add limit to the TokenRequest expiration time

**What this PR does / why we need it**:
A new API TokenRequest has been implemented.It improves current serviceaccount model from many ways.
This patch adds limit to TokenRequest expiration time.


**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #63575

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-27 00:31:08 -07:00
Kubernetes Submit Queue
05f073dc28 Merge pull request #65468 from mindprince/remove-cos-requirement
Automatic merge from submit-queue (batch tested with PRs 65404, 65323, 65468). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove COS requirement while running e2e nvidia gpu tests.

```release-note
NONE
```
2018-06-26 17:33:08 -07:00
Rohit Agarwal
af3bc705b5 Remove COS requirement while running e2e nvidia gpu tests. 2018-06-26 12:12:06 -07:00
Kubernetes Submit Queue
ba7f798a1a Merge pull request #65460 from cofyc/issue64853
Automatic merge from submit-queue (batch tested with PRs 65342, 65460). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Prepare local volumes via hostexec pod instead of SSH

**What this PR does / why we need it**:

Prepare local volumes via hostexec pod. SSH access may be removed in future.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64853

**Special notes for your reviewer**:

For each test, launch a pod for each node to setup volumes when needed.
It uses `nsenter` to enter into host mount namespace to run commands.

Why using `nsenter` command:
- migrate to use hostexec pod (baseimage: alpine:3.6) busybox `losetup` is hard
- alpine does not contain mkfs.ext4 command
- easier to setup local volumes (no need to mount /tmp, /mnt, /dev/, /sys directories)
- only require hostexec pod contains `nsenter` command

**Release note**:

```release-note
NONE
```
2018-06-26 11:55:08 -07:00
Ashley Gau
72335f6607 update NEGAnnotation 2018-06-26 10:59:48 -07:00
Jordan Liggitt
6b2278fe21 Add debugging for scale e2e test errors 2018-06-26 13:34:48 -04:00
Kubernetes Submit Queue
0d9c432542 Merge pull request #65437 from losipiuk/lo/gpu-tests-from-env
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Read gpu type from TESTED_GPU_TYPE env variable

```release-note
NONE
```
2018-06-26 08:55:44 -07:00
Kubernetes Submit Queue
76b4699c69 Merge pull request #49410 from jasonbrooks/patch-1
Automatic merge from submit-queue (batch tested with PRs 65449, 65373, 49410). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

add kernel config locations for fedora and atomic

**What this PR does / why we need it**:

* Fedora stores its kernel configs in /usr/lib/modules/$(uname -r)/config
* Fedora/CentOS/RHEL atomic hosts use /usr/lib/ostree-boot/$(uname -r), though this location is deprecated
* The lack of these locations in the validator is causing kubeadm to hang on "failed to parse kernel config" in its preflight checking on fedora and atomic host

**Special notes for your reviewer**:

**Release note**:

```release-note
```
2018-06-26 02:52:11 -07:00
Yecheng Fu
1fbc5babb5 Prepare local volumes via hostexec pod. 2018-06-26 13:18:55 +08:00
Łukasz Osipiuk
63f5f3106b Read gpu type from TESTED_GPU_TYPE env variable 2018-06-25 18:47:49 +02:00
Ashley Gau
7beefd0c9c move NEG out of featuregate 2018-06-25 09:47:39 -07:00
immutablet
0100891168 Add support for linux abstract socket namespace. 2018-06-25 09:41:14 -07:00
Mikhail Mazurskiy
bfe313d5f3 Add missing error handling in schema-related code 2018-06-23 21:06:32 +10:00
Kubernetes Submit Queue
53cc12b9bd Merge pull request #64535 from agau4779/expose-neg-e2e
Automatic merge from submit-queue (batch tested with PRs 65338, 64535). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

[GCE] e2e test for expose neg on gce ingress

**What this PR does / why we need it**:
- Adds e2e test for the expose NEG annotation (which allows for standalone NEGs)

**Special notes for your reviewer**:
Note, https://github.com/kubernetes/ingress-gce/pull/350 must be merged first before this is merged.

`[Unreleased]` tag is on this PR because it depends on code from https://github.com/kubernetes/ingress-gce/pull/350 and https://github.com/kubernetes/ingress-gce/pull/284 being in an Ingress release. Will update this test and test-infra once this is released in the next Ingress.

**Release note**:
```release-note
NONE
```
2018-06-22 21:28:05 -07:00
Kubernetes Submit Queue
75339d33cf Merge pull request #64936 from wgliang/master.scheduler_perf_test
Automatic merge from submit-queue (batch tested with PRs 64122, 64936, 65288, 65383). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix integer divide by zero panic

**What this PR does / why we need it**:
/kind bug

fix integer divide by zero panic when time.Since(start) < 1s

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #64935

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2018-06-22 19:03:16 -07:00
Ashley Gau
c981a3349f simplify negs checking 2018-06-22 17:21:28 -07:00
Ashley Gau
90c905b4f1 address comments 2018-06-22 16:38:43 -07:00
Jeff Grafton
23ceebac22 Run hack/update-bazel.sh 2018-06-22 16:22:57 -07:00
Jeff Grafton
a725660640 Update to gazelle 0.12.0 and run hack/update-bazel.sh 2018-06-22 16:22:18 -07:00
Kubernetes Submit Queue
5e9a5659b7 Merge pull request #65376 from mindprince/to-done
Automatic merge from submit-queue (batch tested with PRs 65377, 63837, 65370, 65294, 65376). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Remove unneeded sleep from test.

The race condition that required this sleep was fixed in google/cadvisor#1969.
That was vendored in #65334.

```release-note
NONE
```

/assign @jiayingz @vishh
2018-06-22 16:16:18 -07:00
Ashley Gau
34928d219c add e2e test for standalone (exposed) NEG annotation 2018-06-22 16:15:32 -07:00
Kubernetes Submit Queue
5880db4a65 Merge pull request #65335 from shyamjvs/add-scheduler-profiling-to-testing
Automatic merge from submit-queue (batch tested with PRs 65339, 65343, 65324, 65335, 65367). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Introduce scheduler CPU/Memory profile-gathering in density test

This should help us get more reliable/realistic data for scheduler (from our real-cluster scalability tests).

/cc @wojtek-t 
fyi - @davidopp @bsalamat @misterikkit 

```release-note
NONE
```
2018-06-22 10:31:20 -07:00
Rohit Agarwal
9a9c2aedd3 Remove unneeded sleep from test.
The race condition that required this sleep was fixed in google/cadvisor#1969.
That was vendored in #65334.
2018-06-22 08:53:11 -07:00
Kubernetes Submit Queue
449908488f Merge pull request #65289 from jiayingz/upgrade-test
Automatic merge from submit-queue (batch tested with PRs 65290, 65326, 65289, 65334, 64860). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add a GPUClusterDowngrade test.

**What this PR does / why we need it**:
We actually need a separate GPUClusterDowngrade test to run gpu downgrade tests defined in e.g.,
https://k8s-testgrid.appspot.com/wg-resource-management#gce-1.11-1.10-gpu-master-downgrade

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note

```
2018-06-22 04:43:09 -07:00
Shyam Jeedigunta
0c787703f5 Introduce scheduler CPU/Memory profile-gathering in density test 2018-06-22 12:12:05 +02:00
Shyam Jeedigunta
457548ef7d Refactor profile-gatherer to work across all master components 2018-06-22 12:11:56 +02:00
Kubernetes Submit Queue
6c847f3e7a Merge pull request #65307 from shyamjvs/fix-scheduler-reset-metrics-bug
Automatic merge from submit-queue (batch tested with PRs 65301, 65291, 65307, 63845, 65313). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Fix scheduler reset metrics bug in testinfra

/cc @krzysied 

```release-note
NONE
```
2018-06-22 03:08:13 -07:00