Commit Graph

8386 Commits

Author SHA1 Message Date
Kubernetes Submit Queue
81363abc20 Merge pull request #51230 from enisoc/sts-deflake-exec
Automatic merge from submit-queue (batch tested with PRs 50213, 50707, 49502, 51230, 50848)

StatefulSet: Deflake e2e `kubectl exec` commands.

This may help with another source of flakiness found while investigating #48031.

We seem to get a lot of flakes due to "connection refused" while running `kubectl exec`. I can't find any reason this would be caused by the test flow, so I'm adding retries to see if that helps.
2017-08-25 01:10:35 -07:00
Kubernetes Submit Queue
4a94363c7e Merge pull request #51158 from yguo0905/overlay2
Automatic merge from submit-queue (batch tested with PRs 51224, 51191, 51158, 50669, 51222)

Enable overlay2 on cos-m60 in node e2e tests

Ref: https://github.com/kubernetes/kubernetes/issues/42926

- Restart docker with `-s overlay2` in cloud-init before running all node e2e tests. I have to copy the systemd unit file to `/etc/systemd/system` because the `/usr/lib/systemd/system/` is read only.
- Updated node e2e tests to use the new cos-m60 image.
- The name of the cloud init file (`cos-init-live-restore.yaml`) does not indicate overlay2 will be enabled, but I can't just change the name in this PR, since it's referenced in test-infra.

**Release note**:

```
None
```

/assign @Random-Liu
2017-08-24 22:59:33 -07:00
Kubernetes Submit Queue
ce3e2d9b10 Merge pull request #51224 from enisoc/sts-deflake-restart
Automatic merge from submit-queue (batch tested with PRs 51224, 51191, 51158, 50669, 51222)

StatefulSet: Deflake e2e "restart" phase.

This addresses another source of flakiness found while investigating #48031.

The test used to scale the StatefulSet down to 0, wait for ListPods to return 0 matching Pods, and then scale the StatefulSet back up.

This was prone to a race in which StatefulSet was told to scale back up before it had observed its own deletion of the last Pod, as evidenced by logs showing the creation of Pod ss-1 prior to the creation of the replacement Pod ss-0.

Instead, we now wait for the controller to observe all deletions before scaling it back up. This should fix flakes of the form:

```
Too many pods scheduled, expected 1 got 2
```
2017-08-24 22:59:28 -07:00
Anthony Yeh
05d6c8a6c2 StatefulSet: Deflake e2e kubectl exec commands.
We seem to get a lot of flakes due to "connection refused" while running
`kubectl exec`. I can't find any reason this would be caused by the test
flow, so I'm adding retries to see if that helps.
2017-08-24 11:42:05 -07:00
Huamin Chen
4525446af2 azure file volume: add secret namespace api
Signed-off-by: Huamin Chen <hchen@redhat.com>
2017-08-24 14:49:58 +00:00
Kubernetes Submit Queue
55a20bb901 Merge pull request #51206 from yguo0905/update-cos
Automatic merge from submit-queue (batch tested with PRs 47115, 51196, 51204, 51208, 51206)

Update cos-m61 image in benchmark tests

Ref: https://github.com/kubernetes/kubernetes/issues/51205

**Release note**:
```
None
```
2017-08-24 07:20:16 -07:00
Kubernetes Submit Queue
ce3b118959 Merge pull request #42689 from intelsdi-x/enable-oir-e2e
Automatic merge from submit-queue (batch tested with PRs 51193, 51154, 42689, 51189, 51200)

Re-enable OIR e2e tests.

Re-enabling test skeleton for opaque integer resources originally submitted as part of #41870. The e2e was disabled since it was flaky. This is the first step toward re-enabling them. Currently all cases are skipped, so this exercises only the BeforeEach behavior and the deferred removal of OIRs from a node.

cc @timothysc
2017-08-24 04:38:07 -07:00
Kubernetes Submit Queue
db928095a0 Merge pull request #50947 from shyamjvs/clusterIpRange-ginkgo
Automatic merge from submit-queue (batch tested with PRs 51108, 51035, 50539, 51160, 50947)

Auto-calculate CLUSTER_IP_RANGE based on cluster size

In preparation for eliminating CLUSTER_IP_RANGE env var from job configs, making it less error prone while folks try to start their own large cluster tests (https://github.com/kubernetes/kubernetes/issues/50907).

/cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
2017-08-24 02:32:14 -07:00
Kubernetes Submit Queue
14cc8cdfa4 Merge pull request #50397 from bdbauer/statefulTesting
Automatic merge from submit-queue (batch tested with PRs 51113, 46597, 50397, 51052, 51166)

Add statefulset upgrade tests to cluster_upgrade

**What this PR does / why we need it**:
Adds already created statefulset upgrade tests to cluster_upgrade.go. With further test infra changes, this will allow them to be continuously run, giving better signals.

Detect and prevent issues like https://github.com/kubernetes/kubernetes/issues/48327

**Release note**:

```release-note
NONE
```
2017-08-23 23:16:30 -07:00
Kubernetes Submit Queue
c041567b5a Merge pull request #46597 from dixudx/implement_proposal_34058
Automatic merge from submit-queue (batch tested with PRs 51113, 46597, 50397, 51052, 51166)

implement proposal 34058: hostPath volume type

**What this PR does / why we need it**:
implement proposal #34058

**Which issue this PR fixes** : fixes #46549

**Special notes for your reviewer**:
cc @thockin @luxas @euank PTAL
2017-08-23 23:16:27 -07:00
Kubernetes Submit Queue
3b2e403a37 Merge pull request #51011 from xilabao/rbac-v1-in-yaml
Automatic merge from submit-queue (batch tested with PRs 50489, 51070, 51011, 51022, 51141)

update to rbac v1 in yaml file

**What this PR does / why we need it**:
ref to https://github.com/kubernetes/kubernetes/pull/49642
ref https://github.com/kubernetes/features/issues/2

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
cc @liggitt 

**Release note**:

```release-note
NONE
```
2017-08-23 19:54:28 -07:00
Kubernetes Submit Queue
ea3a8a7570 Merge pull request #51047 from apelisse/remove-gke-test
Automatic merge from submit-queue

Skip "Simple pod should support exec through kubectl proxy" test

As reported in https://github.com/kubernetes/kubernetes/issues/50466,
this test doesn't work in GKE because it uses a bearer token and the feature only works with client certs.

As the feature that is broken in GKE is new and didn't work before, it
is safe to juste ignore the test and consider the feature as "still not
working" in GKE.

**What this PR does / why we need it**: Fixes the broken test in https://k8s-testgrid.appspot.com/release-master-blocking#gke

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: works-around #50466

**Special notes for your reviewer**:

**Release note**:
```release-note
NONE
```
2017-08-23 17:41:58 -07:00
Anthony Yeh
ce3fad326f StatefulSet: Deflake e2e "restart" phase.
The test used to scale the StatefulSet down to 0, wait for ListPods to
return 0 matching Pods, and then scale the StatefulSet back up.

This was prone to a race in which StatefulSet was told to scale back up
before it had observed its own deletion of the last Pod, as evidenced by
logs showing the creation of Pod ss-1 prior to the creation of the
replacement Pod ss-0.

We now wait for the controller to observe all deletions before
scaling it back up. This should fix flakes of the form:

```
Too many pods scheduled, expected 1 got 2
```
2017-08-23 15:08:58 -07:00
Yang Guo
a1c5c14eff Update cos-m61 image in benchmark tests 2017-08-23 09:30:20 -07:00
Kubernetes Submit Queue
178a5ff314 Merge pull request #50665 from xiangpengzhao/hardcode-to-const
Automatic merge from submit-queue (batch tested with PRs 50257, 50247, 50665, 50554, 51077)

Replace hard-code "cpu" and "memory" to consts

**What this PR does / why we need it**:
There are many places using hard coded "cpu" and "memory" as resource name. This PR replace them to consts.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:
/kind cleanup

**Release note**:

```release-note
NONE
```
2017-08-23 02:35:09 -07:00
Kubernetes Submit Queue
172f05bc53 Merge pull request #46902 from thockin/remove-obsolete-bins
Automatic merge from submit-queue (batch tested with PRs 50980, 46902, 51051, 51062, 51020)

Remove seemingly obsolete binaries

It's hard to tell if these are safe to remove.  Let CI tell me.
2017-08-22 23:13:59 -07:00
Di Xu
6f74af94ef update e2e tests and yaml files 2017-08-23 14:05:21 +08:00
Kubernetes Submit Queue
49c36f4b33 Merge pull request #50546 from apelisse/plumb-openapi-validation
Automatic merge from submit-queue (batch tested with PRs 51039, 50512, 50546, 50965, 50467)

Kubectl: Plumb openapi validation (disabled by default)

**What this PR does / why we need it**: Creates a new flag '--openapi' and plumb in the validation code so that it can be used by default to validate objects against the openapi schema.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: partially https://github.com/kubernetes/kubectl/issues/49

**Special notes for your reviewer**:

This is not complete, the name of the variable must change for example.

**Release note**:
```release-note
Kubectl uses openapi for validation. If OpenAPI is not available on the server, it defaults back to the old Swagger.
```
2017-08-22 21:16:11 -07:00
Kubernetes Submit Queue
a44e538dbc Merge pull request #51039 from enisoc/deflake-sts-saturate
Automatic merge from submit-queue

StatefulSet: Deflake e2e "Saturate" phase.

This should reduce one source of flakiness found while investigating #48031.

The "Saturate" phase of StatefulSet e2e tests verifies orderly startup by controlling when each Pod is allowed to report Ready. If a Pod unexepectedly goes down during the test, the replacement Pod
created by the controller will forget if it was already allowed to report Ready.

After this change, the signal that allows each Pod to report Ready is persisted in the Pod's PVC. Thus, the replacement Pod will remember that it was already told to proceed to a Ready state.
2017-08-22 21:13:13 -07:00
Yang Guo
755ce10e9b Enable overlay2 on cos-m60 in node e2e tests 2017-08-22 17:08:52 -07:00
Kubernetes Submit Queue
36b5e0eca6 Merge pull request #51037 from MrHohn/sig-network-e2e-fix-describe
Automatic merge from submit-queue (batch tested with PRs 51102, 50712, 51037, 51044, 51059)

[sig-network-e2e] Remove redundant sig prefix from tests

**What this PR does / why we need it**:
Remove redundant sig prefix from:
```
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for endpoint-Service: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for endpoint-Service: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for node-Service: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for node-Service: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for pod-Service: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should function for pod-Service: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update endpoints: http
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update endpoints: udp
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update nodePort: http [Slow]
[sig-network] Networking [sig-network] Granular Checks: Services [Slow] should update nodePort: udp [Slow]
[sig-network] Loadbalancing: L7 [sig-network] GCE [Slow] [Feature:Ingress] should conform to Ingress spec
[sig-network] Loadbalancing: L7 [sig-network] GCE [Slow] [Feature:Ingress] should create ingress with given static-ip
```

Umbrella issue #49161

**Special notes for your reviewer**:
cc @xiangpengzhao 

**Release note**:

```release-note
NONE
```
2017-08-22 12:28:02 -07:00
Kubernetes Submit Queue
622bc55598 Merge pull request #51028 from ironcladlou/gc-int-flake
Automatic merge from submit-queue (batch tested with PRs 50967, 50505, 50706, 51033, 51028)

Fix GC integration test race

During TestCreateWithNonExistentOwner, when creating a pod with a
non-existent owner, assume it's possible the pod will be deleted before
we start checking for the pod's existence. Assuming that the pod still
exists immediately after Create returns is flaky if the GC reacts very
quickly.

```release-note
NONE
```

Might fix https://github.com/kubernetes/kubernetes/issues/50943; without the additional test context provided by this PR, it's not entirely possible to assess the root cause of the reported failure (as we don't know whether the original assertion failure was due to there being 0 or >1 pods).

/cc @caesarxuchao
2017-08-22 10:48:26 -07:00
Kubernetes Submit Queue
c6980e7247 Merge pull request #51033 from mtaufen/revert-51008-revert-50789-fix-scheme
Automatic merge from submit-queue (batch tested with PRs 50967, 50505, 50706, 51033, 51028)

Revert "Merge pull request #51008 from kubernetes/revert-50789-fix-scheme"

I'm spinning up a cluster right now to test this fix, but I'm pretty sure this was the problem.
There doesn't seem to be a way to confirm from logs, because AFAICT the logs from the hollow kubelet containers are not collected as part of the kubemark test.

**What this PR does / why we need it**:

This reverts commit f4afdecef8, reversing
changes made to e633a1604f.

This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.

**Which issue this PR fixes**: fixes #51007

**Release note**:

```release-note
NONE
```

/cc @shyamjvs @wojtek-t
2017-08-22 10:48:21 -07:00
Antoine Pelisse
3bc6ceac38 Skip "Simple pod should support exec through kubectl proxy" test
As reported in https://github.com/kubernetes/kubernetes/issues/50466,
this test doesn't work in GKE because the transport layer doesn't work
with dialing.

As the feature that is broken in GKE is new and didn't work before, it
is safe to juste ignore the test and consider the feature as "still not
working" in GKE.
2017-08-22 10:30:16 -07:00
Kubernetes Submit Queue
c61468f29b Merge pull request #51091 from resouer/fix-perf
Automatic merge from submit-queue

Should generate files before scheduler perf

**What this PR does / why we need it**:
For a newly cloned project,  generated files are not included. Then scheduler_perf will fail:
```
undefined: openapi.GetOpenAPIDefinitions
```

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: 

fixes: #51090

**Special notes for your reviewer**:
2017-08-22 08:28:26 -07:00
Kubernetes Submit Queue
fdf14b8218 Merge pull request #50913 from shyamjvs/list-call-slo
Automatic merge from submit-queue (batch tested with PRs 50893, 50913, 50963, 50629, 50640)

Increase latency threshold for list api calls

This is only a short-term solution to make our density test green. In the long-term, we should measure as per our new SLIs.
From @wojtek-t's [doc](https://docs.google.com/document/d/1Q5qxdeBPgTTIXZxdsFILg7kgqWhvOwY8uROEf0j5YBw) on the new SLIs/SLOs, we have the following SLO for list calls:

```
SLO1: In default Kubernetes installation, 99th percentile of SLI2 per cluster-day:
<= 1s if total number of objects of the same type as resource in the system <= X
<= 5s if total number of objects of the same type as resource in the system <= Y
<= 30s if total number of objects of the same types as resource in the system <= Z
```

I would guess that 170,000 pods would fall into the 2nd bracket (at least) and hence the new value of 5s. WDYT?

cc @kubernetes/sig-scalability-misc @wojtek-t @gmarek
2017-08-22 05:31:07 -07:00
Harry Zhang
388e0b39bf generate files before scheduler perf 2017-08-22 16:40:16 +08:00
Kubernetes Submit Queue
cb8ade18c6 Merge pull request #50950 from k82cn/revert_50360
Automatic merge from submit-queue

Revert #50362.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: part of #50884

**Release note**:

```release-note
None
```
2017-08-21 16:50:53 -07:00
Kubernetes Submit Queue
0867802bbc Merge pull request #50831 from Random-Liu/instance-metadata-from-flag
Automatic merge from submit-queue (batch tested with PRs 50693, 50831, 47506, 49119, 50871)

Add instance metadata from flag even when using image config.

Also add instance metadata from flag even when we are using image config.

* Sometimes we need to dynamically generate instance metadata, it's troublesome to put them into image config.
* Sometimes we want to apply instance metadata to all images, it's duplicated to add them to each image in the image config.

/assign @yguo0905 Could you help me review this?
2017-08-21 14:29:57 -07:00
Anthony Yeh
3bc7676024 StatefulSet: Deflake e2e "Saturate" phase.
The "Saturate" phase of StatefulSet e2e tests verifies orderly startup
by controlling when each Pod is allowed to report Ready.
If a Pod unexepectedly goes down during the test, the replacement Pod
created by the controller will forget if it was already allowed to
report Ready.

After this change, the signal that allows each Pod to report Ready is
persisted in the Pod's PVC. Thus, the replacement Pod will remember that
it was already told to proceed to a Ready state.
2017-08-21 13:52:15 -07:00
Michael Taufen
a90d81620b Revert "Merge pull request #51008 from kubernetes/revert-50789-fix-scheme"
This reverts commit f4afdecef8, reversing
changes made to e633a1604f.

This also fixes a bug where Kubemark was still using the core api scheme
to manipulate the Kubelet's types, which was the cause of the initial
revert.
2017-08-21 11:28:05 -07:00
Zihong Zheng
e5349e8a90 [sig-network-e2e] Remove redundant sig prefix from tests 2017-08-21 11:17:02 -07:00
Antoine Pelisse
b7b5457050 Validate against OpenAPI schema (if available) 2017-08-21 08:58:42 -07:00
Kubernetes Submit Queue
b2b079b95a Merge pull request #51005 from wasylkowski/preparation-timeout
Automatic merge from submit-queue (batch tested with PRs 47896, 50678, 50620, 50631, 51005)

Made the difference between scale-up timeout and cluster set-up timeout explicit.

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
2017-08-21 08:26:29 -07:00
Dan Mace
5334e164c7 Fix GC integration test race
During TestCreateWithNonExistentOwner, when creating a pod with a
non-existent owner, assume it's possible the pod will be deleted before
we start checking for the pod's existence. Assuming that the pod still
exists immediately after Create returns is flaky if the GC reacts very
quickly.
2017-08-21 09:32:15 -04:00
Shyam Jeedigunta
bacc01f729 Auto-calculate CLUSTER_IP_RANGE based on no. of nodes 2017-08-21 14:21:43 +02:00
Andrzej Wasylkowski
6a41614342 Made the difference between scale-up timeout and cluster set-up timeout explicit. 2017-08-21 13:21:50 +02:00
Chen Rong
d23df051e1 update to rbac v1 in yaml file 2017-08-21 17:29:37 +08:00
Shyam JVS
5591914d62 Revert "Don't register the kubeletconfig group with the default Scheme" 2017-08-21 11:15:27 +02:00
Kubernetes Submit Queue
e633a1604f Merge pull request #50789 from mtaufen/fix-scheme
Automatic merge from submit-queue

Don't register the kubeletconfig group with the default Scheme

See https://github.com/kubernetes/kubernetes/pull/49051#discussion_r132527078
2017-08-19 12:15:48 -07:00
Kubernetes Submit Queue
b59ad9cbff Merge pull request #50146 from gmarek/deepcopyinto
Automatic merge from submit-queue (batch tested with PRs 46512, 50146)

Make metav1.(Micro)?Time functions take pointers

Is there any reason for those functions not to be on pointers?
2017-08-19 11:28:15 -07:00
Tim Hockin
831fd242e8 Remove seemingly obsolete binaries 2017-08-18 21:01:19 -07:00
Klaus Ma
df3a699069 Revert #50362. 2017-08-19 10:24:50 +08:00
Shyam Jeedigunta
70123e71bb Increase latency threshold for list api calls 2017-08-19 00:55:35 +02:00
Michael Taufen
0af9f756cd Don't register the kubeletconfig group with the default Scheme 2017-08-18 13:51:39 -07:00
Benjamin Bauer
014b765988 Refactor cluster_upgrade to include statefulset upgrade tests. 2017-08-18 10:27:47 -07:00
Kubernetes Submit Queue
26eb7c94ea Merge pull request #50904 from crassirostris/sd-logging-e2e-system-logs
Automatic merge from submit-queue (batch tested with PRs 50904, 50691)

Stackdriver Logging e2e: Explicitly check for docker and kubelet logs presence

Check for kubelet and docker logs explicitly in the Stackdriver Logging e2e tests
2017-08-18 07:29:36 -07:00
Mik Vyatskov
cd23c5d1b0 Stackdriver Logging e2e: Explicitly check for docker and kubelet logs presence 2017-08-18 14:43:04 +02:00
Kubernetes Submit Queue
e472a758c7 Merge pull request #50873 from enj/patch-1
Automatic merge from submit-queue

Add enj to OWNERS for test/integration/etcd/etcd_storage_path_test.go

@deads2k is the bot smart enough to not spam me with every test change?  Perhaps I should create an `OWNERS` file in `test/integration/etcd`?

**Release note**:

```release-note
NONE
```

@kubernetes/sig-api-machinery-pr-reviews
2017-08-18 01:06:57 -07:00
Kubernetes Submit Queue
e553d6eb5b Merge pull request #50575 from dixudx/CollisionCount_int64_to_int32
Automatic merge from submit-queue

CollisionCount should have type int32 across controllers that use it for collision avoidance

**What this PR does / why we need it**:

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50530

**Special notes for your reviewer**:
/cc @liyinan926
/assign @kow3ns @thockin @janetkuo 

**Release note**:

```release-note
Change CollisionCount from int64 to int32 across controllers
```
2017-08-17 23:21:11 -07:00