generic pod-per-node functionality for testing - 2 node test only
- update framework to decompose pod vs svc creation for composition.
- remove hard coded 2 and pointer to --scale
Automatic merge from submit-queue
Move internal types of job from pkg/apis/extensions to pkg/apis/batch
This addressed the job part of #23216, this is still WIP. Will notify once finished. I'd like to have it in before starting working on ScheduledJob.
@lavalamp @erictune fyi
Automatic merge from submit-queue
Use mCPU as CPU usage unit, add version in PerfData, and fix memory usage bug.
Partially addressed #24436.
This PR:
1) Change the CPU usage unit to "mCPU"
2) Add version in PerfData, and perfdash will only support the newest version now.
3) Fix stupid mistake when calculating the memory usage average.
/cc @vishh
Automatic merge from submit-queue
Add timeout to e2e network connectivity checks
Some e2e tests use wget to check connectivity, and the default e2e
timeout is 900s. This change allows the timeout to be specified on a
check-by-check basis. This will also make the check useful for negative
checks (like those used by openshift to validate isolation) since a
short timeout is suggested where connectivity is not expected.
Automatic merge from submit-queue
Increase provisioning test timeouts.
We've encountered flakes in our e2e infrastructure when kubelet took more than one minute to detach a volume used by a deleted pod.
Let's increase the wait period from 1 to 3 minutes. This slows down the test by 2 minutes, but it makes the test more stable.
In addition, when kubelet cannot detach a volume for 3 minutes, let the test wait for additional recycle controller retry interval (10 minutes) and hope the volume is deleted by then. This should not increase usual test time, it makes the test stable when kubelet is _extremely_ slow when releasing the volume.
Fixes: #24161
Automatic merge from submit-queue
Fix unintended change of Service.spec.ports[].nodePort during kubectl apply
Please refer #23551 for more detail. @bgrant0607 I think this simple fix should be ok to reuse nodePort. @thockin ptal.
Release note: Fix unintended change of `Service.spec.ports[].nodePort` during `kubectl apply`.
Automatic merge from submit-queue
Cluster Verification Framework
I've spent the last few days looking at the general patterns of verification we have that we tend to reuse in the e2es. Basically, we need
- label filters
- forEach and WaitFor (where forEach doesn't necessarily waitFor anything).
- timeouts
- multiple phases (reusable definition of state)
- an extensible way to define cluster state that can evolve over time in a data object rather than as a set of parameters that have magic semantics
This PR
- implements the abstract above functionality declaratively, and w/o hidden semantics.
- addresses the sprawling duplicate methods in #23540, so that we can phase out the wrapper methods and replace them with well defined, extensible semantics for cluster state.
- fixes the recently discovered #23730 issue (where kubectl.go is relying on examples.go, which is obviously wacky) by using the new framework to implement forEachPod in just a couple of lines and migrating the wrapper function into framework.go.
There is some cleanup to do here, but this is seemingly working for a couple of use cases that are important (spark,cassandra,...,kubectl) tests. - i played with a few different ideas and this wound up seeming to be the most natural implementation from a usability standpoint...
in any case, just thought id push this up as a first iteration, open to feedback.
@kubernetes/sig-testing @timothysc
Automatic merge from submit-queue
Fix DNS test for larger clusters
On GKE, we scale the number of DNS pods based on the cluster size. For
testing on larger clusters, relax the DNS pod check.
- rebase: ForEach only on Running pods
- add waitFor step in guestbook describe and wrapper
- simplify logs in polling, make panic immediate, give rolluped stats in
the logs.
Improve logging for failure on ForEach
We've encountered flakes in our e2e infrastructure when kubelet took more than
one minute to detach a volume used by a deleted pod.
Let's increase the wait period from 1 to 3 minutes. This slows down the test
by 2 minutes, but it makes the test more stable.
In addition, when kubelet cannot detach a volume for 3 minutes, let the test
wait for additional recycle controller retry interval (10 minutes) and hope the
volume is deleted by then. This should not increase usual test time, it makes
the test stable when kubelet is _extremely_ slow when releasing the volume.
Automatic merge from submit-queue
Move /resetMetrics to DELETE /metrics
Reduces the surface area of the API server slightly and allows
downstream components to have deleteable metrics. After this change
genericapiserver will *not* have metrics unless the caller defines it
(allows different apiserver implementations to make that choice on their
own).
@wojtek-t
Automatic merge from submit-queue
Add watch.Until, a conditional watch mechanism
A more powerful tool than wait.Poll, allows a watch interface to drive conditionals to react to changes on a resource or resources. Provide a set of standard conditions that are in common use in the code, and updates e2e to use a few of these.
Extracted from #23567
Automatic merge from submit-queue
phase 2 of cassandra example overhaul
Here's the next iteration in overhauling this example, towards https://github.com/kubernetes/kubernetes/issues/20961. This removes the pod adoption part, but doesn't (yet) otherwise change any of the resources used.
It also includes some README cleanup, and removes some explicit specification of labels in the rc yaml.
This PR doesn't yet add any commentary on how we're using the seed provider (re: https://github.com/kubernetes/kubernetes/issues/20961#issuecomment-190405959 etc.). Maybe we should add that.
Also: LMK if this PR should include any changes to the links out to the docs.
cc @bgrant0607 @johndmulhausen
in e2e/volumes.go: give time to allow pod cleanup and volume unmount happen before volume server exit;
skip cinder volume test if not running with openstack provider
comment on why pause before containerized server is stopped in volume e2e tests, fix#24100
updates NFS server image to 0.6, per #22529
fix persistent_volume e2e test: test cleanup doesn't expect client pod; delete PV after test
Signed-off-by: Huamin Chen <hchen@redhat.com>
Reduces the surface area of the API server slightly and allows
downstream components to have deleteable metrics. After this change
genericapiserver will *not* have metrics unless the caller defines it
(allows different apiserver implementations to make that choice on their
own).
Some e2e tests use wget to check connectivity, and the default e2e
timeout is 900s. This change allows the timeout to be specified on a
check-by-check basis. This will also make the check useful for negative
checks (like those used by openshift to validate isolation) since a
short timeout is suggested where connectivity is not expected.