The dual-stack integration tests already validate that we get the
expected Endpoints for single- and dual-stack Services. There is no
further "end to end" testing needed for Endpoints, given that
everything in a normal cluster would look at EndpointSlices, not
Endpoints.
Slightly-more-generic replacement for validateEndpointsPortsOrFail()
(but only validates EndpointSlices, not Endpoints).
Also, add two new unit tests to the Endpoints controller, to assert
the correct Endpoints-generating behavior in the cases formerly
covered by the "should serve endpoints on same port and different
protocols" and "should be updated after adding or deleting ports" e2e
tests (since they are now EndpointSlice-only). (There's not much point
in testing the Endpoints controller in "end to end" tests, since
nothing in a normal cluster ever looks at its output, so there's
really only one "end" anyway.)
These tests were using validateEndpointsPortsOrFail() not because they
cared about ports, but just because it was there, or in some cases
because they needed to wait for one pod to exit and a different pod to
start, which can't be done with framework.WaitForServiceEndpointsNum()
(or e2eendpointslice.WaitForEndpointCount) without racing. Update
these tests using the new e2eendpointslice.WaitForEndpointPods, which
can wait for specific expected pods.
(This also means these tests now only watch EndpointSlices, rather
than watching both Endpoints and EndpointSlices, which is fine,
because none of them are doing tricky things that actually require
making assertions about the exact contents of the
Endpoints/EndpointSlices. They just want to know when the controller
has updated things to point to the expected pods.)
A bunch of tests in test/e2e/network were using
validateEndpointsPortsOrFail but didn't actually care about ports at
all; they were just using it because that helper function was there.
Make them use WaitForEndpointCount instead.
Likewise, fix one test that was manually counting Endpoints and
EndpointSlices to use WaitForEndpointCount.
It was trying to reconnect to the LoadBalancer as fast as possible to
try to catch any transient problems, but "as fast as possible" ended
up meaning about 12,500 times a second, which is clearly excessive.
Limit it to 100 times a second.
This changes the conformance test to measure the latency of the
EndpointSlice controller rather than the Endpoints controller. This
should not have any real effect since the two controllers run in the
same process and do mostly the same thing (especially in tests like
this one which don't test multi-slice services).
Use Pod Readiness Gates and Services to implement blue green deployment
strategies by setting the corresponding readines gate to influence the
service traffic.
Change-Id: I23e3b9a0440014f48a4612685055565fd8dff5ec
Conditionally skip instead of failing when nfacct test when the required
metric is not present. The only case when the metric will not be registered
by kube-proxy is when nfacct subsystem is not supported on the node, we
skip the test in that case.
Signed-off-by: Daman Arora <aroradaman@gmail.com>
The existing TrafficDistribution test didn't really distinguish "same
zone" from "same node". Add another test that makes sure there are at
least 2 nodes in each zone so it can do that.
(Keep the original test as well to avoid losing coverage in CI systems
with single-schedulable-node-per-zone clusters.)
Split the logic of creating the clients and the servers apart from the
logic of checking which clients connect to which servers. Add some
extra complexity to support additional use cases (like multiple
endpoints on the same node).
Remove endpointSlicesHaveSameZoneHints check. We are testing that
connections end up at the right endpoints. We don't need to validate
_why_ they go to the right endpoints, which is already tested by other
tests anyway. (Also, validating the hints becomes more complicated in
the same-node case, where there may or may not also be same-zone hints
depending on cluster configuration.)
Remove DeferCleanup calls; we don't need to delete anything manually
because namespaced resources will automatically be deleted when the
test case's namespace is deleted.
Remove a setting of pod.NodeName that was redundant with
e2epod.SetNodeSelection().