'docker pull' is a time consuming operation. It makes sense to check
if image exists locally before pulling it from a registry.
Checked if image exists by running 'docker inspect'. Only pull if
image doesn't exist.
The service allocator is used to allocate ip addresses for the
Service IP allocator and NodePorts for the Service NodePort
allocator. It uses a bitmap backed by etcd to store the allocation
and tries to allocate the resources directly from the local memory
instead from etcd, that can cause issues in environment with
high concurrency.
It may happen, in deployments with multiple apiservers, that the
resource allocation information is out of sync, this is more
sensible with NodePorts, per example:
1. apiserver A create a service with NodePort X
2. apiserver B deletes the service
3. apiserver A creates the service again
If the allocation data of apiserver A wasn't refreshed with the
deletion of apiserver B, apiserver A fails the allocation because
the data is out of sync. The Repair loops solve the problem later,
but there are some use cases that require to improve the concurrency
in the allocation logic.
We can try to not do the Allocation and Release operations locally,
and try instead to check if the local data is up to date with etcd,
and operate over the most recent version of the data.
- add ./hack/tools/go.mod, this makes ./hack/tools a distinct module
- hack/tools/tools.go undescore imports bazel related tools, over time we
can add others.
- hack/*.sh scripts will cd to hack/tools and go install tools from there
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
During a parallel test run these tests were observed to cause a
preemption of another test:
```
Apr 19 16:59:06.749: INFO: At 2020-04-19 16:58:52 +0000 UTC - event for pod-init-b6fbd440-dbc2-454a-b31a-ce44266298d1: {default-scheduler } Scheduled: Successfully assigned e2e-init-container-7691/pod-init-b6fbd440-dbc2-454a-b31a-ce44266298d1 to ip-10-0-148-234.us-west-2.compute.internal
Apr 19 16:59:06.750: INFO: At 2020-04-19 16:58:54 +0000 UTC - event for pod-init-b6fbd440-dbc2-454a-b31a-ce44266298d1: {default-scheduler } Preempted: Preempted by e2e-resourcequota-priorityclass-8850/testpod-pclass9 on node ip-10-0-148-234.us-west-2.compute.internal
```
These tests have no need to actually land on a node to validate resource quota and so we can set an impossible scheduling condition. Hopefully we don't have tests that too broadly check impossible scheduling conditions.
The admission cache may take longer to see both ingress classes
than it takes to create the ingress. We must loop until we see
the appropriate error, cleaning up after ourselves as we go.
Don't set a connection deadline for reading, because the read operation will
fail if no data is reaceived after the deadline, and will not keep the
connection in the close_wait status.
Copy csi-hostpath driver manifests from
kubernetes-csi/csi-driver-host-path. It bumps version of all images to the
release shipped along Kubernetes 1.18.
As seen in one case (https://github.com/intel/pmem-csi/issues/587), a
pod can reach the "not running" state although its ephemeral volumes
are still being torn down by kubelet and the CSI driver. What happens
then is that the test returns too early and even deleting the
namespace and thus the pod succeeds before the NodeVolumeUnpublish
really finishes.
To avoid this, StopPod now waits for the pod to really disappear.
The agnhost image used for testing has a `netexec` path which supports
two new flags, `--tls-cert-file` and `--tls-private-key-file`. If the
former is provided, the HTTP server will be upgraded to HTTPS, using the
certificate (and private key) provided.
By default, there are keys already mounted into the container at
`/localhost.crt` and `/localhost.key`, which contain PEM-encoded TLS
certs with IP SANs for `127.0.0.1` and `[::1]`.