In reality, the kubelet plugin of a DRA driver is meant to be deployed as a
daemonset with a service account that limits its
permissions. https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#additional-metadata-in-pod-bound-tokens
ensures that the node name is bound to the pod, which then can be used
in a validating admission policy (VAP) to ensure that the operations are
limited to the node.
In E2E testing, we emulate that via impersonation. This ensures that the plugin
does not accidentally depend on additional permissions.
validating that one endpoint is reachable from one part of the cluster
is not enough condition to consider it will be reachable from any node,
as different Services proxies on different nodes will have different
propagation delays for the EndpointSlices and Services information.
This is the second and final step towards making kubelet independent of the
resource.k8s.io API versioning because it now doesn't need to copy structs
defined by that API from the driver to the API server.
This is a first step towards making kubelet independent of the resource.k8s.io
API versioning because it now doesn't need to copy structs defined by that API
from the driver to the API server. The next step is removing the other
direction (reading ResourceClaim status and passing the resource handle to
drivers).
The drivers must get deployed so that they have their own connection to the API
server. Securing at least the writes via a validating admission policy should
be possible.
As before, the kubelet removes all ResourceSlices for its node at startup, then
DRA drivers recreate them if (and only if) they start up again. This ensures
that there are no orphaned ResourceSlices when a driver gets removed while the
kubelet was down.
While at it, logging gets cleaned up and updated to use structured, contextual
logging as much as possible. gRPC requests and streams now use a shared,
per-process request ID and streams also get logged.
including one Alpha only test, as the feature is in alpha
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Co-authored-by: Sohan Kunkerkar <sohank2602@gmail.com>
With cloud providers removed from k/k, e2e tests have no way how to create a
static AWS EBS, GCE PD, Azure Disk or other cloud volume. Test
"[sig-storage] Multi-AZ Cluster Volumes should schedule pods in the same
zones as statically provisioned PVs" constantly fails with "provider does
not support volume creation".
There is no upstream e2e job that would run the test and show the error.
We noticed it downstream in OpenShift.
The new pull-kubernetes-kind-dra uses
-label-filter='Feature: containsAny DynamicResourceAllocation && !Flaky && !Serial'
to run DRA tests. That didn't work because the E2E framework behind its back
added the default skip expression.
Fix `[sig-network] Services [It] should complete a service status
lifecycle [Conformance]` not iterating through all `Conditions` in
status, causing failure when a `Condition` already exists.
Log Conditions instead of Loadbalanacer.