Some tests do version emulation and need the DRA feature. In that combination
the --runtime-config-emulation-forward-compatible option is needed to allow
enabling the V1 API although it's only available in 1.34.
As before when adding v1beta2, DRA drivers built using the
k8s.io/dynamic-resource-allocation helper packages remain compatible with all
Kubernetes release >= 1.32. The helper code picks whatever API version is
enabled from v1beta1/v1beta2/v1.
However, the control plane now depends on v1, so a cluster configuration where
only v1beta1 or v1beta2 are enabled without the v1 won't work.
This change adds the StructuredAuthenticationConfigurationEgressSelector
beta feature (default on). When enabled, each JWT authenticator
specified via the AuthenticationConfiguration.jwt array can
optionally specify either the controlplane or cluster egress
selector by setting the issuer.egressSelectorType field. When
unset, the prior behavior of using no egress selector is retained.
Egress selection is valuable when the persona configuring the JWT
authenticator and the persona managing the control plane are
different individuals. This change allows the latter to protect
control plane network services from unexpected connections.
Signed-off-by: Monis Khan <mok@microsoft.com>
Writes to policy resources don't instantaneously take effect in admission. ValidatingAdmissionPolicy
integration tests determine that the policies under test have taken effect by adding a sentinel
policy rule and polling until that rule is applied to a request.
If the marker resource names are the same for each test case in a series of test cases, then
observing a policy's effect on a marker request only indicates that _any_ test policy is in effect,
but it's not necessarily the policy the current test case is waiting for. For example:
1. Test 1 creates a policy and binding.
2. The policy and binding are observed by the admission plugin and take effect.
3. Test 1 observes that a policy is in effect via marker requests.
4. Test 1 exercises the behavior under test and successfully deletes the policy and binding it
created.
5. Test 2 creates a policy and binding.
6. Test 2 observes that a policy is in effect via marker requests, but the policy in effect is still
the one created by Test 1.
7. Test 2 exercises the behavior under test, which fails because it was evaluated against Test 1's
policy.
Generating a per-policy name for the marker resource in each test resolves the timing issue. In the
example, step (6) will not proceed until the admission plugin has observed the policy and binding
created in (5).
The v1alpha3 version is still needed for DeviceTaintRule, but the rest of the
types and most structs became obsolete in v1.32 when we introduced v1beta1 and
bumped the storage version to v1beta1.
Removing them now simplifies adding new features because new fields don't need
to be added to these obsolete types. This could have been done already in 1.33,
but wasn't to minimize disrupting on-going work.
Previously, etcd wrote to stderr in JSON format:
{"level":"warn","ts":"2025-04-11T03:32:06.676527Z","caller":"embed/config.go:689","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"warn","ts":"2025-04-11T03:32:06.676707Z","caller":"embed/config.go:689","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"warn","ts":"2025-04-11T03:32:06.677056Z","caller":"etcdmain/etcd.go:146","msg":"failed to start etcd","error":"listen tcp 127.0.0.1:37803: bind: address already in use"}
{"level":"fatal","ts":"2025-04-11T03:32:06.677104Z","caller":"etcdmain/etcd.go:204","msg":"discovery failed","error":"listen tcp 127.0.0.1:37803: bind: address already in use","stacktrace":"go.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:204\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\tgo.etcd.io/etcd/server/v3/etcdmain/main.go:40\nmain.main\n\tgo.etcd.io/etcd/server/v3/main.go:31\nruntime.main\n\truntime/proc.go:272"}
This has several drawbacks:
- Not very readable.
- When used in tests which start etcd themselves (for example, scheduler_perf),
the output is not associated with the current test.
- Contains warnings that are confusing for developers who don't know that they
are harmless.
Intercepting output, parsing it and reformating makes the output nicer. Instead
of a mixture of JSON messages (see above) and normal test output, we now get
the etcd output embedded inside the test output. We can also filter out some
known harmless messages. Cleaning up more output or avoiding it in the first
place might be a good next step.
With `go test -v ./test/integration/scheduler_perf/dra`:
=== RUN TestSchedulerPerf
=== RUN TestSchedulerPerf/SchedulingWithResourceClaimTemplate
=== RUN TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast
I0411 13:21:03.353458 65212 feature_gate.go:385] feature gates: {map[SchedulerQueueingHints:false]}
...
I0411 13:21:10.552975 65212 cidrallocator.go:210] stopping ServiceCIDR Allocator Controller
I0411 13:21:10.567327 65212 etcd.go:210] "etcd output" logger="TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast" error="accept tcp 127.0.0.1:42245: use of closed network connection" level="warn" ts="2025-04-11T13:21:10.567045+0200" caller="embed/serve.go:160" msg="stopping insecure grpc server due to error"
I0411 13:21:10.567398 65212 etcd.go:210] "etcd output" logger="TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast" ts="2025-04-11T13:21:10.567198+0200" caller="embed/serve.go:162" msg="stopped insecure grpc server due to error" error="accept tcp 127.0.0.1:42245: use of closed network connection" level="warn"
I0411 13:21:15.567917 65212 etcd.go:227] "etcd didn't exit in 5 seconds, killing it" logger="TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast"
I0411 13:21:15.567964 65212 etcd.go:234] "etcd exited" logger="TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast" err="signal: terminated"
With per-test output `go test -v ./test/integration/scheduler_perf/dra -args -use-testing-log`:
=== RUN TestSchedulerPerf
=== RUN TestSchedulerPerf/SchedulingWithResourceClaimTemplate
=== RUN TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast
I0411 13:19:03.540497 28645 feature_gate.go:385] feature gates: {map[DynamicResourceAllocation:true]}
...
I0411 13:19:10.519994 28645 cidrallocator.go:210] stopping ServiceCIDR Allocator Controller
etcd.go:210: I0411 13:19:10.533131] etcd output msg="stopping insecure grpc server due to error" error="accept tcp 127.0.0.1:46637: use of closed network connection" level="warn" ts="2025-04-11T13:19:10.532900+0200" caller="embed/serve.go:160"
etcd.go:210: I0411 13:19:10.533274] etcd output msg="stopped insecure grpc server due to error" error="accept tcp 127.0.0.1:46637: use of closed network connection" level="warn" ts="2025-04-11T13:19:10.533022+0200" caller="embed/serve.go:162"
etcd.go:227: I0411 13:19:15.533715] etcd didn't exit in 5 seconds, killing it
etcd.go:234: I0411 13:19:15.533803] etcd exited err="signal: terminated"
This adds the "DeviceTaint" top-level type to v1alpha3 and related fields to
ResourceSlice and ResourceClaim. It's complete enough bring up an API server
and generate files.