This should fix the following test when running it with CRI-O:
```
[It] [sig-node] [Feature:SidecarContainers] [Serial] Containers
Lifecycle when A node running restartable init containers reboots should
restart the containers in right order with the proper phase after the
node reboot
```
The issue is that we have prefixed "unable to retrieve container logs
for …" outputs in the message to be parsed. We now skip that part and
leave the current behavior untouched.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
The `ginkgo.ContinueOnFailure` decorator serves the usecase
of the new cpumanager tests perfectly:
https://onsi.github.io/ginkgo/#failure-handling-in-ordered-containers
"""
You can override this behavior by decorating an Ordered container with
ContinueOnFailure. This is useful in cases where Ordered is being used
to provide shared expensive set up for a collection of specs.
When ContinueOnFailure is set, Ginkgo will continue running specs even
if an earlier spec in the Ordered container has failed.
"""
And this is exactly the case at hand. Previously, without this
decorator, subsequent failures were masked, which is dangerous and not
what we want.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Initially we added minimal quota disablement e2e tests,
but since the emergence of https://github.com/kubevirt/kubevirt/issues/14965
it becames clear that is better to have full coverage.
This PR restores coverage parity with the old test suite.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Added tests to verify DRA functionality with 2 different socket
configurations:
- the same socket is used for the registration and the DRA service
- 2 separate sockets are used for the registration and the DRA service
Used table-driven ginkgo to avoid code duplication:
specs https://onsi.github.io/ginkgo/#table-driven-tests
This change enhances the robustness of the DRA e2e tests by
validating its behavior with different socket setups.
Added an ability to specify the socket path for the DRA gRPC
service in the e2e node tests.
The PluginSocket option is added to allow setting the name
of the socket inside the directory where the DRA driver
creates the socket for the DRA gRPC calls. This is used by
the kubelet to connect to the DRA plugin.
The newDRAService and newRegistrar functions are updated to
accept a socketPath parameter, which is used to configure
the PluginDataDirectoryPath and PluginSocket options for the
DRA plugin.
This change enables more flexible configuration of the DRA
plugin in e2e tests, allowing for testing with different
socket paths.
Fixed the following warnings:
dra_test.go:884:2: singleCaseSwitch: should rewrite switch statement to if statement (gocritic)
switch podName {
^
dra_test.go:686:4: SA4006: this value of kubeletPlugin is never used (staticcheck)
kubeletPlugin = newDRAService(ctx, f.ClientSet, nodeName, driverName)
^
This ensures that ResourceSlices get removed also when a plugin becomes
unresponsive without removing the registration socket.
Tests are from https://github.com/kubernetes/kubernetes/pull/131073 by Ed
with some modifications, the implementation is new.
The rest of the system logs information using "driverName" as key in structured
logging. The kubelet should do the same.
This also gets clarified in the code, together with using consistent a
consistent name for a Plugin pointer: "plugin" instead of "client" or
"instance".
The New in NewDRAPluginClient made no sense because it's not constructing
anything, and it returns a plugin, not a client -> GetDRAPlugin.
when a test is verifying a container has restarted, we use a continually exiting
container. Not verifying the number of restarts is less than (rather than equal) introduces
a race between the container restarting and the status observation.
Signed-off-by: Peter Hunt <pehunt@redhat.com>
in general, the rewritten e2e cpumanager test assume cgroup v2.
A limited set of these may be updated to work also with the
obsolete and declining cgroup v1, but these need to be reviewed
on test-by-test matter.
To fix test failures, we add a top level require for cgroup v2,
skipping otherwise. This will fix the red lanes while we review
the testcases and the deprecation plan of the other tests.
Signed-off-by: Francesco Romani <fromani@redhat.com>
The package is unmaintained, and the tests don't rely on the
functionality it provides on top of Golang errors (stack traces).
Signed-off-by: Stephen Kitt <skitt@redhat.com>
The PR https://github.com/kubernetes/kubernetes/pull/130274 rewrote the
cpumanager tests assuming there are always at least 4 online CPUs,
adding checks for the tests which require more.
We still have, and likely we will have for the time being, lanes
which run on machines with 2 online CPUs.
Thus, every test which either reserve cpus (--reserved-cpus) or run
pods with exclusive CPU allocation must declare the requisites
and skip if the machine don't provide them.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Adds a DRA e2e_node test to verify that the kubelet plugin manager
retries plugin registration when the GetInfo call fails, and
successfully registers the plugin once GetInfo succeeds.
This ensures correct recovery and registration behavior for
DRA plugins in failure scenarios.
We only need one special "DynamicResourceAllocation" feature for the optional
node support of DRA (plugin registration, CDI support in the container
runtime). For individual features, the automatic labeling through
WithFeatureGate is sufficient.
To find DRA-related tests in a label filter, instead of plain-text "DRA" a
"DRA" label now gets added.
This change depends on an update of the DRA jobs.
rewrite the tests porting to the new layout and utilities.
We may add more cases and better integration in the future.
Signed-off-by: Francesco Romani <fromani@redhat.com>
now that we have a minimal BeforeEach (and let's keep it this way)
we can factor out the reservedCPUs setting, since it's the same
code for each testcase.
Signed-off-by: Francesco Romani <fromani@redhat.com>
reuse existing building blocks at the cost of
a tiny, non-nested BeforeEach (which is still OK)
and some targeted duplication.
Signed-off-by: Francesco Romani <fromani@redhat.com>
rewrite tests which exercise multiple container within the
same pod. Preserve the existing testcases, add more.
Note basic coverage for mixed pods - some containers requiring
exclusive CPUs, some not, was already added with the initial batch.
Signed-off-by: Francesco Romani <fromani@redhat.com>
We have tests which cover the case on which a pod
with a single container require multiple CPUs;
rewrite them preserving the testcases and actually
adding coverage.
Add and use stricter checks along the way.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Complete the rewrite the policy option compatibility tests,
rewriting the tests which check compatibility
between the `full-pcpus-only` and `distribute-cpus-across-numa`.
All testcases are preserved.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Rewrite the policy option compatibility tests.
We start with the tests which check the compatibility
between the `full-pcpus-only` and `strict-cpu-reservation`
tests, because the former is the only GA option
at time of writing.
All testcases are preserved.
Signed-off-by: Francesco Romani <fromani@redhat.com>
rewrite the cpumanager e2e tests for the
`strict-cpu-reservation` policy option to fit
into the new layout.
All testcases are preserved.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Rewrite the cpumanager tests to make use of the lessons
learned, more modern idioms, remove obsolete assumptions
and in gneeral remove all the legacy which was accumulating
over the years.
The goal is to have a simpler, flatter and more maintenable
code layout, de-entangle the net of dependency,
making the tests more robust and easier to extend.
In short, this is all about maintainability. All the testcases
will be preserved, and few other can be added along the way.
Comments in the code will explain the code layout decisions
and tradeoff, and provide a good guide to add more tests
in the future.
Special care was added in order to maximize the isolation between
tests, at cost, in selected cases of a controlled and planned
code duplication.
Signed-off-by: Francesco Romani <fromani@redhat.com>
Passing a constant value to gomega.Consistently means that it will not re-check
while running.
Found by linter after removing the suppression rule for the check. It was
disabled earlier because of a bug in the linter.