* Fix test flakiness caused by shadowed error in reboot test and stale pods affecting eviction test
- Fixed a bug in `ExecCommandInContainerWithFullOutput` usage in `hybrid_network.go` where the `err` variable was shadowed, causing test failures to be silently ignored.
- Added cleanup logic in `eviction.go` to explicitly delete pods created by the reboot node test (e.g., `img-puller`, `reboot-host-test-windows`) before starting the eviction test.
- This improves reliability of the eviction test when the reboot test runs beforehand and leaves behind memory-consuming pods.
* Add cleanup containers in reboot node test
* Workaround for Calico HNS issue by avoiding eviction of undeletable pods
Add a workaround for a known Calico issue (https://github.com/projectcalico/calico/issues/6974) where pods on Windows nodes may become undeletable after a reboot. This causes the eviction manager to attempt evicting these pods but fail due to HNS namespace deletion errors.
The workaround avoids scheduling critical test pods on rebooted nodes to prevent interference.
TODO: Remove this workaround once the Calico issue is resolved.
* Fix the lint issue.
* Address comments
- Add logic to wait until memory-pressure taint is cleared before running the test.
This reduces test flakiness due to lingering node conditions between retries.
github.com/client9/misspell was archived by the owner on Mar 26, 2025.
The golangci-lint team maintains a fork.
The newer code finds some misspellings which where missed before.
We have "-kube-test-repo-list" command line flag to override the image registry. If we store it in global variable, then that overriding cannot take effect.
And this can cause puzzling bugs, e.g.: containerIsUnused() function will compare incorrect image address.
This changes the text registration so that tags for which the framework has a
dedicated API (features, feature gates, slow, serial, etc.) those APIs are
used.
Arbitrary, custom tags are still left in place for now.
Add retry logic to the `assertConsistentConnectivity` function from
the `test/e2e/windows/hybrid_network.go` file.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
framework.SIGDescribe is better because:
- Ginkgo uses the source code location of the test, not of the wrapper,
when reporting progress.
- Additional annotations can be passed.
To make this a drop-in replacement, framework.SIGDescribe generates a function
that can be used instead of the former SIGDescribe functions.
windows.SIGDescribe contained some additional code to ensure that tests are
skipped when not running with a suitable node OS. This gets moved into a
separate wrapper generator, to allow using framework.SIGDescribe as intended.
To ensure that all callers were modified, the windows.sigDescribe isn't
exported anymore (wasn't necessary in the first place!).
The spaces are redundant because Ginkgo will add them itself when concatenating
the different test name components. Upcoming change in the framework will
enforce that there are no such redundant spaces.
The failure message becomes a bit nicer. Found by the new ginkgolinter, for
example:
test/e2e/windows/memory_limits.go:160:2: ginkgo-linter: wrong boolean assertion; consider using `gomega.Eventually(ctx, func() bool {
eventList, err := f.ClientSet.CoreV1().Events(f.Namespace.Name).List(ctx, metav1.ListOptions{})
...
}, 3*time.Minute, 10*time.Second).Should(gomega.BeTrue())` instead (ginkgolinter)
Ginkgo changed the noColor command line arg to be no-color and will
issue the following warning:
You're using deprecated Ginkgo functionality:
=============================================
--noColor is deprecated, use --no-color instead
Fix this by changing all occurrences accordingly.
Sometimes, the kube-proxy endpoints take longer than 10 seconds to
be ready, so the initial connection check fails (via `gomega.Eventually`).
This patch uses a separate longer timeout for `gomega.Eventually`.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>