Moving Scheduler interfaces to staging: Move PodInfo and NodeInfo interfaces (together with related types) to staging repo, leaving internal implementation in kubernetes/kubernetes/pkg/scheduler
* Add JSON & YAML output support for kubectl api-resources
Create a separate `PrintFlags` struct within the apiresources.go file
that handles printing only for `kubetl api-resources` because existing
output formats, i.e., wide and name, are already implemented
independently from HumanReadableFlags and NamePrintFlags.
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
* Use separate printer type for all options
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
* Unit tests for JSON & YAML outputs
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
* Separate file for print types
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
* Move JSON-YAML tests to separate function
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
* Fix broken unit test
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
* Unifying JSON & YAML unit test functions
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
* Fix linter errors
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
* PR feedback and linter again
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
---------
Signed-off-by: Dharmit Shah <shahdharmit@gmail.com>
As part of the PR 132028 we added more e2e test coverage to validate
the fix, and check as much as possible there are no regressions.
The issue and the fix become evident largely when inspecting
memory allocation with the Memory Manager static policy enabled.
Quoting the commit message of bc56d0e45a
```
The podresources API List implementation uses the internal data of the
resource managers as source of truth.
Looking at the implementation here:
https://github.com/kubernetes/kubernetes/blob/v1.34.0-alpha.0/pkg/kubelet/apis/podresources/server_v1.go#L60
we take care of syncing the device allocation data before querying the
device manager to return its pod->devices assignment.
This is needed because otherwise the device manager (and all the other
resource managers) would do the cleanup asynchronously, so the `List` call
will return incorrect data.
But we don't do this syncing neither for CPUs or for memory,
so when we report these we will get stale data as the issue #132020 demonstrates.
For CPU manager, we however have the reconcile loop which cleans the stale data periodically.
Turns out this timing interplay was actually the reason the existing issue #119423 seemed fixed
(see: #119423 (comment)).
But it's actually timing. If in the reproducer we set the `cpuManagerReconcilePeriod` to a time
very high (>= 5 minutes), then the issue still reproduces against current master branch
(https://github.com/kubernetes/kubernetes/blob/v1.34.0-alpha.0/test/e2e_node/podresources_test.go#L983).
```
The missing actor here is memory manager. Memory manager has no
reconcile loop (implicit fixing the stale data problem) no explicit
synchronization, so it is the unlucky one which reported stale data,
leading to the eventual understanding of the problem.
For this reason it was (and still is) important to exercise it during
the test.
Turns out the test is however wrong, likely because a hidden dependency
between the test expectations and the lane configuration (notably
machine specs), so we disable the memory manager activation for the time
being, until we figure out a safe way to enable it.
Note this significantly weakens the signal for this specific test.
Signed-off-by: Francesco Romani <fromani@redhat.com>
This avoids the overhead for the more complex conversion to v1beta1 and might
make it a bit more realistic to get rid of the v1beta1 eventually.
The expected GVK must be set explicitly because when emulating 1.33,
v1beta1 is the default although the fixed storage version is v1beta2.
It hasn't been on-by-default before, therefore it does not get locked to the
new default on yet. This has some impact on the scheduler configuration
because the plugin is now enabled by default.
Because the feature is now GA, it doesn't need to be a label on E2E tests,
which wouldn't be possible anyway once it gets removed entirely.
The pods/finalizer permission can be restricted to just updates because that is
all that matters.
The DeviceTaints rules were under the wrong feature gate check (copy-and-paste)
and must remain disabled when DRA itself becomes enabled.
Some tests do version emulation and need the DRA feature. In that combination
the --runtime-config-emulation-forward-compatible option is needed to allow
enabling the V1 API although it's only available in 1.34.
As before when adding v1beta2, DRA drivers built using the
k8s.io/dynamic-resource-allocation helper packages remain compatible with all
Kubernetes release >= 1.32. The helper code picks whatever API version is
enabled from v1beta1/v1beta2/v1.
However, the control plane now depends on v1, so a cluster configuration where
only v1beta1 or v1beta2 are enabled without the v1 won't work.
The sig-node tests have scenarios of doing probes and
lifecycle handler tests with post-start and pre-stop hooks
setting the host field to be another pod.
In baseline level such things won't be allowed because of
the PSA rules we are adding in this PR. So unsetting
the host field means it uses the podIP of self for doing
the checks and using that in the pre-stop and post-start
hooks is tricky because of the timing issues with when the
container is actually up v/s running the test.
So I have changed the tests to be privileded for them to
use the .host fields if they desire to.
See https://github.com/kubernetes/kubernetes/issues/133091
which is an issue opened to properly refactor these tests.
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
This commit adds the fixture tests for the
new .host field restrictions on probe
and lifecycle handlers.
ran UPDATE_POD_SECURITY_FIXTURE_DATA=true go test -v ./test/... -run TestFixtures
Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>