Commit Graph

3056 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
7b6c56e5fb Merge pull request #130135 from saschagrunert/image-volume-beta
[KEP-4639] Graduate image volume sources to beta
2025-03-12 18:03:58 -07:00
Kubernetes Prow Robot
05bfdbc6dd Merge pull request #129950 from ffromani/alignment-error-detail-metrics
node: metrics for alignment failures
2025-03-12 18:03:46 -07:00
Anish Ramasekar
2090a01e0a add e2e test with the gcp-credential-provider test plugin
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2025-03-11 20:36:36 -07:00
Sascha Grunert
f9e5dd84ad Graduate image volume sources to beta
Graduate the feature to beta, by:

- Allowing `subPath`/`subPathExpr` for image volumes
- Modifying the CRI to pass down the (resolved) sub path
- Adding metrics which are outlined in the KEP

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2025-03-11 13:41:45 +01:00
Kubernetes Prow Robot
b82260f003 Merge pull request #130391 from bart0sh/PR174-e2e_node-fix-eviction-kubetest2
e2e_node: fix ImageGCNoEviction test for kubetest2
2025-03-10 08:57:53 -07:00
Kubernetes Prow Robot
0eaee48ecb Merge pull request #130569 from dims/update-to-latest-cadvisor-v0.52.0
Update to latest cadvisor @ v0.52.1 and new opencontainer/cgroups and drops opencontainers/runc
2025-03-07 17:09:51 -08:00
Kubernetes Prow Robot
2effa5e3cf Merge pull request #130352 from natasha41575/kubelet-pod-observedgen
[FG:PodObservedGenerationTracking] Kubelet sets pod `status.observedGeneration` when updating the pod status
2025-03-07 13:33:45 -08:00
Kubernetes Prow Robot
74cb75c884 Merge pull request #130396 from bart0sh/PR173-e2e_node-fix-getting-pod-logs
e2e_node: remote: fix getting pod logs
2025-03-07 05:21:45 -08:00
Natasha Sarkar
eab9197d1a Add observedGeneration and validation to pod status and conditions 2025-03-06 20:08:06 +00:00
Davanum Srinivas
5ecddb6571 update to latest cadvisor @ v0.52.0
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2025-03-05 06:36:39 -05:00
Francesco Romani
04129d1dc8 node: metrics for alignment failures
Add metrics to report alignment allocation failures
See: https://github.com/kubernetes/enhancements/pull/5108

Signed-off-by: Francesco Romani <fromani@redhat.com>
2025-03-04 19:50:08 +01:00
Kubernetes Prow Robot
4f3fd12bc1 Merge pull request #130116 from AkihiroSuda/rro
KEP-3857: Recursive Read-only (RRO) mounts: promote to GA
2025-03-03 19:55:49 -08:00
Kubernetes Prow Robot
88d2355c41 Merge pull request #129951 from parkjeongryul/add-e2e-topology-manager-for-init-ctn
Add e2e test for topology manager with restartable init containers
2025-02-28 04:38:23 -08:00
parkjeongryul
dca3f56f64 Add e2e test for topology manager with restartable init containers 2025-02-28 00:48:27 +09:00
Davanum Srinivas
fb3b163ca0 Ensure we switch to k8s root directory for dockerized builds during e2e-node ci job
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2025-02-27 10:05:45 -05:00
Ed Bartosh
65c792ca9b e2e_node: remote: fix getting pod logs 2025-02-27 09:04:43 +02:00
Kubernetes Prow Robot
27cbe54b09 Merge pull request #130163 from ffromani/e2e-node-fix-cpu-quota-test
e2e: node: cpumgr: cleanup after each test case
2025-02-25 05:08:29 -08:00
Ed Bartosh
4c0b24b06d e2e_node: eviction: fix ImageGCNoEviction test
ImageGCNoEviction fails when tests run by kubetest2 as the test depends
on the prepulled test images (framework.TestContext.PrepullImages), but
kubetest2 --prepull-images command line option is set to false by
default.

Prepulling images explicitly for the only test that uses them
should fix the issue.
2025-02-24 18:27:38 +02:00
Ed Bartosh
cf70b06e37 e2e_node: improve logging for eviction test 2025-02-24 10:39:04 +02:00
Francesco Romani
323410664c e2e: node: cpumgr: check CPU allocatable for CFS quota test
add (admittedly pretty crude) CPU allocatable check.
A more incisive refactoring is needed, but we need
to unbreak CI first, so this seems the minimal decently clean test.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2025-02-18 10:04:57 +01:00
carlory
c48499d360 fix ci
Signed-off-by: carlory <baofa.fan@daocloud.io>
2025-02-17 11:49:24 +08:00
Francesco Romani
844c2ef39d e2e: node: cpumgr: cleanup after each test case
Our CI machines happen to have 1 fully allocatable CPU for test workloads.
This is really, really the minimal amount. But still should be sufficient for the tests to run
the tests; the CFS quota pod, however, does create a series of pods (at time of writing, 6)
and does the cleanup only at the very end the end. This means pods
requiring resources accumulate on the CI machine node.

The fix implemented here is to just clean up after each subcase.
Doing so the cpu test footprint is equal to the higher requirement (say, 1000 millicores) vs
the sum of all the subcases requirements.

Doing like this doesn't change the test behavior, and make it possible
to run it on very barebones machines.
2025-02-14 15:45:36 +01:00
Akihiro Suda
d6a6dda2fa KEP-3857: Recursive Read-only (RRO) mounts: promote to GA
Discussed in kubernetes/enhancements PR 5157

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2025-02-13 20:43:35 +09:00
Kubernetes Prow Robot
5d57d0c110 Merge pull request #129845 from bitoku/fix-flake
Reduce the number of processes used in e2e to prevent unexpected OOM
2025-02-12 13:16:20 -08:00
Kubernetes Prow Robot
5e1c31b9db Merge pull request #130053 from iholder101/bugfix/swap-resource-metrics-e2e-bug
[KEP-2400] [failing-test] resource metrics e2e tests: expect swap node and container level stats
2025-02-12 12:02:28 -08:00
Kubernetes Prow Robot
cd2959b798 Merge pull request #127525 from scott-grimes/patch-1
fix: pods meeting qualifications for static placement when cpu-manager-policy=static should not have cfs quota enforcement
2025-02-12 12:02:21 -08:00
Ayato Tokubi
dbb34a04cc Reduce the number of processes used in e2e to prevent unexpected OOM
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-02-12 17:39:56 +00:00
Scott Grimes
1c5170ff52 disable cfs quota when exclusive cpus allocated per static cpu policy requirements 2025-02-11 13:42:30 -05:00
Itamar Holder
8a797e42e1 resource metrics e2e tests: expect swap node and container level stats
Signed-off-by: Itamar Holder <iholder@redhat.com>
2025-02-11 15:19:45 +02:00
Kubernetes Prow Robot
491a23f079 Merge pull request #129999 from pohly/test-e2e-node-timeout
E2E node: fix --timeout default
2025-02-06 03:59:55 -08:00
Patrick Ohly
46a17f60e4 E2E node: fix --timeout default
For unknown reasons, hack/make-rules/test-e2e-node.sh adds -timeout instead of
--timeout. Therefore the fallback code in test/e2e_node/remote/remote.go didn't
find it and added its own --timeout=60m after it. This effectively limits E2E
node test runs to 60 minutes, regardless of what is specified in the job:

    W0206 09:53:51.425532    7151 remote.go:158] ginkgo flags are missing explicit --timeout (ginkgo defaults to 60 minutes)
    I0206 09:53:51.425565    7151 remote.go:165] updated ginkgo flags: -timeout=24h --label-filter="Feature: containsAny DynamicResourceAllocation && Feature: isSubsetOf { Beta, DynamicResourceAllocation } && !Flaky && !Slow"  --no-color -v --timeout=60m
    ...
    I0206 09:53:57.767096    7151 ssh.go:146] Running the command ssh, with args: ... timeout -k 30s 3600.000000s ./ginkgo -timeout=24h --label-filter="Feature: containsAny DynamicResourceAllocation && Feature: isSubsetOf { Beta, DynamicResourceAllocation } && !Flaky && !Slow"  --no-color -v --timeout=60m ...

Note that the timeout for the test was 60m in this case (hence the "timeout -k
30s 3600.000000s") but it could also be something larger.
2025-02-06 11:45:12 +01:00
Kubernetes Prow Robot
c4434c3161 Merge pull request #129910 from bitoku/fix-129836
Fix flaky test for container life cycle
2025-02-04 16:23:09 -08:00
Kubernetes Prow Robot
f82439f536 Merge pull request #129486 from iholder101/bugfix/swap-container-cri-stats
[KEP-2400] [Bugfix]: Ensure container-level swap metrics are collected
2025-02-04 08:14:59 -08:00
Kubernetes Prow Robot
a376ae5dad Merge pull request #128845 from SergeyKanzhelev/staticPodUpgrade
static pod upgrade test with hostNetwork
2025-02-03 23:30:58 -08:00
Vinayak Goyal
81f09811ca Fix kubelet_authz_test.go 2025-01-31 15:38:18 +00:00
Ayato Tokubi
da5a76bd39 Fix flaky test for container life cycle
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-01-30 16:23:51 +00:00
Vinayak Goyal
ce7d2130ad Fix kubelet_authz_test.go 2025-01-29 23:06:56 +00:00
Swati Sehgal
82f0303f89 node: e2e: Remove flaky label as device plugin reboot test is deflaked
With the device plugin node reboot test fixed, we can see in testgrid
[node-kubelet-containerd-flaky](https://testgrid.k8s.io/sig-node-containerd#node-kubelet-containerd-flaky)
that the test is passing consitently and we can remove the flaky label.

With the test not flaky anymore, we can validate new PRs against it
and ensure we don't cause regressions.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-01-29 11:12:40 +00:00
Kubernetes Prow Robot
48dce2e9b3 Merge pull request #129776 from saschagrunert/cni-plugins-1.6.2
Update CNI plugins to v1.6.2 and avoid using k8s-artifacts-cni bucket
2025-01-28 07:29:26 -08:00
Kubernetes Prow Robot
2bda5dd8c7 Merge pull request #129656 from vinayakankugoyal/kep2862beta
KEP-2862: Graduate to BETA.
2025-01-27 19:05:23 -08:00
Itamar Holder
617c094435 Add an e2e test
Signed-off-by: Itamar Holder <iholder@redhat.com>
2025-01-27 15:44:18 +02:00
Vinayak Goyal
3a780a1c1b KEP-2862: Graduate to BETA. 2025-01-24 21:36:00 +00:00
Kubernetes Prow Robot
29bf17b6cf Merge pull request #129168 from kannon92/drop-node-features
[KEP-3041] - remove nodefeatures from k/k repo
2025-01-23 12:07:29 -08:00
Kubernetes Prow Robot
4f979c9db8 Merge pull request #129010 from ffromani/e2e-fix-device-plugin-reboot-test
node: e2e: fix device plugin reboot test
2025-01-23 12:07:22 -08:00
Sascha Grunert
da999fbc1b Update CNI plugins to v1.6.2 and avoid using k8s-artifacts-cni bucket
Updating the CNI plugins to the latest release and switch over to use
GitHub releases instead of the `k8s-artifacts-cni` bucket.

Follow-up on https://github.com/kubernetes/kubernetes/pull/129095

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2025-01-23 10:50:58 +01:00
Kubernetes Prow Robot
a271299643 Merge pull request #129717 from esotsal/fix-128837
testing: Fix pod delete timeout failures after InPlacePodVerticalScaling Graduate to Beta commit
2025-01-21 15:50:47 -08:00
Kubernetes Prow Robot
0d988d7209 Merge pull request #129619 from ffromani/sig-node-approvers-ffromani
Self-nominating ffromani as approver for sig-node container and resource managers
2025-01-21 15:50:36 -08:00
Kubernetes Prow Robot
3d2ee2fbb7 Merge pull request #129609 from carlory/cleanup-exec-utils
Move some exec helper functions from framework/volume to framework/pod
2025-01-21 09:00:37 -08:00
Sotiris Salloumis
c5fc4193bb Fix pod delete issues in podresize tests 2025-01-21 07:25:14 +01:00
Kevin Hannon
bae4122f56 deprecate nodefeature for feature labels 2025-01-20 17:02:59 -05:00