Commit Graph

2897 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
08aefc8a92 Merge pull request #119362 from pacoxu/add-new-eviction-pid-test
add new e2e test with PodAndContainerStatsFromCRI enabled for pid eviction order
2024-09-20 05:44:45 +01:00
rongfu.leng
0c753d1cb9 fix CPUManagerReconcilePeriod field is not allowed 0
Signed-off-by: rongfu.leng <lenronfu@gmail.com>
2024-09-18 23:54:24 +08:00
Kevin Hannon
67402fe110 Revert "add e2e test for restart kubelet" 2024-09-15 15:47:02 -04:00
joey
40dd01fdbd add e2e test for restart kubelet
when nodes removes the label that satisfies the pod affinity, the running pods are not affected, but restarting the kubelet will kill these pods.

Signed-off-by: joey <zchengjoey@gmail.com>
2024-09-14 15:17:58 +08:00
杨朱 · Kiki
c6904b76d3 Update test/e2e_node/image_volume.go
Co-authored-by: Sascha Grunert <sgrunert@redhat.com>
2024-09-09 16:02:01 +08:00
carlory
8121bc99a8 e2e-node: should succeed with multiple pods and same image
Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>
2024-09-09 10:32:27 +08:00
Kubernetes Prow Robot
49f486dafa Merge pull request #127018 from saschagrunert/imagevolume-selinux
Allow using SELinux on image volume e2e test
2024-09-06 19:01:34 +01:00
Kubernetes Prow Robot
a105f3683c Merge pull request #126810 from liangyuanpeng/testgrid_podhostips
Do not serial for tests of Pod Host IPs.
2024-09-05 07:03:44 +01:00
Paco Xu
3a21a033bd add new e2e test with PodAndContainerStatsFromCRI enabled for pid pressure e2e 2024-09-05 09:10:27 +08:00
Sascha Grunert
6bd3bb5881 Allow using SELinux on image volume e2e test
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-08-30 10:43:15 +02:00
Sascha Grunert
931b9b3a70 Update cni-plugins to v1.5.1
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-08-28 12:58:36 +02:00
Sascha Grunert
ff50da579e Fix device plugin node ready test assertion
Introduced in d770dd695a and high likely
the issue caused in the failing test:
https://github.com/kubernetes/kubernetes/issues/126915

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-08-26 14:56:59 +02:00
Kubernetes Prow Robot
3f306ae140 Merge pull request #126343 from SergeyKanzhelev/succeededPodReadmitted
Terminated pod should not be re-admitted
2024-08-22 16:32:09 +01:00
Sascha Grunert
14476d88f3 Fix hugepages e2e test assertion
This causes failures in:

- https://testgrid.k8s.io/sig-node-cri-o#ci-crio-cgroupv2-node-e2e-hugepages
- https://testgrid.k8s.io/sig-node-cri-o#ci-crio-cgroupv1-node-e2e-hugepages

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-08-21 13:37:22 +02:00
Lan Liang
85b52c2a86 do not serial for tests of Pod Host IPs.
Signed-off-by: Lan Liang <gcslyp@gmail.com>
2024-08-20 12:08:19 +00:00
Kubernetes Prow Robot
d770dd695a Merge pull request #121888 from SD-13/e2e-gomega-be-true-or-false
Enhance boolean assertions when fail
2024-08-20 04:24:42 -07:00
Kubernetes Prow Robot
a221d3a40c Merge pull request #126602 from haircommander/node-cm-test
Revert "Skip node container manager test on systemd" and fix test
2024-08-15 15:39:58 -07:00
Kubernetes Prow Robot
7576984eec Merge pull request #126444 from bart0sh/PR152-dra-e2e_node-cleanup
DRA: e2e_node: improve readability
2024-08-13 21:03:59 -07:00
Peter Hunt
c7b7ea0514 e2e_node: update node cgroup manager test to verify kubelet recreates kubepods cgroup
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-08-08 16:53:44 -04:00
Peter Hunt
dd2dcc0b0a e2e_node: enable and fix cgroups test for systemd
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-08-08 15:57:49 -04:00
Sujay
223aedcf6b enhance boolean assertions 2024-07-31 15:58:15 +00:00
Ed Bartosh
c5842ca4ad DRA: e2e_node: improve readability 2024-07-29 21:57:44 +03:00
Paco Xu
9ee99a9307 skip if ResourceHealthStatus is disabled 2024-07-29 17:40:44 +08:00
Kevin Hannon
a1bbae8168 fix resource health status test failures in unlabeled jobs 2024-07-26 09:43:48 -04:00
Kubernetes Prow Robot
e9d9a82839 Merge pull request #124101 from haircommander/process_stats-with-pid-fix
kubelet: fix PID based eviction
2024-07-25 11:59:57 -07:00
Sergey Kanzhelev
300128de65 succeeded pod is being re-admitted 2024-07-25 17:45:27 +00:00
Kubernetes Prow Robot
ab470aad01 Merge pull request #126220 from saschagrunert/image-volumesource-e2e
[KEP-4639] Add `ImageVolumeSource` node e2e tests
2024-07-24 06:40:50 -07:00
Sascha Grunert
bc452887fa Add ImageVolumeSource e2e tests
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-07-24 13:57:39 +02:00
Kubernetes Prow Robot
5af1710d90 Merge pull request #126243 from SergeyKanzhelev/devicePluginFailures
Implement resource health in pod status (KEP 4680)
2024-07-23 20:12:24 -07:00
Kubernetes Prow Robot
638128e74f Merge pull request #119019 from gjkim42/add-e2e-node-test-restarting-the-kubelet
Add node serial e2e tests that simulate the kubelet restart
2024-07-23 18:01:36 -07:00
Sergey Kanzhelev
62f96d2748 set AllocatedResourcesStatus in the Pod Status 2024-07-24 00:29:35 +00:00
Kubernetes Prow Robot
1353c08110 Merge pull request #126298 from vinayakankugoyal/apparmortest
Update AppArmor e2e tests to use both containers[*].securityContext.appArmorProfile field and annotations.
2024-07-23 15:45:29 -07:00
Kubernetes Prow Robot
fa4b8f32ac Merge pull request #125935 from gjkim42/fix-125880
Terminate restartable init containers ignoring not-started containers
2024-07-23 15:45:11 -07:00
Vinayak Goyal
b580eb1864 Update AppArmor e2e tests to use Pod field instead of annotations.
Signed-off-by: Vinayak Goyal <vinaygo@google.com>
2024-07-23 17:03:17 +00:00
Kubernetes Prow Robot
a4f9910c51 Merge pull request #126014 from PannagaRao/kep-ephemeral-storage-quota
pkg/volume/*: Enable quotas in user namespace
2024-07-23 09:21:02 -07:00
Kubernetes Prow Robot
7590cb7adf Merge pull request #125257 from vinayakankugoyal/armor
KEP-24: Update AppArmor feature gates to GA stage.
2024-07-23 09:20:52 -07:00
Kubernetes Prow Robot
3e9a73d558 Merge pull request #126058 from AnishShah/patch-2
Deflake kubernetes-node-swap-fedora-serial jobs
2024-07-22 15:48:42 -07:00
Kubernetes Prow Robot
d21b17264e Merge pull request #125488 from pohly/dra-1.31
DRA for 1.31
2024-07-22 11:45:55 -07:00
Patrick Ohly
d11b58efe6 DRA kubelet: refactor gRPC call timeouts
Some of the E2E node tests were flaky. Their timeout apparently was chosen
under the assumption that kubelet would retry immediately after a failed gRPC
call, with a factor of 2 as safety margin. But according to
0449cef8fd,
kubelet has a different, higher retry period of 90 seconds, which was exactly
the test timeout. The test timeout has to be higher than that.

As the tests don't use the gRPC call timeout anymore, it can be made
private. While at it, the name and documentation gets updated.
2024-07-22 18:09:34 +02:00
Patrick Ohly
0b62bfb690 DRA e2e: adapt to v1alpha3 API 2024-07-22 18:09:34 +02:00
Itamar Holder
a6df16af85 node e2e test: exclude critical pods from swapping
Signed-off-by: Itamar Holder <iholder@redhat.com>
2024-07-22 17:56:52 +03:00
Peter Hunt
0979ba9cb8 kubelet/stats: verify there is at least one process in each container
0 processes is too low a bar to be meaningfully testing that the process
stats are being reported.

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-07-22 10:54:42 -04:00
Kevin Hannon
7d8ba7849b priority pid tests should match on processes
pids 0
process should not be nonzero
2024-07-22 10:54:42 -04:00
David Porter
6e6b2b76a3 test: Update summary test to check for process count
The process count is expected to always be >= 1 for pods in the test.

Let's check it's >= 1, so we can catch issues if the proecss count is
not reported.

Signed-off-by: David Porter <david@porter.me>
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2024-07-22 10:54:42 -04:00
PannagaRamamanohara
d16fd6a915 pkg/volume: Use QuotaMonitoring in UserNamespace
Enable LocalStorageCapacityIsolationFSQuotaMonitoring
only when hostUsers in PodSpec is set to false.
Modify unit tests and e2e tests to verify

Signed-off-by: PannagaRamamanohara <pbhojara@redhat.com>
2024-07-22 09:43:57 -04:00
Anish Shah
665df5794e wait for pod to be ready before continuing with the test
This test is flaky. I have noticed that this happens because the pod is not READY when it is being deleted at the end of the test. This fix ensures that the pod is READY before continuing with the rest of the test.
2024-07-22 05:26:59 +00:00
Patrick Ohly
b51d68bb87 DRA: bump API v1alpha2 -> v1alpha3
This is in preparation for revamping the resource.k8s.io completely. Because
there will be no support for transitioning from v1alpha2 to v1alpha3, the
roundtrip test data for that API in 1.29 and 1.30 gets removed.

Repeating the version in the import name of the API packages is not really
required. It was done for a while to support simpler grepping for usage of
alpha APIs, but there are better ways for that now. So during this transition,
"resourceapi" gets used instead of "resourcev1alpha3" and the version gets
dropped from informer and lister imports. The advantage is that the next bump
to v1beta1 will affect fewer source code lines.

Only source code where the version really matters (like API registration)
retains the versioned import.
2024-07-21 17:28:13 +02:00
Gunju Kim
45a243e102 Add node serial e2e tests that simulate the kubelet restart
This adds node e2e tests to make sure a completed init container is not
restarted due to the kubelet restart.
2024-07-19 21:18:34 +09:00
Kubernetes Prow Robot
f2428d66cc Merge pull request #125163 from pohly/dra-kubelet-api-version-independent-no-rest-proxy
DRA: make kubelet independent of the resource.k8s.io API version
2024-07-18 17:47:48 -07:00
Patrick Ohly
616a014347 DRA: move ResourceSlice publishing into DRA drivers
This is a first step towards making kubelet independent of the resource.k8s.io
API versioning because it now doesn't need to copy structs defined by that API
from the driver to the API server. The next step is removing the other
direction (reading ResourceClaim status and passing the resource handle to
drivers).

The drivers must get deployed so that they have their own connection to the API
server. Securing at least the writes via a validating admission policy should
be possible.

As before, the kubelet removes all ResourceSlices for its node at startup, then
DRA drivers recreate them if (and only if) they start up again. This ensures
that there are no orphaned ResourceSlices when a driver gets removed while the
kubelet was down.

While at it, logging gets cleaned up and updated to use structured, contextual
logging as much as possible. gRPC requests and streams now use a shared,
per-process request ID and streams also get logged.
2024-07-18 09:09:19 +02:00