Commit Graph

26143 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
1cd8074b83 Merge pull request #128079 from pohly/e2e-daemonset-check-daemon-status-polling
e2e daemon set: better polling in CheckDaemonStatus
2024-10-15 12:24:21 +01:00
Kubernetes Prow Robot
7c53005b6c Merge pull request #128066 from bart0sh/PR160-e2e_nod-fix-mirror-pod-test
e2e_node: fix mirror pod test
2024-10-15 09:24:21 +01:00
Patrick Ohly
e43065d542 e2e daemon set: better polling in CheckDaemonStatus
As a quick fix for a flake, bceec5a3ff
introduced polling with wait.Poll in all callers of CheckDaemonStatus.

This commit reverts all callers to what they were before (CheckDaemonStatus +
ExpectNoError) and implements polling according to E2E best practices
(https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/writing-good-e2e-tests.md#polling-and-timeouts):

- no logging while polling
- support for progress reporting while polling
- last but not least, produce an informative failure message in case of a
  timeout, including a dump of the daemon set as YAML
2024-10-15 10:12:28 +02:00
Kubernetes Prow Robot
8b7b768ff7 Merge pull request #128011 from seans3/egress-selector-configuration-strict
EgressSelectorConfiguration now uses strict validation
2024-10-15 02:04:20 +01:00
Ed Bartosh
876819b8b6 e2e_node: fix mirror pod test
Modified stopNfsServer function to wait until nfs rpc is unregistered.
This should fix failing pull-kubernetes-node-arm64-ubuntu-serial-gce
job.
2024-10-15 02:05:26 +03:00
Antonio Ojea
bceec5a3ff e2e flake CheckDaemonStatus assert on async value
The util for checking on daemonstatus was checking once if the Status of
the daemonset was reporting that all the desired Pods are scheduled and
ready.

However, the pattern used in the e2e test for this function was not
taking into consideration that the controller needs to propagate the Pod
status to the DeamonSet status, and was asserting on the condition only
once after waiting for all the Pods to be ready.

In order to avoid more churn code, change the CheckDaemonStatus
signature to the wait.Condition type and use it in a async poll loop on
the tests.
2024-10-14 13:30:03 +00:00
Kubernetes Prow Robot
de8f6b0db7 Merge pull request #128037 from dshebib/e2eNode_containerLifecycleContext
[e2e_node] Use shared context in regular container tests
2024-10-14 13:10:28 +01:00
Kubernetes Prow Robot
a454563a8d Merge pull request #127812 from p0lyn0mial/upstream-decode-list-blueprint
client-go/rest/request: decodes initialEventsListBlueprint for watchlist requests
2024-10-14 13:10:21 +01:00
Kubernetes Prow Robot
d003e4cd9f Merge pull request #127923 from unvavo/add-test-tainttoleration-for-queueinghint
add integration test for tainttoleration in requeueing scenarios
2024-10-14 10:20:28 +01:00
Lukasz Szaszkiewicz
7be192ae0b client-go/rest/request: decodes initialEventsListBlueprint for watchlist requests 2024-10-14 08:48:32 +02:00
Sean Sullivan
32b2eea50d EgressSelectorConfiguration now uses strict validation 2024-10-13 16:09:35 -07:00
Daniel Shebib
1618dbe695 Add context to tests 2024-10-12 21:23:29 -05:00
unvavo
2c254d3b25 add integration test for tainttoleration in requeueing scenarios 2024-10-12 15:23:55 +09:00
Tsubasa Nagasawa
04c6d9324e Check for restarts without being affected by container startup order
The test for checking container restarts in a Pod with restartable-init-1
and regular-1 is flaky. Right now, when we check if restartable-init-1 has
restarted, we see if it hasn’t written the "Started" log after regular-1 has
written its "Started" log.
But even though the startup sequence starts with restartable-init-1 and then
regular-1, there’s no guarantee they’ll finish starting up in that order.
Sometimes regular-1 finishes first and writes its "Started" log before restartable-init-1.

1. restartable-init-1 Starting
2. regular-1 Starting
3. regular-1 Started
4. restartable-init-1 Started

In this test, the startup order doesn’t really matter; all we need to check is
if restartable-init-1 restarted. So I changed the test to simply look for
more than one "Starting" log in restartable-init-1's logs.

There were other places that used the same helper function DoesntStartAfter,
so replaced those as well and deleted the helper function.
2024-10-12 15:17:47 +09:00
Kubernetes Prow Robot
762a85e25d Merge pull request #125923 from haircommander/cpuset-fix-restart
kubelet/cm: fix bug where kubelet restarts from missing cpuset cgroup
2024-10-12 00:12:20 +01:00
Peter Hunt
b94c5387b8 e2e_node: use restart instead of start stop
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-10-11 16:53:33 -04:00
Kubernetes Prow Robot
2f7df335ad Merge pull request #127615 from macsko/add_node_add_event_benchmark_to_scheduler_perf
Add scheduler_perf test case for NodeAdd event handling
2024-10-11 18:10:19 +01:00
Francesco Romani
cc87438f2f e2e_node: add a test to verify the kubelet starts
with systemd cgroup driver and cpumanager none policy.

This was originally planned to be a correctness check for
https://issues.k8s.io/125923, but it was difficult to reproduce the bug,
so it's now a regression test against it.

Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-10-11 11:29:16 -04:00
Kubernetes Prow Robot
1b6c993cee Merge pull request #127952 from macsko/allow_to_specify_feature_gates_on_workload_level_scheduler_perf
Allow to set feature gates on workload level in scheduler_perf
2024-10-11 15:28:19 +01:00
Kubernetes Prow Robot
a0e146a4b0 Merge pull request #127988 from pohly/e2e-daemonset-health-check
e2e daemonset: stronger health check of DaemonSet status
2024-10-11 14:16:21 +01:00
Maciej Skoczeń
e676d0e76a Allow to specify feature gates on workload level in scheduler_perf 2024-10-11 08:41:08 +00:00
Patrick Ohly
3ec84373c1 e2e daemonset: stronger health check of DaemonSet status
The error was only generated if both checks (generated pods and ready pods)
failed. This looks like a logic error, failing if either of those isn't
matching expectations seems better.
2024-10-11 10:36:36 +02:00
Maciej Skoczeń
25850caf8a Add scheduler_perf test case for NodeAdd event handling 2024-10-11 07:40:06 +00:00
Kubernetes Prow Robot
9ffc095f88 Merge pull request #127892 from utam0k/test-qhint-volume-restriction
Add integration test for VolumeRestriction in requeueing scenarios
2024-10-11 07:32:20 +01:00
Kubernetes Prow Robot
c15581b277 Merge pull request #127695 from kaisoz/wait-for-job-failfast
Fail fast when waiting for job conditions in e2e tests
2024-10-10 22:28:19 +01:00
Tomas Tormo
3b1a5bfc9c Fail fast when waiting for job conditions in e2e tests 2024-10-10 20:18:21 +00:00
Kubernetes Prow Robot
95612e7b3b Merge pull request #127878 from AxeZhan/sidecar
[scheduler] calculate pod requests resources with sidecar containers
2024-10-10 17:54:19 +01:00
AxeZhan
8b15843d00 remove unused GetNonzeroRequests function 2024-10-10 23:52:25 +08:00
Kubernetes Prow Robot
61d9bae274 Merge pull request #127348 from RyanAoh/kep-1860-ga
Promote LoadBalancerIPMode to GA
2024-10-10 16:36:19 +01:00
Aohan Yang
da5738d9aa Set feature gate emulation version during test 2024-10-10 19:26:31 +08:00
AxeZhan
b1f07bb36c add tests for scheduler 2024-10-10 15:53:19 +08:00
Kubernetes Prow Robot
fe218437e0 Merge pull request #127974 from jpbetz/mvp-test-cleanup
peerproxy flake: Use t.Cleanup instead of defer to shut down servers
2024-10-10 03:54:22 +01:00
Kubernetes Prow Robot
582dcc2aca Merge pull request #127221 from toVersus/test/restartable-init-termination
[Sidecar Containers] Expand test coverage for Node E2E tests on pod termination behavior
2024-10-10 02:48:23 +01:00
Tsubasa Nagasawa
82b690ddf6 Add more Node E2E tests to cover pod termination for Sidecar Containers
* A pod with restartable init container that exits with
  a non-zero code is marked as a pod succeeded phase
* A pod with restartable init containers that exits with
  a non-zero code by prestop hook is marked as a pod succeeded phase
* A pod with regular container that exceeds its termination grace period
  seconds is marked as a pod failed phase
* A pod with restartable init containers that exceeds its termination
  grace period seconds is marked as a pod succeeded phase
* A pod with a regular container that exceeded its termination grace
  period seconds by PreStop hook is marked as a pod failed phase
* A pod with restartable init containers that exceeds its termination
  grace period seconds by PreStop hook is marked as a pod succeeded phase

Signed-off-by: Tsubasa Nagasawa <toversus2357@gmail.com>
2024-10-10 09:43:41 +09:00
Kubernetes Prow Robot
74cfa2fd04 Merge pull request #127825 from macsko/add_pod_update_event_handling_scheduler_perf_test_case
Add scheduler_perf test case for pod update events handling
2024-10-10 01:38:23 +01:00
Tsubasa Nagasawa
bd00f83578 Add step to existing pod termination Node E2E tests to check the container’s exit code
Signed-off-by: Tsubasa Nagasawa <toversus2357@gmail.com>
2024-10-10 09:17:43 +09:00
Joe Betz
875d163ce6 Use t.Cleanup instead of defer to shut down servers 2024-10-09 20:16:01 -04:00
utam0k
60c29c380d Add integration test for VolumeRestriction in requeueing scenarios
Signed-off-by: utam0k <k0ma@utam0k.jp>
2024-10-10 08:31:29 +09:00
Kubernetes Prow Robot
78d6490412 Merge pull request #127302 from cici37/costFG
Promote cost related feature gate to default true
2024-10-09 23:02:23 +01:00
Kubernetes Prow Robot
dd87bc0646 Merge pull request #127901 from skitt/k8s-sigs-yaml
Use sigs.k8s.io/yaml instead of gopkg.in/yaml
2024-10-09 19:38:29 +01:00
Kubernetes Prow Robot
364c73d5a9 Merge pull request #127637 from dshebib/e2eNode_containersLifecycleFormatting
[e2e_node] containers_lifecycle organize tests
2024-10-09 19:38:22 +01:00
Kubernetes Prow Robot
d9c46d8ecb Merge pull request #127909 from richabanker/mvp-cleanup
Reduce IdentityLeaseRenewIntervalPeriod in peer_proxy test
2024-10-09 13:28:23 +01:00
Kubernetes Prow Robot
c9ff60dc68 Merge pull request #127607 from sanposhiho/metric-queuetest
chore: ensure the scheduler handles events before checking the pod position
2024-10-09 12:24:24 +01:00
Maciej Skoczeń
98e4892b84 Add scheduler_perf test case for pod update events handling 2024-10-09 08:35:25 +00:00
Joe Betz
3570feb2fc Cancel informers for shutdown server in peerproxy test 2024-10-08 21:49:09 -04:00
Richa Banker
fe97e41f29 add more logging for peer_proxy_test, also tweak IdentityLeaseGCPeriod and IdentityLeaseRenewIntervalPeriod 2024-10-08 17:18:27 -07:00
Kubernetes Prow Robot
f37ab27d9a Merge pull request #127936 from riaankleinhans/remove-tested-pending-eligible-endpoints
Remove tested pending eligible endpoints
2024-10-08 22:58:29 +01:00
Kubernetes Prow Robot
d598a3ec0f Merge pull request #126326 from manishym/group_snapshot_e2e
Add end-to-end tests for Volume Group Snapshot
2024-10-08 22:58:22 +01:00
Kubernetes Prow Robot
41440c8117 Merge pull request #127389 from macsko/pod_delete_event_handling_scheduler_perf_test_case
Add scheduler_perf test case for AssignedPodDelete event handling
2024-10-08 21:52:28 +01:00
Kubernetes Prow Robot
cd6a959cb4 Merge pull request #126927 from AnishShah/eviction-test
Deflake containerd DiskPressure eviction e2e tests
2024-10-08 21:52:22 +01:00