Commit Graph

26230 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
6f3af076d6 Merge pull request #127829 from dom4ha/scheduler-perf
Add PreemptionBlocking test in scheduler_perf to monitor how long the preemption process (which blocks scheduling of regular nodes) takes.
2024-10-17 16:03:04 +01:00
Anish Shah
d72c7319f8 test: refactor duplicate inplace pod resize test utilities 2024-10-17 05:35:40 -07:00
Anish Shah
b8897e688d test: refactor duplicate IPPR e22 tests.
This change refactors duplicate IPPR cluster and node e2e tests under
test/e2e/common directory
2024-10-17 05:27:27 -07:00
Anish Shah
6203006348 test: parity between cluster and node IPPR e2e tests
Some IPPR cluster e2e tests are missing from node e2e tests. This change
brings parity between them.
2024-10-17 05:23:55 -07:00
Lukasz Szaszkiewicz
06a15c5cf9 Promote WatchList feature to Beta (#128053)
* e2e/apimachinery/watchlist: always run WatchList e2e tests

* kube-controller-manager: enable WatchListClient

* kube-apiserver: promote WatchList feature to beta
2024-10-17 11:07:04 +01:00
dom4ha
b2b41444f2 Add PreemptionBlocking test in scheduler_perf to monitor how long the preemption process (which blocks scheduling of regular nodes) takes. 2024-10-17 09:58:32 +00:00
Michal Wozniak
ec983bd3fb Use Consistently in the e2e for Job 2024-10-17 09:02:39 +02:00
Michal Wozniak
70a8ceb6f0 Graduate JobManagedBy to Beta in 1.32
# Conflicts:
#	pkg/features/kube_features.go
2024-10-17 09:01:54 +02:00
bzsuni
6d0b62a84d Update npd from v0.8.19 to v0.8.20
Signed-off-by: bzsuni <bingzhe.sun@daocloud.io>
2024-10-17 14:06:14 +08:00
Kubernetes Prow Robot
d67e6545b1 Merge pull request #124227 from iholder101/in-pod-vertical-scaling/extended-resources
[FG:InPlacePodVerticalScaling] Add extended resources to ContainerStatuses[i].Resources
2024-10-17 01:39:03 +01:00
Kubernetes Prow Robot
a8fd407d2f Merge pull request #128136 from enj/enj/t/non_global_kms_kdf_via_name
kmsv2: run KDF tests in parallel
2024-10-16 23:49:03 +01:00
Patrick Ohly
f84eb5ecf8 DRA: remove "classic DRA"
This removes the DRAControlPlaneController feature gate, the fields controlled
by it (claim.spec.controller, claim.status.deallocationRequested,
claim.status.allocation.controller, class.spec.suitableNodes), the
PodSchedulingContext type, and all code related to the feature.

The feature gets removed because there is no path towards beta and GA and DRA
with "structured parameters" should be able to replace it.
2024-10-16 23:09:50 +02:00
Monis Khan
43740c0def kmsv2: run KDF tests in parallel
This change updates the KDF "feature flag" to be per KMS provider
instead of global to the API server.  This allows integration tests
that use distinct provider names to run in parallel.

Locally this change reduced the runtime of
test/integration/controlplane/transformation by 3.5 minutes.

Signed-off-by: Monis Khan <mok@microsoft.com>
2024-10-16 16:58:19 -04:00
Kubernetes Prow Robot
d6e7aa0f18 Merge pull request #127495 from zhifei92/criproxy-for-e2enode
Add cri proxy for e2e_node
2024-10-16 20:35:04 +01:00
Kubernetes Prow Robot
e287784a8d Merge pull request #128050 from macsko/add_pod_add_event_handling_scheduler_perf_test_case
Add scheduler_perf test case for AssignedPodAdd event handling
2024-10-16 15:37:02 +01:00
zhifei92
2e182e736b feat: Add cri proxy for e2e_node
add example of using CRI proxy

fix:  Invalid function call

fix:  Optimize getPodImagePullDuration

fix:  Return error if the CRI Proxy is undefined

chore:  add a document
2024-10-16 20:53:28 +08:00
googs1025
573f0d4538 flake(kubectl): fix run_kubectl_request_timeout_tests in integration test 2024-10-16 20:44:25 +08:00
Patrick Ohly
4526b28606 ktesting: improve context message
This is not necessarily a problem, some code might use a timeout and expect it
to trigger. Therefore this should only be an info message, not a
warning. Long-term it might be useful to have an API where the caller decides
whether this gets logged.

The caller should use short messages and leave it to the user of those to
provide more context (no pun intended...). When logging, "canceling context" is
that context.

Before:

    scheduler_perf.go:1431: FATAL ERROR: op 7: delete scheduled pods: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
    contexthelper.go:69:
        WARNING: the operation ran for the configured 2s

After:

    scheduler_perf.go:1431: FATAL ERROR: op 7: delete scheduled pods: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
    contexthelper.go:69:
        INFO: canceling context: the operation ran for the configured 2s
2024-10-16 10:46:24 +02:00
Maciej Skoczeń
0db96a0ac3 Add scheduler_perf test case for AssignedPodAdd event handling 2024-10-16 07:45:50 +00:00
Haitao Chen
e9cbbc7886 bump golang to 1.23.2
from 1.23.0
2024-10-15 18:43:50 -07:00
Kubernetes Prow Robot
e5ba5cd2b0 Merge pull request #128097 from jsturtevant/fix-gmsa-test
[sig-windows] Update kubectl exec to use correct format
2024-10-15 23:05:10 +01:00
Kubernetes Prow Robot
558c0b6eaa Merge pull request #128084 from macsko/fix_panic_when_defining_featuregates_only_on_workload_level_scheduler_perf
Fix panic when setting feature gates only on workload level in scheduler_perf
2024-10-15 23:05:03 +01:00
Kubernetes Prow Robot
9872b17ccc Merge pull request #127828 from macsko/add_template_parameters_to_createnodesop_in_scheduler_perf
Add template parameters to createNodesOp in scheduler_perf
2024-10-15 20:43:04 +01:00
James Sturtevant
f1af725850 Update kubectl exec to use correct format
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
2024-10-15 09:34:38 -07:00
Kubernetes Prow Robot
1cd8074b83 Merge pull request #128079 from pohly/e2e-daemonset-check-daemon-status-polling
e2e daemon set: better polling in CheckDaemonStatus
2024-10-15 12:24:21 +01:00
Maciej Skoczeń
cca6f8c800 Fix panic when defining feature gates only on workload level in scheduler_perf 2024-10-15 09:50:55 +00:00
Kubernetes Prow Robot
7c53005b6c Merge pull request #128066 from bart0sh/PR160-e2e_nod-fix-mirror-pod-test
e2e_node: fix mirror pod test
2024-10-15 09:24:21 +01:00
Patrick Ohly
e43065d542 e2e daemon set: better polling in CheckDaemonStatus
As a quick fix for a flake, bceec5a3ff
introduced polling with wait.Poll in all callers of CheckDaemonStatus.

This commit reverts all callers to what they were before (CheckDaemonStatus +
ExpectNoError) and implements polling according to E2E best practices
(https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/writing-good-e2e-tests.md#polling-and-timeouts):

- no logging while polling
- support for progress reporting while polling
- last but not least, produce an informative failure message in case of a
  timeout, including a dump of the daemon set as YAML
2024-10-15 10:12:28 +02:00
Paco Xu
386f495d41 storage fsquota monitoring pod should be user namespaced 2024-10-15 14:43:16 +08:00
torredil
6da4f57e8a Implement e2e test for Retroactive StorageClass Assignment feature
Signed-off-by: torredil <torredil@amazon.com>
2024-10-15 03:33:32 +00:00
Kubernetes Prow Robot
8b7b768ff7 Merge pull request #128011 from seans3/egress-selector-configuration-strict
EgressSelectorConfiguration now uses strict validation
2024-10-15 02:04:20 +01:00
Ed Bartosh
876819b8b6 e2e_node: fix mirror pod test
Modified stopNfsServer function to wait until nfs rpc is unregistered.
This should fix failing pull-kubernetes-node-arm64-ubuntu-serial-gce
job.
2024-10-15 02:05:26 +03:00
Antonio Ojea
bceec5a3ff e2e flake CheckDaemonStatus assert on async value
The util for checking on daemonstatus was checking once if the Status of
the daemonset was reporting that all the desired Pods are scheduled and
ready.

However, the pattern used in the e2e test for this function was not
taking into consideration that the controller needs to propagate the Pod
status to the DeamonSet status, and was asserting on the condition only
once after waiting for all the Pods to be ready.

In order to avoid more churn code, change the CheckDaemonStatus
signature to the wait.Condition type and use it in a async poll loop on
the tests.
2024-10-14 13:30:03 +00:00
Kubernetes Prow Robot
de8f6b0db7 Merge pull request #128037 from dshebib/e2eNode_containerLifecycleContext
[e2e_node] Use shared context in regular container tests
2024-10-14 13:10:28 +01:00
Kubernetes Prow Robot
a454563a8d Merge pull request #127812 from p0lyn0mial/upstream-decode-list-blueprint
client-go/rest/request: decodes initialEventsListBlueprint for watchlist requests
2024-10-14 13:10:21 +01:00
Kubernetes Prow Robot
d003e4cd9f Merge pull request #127923 from unvavo/add-test-tainttoleration-for-queueinghint
add integration test for tainttoleration in requeueing scenarios
2024-10-14 10:20:28 +01:00
Lukasz Szaszkiewicz
7be192ae0b client-go/rest/request: decodes initialEventsListBlueprint for watchlist requests 2024-10-14 08:48:32 +02:00
Sean Sullivan
32b2eea50d EgressSelectorConfiguration now uses strict validation 2024-10-13 16:09:35 -07:00
Daniel Shebib
1618dbe695 Add context to tests 2024-10-12 21:23:29 -05:00
unvavo
2c254d3b25 add integration test for tainttoleration in requeueing scenarios 2024-10-12 15:23:55 +09:00
Tsubasa Nagasawa
04c6d9324e Check for restarts without being affected by container startup order
The test for checking container restarts in a Pod with restartable-init-1
and regular-1 is flaky. Right now, when we check if restartable-init-1 has
restarted, we see if it hasn’t written the "Started" log after regular-1 has
written its "Started" log.
But even though the startup sequence starts with restartable-init-1 and then
regular-1, there’s no guarantee they’ll finish starting up in that order.
Sometimes regular-1 finishes first and writes its "Started" log before restartable-init-1.

1. restartable-init-1 Starting
2. regular-1 Starting
3. regular-1 Started
4. restartable-init-1 Started

In this test, the startup order doesn’t really matter; all we need to check is
if restartable-init-1 restarted. So I changed the test to simply look for
more than one "Starting" log in restartable-init-1's logs.

There were other places that used the same helper function DoesntStartAfter,
so replaced those as well and deleted the helper function.
2024-10-12 15:17:47 +09:00
Kubernetes Prow Robot
762a85e25d Merge pull request #125923 from haircommander/cpuset-fix-restart
kubelet/cm: fix bug where kubelet restarts from missing cpuset cgroup
2024-10-12 00:12:20 +01:00
Peter Hunt
b94c5387b8 e2e_node: use restart instead of start stop
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-10-11 16:53:33 -04:00
Kubernetes Prow Robot
2f7df335ad Merge pull request #127615 from macsko/add_node_add_event_benchmark_to_scheduler_perf
Add scheduler_perf test case for NodeAdd event handling
2024-10-11 18:10:19 +01:00
Francesco Romani
cc87438f2f e2e_node: add a test to verify the kubelet starts
with systemd cgroup driver and cpumanager none policy.

This was originally planned to be a correctness check for
https://issues.k8s.io/125923, but it was difficult to reproduce the bug,
so it's now a regression test against it.

Signed-off-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-10-11 11:29:16 -04:00
Kubernetes Prow Robot
1b6c993cee Merge pull request #127952 from macsko/allow_to_specify_feature_gates_on_workload_level_scheduler_perf
Allow to set feature gates on workload level in scheduler_perf
2024-10-11 15:28:19 +01:00
Kubernetes Prow Robot
a0e146a4b0 Merge pull request #127988 from pohly/e2e-daemonset-health-check
e2e daemonset: stronger health check of DaemonSet status
2024-10-11 14:16:21 +01:00
Maciej Skoczeń
e676d0e76a Allow to specify feature gates on workload level in scheduler_perf 2024-10-11 08:41:08 +00:00
Patrick Ohly
3ec84373c1 e2e daemonset: stronger health check of DaemonSet status
The error was only generated if both checks (generated pods and ready pods)
failed. This looks like a logic error, failing if either of those isn't
matching expectations seems better.
2024-10-11 10:36:36 +02:00
Maciej Skoczeń
6dbb5d84b3 Move integration tests perf utils to scheduler_perf package 2024-10-11 08:27:08 +00:00