Commit Graph

453 Commits

Author SHA1 Message Date
Yusuke Sakurai
5d278c138c fix labelvalues for scheduler-perf 2025-02-10 10:00:52 +09:00
Patrick Ohly
e2ff03486d scheduler_perf: add thresholds to DRA test cases
They were enabled yesterday and executed seven times, with results that (so
far) seem to be fairly stable with just one run that was slower across the
board.

The links in the YAML can be used to navigate to each test case quickly. The
thresholds were chose with a 20% security margin below what seems to be a
common result.
2025-02-03 13:10:10 +01:00
Kubernetes Prow Robot
209538059e Merge pull request #129885 from macsko/default_topology_spreading_scheduler_perf_test_case
Add scheduler_perf test case for default PodTopologySpreading constraints
2025-01-30 05:05:32 -08:00
Kubernetes Prow Robot
07cc2308c6 Merge pull request #128836 from pohly/dra-scheduler-perf-enablement
DRA: enable performance tracking with scheduler_perf
2025-01-30 03:07:23 -08:00
Maciej Skoczeń
274ad0391f Add scheduler_perf test case for default PodTopologySpreading constraints 2025-01-30 08:55:24 +00:00
dom4ha
f150016fbe feature: Make Unschedulable scheduler performance test parametrized with the number of initial nodes. 2025-01-23 00:48:02 +00:00
Kubernetes Prow Robot
bcd65ce240 Merge pull request #128667 from macsko/add_integration_tests_for_event_handling_scheduler_perf
Add integration tests for event handling cases in scheduler_perf
2024-12-12 13:10:26 +01:00
Kubernetes Prow Robot
ab9171b0cf Merge pull request #129040 from sanposhiho/patch-14
chore: ignore dat files generated by scheduler-perf
2024-12-12 05:29:13 +00:00
Kubernetes Prow Robot
078664b424 Merge pull request #129023 from zhifei92/cleanup-actiontype
scheduler:  Rename UpdatePodTolerations for code style consistency
2024-12-12 05:28:52 +00:00
Kensei Nakada
8f4e425daf chore: ignore dat files generated by scheduler-perf 2024-11-30 22:23:15 +09:00
zhifei92
27608fa25d refactor(scheduler): Rename UpdatePodTolerations for code style consistency. 2024-11-29 13:13:09 +08:00
dom4ha
67b74696f8 Adjust performance test threshold limits 2024-11-25 15:07:15 +00:00
Patrick Ohly
0ba8af9006 DRA: enable performance tracking with scheduler_perf
The performance of the basic "fill up the cluster"
scenario (SchedulingWithResourceClaimTemplate) and the steady-state
scenario (SteadyStateClusterResourceClaimTemplate) are relevant. The large
configurations should run long enough to provide meaningful results.

Performance may be different with queueing hints enabled, so variants with that
get added for those large configurations.
2024-11-18 14:34:31 +01:00
Patrick Ohly
ac3d43a8a6 scheduler_perf: work around incorrect gotestsum failure reports
Because Go does not a "pass" action for
benchmarks (https://github.com/golang/go/issues/66825#issuecomment-2343229005),
gotestsum reports a successful benchmark run as failed
(https://github.com/gotestyourself/gotestsum/issues/413#issuecomment-2343206787).

We can work around that in each benchmark and sub-benchmark by emitting the
output line that `go test` expects on stdout from the test binary for success.
2024-11-18 12:35:05 +01:00
Patrick Ohly
369a18a3a1 scheduler_perf: simplify flags, fix output
The "disabled by label filter" message for benchmarks printed the pointer to
the filter string, not the filter string itself. This mistake gets avoided and
the code becomes simpler when not using pointers.
2024-11-18 12:32:59 +01:00
Maciej Skoczeń
de8e8c5404 Add integration tests for event handling cases in scheduler_perf 2024-11-13 13:17:48 +00:00
Kubernetes Prow Robot
8115baca00 Merge pull request #128666 from macsko/fix_scale_down_in_eventhandlingpodupdate_scheduler_perf_test_case
Fix pod scale down failure in EventHandlingPodUpdate scheduler_perf test
2024-11-12 16:28:47 +00:00
Kubernetes Prow Robot
fb033826a8 Merge pull request #128170 from sanposhiho/async-preemption
feature(KEP-4832): asynchronous preemption
2024-11-07 19:44:54 +00:00
Maciej Skoczeń
379bff8dc9 Fix pod scale down failure in EventHandlingPodUpdate scheduler_perf test case 2024-11-07 13:48:50 +00:00
Patrick Ohly
0301b6b504 scheduler_perf: fix steady-state pod creation/deletion
This fixes an issue in
TestSchedulerPerf/SteadyStateClusterResourceClaimTemplate:

    scheduler_perf.go:1542: FATAL ERROR: op 7: delete scheduled pods: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline

That occurs when the test is almost done, but hasn't observed all scheduled
pods yet. The previous attempt to address this error wasn't actually 100%
correct. It covered the case when the context has already been canceled, but
not this particular "will reach deadline soon".
2024-11-07 09:36:36 +01:00
Kensei Nakada
4a084d54d2 feat: set the threashold on the scheduler-perf test case 2024-11-07 14:09:35 +09:00
Kensei Nakada
4b92f6d398 fix the broken part due to the merge 2024-11-07 14:09:35 +09:00
Kensei Nakada
69a8d0ec0b feature(KEP-4832): asynchronous preemption 2024-11-07 14:09:34 +09:00
Patrick Ohly
30f5282656 DRA API: rename DeviceCapacity.Quantity to DeviceCapacity.Value
Based on review
feedback (https://github.com/kubernetes/kubernetes/pull/127511#discussion_r1823521172).
2024-11-06 13:03:20 +01:00
Patrick Ohly
33ea278c51 DRA: use v1beta1 API
No code is left which depends on the v1alpha3, except of course the code
implementing that version.
2024-11-06 13:03:19 +01:00
Kubernetes Prow Robot
0fad78930f Merge pull request #127904 from towca/jtuznik/dra-autoscaling
DRA: allow Cluster Autoscaler to integrate with DRA scheduler plugin
2024-11-06 10:01:29 +00:00
Kubernetes Prow Robot
9bbb46d05f Merge pull request #128566 from macsko/run_scheduler_perf_with_queueinghints_enabled_disabled
Run scheduler_perf with QueueingHints both enabled and disabled
2024-11-05 14:53:29 +00:00
Kuba Tużnik
8d489425aa scheduler/dynamicresources: extract obtaining and tracking in-memory modifications of DRA objects
All logic related to obtaining DRA objects and tracking modifications
to ResourceClaims in-memory is extracted to DefaultDRAManager, which
implements framework.SharedDRAManager.

This is intended to be a no-op in terms of the DRA plugin behavior.
2024-11-05 14:11:04 +01:00
Kubernetes Prow Robot
2bb886ce2a Merge pull request #128482 from sanposhiho/scheduler-perf-ff
fix: register QHint metrics only when available
2024-11-05 12:15:30 +00:00
Kubernetes Prow Robot
c69f150008 Merge pull request #127277 from pohly/dra-structured-performance
kube-scheduler: enhance performance for DRA structured parameters
2024-11-05 10:05:29 +00:00
Kensei Nakada
0bf95100f1 fix: register QHint metrics only when available 2024-11-05 18:52:27 +09:00
Maciej Skoczeń
e44041ee47 Run scheduler_perf with QueueingHints both enabled and disabled 2024-11-05 09:13:03 +00:00
Patrick Ohly
7863d9a381 DRA scheduler: refactor CEL compilation cache
A better place is the cel package because a) the name can become shorter
and b) it is tightly coupled with the compiler there.

Moving the compilation into the cache simplifies the callers.
2024-11-05 08:34:42 +01:00
Maciej Skoczeń
8371a35824 Split scheduler_perf config into subdirectories 2024-11-04 08:45:34 +00:00
Patrick Ohly
bc55e82621 DRA scheduler: maintain a set of allocated device IDs
Reacting to events from the informer cache (indirectly, through the assume
cache) is more efficient than repeatedly listing it's content and then
converting to IDs with unique strings.

    goos: linux
    goarch: amd64
    pkg: k8s.io/kubernetes/test/integration/scheduler_perf
    cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
                                                                                       │            before            │                        after                        │
                                                                                       │ SchedulingThroughput/Average │ SchedulingThroughput/Average  vs base               │
    PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36                      54.70 ± 6%                     76.81 ± 6%  +40.42% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36                     106.4 ± 4%                     105.6 ± 2%        ~ (p=0.413 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36                     120.0 ± 4%                     118.9 ± 7%        ~ (p=0.117 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36                      112.5 ± 4%                     105.9 ± 4%   -5.87% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36                      87.13 ± 4%                    123.55 ± 4%  +41.80% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36                      113.4 ± 2%                     103.3 ± 2%   -8.95% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36                      65.55 ± 3%                    121.30 ± 3%  +85.05% (p=0.002 n=6)
    geomean                                                                                                90.81                          106.8       +17.57%
2024-11-01 13:23:06 +01:00
Patrick Ohly
814c9428fd DRA scheduler: cache compiled CEL expressions
DeviceClasses and different requests are very likely to contain the same
expression string. We don't need to compile that over and over again.

To avoid hanging onto that cache longer than necessary, it's currently tied to
each PreFilter/Filter combination. It might make sense to move this up into the
scheduler plugin and thus reuse compiled expressions for different pods.

    goos: linux
    goarch: amd64
    pkg: k8s.io/kubernetes/test/integration/scheduler_perf
    cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
                                                                                       │            before            │                        after                        │
                                                                                       │ SchedulingThroughput/Average │ SchedulingThroughput/Average  vs base               │
    PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36                      33.95 ± 4%                     36.65 ± 2%   +7.95% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36                     105.8 ± 2%                     106.7 ± 3%        ~ (p=0.177 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36                     100.7 ± 1%                     119.7 ± 3%  +18.82% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36                      90.78 ± 1%                    121.10 ± 4%  +33.40% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36                      50.51 ± 7%                     63.72 ± 3%  +26.17% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36                      103.7 ± 5%                     110.2 ± 2%   +6.32% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36                      28.50 ± 2%                     28.16 ± 5%        ~ (p=0.102 n=6)
    geomean                                                                                                64.99                          73.15       +12.56%
2024-11-01 13:20:06 +01:00
Patrick Ohly
941d17b3b8 DRA scheduler: code cleanups
Looking up the slice can be avoided by storing it when allocating a device.
The AllocationResult struct is small enough that it can be copied by value.

    goos: linux
    goarch: amd64
    pkg: k8s.io/kubernetes/test/integration/scheduler_perf
    cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
                                                                                       │            before            │                       after                        │
                                                                                       │ SchedulingThroughput/Average │ SchedulingThroughput/Average  vs base              │
    PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36                      33.30 ± 2%                     33.95 ± 4%       ~ (p=0.288 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36                     105.3 ± 2%                     105.8 ± 2%       ~ (p=0.524 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36                     100.8 ± 1%                     100.7 ± 1%       ~ (p=0.738 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36                      90.96 ± 2%                     90.78 ± 1%       ~ (p=0.952 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36                      49.84 ± 4%                     50.51 ± 7%       ~ (p=0.485 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36                      103.8 ± 1%                     103.7 ± 5%       ~ (p=0.582 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36                      27.21 ± 7%                     28.50 ± 2%       ~ (p=0.065 n=6)
    geomean                                                                                                64.26                          64.99       +1.14%
2024-11-01 13:19:51 +01:00
Patrick Ohly
1246898315 DRA scheduler: ResourceSlice with unique strings
Using unique strings instead of normal strings speeds up allocation with
structured parameters because maps that use those strings as key no longer need
to build hashes of the string content. However, care must be taken to call
unique.Make as little as possible because it is costly.

Pre-allocating the map of allocated devices reduces the need to grow the map
when adding devices.

    goos: linux
    goarch: amd64
    pkg: k8s.io/kubernetes/test/integration/scheduler_perf
    cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
                                                                                       │            before            │                        after                         │
                                                                                       │ SchedulingThroughput/Average │ SchedulingThroughput/Average  vs base                │
    PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36                     18.06 ±  2%                     33.30 ± 2%   +84.31% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36                    104.7 ±  2%                     105.3 ± 2%         ~ (p=0.818 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36                    96.62 ±  1%                    100.75 ± 1%    +4.28% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36                     83.00 ±  2%                     90.96 ± 2%    +9.59% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36                     32.45 ±  7%                     49.84 ± 4%   +53.60% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36                     95.22 ±  7%                    103.80 ± 1%    +9.00% (p=0.002 n=6)
    PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36                     9.111 ± 10%                    27.215 ± 7%  +198.69% (p=0.002 n=6)
    geomean                                                                                               45.86                           64.26        +40.12%
2024-11-01 13:19:48 +01:00
dom4ha
ff584a76e0 Fix Unschedulable test by scheduling high priority churn pods to get processed right after they were injected (before the queued test pods) 2024-10-30 13:04:38 +00:00
Patrick Ohly
9a7e4ccab2 DRA admin access: add feature gate
The new DRAAdminAccess feature gate has the following effects:
- If disabled in the apiserver, the spec.devices.requests[*].adminAccess
  field gets cleared. Same in the status. In both cases the scenario
  that it was already set and a claim or claim template get updated
  is special: in those cases, the field is not cleared.

  Also, allocating a claim with admin access is allowed regardless of the
  feature gate and the field is not cleared. In practice, the scheduler
  will not do that.
- If disabled in the resource claim controller, creating ResourceClaims
  with the field set gets rejected. This prevents running workloads
  which depend on admin access.
- If disabled in the scheduler, claims with admin access don't get
  allocated. The effect is the same.

The alternative would have been to ignore the fields in claim controller and
scheduler. This is bad because a monitoring workload then runs, blocking
resources that probably were meant for production workloads.
2024-10-29 09:50:11 +01:00
Kensei Nakada
b5d0745db3 Fix: use pod-high-priority.yaml to trigger preemption in PreemptionAsync test case 2024-10-26 14:16:24 +09:00
dom4ha
b3c4fe48e9 Tune PreemptionAsync and Unschedulable tests threshold and params. 2024-10-23 12:24:10 +00:00
Maciej Skoczeń
84e23fcc88 Add scheduler_perf test case for NodeUpdate event handling 2024-10-22 09:03:53 +00:00
Kensei Nakada
83f9e4b6df cleanup: remove event list 2024-10-18 11:10:10 +10:00
Kubernetes Prow Robot
b1b4e5d397 Merge pull request #128003 from pohly/dra-classic-dra-removal
DRA: remove "classic DRA"
2024-10-18 00:55:17 +01:00
dom4ha
b7f55a37a0 Bring back the smallest integration test 2024-10-17 15:41:36 +00:00
dom4ha
59458573ff Remove unschedulable test and replace it with the new one. 2024-10-17 15:41:21 +00:00
dom4ha
f2c947e36d Add UnschedulableAsync test in scheduler_perf to monitor impact of unschedulable pods on scheduler throughput 2024-10-17 15:35:21 +00:00
dom4ha
b2b41444f2 Add PreemptionBlocking test in scheduler_perf to monitor how long the preemption process (which blocks scheduling of regular nodes) takes. 2024-10-17 09:58:32 +00:00
Patrick Ohly
f84eb5ecf8 DRA: remove "classic DRA"
This removes the DRAControlPlaneController feature gate, the fields controlled
by it (claim.spec.controller, claim.status.deallocationRequested,
claim.status.allocation.controller, class.spec.suitableNodes), the
PodSchedulingContext type, and all code related to the feature.

The feature gets removed because there is no path towards beta and GA and DRA
with "structured parameters" should be able to replace it.
2024-10-16 23:09:50 +02:00