The "disabled by label filter" message for benchmarks printed the pointer to
the filter string, not the filter string itself. This mistake gets avoided and
the code becomes simpler when not using pointers.
This fixes an issue in
TestSchedulerPerf/SteadyStateClusterResourceClaimTemplate:
scheduler_perf.go:1542: FATAL ERROR: op 7: delete scheduled pods: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
That occurs when the test is almost done, but hasn't observed all scheduled
pods yet. The previous attempt to address this error wasn't actually 100%
correct. It covered the case when the context has already been canceled, but
not this particular "will reach deadline soon".
This is useful to see whether pod scheduling happens in bursts and how it
behaves over time, which is relevant in particular for dynamic resource
allocation where it may become harder at the end to find the node which still
has resources available.
Besides "pods scheduled" it's also useful to know how many attempts were
needed, so schedule_attempts_total also gets sampled and stored.
To visualize the result of one or more test runs, use:
gnuplot.sh *.dat
Having to schedule 4999 pods to simulate a "full" cluster is slow. Creating
claims and then allocating them more or less like the scheduler would when
scheduling pods is much faster and in practice has the same effect on the
dynamicresources plugin because it looks at claims, not pods.
This allows defining the "steady state" workloads with higher number of
devices ("claimsPerNode") again. This was prohibitively slow before.
The previous tests were based on scheduling pods until the cluster was
full. This is a valid scenario, but not necessarily realistic.
More realistic is how quickly the scheduler can schedule new pods when some
old pods finished running, in particular in a cluster that is properly
utilized (= almost full). To test this, pods must get created, scheduled, and
then immediately deleted. This can run for a certain period of time.
Scenarios with empty and full cluster have different scheduling rates. This was
previously visible for DRA because the 50% percentile of the scheduling
throughput was lower than the average, but one had to guess in which scenario
the throughput was lower. Now this can be measured for DRA with the new
SteadyStateClusterResourceClaimTemplateStructured test.
The metrics collector must watch pod events to figure out how many pods got
scheduled. Polling misses pods that already got deleted again. There seems to
be no relevant difference in the collected
metrics (SchedulingWithResourceClaimTemplateStructured/2000pods_200nodes, 6 repetitions):
│ before │ after │
│ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │
157.1 ± 0% 157.1 ± 0% ~ (p=0.329 n=6)
│ before │ after │
│ SchedulingThroughput/Perc50 │ SchedulingThroughput/Perc50 vs base │
48.99 ± 8% 47.52 ± 9% ~ (p=0.937 n=6)
│ before │ after │
│ SchedulingThroughput/Perc90 │ SchedulingThroughput/Perc90 vs base │
463.9 ± 16% 460.1 ± 13% ~ (p=0.818 n=6)
│ before │ after │
│ SchedulingThroughput/Perc95 │ SchedulingThroughput/Perc95 vs base │
463.9 ± 16% 460.1 ± 13% ~ (p=0.818 n=6)
│ before │ after │
│ SchedulingThroughput/Perc99 │ SchedulingThroughput/Perc99 vs base │
463.9 ± 16% 460.1 ± 13% ~ (p=0.818 n=6)
Before, the first error was reported, which typically was the "invalid op code"
error from the createAny operation:
scheduler_perf.go:900: parsing test cases error: error unmarshaling JSON: while decoding JSON: cannot unmarshal {"collectMetrics":true,"count":10,"duration":"30s","namespace":"test","opcode":"createPods","podTemplatePath":"config/dra/pod-with-claim-template.yaml","steadyState":true} into any known op type: invalid opcode "createPods"; expected "createAny"
Now the opcode is determined first, then decoding into exactly the matching operation is
tried and validated. Unknown fields are an error.
In the case above, decoding a string into time.Duration failed:
scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: decoding {"collectMetrics":true,"count":10,"duration":"30s","namespace":"test","opcode":"createPods","podTemplatePath":"config/dra/pod-with-claim-template.yaml","steadyState":true} into *benchmark.createPodsOp: json: cannot unmarshal string into Go struct field createPodsOp.Duration of type time.Duration
Some typos:
scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: unknown opcode "sleeep" in {"duration":"5s","opcode":"sleeep"}
scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: decoding {"countParram":"$deletingPods","deletePodsPerSecond":50,"opcode":"createPods"} into *benchmark.createPodsOp: json: unknown field "countParram"
Reconfiguring the logging infrastructure with a per-test output file mimicks
the behavior of per-test output (log output captured only on failures) while
still using the normal logging code, which is important for benchmarking.
To enable this behavior, the ARTIFACT env variable must be set.
Several pods sharing the same claim is not common, but can be useful and thus
should get tested.
Before, createPods and createAny operations were not able to do this because
each generated object was the same. What we need are different, predictable
names of the claims (from createAny) and different references to those in the
pods (from createPods). Now text/template processing with the index number of
the pod respectively claim as input is used to inject these varying fields. A
"div" function is needed to use the same claim in several different pods.
While at it, some existing test cases get cleaned up a bit (removal of
incorrect comments, adding comments for testing with queuing hints).
This is not relevant for namespaced objects, but matters for the cluster-scoped
ResourceClass during unit testing. This works right now because there is only
one such unit test, but will fail when adding a second one.
Instead of passing a boolean flag down into all functions where it might be
needed, it's now a context value.
With a dynamic client and a rest mapper it is possible to load arbitrary YAML
files and create the object defined by it. This is simpler than adding specific
Go code for each supported type.
Because the version now matters, the incorrect version in the DRA YAMLs were
found and fixed.