kubernetes

mirror of https://github.com/optim-enterprises-bv/kubernetes.git synced 2025-11-29 04:43:54 +00:00

Author	SHA1	Message	Date
Joe Betz	3570feb2fc	Cancel informers for shutdown server in peerproxy test	2024-10-08 21:49:09 -04:00
Richa Banker	fe97e41f29	add more logging for peer_proxy_test, also tweak IdentityLeaseGCPeriod and IdentityLeaseRenewIntervalPeriod	2024-10-08 17:18:27 -07:00
Kubernetes Prow Robot	41440c8117	Merge pull request #127389 from macsko/pod_delete_event_handling_scheduler_perf_test_case Add scheduler_perf test case for AssignedPodDelete event handling	2024-10-08 21:52:28 +01:00
Cici Huang	baeeb66613	Update tests	2024-10-08 17:02:07 +00:00
Kensei Nakada	a2b3a4f4dc	chore: ensure the scheduler handles events before checking the pod position	2024-10-06 21:06:45 +09:00
Kubernetes Prow Robot	7478a30fdc	Merge pull request #127260 from carlory/fix-124136 Fix TestPersistentVolumeProvisionMultiPVCs	2024-10-04 15:02:50 +01:00
Kubernetes Prow Robot	7dd03c1ee5	Merge pull request #127353 from Gekko0114/integration_test_volumezone Add integration test for VolumeZone in requeueing scenarios	2024-10-03 05:48:26 +01:00
Maciej Skoczeń	2a08ce5c68	Add scheduler_perf test case for AssignedPodDelete event handling	2024-10-02 09:16:28 +00:00
moriya	3e57d5cf67	fix	2024-10-02 06:54:32 +09:00
Kubernetes Prow Robot	ae617c3d20	Merge pull request #127781 from macsko/use_barrier_not_sleep_where_possible_in_scheduler_perf_test_cases Use barrier instead of sleep when possible in scheduler_perf test cases	2024-10-01 22:06:10 +01:00
Maciej Skoczeń	bae0eb91d4	Use barrier instead of sleep when possible in scheduler_perf test cases	2024-10-01 13:53:04 +00:00
Maciej Skoczeń	5e2552c2b0	Allow to filter pods using labels on barrier in scheduler_perf	2024-10-01 08:48:37 +00:00
Kubernetes Prow Robot	22a30e7cbb	Merge pull request #127700 from macsko/add_option_waitforpodsprocessed Add option to wait for pods to be attempted in barrierOp in scheduler_perf	2024-10-01 05:17:49 +01:00
Kubernetes Prow Robot	5e65529ca9	Merge pull request #127759 from macsko/allow_to_filter_pods_using_labels_while_collecting_metrics_scheduler_perf Allow to filter pods using labels while collecting metrics in scheduler_perf	2024-09-30 20:37:35 +01:00
Maciej Skoczeń	fdbf21e03a	Allow to filter pods using labels while collecting metrics in scheduler_perf	2024-09-30 13:32:12 +00:00
Lionel Jouin	0bb0e8feaf	Fix TestEnableDisableServiceCIDR The wrong clientset was used to create services and an incorrect amount of services was created. Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-09-30 13:15:00 +02:00
Lionel Jouin	8dafdb2cdd	Fix ServiceCIDR integration test enable/disable feature_enable_disable.go was missing the suffix _test.go to be considered as a test. Without it, TestEnableDisableServiceCIDR was not executed. Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-09-30 12:00:25 +02:00
Maciej Skoczeń	928670061d	Allow to wait for pods to be attempted in barrierOp in scheduler_perf	2024-09-30 08:07:15 +00:00
Kubernetes Prow Robot	80941e3e87	Merge pull request #127643 from Jefftree/set-emulation-integration-test Allow emulation version to be set in integration test	2024-09-27 21:56:01 +01:00
dom4ha	9bf6ee976b	Assert whethere there are no pod in active queue while waiting for all pods to get scheduled instead of asserting it afterwards.	2024-09-27 15:06:04 +00:00
dom4ha	54b0ed45b7	Add one more check to the test case precondition assessment.	2024-09-27 15:06:04 +00:00
dom4ha	151ac846a2	Increase the readability of the test preconditions and double check that all test pods are really unschedulable.	2024-09-27 15:06:04 +00:00
Kubernetes Prow Robot	960e3984b0	Merge pull request #127444 from dom4ha/fine-grained-qhints Fine grain QueueHints for NodeAffinity plugin	2024-09-27 01:42:00 +01:00
Kubernetes Prow Robot	5ebd0da6cc	Merge pull request #127662 from macsko/make_scheduler_perf_sleepop_duration_parametrizable Make sleepOp duration parametrizable in scheduler_perf	2024-09-26 20:10:01 +01:00
Kubernetes Prow Robot	421436a94c	Merge pull request #127473 from dom4ha/fine-grain-qhints-fit feature(scheduler): more fine-grained Node QHint for NodeResourceFit plugin	2024-09-26 18:34:02 +01:00
Maciej Skoczeń	837d917d91	Make sleepOp duration parametrizable in scheduler_perf	2024-09-26 13:07:22 +00:00
dom4ha	c7db4bb450	Fine grain QueueHints for nodeaffinity plugin. Skip queue on unrelated change that keeps pod schedulable when QueueHints are enabled. Split add from QHints disabled case Remove case when QHints are disabled Remove two GHint alternatives in unit tests more fine-grained Node QHint for NodeResourceFit plugin Return early when updated Node causes unmatch Revert "more fine-grained Node QHint for NodeResourceFit plugin" This reverts commit dfbceb60e0c1c4e47748c12722d9ed6dba1a8366. Add integration test for requeue of a pod previously rejected by NodeAffinity plugin when a suitable Node is added Add integratin test for a Node update operation that does not trigger requeue in NodeAffinity plugin Remove innacurrate comment Apply review comments	2024-09-26 10:21:08 +00:00
dom4ha	903b1f7e28	more fine-grained Node QHint for NodeResourceFit plugin	2024-09-26 09:51:36 +00:00
Jefftree	dacc2e1f5d	Allow emulation version to be set in integration test	2024-09-25 22:01:15 -04:00
Maciej Skoczeń	40154baab0	Add updateAnyOp to scheduler_perf	2024-09-25 12:42:25 +00:00
Kubernetes Prow Robot	5fc4e71a30	Merge pull request #127499 from pohly/scheduler-perf-updates scheduler_perf: updates to enhance performance testing of DRA	2024-09-25 13:32:00 +01:00
Kubernetes Prow Robot	75214d11d5	Merge pull request #127428 from googs1025/scheduler/plugin chore(scheduler): refactor import package ordering in scheduler	2024-09-25 11:40:07 +01:00
Patrick Ohly	d100768d94	scheduler_perf: track and visualize progress over time This is useful to see whether pod scheduling happens in bursts and how it behaves over time, which is relevant in particular for dynamic resource allocation where it may become harder at the end to find the node which still has resources available. Besides "pods scheduled" it's also useful to know how many attempts were needed, so schedule_attempts_total also gets sampled and stored. To visualize the result of one or more test runs, use: gnuplot.sh *.dat	2024-09-25 11:09:15 +02:00
Patrick Ohly	ded96042f7	scheduler_perf + DRA: load up cluster by allocating claims Having to schedule 4999 pods to simulate a "full" cluster is slow. Creating claims and then allocating them more or less like the scheduler would when scheduling pods is much faster and in practice has the same effect on the dynamicresources plugin because it looks at claims, not pods. This allows defining the "steady state" workloads with higher number of devices ("claimsPerNode") again. This was prohibitively slow before.	2024-09-25 09:45:39 +02:00
Patrick Ohly	385599f0a8	scheduler_perf + DRA: measure pod scheduling at a steady state The previous tests were based on scheduling pods until the cluster was full. This is a valid scenario, but not necessarily realistic. More realistic is how quickly the scheduler can schedule new pods when some old pods finished running, in particular in a cluster that is properly utilized (= almost full). To test this, pods must get created, scheduled, and then immediately deleted. This can run for a certain period of time. Scenarios with empty and full cluster have different scheduling rates. This was previously visible for DRA because the 50% percentile of the scheduling throughput was lower than the average, but one had to guess in which scenario the throughput was lower. Now this can be measured for DRA with the new SteadyStateClusterResourceClaimTemplateStructured test. The metrics collector must watch pod events to figure out how many pods got scheduled. Polling misses pods that already got deleted again. There seems to be no relevant difference in the collected metrics (SchedulingWithResourceClaimTemplateStructured/2000pods_200nodes, 6 repetitions): │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ 157.1 ± 0% 157.1 ± 0% ~ (p=0.329 n=6) │ before │ after │ │ SchedulingThroughput/Perc50 │ SchedulingThroughput/Perc50 vs base │ 48.99 ± 8% 47.52 ± 9% ~ (p=0.937 n=6) │ before │ after │ │ SchedulingThroughput/Perc90 │ SchedulingThroughput/Perc90 vs base │ 463.9 ± 16% 460.1 ± 13% ~ (p=0.818 n=6) │ before │ after │ │ SchedulingThroughput/Perc95 │ SchedulingThroughput/Perc95 vs base │ 463.9 ± 16% 460.1 ± 13% ~ (p=0.818 n=6) │ before │ after │ │ SchedulingThroughput/Perc99 │ SchedulingThroughput/Perc99 vs base │ 463.9 ± 16% 460.1 ± 13% ~ (p=0.818 n=6)	2024-09-25 09:45:39 +02:00
Patrick Ohly	51cafb0053	scheduler_perf: more useful errors for configuration mistakes Before, the first error was reported, which typically was the "invalid op code" error from the createAny operation: scheduler_perf.go:900: parsing test cases error: error unmarshaling JSON: while decoding JSON: cannot unmarshal {"collectMetrics":true,"count":10,"duration":"30s","namespace":"test","opcode":"createPods","podTemplatePath":"config/dra/pod-with-claim-template.yaml","steadyState":true} into any known op type: invalid opcode "createPods"; expected "createAny" Now the opcode is determined first, then decoding into exactly the matching operation is tried and validated. Unknown fields are an error. In the case above, decoding a string into time.Duration failed: scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: decoding {"collectMetrics":true,"count":10,"duration":"30s","namespace":"test","opcode":"createPods","podTemplatePath":"config/dra/pod-with-claim-template.yaml","steadyState":true} into benchmark.createPodsOp: json: cannot unmarshal string into Go struct field createPodsOp.Duration of type time.Duration Some typos: scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: unknown opcode "sleeep" in {"duration":"5s","opcode":"sleeep"} scheduler_test.go:29: parsing test cases error: error unmarshaling JSON: while decoding JSON: decoding {"countParram":"$deletingPods","deletePodsPerSecond":50,"opcode":"createPods"} into benchmark.createPodsOp: json: unknown field "countParram"	2024-09-25 09:45:39 +02:00
Kubernetes Prow Robot	5dd244ff00	Merge pull request #125796 from haorenfsa/fix-gc-sync-blocked garbagecollector: controller should not be blocking on failed cache sync	2024-09-25 04:02:00 +01:00
Kubernetes Prow Robot	e9cde03b91	Merge pull request #127598 from aojea/servicecidr_seconday_dualwrite bugfix: initialize secondary range registry with the right value	2024-09-24 21:08:08 +01:00
Antonio Ojea	7a9bca3888	bugfix: initialize secondary range registry with the right value When MultiCIDRServiceAllocator feature is enabled, we added an additional feature gate DisableAllocatorDualWrite that allows to enable a mirror behavior on the old allocator to deal with problems during cluster upgrades. During the implementation the secondary range of the legacy allocator was initialized with the valuye of the primary range, hence, when a Service tried to allocate a new IP on the secondary range, it succeded in the new ip allocator but failed when it tried to allocate the same IP on the legacy allocator, since it has a different range. Expand the integration test that run over all the combinations of Service ClusterIP possibilities to run with all the possible combinations of the feature gates. The integration test need to change the way of starting the apiserver otherwise it will timeout.	2024-09-24 17:48:13 +00:00
Patrick Ohly	7bbb3465e5	scheduler_perf: more realistic structured parameters tests Real devices are likely to have a handful of attributes and (for GPUs) the memory as capacity. Most keys will be driver specific, a few may eventually have a domain (none standardized right now).	2024-09-24 18:52:45 +02:00
Kubernetes Prow Robot	6ded721910	Merge pull request #127496 from macsko/add_metricscollectionop_to_scheduler_perf Add separate ops for collecting metrics from multiple namespaces in scheduler_perf	2024-09-24 14:34:00 +01:00
Maciej Skoczeń	a273e5381a	Add separate ops for collecting metrics from multiple namespaces in scheduler_perf	2024-09-24 12:28:53 +00:00
Kubernetes Prow Robot	94df29b8f2	Merge pull request #127464 from sanposhiho/trigger-nodedelete fix(eventhandler): trigger Node/Delete event	2024-09-24 02:24:00 +01:00
moriya	cd0e0fc881	add_test	2024-09-23 21:49:09 +09:00
moriya	090145aadf	add_non_queued_pod	2024-09-23 21:24:09 +09:00
Kubernetes Prow Robot	15d08bf7c8	Merge pull request #127323 from vrutkovs/tracing-cacher-get tracing: add span for get cacher	2024-09-23 10:27:59 +01:00
Kensei Nakada	421f87a4e3	feat: add a requeueing integration test for PodTopologySpread with Node/delete event (QHint: disabled)	2024-09-23 00:29:56 +09:00
Kensei Nakada	bf8f7a3ad7	feat: add a requeueing integration test for PodTopologySpread with Node/delete event	2024-09-22 17:34:37 +09:00
Kubernetes Prow Robot	61dbc03563	Merge pull request #127471 from macsko/add_deletepodsop_to_scheduler_perf Add deletePodsOp to scheduler_perf	2024-09-22 07:00:04 +01:00
Vadim Rutkovsky	dff0075e7c	tracing: add span for cacher.Get Also updates tracing integration tests for cacher.GetList	2024-09-21 09:53:43 +02:00

... 2 3 4 5 6 ...

5145 Commits