kubernetes

mirror of https://github.com/optim-enterprises-bv/kubernetes.git synced 2025-12-12 19:15:36 +00:00

Author	SHA1	Message	Date
Kensei Nakada	d4d91d4ace	fix: use set methods	2024-11-07 14:09:35 +09:00
Kensei Nakada	a95b8b5085	fix: use Activate always	2024-11-07 14:09:35 +09:00
Kensei Nakada	677792663f	fix: register Pod/Delete event at the preemption plugin	2024-11-07 14:09:35 +09:00
Kensei Nakada	fe3119fa69	make sure DefaultPreemption implements PreEnqueuePlugin	2024-11-07 14:09:35 +09:00
Kensei Nakada	69a8d0ec0b	feature(KEP-4832): asynchronous preemption	2024-11-07 14:09:34 +09:00
Patrick Ohly	33ea278c51	DRA: use v1beta1 API No code is left which depends on the v1alpha3, except of course the code implementing that version.	2024-11-06 13:03:19 +01:00
Kubernetes Prow Robot	0fad78930f	Merge pull request #127904 from towca/jtuznik/dra-autoscaling DRA: allow Cluster Autoscaler to integrate with DRA scheduler plugin	2024-11-06 10:01:29 +00:00
Kubernetes Prow Robot	f81a68f488	Merge pull request #128377 from tallclair/allocated-status-2 [FG:InPlacePodVerticalScaling] Implement AllocatedResources status changes for Beta	2024-11-05 23:21:49 +00:00
Kuba Tużnik	8d489425aa	scheduler/dynamicresources: extract obtaining and tracking in-memory modifications of DRA objects All logic related to obtaining DRA objects and tracking modifications to ResourceClaims in-memory is extracted to DefaultDRAManager, which implements framework.SharedDRAManager. This is intended to be a no-op in terms of the DRA plugin behavior.	2024-11-05 14:11:04 +01:00
Patrick Ohly	7863d9a381	DRA scheduler: refactor CEL compilation cache A better place is the cel package because a) the name can become shorter and b) it is tightly coupled with the compiler there. Moving the compilation into the cache simplifies the callers.	2024-11-05 08:34:42 +01:00
Tim Allclair	81df195819	Stop using status.AllocatedResources to aggregate resources	2024-11-01 14:02:58 -07:00
Patrick Ohly	6f07fa3a5e	DRA scheduler: update some stale comments	2024-11-01 13:23:42 +01:00
Patrick Ohly	ae6b5522ea	DRA scheduler: rename variable "Allocated devices" are the ones which can be observed from the informer. "All allocated devices" also includes those which are in flight and haven't been written back to the apiserver.	2024-11-01 13:23:42 +01:00
Patrick Ohly	0130ebba1d	DRA scheduler: refactor "allocated devices" lookup The logic for skipping "admin access" was repeated in three different places. A single foreachAllocatedDevices with a callback puts it into one function.	2024-11-01 13:23:28 +01:00
Patrick Ohly	bd7ff9c4c7	DRA scheduler: update some log strings	2024-11-01 13:23:11 +01:00
Patrick Ohly	bc55e82621	DRA scheduler: maintain a set of allocated device IDs Reacting to events from the informer cache (indirectly, through the assume cache) is more efficient than repeatedly listing it's content and then converting to IDs with unique strings. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 54.70 ± 6% 76.81 ± 6% +40.42% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 106.4 ± 4% 105.6 ± 2% ~ (p=0.413 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 120.0 ± 4% 118.9 ± 7% ~ (p=0.117 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 112.5 ± 4% 105.9 ± 4% -5.87% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 87.13 ± 4% 123.55 ± 4% +41.80% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 113.4 ± 2% 103.3 ± 2% -8.95% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 65.55 ± 3% 121.30 ± 3% +85.05% (p=0.002 n=6) geomean 90.81 106.8 +17.57%	2024-11-01 13:23:06 +01:00
Patrick Ohly	814c9428fd	DRA scheduler: cache compiled CEL expressions DeviceClasses and different requests are very likely to contain the same expression string. We don't need to compile that over and over again. To avoid hanging onto that cache longer than necessary, it's currently tied to each PreFilter/Filter combination. It might make sense to move this up into the scheduler plugin and thus reuse compiled expressions for different pods. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 33.95 ± 4% 36.65 ± 2% +7.95% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 105.8 ± 2% 106.7 ± 3% ~ (p=0.177 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 100.7 ± 1% 119.7 ± 3% +18.82% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 90.78 ± 1% 121.10 ± 4% +33.40% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 50.51 ± 7% 63.72 ± 3% +26.17% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 103.7 ± 5% 110.2 ± 2% +6.32% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 28.50 ± 2% 28.16 ± 5% ~ (p=0.102 n=6) geomean 64.99 73.15 +12.56%	2024-11-01 13:20:06 +01:00
Patrick Ohly	941d17b3b8	DRA scheduler: code cleanups Looking up the slice can be avoided by storing it when allocating a device. The AllocationResult struct is small enough that it can be copied by value. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 33.30 ± 2% 33.95 ± 4% ~ (p=0.288 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 105.3 ± 2% 105.8 ± 2% ~ (p=0.524 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 100.8 ± 1% 100.7 ± 1% ~ (p=0.738 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 90.96 ± 2% 90.78 ± 1% ~ (p=0.952 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 49.84 ± 4% 50.51 ± 7% ~ (p=0.485 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 103.8 ± 1% 103.7 ± 5% ~ (p=0.582 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 27.21 ± 7% 28.50 ± 2% ~ (p=0.065 n=6) geomean 64.26 64.99 +1.14%	2024-11-01 13:19:51 +01:00
Patrick Ohly	1246898315	DRA scheduler: ResourceSlice with unique strings Using unique strings instead of normal strings speeds up allocation with structured parameters because maps that use those strings as key no longer need to build hashes of the string content. However, care must be taken to call unique.Make as little as possible because it is costly. Pre-allocating the map of allocated devices reduces the need to grow the map when adding devices. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 18.06 ± 2% 33.30 ± 2% +84.31% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 104.7 ± 2% 105.3 ± 2% ~ (p=0.818 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 96.62 ± 1% 100.75 ± 1% +4.28% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 83.00 ± 2% 90.96 ± 2% +9.59% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 32.45 ± 7% 49.84 ± 4% +53.60% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 95.22 ± 7% 103.80 ± 1% +9.00% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 9.111 ± 10% 27.215 ± 7% +198.69% (p=0.002 n=6) geomean 45.86 64.26 +40.12%	2024-11-01 13:19:48 +01:00
Patrick Ohly	7de6d070f2	DRA scheduler: avoid listing claims during Filter The Allocate call used to call back into the claim lister for each node. This was significant work which showed up at the top of the CPU profile. It's okay to list only once during PreFilter because the Filter call does not change the claim status between Allocate calls. goos: linux goarch: amd64 pkg: k8s.io/kubernetes/test/integration/scheduler_perf cpu: Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz │ before │ after │ │ SchedulingThroughput/Average │ SchedulingThroughput/Average vs base │ PerfScheduling/SchedulingWithResourceClaimTemplateStructured/5000pods_500nodes-36 15.04 ± 0% 18.06 ± 2% +20.07% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_100nodes-36 105.5 ± 1% 104.7 ± 2% ~ (p=0.485 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/empty_500nodes-36 95.83 ± 1% 96.62 ± 1% ~ (p=0.063 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_100nodes-36 79.67 ± 3% 83.00 ± 2% +4.18% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/half_500nodes-36 27.11 ± 5% 32.45 ± 7% +19.68% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_100nodes-36 84.00 ± 3% 95.22 ± 7% +13.36% (p=0.002 n=6) PerfScheduling/SteadyStateClusterResourceClaimTemplateStructured/full_500nodes-36 7.110 ± 6% 9.111 ± 10% +28.15% (p=0.002 n=6) geomean 41.05 45.86 +11.73%	2024-11-01 12:43:17 +01:00
Kubernetes Prow Robot	223ac36b50	Merge pull request #128399 from JesseStutler/dra Refactor the dynamicResources struct to DynamicResources	2024-11-01 00:33:27 +00:00
Kubernetes Prow Robot	daef8c2419	Merge pull request #127266 from pohly/dra-admin-access-in-status DRA API: AdminAccess in DeviceRequestAllocationResult + DRAAdminAccess feature gate	2024-10-30 03:41:25 +00:00
Kubernetes Prow Robot	988769933e	Merge pull request #128307 from NoicFank/bugfix-scheduler-preemption bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it	2024-10-29 19:05:02 +00:00
NoicFank	68f7a7c682	bugfix(scheduler): preemption picks wrong victim node with higher priority pod on it. Introducing pdb to preemption had disrupted the orderliness of pods in the victims, which would leads picking wrong victim node with higher priority pod on it.	2024-10-29 19:50:55 +08:00
Patrick Ohly	4419568259	DRA: treat AdminAccess as a new feature gated field Using the "normal" logic for a feature gated field simplifies the implementation of the feature gate. There is one (entirely theoretic!) problem with updating from 1.31: if a claim was allocated in 1.31 with admin access, the status field was not set because it didn't exist yet. If a driver now follows the current definition of "unset = off", then it will not grant admin access even though it should. This is theoretic because drivers are starting to support admin access with 1.32, so there shouldn't be any claim where this problem could occur.	2024-10-29 10:22:31 +01:00
Patrick Ohly	9a7e4ccab2	DRA admin access: add feature gate The new DRAAdminAccess feature gate has the following effects: - If disabled in the apiserver, the spec.devices.requests[*].adminAccess field gets cleared. Same in the status. In both cases the scenario that it was already set and a claim or claim template get updated is special: in those cases, the field is not cleared. Also, allocating a claim with admin access is allowed regardless of the feature gate and the field is not cleared. In practice, the scheduler will not do that. - If disabled in the resource claim controller, creating ResourceClaims with the field set gets rejected. This prevents running workloads which depend on admin access. - If disabled in the scheduler, claims with admin access don't get allocated. The effect is the same. The alternative would have been to ignore the fields in claim controller and scheduler. This is bad because a monitoring workload then runs, blocking resources that probably were meant for production workloads.	2024-10-29 09:50:11 +01:00
Patrick Ohly	f3fef01e79	DRA API: AdminAccess in DeviceRequestAllocationResult Drivers need to know that because admin access may also grant additional permissions. The allocator needs to ignore such results when determining which devices are considered as allocated. In both cases it is conceptually cleaner to not rely on the content of the ClaimSpec.	2024-10-29 09:50:07 +01:00
jessestutler	f7003c76b4	Refactor the dynamicResources struct to DynamicResources	2024-10-29 11:44:42 +08:00
Patrick Ohly	9d1b0654e0	DRA: add wg/device-management label automatically This makes PRs show up automatically in the WG's project board (https://github.com/orgs/kubernetes/projects/95/views/1).	2024-10-28 16:36:04 +01:00
Kubernetes Prow Robot	25d6f76538	Merge pull request #128337 from torredil/fix-gce-cos-master-serial-5123 Add VolumeAttachment event registration to CSI volume limits plugin	2024-10-26 16:00:52 +01:00
torredil	fe1badf635	Add VolumeAttachment event registration to CSI volume limits plugin Signed-off-by: torredil <torredil@amazon.com>	2024-10-26 13:41:28 +00:00
Kubernetes Prow Robot	aec2ea1877	Merge pull request #124609 from AxeZhan/refac Move some helper functions from api/v1 to component-helpers	2024-10-25 17:26:52 +01:00
AxeZhan	2ffb568540	rename functions	2024-10-25 12:53:24 +08:00
Kubernetes Prow Robot	352056f09d	Merge pull request #127757 from torredil/scheduler-bugfix-5123 scheduler: Improve CSILimits plugin accuracy by using VolumeAttachments	2024-10-23 18:12:52 +01:00
Kubernetes Prow Robot	e39571591d	Merge pull request #127478 from googs1025/scheduler/fine-grained feature(scheduler): more fine-grained QHints for podtopologyspread plugin	2024-10-20 13:29:03 +01:00
googs1025	1edbd0b54f	feature(scheduler): more fine-grained QHints for podtopologyspread plugin	2024-10-19 23:45:13 +08:00
torredil	56f2b192cc	scheduler: Improve CSILimits plugin accuracy by using VolumeAttachments Signed-off-by: torredil <torredil@amazon.com>	2024-10-18 19:02:14 +00:00
Kensei Nakada	83f9e4b6df	cleanup: remove event list	2024-10-18 11:10:10 +10:00
Patrick Ohly	f84eb5ecf8	DRA: remove "classic DRA" This removes the DRAControlPlaneController feature gate, the fields controlled by it (claim.spec.controller, claim.status.deallocationRequested, claim.status.allocation.controller, class.spec.suitableNodes), the PodSchedulingContext type, and all code related to the feature. The feature gets removed because there is no path towards beta and GA and DRA with "structured parameters" should be able to replace it.	2024-10-16 23:09:50 +02:00
AxeZhan	b1f07bb36c	add tests for scheduler	2024-10-10 15:53:19 +08:00
Kubernetes Prow Robot	3de975b732	Merge pull request #125171 from YamasouA/ft/queuehint-csidriver volumebinding: scheduler queueing hints - CSIDriver	2024-10-04 00:26:27 +01:00
YamasouA	6dbaa5660e	fix test	2024-10-02 22:50:39 +09:00
googs1025	24a28766d4	chore(scheduler dra): improve dra queue hint unit test	2024-10-01 17:22:15 +08:00
Kubernetes Prow Robot	67cdc26214	Merge pull request #127497 from pohly/dra-scheduler-queueing-hints-fix DRA scheduler: fix queuing hint support	2024-09-30 23:21:48 +01:00
Patrick Ohly	aee77bfc84	DRA scheduler: add special ActionType for ResourceClaim changes Having a dedicated ActionType which only gets used when the scheduler itself already detects some change in the list of generated ResourceClaims of a pod avoids calling the DRA plugin for unrelated Pod changes.	2024-09-27 16:53:58 +02:00
Patrick Ohly	d425353c13	DRA scheduler: reduce verbosity of queuing hints Other hints also only use V(5) or higher.	2024-09-27 08:15:33 +02:00
Patrick Ohly	4a265feb83	DRA scheduler: fix queuing hint support `d66f8f9` added that "plugins have to implement a QueueingHint for Pod/Update event if the rejection from them could be resolved by updating unscheduled Pods itself". This applies to DRA because the name of a generated ResourceClaim must be recorded in the pod status before the pod can be scheduled.	2024-09-27 08:15:33 +02:00
Kubernetes Prow Robot	960e3984b0	Merge pull request #127444 from dom4ha/fine-grained-qhints Fine grain QueueHints for NodeAffinity plugin	2024-09-27 01:42:00 +01:00
dom4ha	c7db4bb450	Fine grain QueueHints for nodeaffinity plugin. Skip queue on unrelated change that keeps pod schedulable when QueueHints are enabled. Split add from QHints disabled case Remove case when QHints are disabled Remove two GHint alternatives in unit tests more fine-grained Node QHint for NodeResourceFit plugin Return early when updated Node causes unmatch Revert "more fine-grained Node QHint for NodeResourceFit plugin" This reverts commit dfbceb60e0c1c4e47748c12722d9ed6dba1a8366. Add integration test for requeue of a pod previously rejected by NodeAffinity plugin when a suitable Node is added Add integratin test for a Node update operation that does not trigger requeue in NodeAffinity plugin Remove innacurrate comment Apply review comments	2024-09-26 10:21:08 +00:00
dom4ha	903b1f7e28	more fine-grained Node QHint for NodeResourceFit plugin	2024-09-26 09:51:36 +00:00

1 2 3 4 5 ...

1239 Commits