kubernetes

mirror of https://github.com/optim-enterprises-bv/kubernetes.git synced 2025-11-03 03:38:15 +00:00

Author	SHA1	Message	Date
Tim Hockin	e54719bb66	Use randfill, do API renames	2025-03-08 15:18:00 -08:00
Kubernetes Prow Robot	9d45ea8b9d	Merge pull request #128586 from mortent/DRAPrioritizedList Prioritized Alternatives in Device Requests	2025-03-06 21:01:44 -08:00
Kubernetes Prow Robot	0556b20d3d	Merge pull request #129435 from googs1025/dra/validation chore: add more error info for validateResourceSliceSpec	2025-03-01 02:16:55 -08:00
Morten Torkildsen	e2d1fcc162	Addressed comments	2025-02-28 20:47:35 +00:00
Morten Torkildsen	a716095a8a	DRA: Update validation for Prioritized Alternatives in Device Requests	2025-02-28 19:28:50 +00:00
Morten Torkildsen	68040a3173	Run make update	2025-02-28 19:28:26 +00:00
Morten Torkildsen	8f7b43b6fd	DRA: Update types and defaults for Prioritized Alternatives in Device Requests	2025-02-28 19:13:48 +00:00
Kubernetes Prow Robot	803e9d6495	Merge pull request #130355 from yongruilin/validation_origin validation: Add Origin field to field.Error for more precise error tracking	2025-02-28 00:04:23 -08:00
yongruilin	a488509197	test: Improve error comparison in resource validation tests Replace manual error logging with cmp.Diff for more precise error comparisons, using cmpopts to ignore Origin field and support UniqueString comparison.	2025-02-27 05:20:54 +00:00
Kubernetes Prow Robot	a18b4a8d97	Merge pull request #129158 from LionelJouin/fix-128831 Fix ResourceClaim status API inconsistency	2025-02-26 20:32:30 -08:00
googs1025	f540197768	chore: add more error info for validateResourceSliceSpec	2025-02-22 08:47:58 +08:00
Kubernetes Prow Robot	9bf60d06e0	Merge pull request #129219 from danwinship/networkdevicedata-validation Require canonicalization of NetworkDeviceData IPs	2025-02-20 16:14:26 -08:00
Dan Winship	2636aa35e3	Require canonicalization of NetworkDeviceData IPs There's no reason to allow non-standard or non-canonical IP values in new APIs.	2025-02-20 12:49:03 -05:00
Kubernetes Prow Robot	481cc1a392	Merge pull request #129560 from bart0sh/PR168-DRA-fix-All-allocation-mode DRA: fix allocation mode `All`	2025-02-05 00:38:16 -08:00
Ed Bartosh	829fa63b5b	DRA: fix allocation mode `All` `All` allocation mode should mean 'at least one' for DRA. Allocation should fail if `All` devices requested and none found.	2025-01-30 16:34:25 +02:00
Patrick Ohly	2cc3dbf225	DRA CEL: add missing size estimator Not implementing a size estimator had the effect that strings retrieved from the attributes were treated as "unknown size", leading to wildly overestimating the cost and validation errors even for even simple expressions like this: device.attributes["qat.intel.com"].services.matches("[^a]?sym") Maximum number of elements in maps and the maximum length of the driver name string were also ignored resp. missing. Pre-defined types like apiservercel.StringType must be avoided because they are defined as having a zero maximum size.	2025-01-16 16:36:43 +01:00
Patrick Ohly	1cee3682da	DRA API: bump maximum size of ReservedFor to 256 The original limit of 32 seemed sufficient for a single GPU on a node. But for shared non-local resources it is too low. For example, a ResourceClaim might be used to allocate an interconnect channel that connects all pods of a workload running on several different nodes, in which case the number of pods can be considerably larger. 256 is high enough for currently planned systems. If we need something even higher in the future, an alternative approach might be needed to avoid scalability problems. Normally, increasing such a limit would have to be done incrementally over two releases. In this case we decided on Slack (https://kubernetes.slack.com/archives/CJUQN3E4T/p1734593174791519) to make an exception and apply this change to current master for 1.33 and backport it to the next 1.32.x patch release for production usage. This breaks downgrades to a 1.32 release without this change if there are ResourceClaims with a number of consumers > 32 in ReservedFor. In practice, this breakage is very unlikely because there are no workloads yet which need so many consumers and such downgrades to a previous patch release are also unlikely. Downgrades to 1.31 already weren't supported when using DRA v1beta1.	2025-01-09 14:26:01 +01:00
Lionel Jouin	5f4d646ea3	Add Device status const comments Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-12-29 12:29:58 +01:00
Lionel Jouin	1d13ff2a05	make update Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-12-14 19:00:06 +01:00
Lionel Jouin	11d68ecc4e	ResourceClaim.Status.Devices.Data as pointer Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-12-14 18:59:59 +01:00
Lionel Jouin	ca5f1deed4	Fix ResourceClaim status API inconsistency * Add constant for limits * Fix comments in API Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-12-13 14:44:09 +01:00
Patrick Ohly	8a908e0c0b	remove import doc comments The "// import <path>" comment has been superseded by Go modules. We don't have to remove them, but doing so has some advantages: - They are used inconsistently, which is confusing. - We can then also remove the (currently broken) hack/update-vanity-imports.sh. - Last but not least, it would be a first step towards avoiding the k8s.io domain. This commit was generated with sed -i -e 's;^package $.$ // import.;package \1;' $(git grep -l '^package.*// import' \| grep -v 'vendor/') Everything was included, except for package labels // import k8s.io/kubernetes/pkg/util/labels because that package is marked as "read-only".	2024-12-02 16:59:34 +01:00
AxeZhan	3075a9ae96	DRA API: validate node selector labels Previously, ValidateNodeSelector did not check that labels are valid. Now it does for resource.k8s.io, regardless whether an object already was created with invalid labels in an earlier Kubernetes release. Theoretically this is a breaking change and could cause problems during an upgrade, but that is highly unlikely in practice. In contrast to node affinity, DRA does not ignore parse errors (= uses NewNodeSelector, not NewLazyErrorNodeSelector), so invalid labels would have been found instead of being silently ignored. Even if some object has invalid labels, this only affects an alpha -> beta upgrade which isn't guaranteed to work seamlessly.	2024-11-22 09:10:02 +01:00
Lionel Jouin	118356175d	[KEP-4817] Add limits on conditions and IPs + fix documentation Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 22:18:53 +01:00
Lionel Jouin	d28b50e0a0	[KEP-4817] make update Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 10:36:09 +01:00
Lionel Jouin	39f55e1cd0	[KEP-4817] Add data length limit (from #128601 ) Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 10:35:29 +01:00
Lionel Jouin	4b76ba1a87	[KEP-4817] Rename Addresses to IPs Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 09:59:56 +01:00
Lionel Jouin	43d23b8994	[KEP-4817] Use structured.MakeDeviceID Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 09:59:56 +01:00
Lionel Jouin	8ab33b8413	[KEP-4817] Improve NetworkData Validation * Add max length for InterfaceName and HardwareAddress * Prevent duplicated Addresses Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 09:59:56 +01:00
Lionel Jouin	a062f91106	[KEP-4817] Fixes based on review * Rename HWAddress to HardwareAddress * Fix condition validation * Remove feature gate validation * Fix drop field on disabled feature gate Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 09:59:56 +01:00
Lionel Jouin	5df47a64d3	[KEP-4817] Remove unnecessary DeepCopy in validation tests Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 09:59:56 +01:00
Lionel Jouin	cb9ee1d4fe	[KEP-4817] Remove pointer on Data, InterfaceName and HWAddress fields Adapat validation and tests based on these API changes Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 09:59:51 +01:00
Lionel Jouin	5d7a16b0a5	[KEP-4817] improve testing * Test feature-gate enabled/disabled for validation * Test pkg/registry/resource/resourceclaim * Add Data and NetworkData to integration test Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 09:54:19 +01:00
Lionel Jouin	4bd62e5234	[KEP-4817] Fix fuzz API tests and ./hack/update-featuregates.sh Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 09:54:19 +01:00
Lionel Jouin	3e595db0af	[KEP-4817] API, validation and feature-gate * Add status * Add validation to check if fields are correct (Network field, device has been allocated)) * Add feature-gate * Drop field if feature-gate not set Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>	2024-11-07 09:54:17 +01:00
Patrick Ohly	446f20aa3e	DRA API: add maximum length of opaque parameters This had been left out unintentionally earlier. Because theoretically there might now be existing objects with parameters that are larger than whatever limit gets enforced now, the limit only gets checked when parameters get created or modified. This is similar to the validation of CEL expressions and for consistency, the same 10 Ki limit as for those is chosen. Because the limit is not enforced for stored parameters, it can be increased in the future, with the caveat that users who need larger parameters then depend on the newer Kubernetes release with a higher limit. Lowering the limit is harder because creating deployments that worked in older Kubernetes will not work anymore with newer Kubernetes.	2024-11-06 17:29:51 +01:00
Patrick Ohly	30f5282656	DRA API: rename DeviceCapacity.Quantity to DeviceCapacity.Value Based on review feedback (https://github.com/kubernetes/kubernetes/pull/127511#discussion_r1823521172).	2024-11-06 13:03:20 +01:00
Patrick Ohly	81fd64256c	DRA API: use DeviceCapacity struct instead of plain Quantity This enables a future extension where capacity of a single device gets consumed by different claims. The semantic without any additional fields is the same as before: a capacity cannot be split up and is only an attribute of a device. Because its semantically the same as before, two-way conversion to v1alpha3 is possible.	2024-11-06 13:03:19 +01:00
Patrick Ohly	142319bd92	DRA API: use v1beta1 as storage version This is meant to make it easier to remove the v1alpha3 because it won't be used in clusters that started with DRA as beta in Kubernetes 1.32 when all clients support v1beta1.	2024-11-06 13:03:19 +01:00
Patrick Ohly	2e64c72249	DRA API: register v1beta1 This is the minimal set of changes that are needed to make the new version usable. The storage version is still v1alpha3. More changes will follow.	2024-11-06 13:03:18 +01:00
Patrick Ohly	d685064ff7	DRA API: search/replace v1alpha3 -> v1beta1	2024-11-06 13:03:18 +01:00
Patrick Ohly	f1e5616f05	DRA API: verbatim copy of v1alpha3 -> v1beta1	2024-11-06 13:03:18 +01:00
Patrick Ohly	99acb67c68	DRA API: enhance validation testing The line coverage is now at 98.5% and several more corner cases are covered. The remaining lines are hard or impossible to reach. The actual validation is the same as before, with some small tweaks to the generated errors. When failures are not as expected, it is useful to show what the expected and actual failures look like to a user. Perhaps even better would be to put the expected texts into the test files instead of the error structs. That would be easier to review and shorter.	2024-11-06 13:03:18 +01:00
Patrick Ohly	51d5992335	DRA API: fix some comments Wording in one case was wrong. The tombstone comment should use the same field definition as before the removal.	2024-11-06 11:05:05 +01:00
Tim Hockin	c8eeb486f4	Call-site comments: the "" arg to TooLong is unused	2024-11-05 15:10:24 -08:00
Tim Hockin	8a7af90300	Clarify that value arg to field.TooLong is unused	2024-11-05 15:10:23 -08:00
Tim Hockin	4d0e1c8fd4	Kill TooLongMaxLength() in favor of TooLong()	2024-11-05 15:10:22 -08:00
Kubernetes Prow Robot	daef8c2419	Merge pull request #127266 from pohly/dra-admin-access-in-status DRA API: AdminAccess in DeviceRequestAllocationResult + DRAAdminAccess feature gate	2024-10-30 03:41:25 +00:00
Patrick Ohly	4419568259	DRA: treat AdminAccess as a new feature gated field Using the "normal" logic for a feature gated field simplifies the implementation of the feature gate. There is one (entirely theoretic!) problem with updating from 1.31: if a claim was allocated in 1.31 with admin access, the status field was not set because it didn't exist yet. If a driver now follows the current definition of "unset = off", then it will not grant admin access even though it should. This is theoretic because drivers are starting to support admin access with 1.32, so there shouldn't be any claim where this problem could occur.	2024-10-29 10:22:31 +01:00
Patrick Ohly	9a7e4ccab2	DRA admin access: add feature gate The new DRAAdminAccess feature gate has the following effects: - If disabled in the apiserver, the spec.devices.requests[*].adminAccess field gets cleared. Same in the status. In both cases the scenario that it was already set and a claim or claim template get updated is special: in those cases, the field is not cleared. Also, allocating a claim with admin access is allowed regardless of the feature gate and the field is not cleared. In practice, the scheduler will not do that. - If disabled in the resource claim controller, creating ResourceClaims with the field set gets rejected. This prevents running workloads which depend on admin access. - If disabled in the scheduler, claims with admin access don't get allocated. The effect is the same. The alternative would have been to ignore the fields in claim controller and scheduler. This is bad because a monitoring workload then runs, blocking resources that probably were meant for production workloads.	2024-10-29 09:50:11 +01:00

1 2

85 Commits