kubernetes

mirror of https://github.com/optim-enterprises-bv/kubernetes.git synced 2025-11-25 02:45:12 +00:00

Author	SHA1	Message	Date
rongfu.leng	d04a54c50b	optimize code, filter podUID is empty string Signed-off-by: rongfu.leng <lenronfu@gmail.com>	2024-09-13 01:48:14 +00:00
Kubernetes Prow Robot	11e8169a16	Merge pull request #120569 from ffromani/cpumanager-extra-logs enhance the cpumanager logs	2024-09-12 00:25:18 +01:00
Kubernetes Prow Robot	8e5d7cbef7	Merge pull request #127250 from bart0sh/PR157-Kubelet-DRA-fix-testify-errors Kubelet: DRA: fix testify errors	2024-09-09 23:24:48 +01:00
Ed Bartosh	e70a2ad828	Kubelet: DRA: fix testify errors	2024-09-09 22:18:07 +03:00
Kubernetes Prow Robot	1c15f718b6	Merge pull request #126717 from bart0sh/PR154-DRA-test-restoring-checkpoint-for-upgraded-structure DRA: test checkpoint structure upgrade	2024-09-09 11:32:27 +01:00
Kubernetes Prow Robot	14ff551c96	Merge pull request #124246 from SataQiu/fix-20240409 Remove unused code in container manager	2024-09-06 23:47:02 +01:00
Kubernetes Prow Robot	3ac8fc04e1	Merge pull request #126834 from carlory/fix-125924-1 DRA: rename pkg/cm/dra/plugin files	2024-09-05 12:15:58 +01:00
Kubernetes Prow Robot	14f2cab4de	Merge pull request #126976 from jsturtevant/socket-file-revert Revert "fix: handle socket file detection on Windows"	2024-09-03 18:31:16 +01:00
Kubernetes Prow Robot	a4ec0c039a	Merge pull request #126435 from bart0sh/PR151-Kubelet-devicemanager-stop-using-CDI-annotations Kubelet: stop using CDI annotations	2024-08-29 16:49:30 +01:00
Ed Bartosh	d3b5cb6f41	DRA: test checkpoint structure and version upgrades Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>	2024-08-29 15:25:58 +03:00
James Sturtevant	3ca610757e	Revert "fix: handle socket file detection on Windows" This reverts commit `4060ee60c1`.	2024-08-28 10:31:58 -07:00
carlory	3372c056cd	fix linter hints	2024-08-27 01:30:58 +08:00
carlory	7b33495d9d	DRA: rename pkg/cm/dra/plugin files	2024-08-27 00:54:37 +08:00
Ed Bartosh	e1bc8defac	kubelet: Migrate DRA Manager to contextual logging Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>	2024-08-22 11:12:41 +03:00
Ed Bartosh	9d893c83f0	DRA: fix failing test Added error assertion for NodePrepareResources call unveiled "rpc error: code = DeadlineExceeded desc = context deadline exceeded" failure in the TestGRPCConnIsReused test. Setting clientCallTimeout field when creating plugin should fix it.	2024-08-20 11:11:43 +03:00
Ed Bartosh	ea3c6628b7	Kubelet: stop using CDI annotations Removing setting CDI annotations by the device manager as CRI field CDIDevices is mature enough to be used instead.	2024-07-29 18:26:27 +03:00
Paco Xu	78d3830d97	ignore order of containers status allocated resources	2024-07-29 16:48:00 +08:00
Kubernetes Prow Robot	5af1710d90	Merge pull request #126243 from SergeyKanzhelev/devicePluginFailures Implement resource health in pod status (KEP 4680)	2024-07-23 20:12:24 -07:00
Sergey Kanzhelev	62f96d2748	set AllocatedResourcesStatus in the Pod Status	2024-07-24 00:29:35 +00:00
Ed Bartosh	c0d922e786	DRA: Kubelet code cleanup	2024-07-24 00:27:52 +03:00
Ed Bartosh	59555c6a62	DRA: move dra/checkpont/* to dra/state/*	2024-07-24 00:12:10 +03:00
Ed Bartosh	35fbbc5cfd	DRA: use crc32.ChecksumIEEE to calculate checkpoint checksum	2024-07-24 00:10:39 +03:00
Ed Bartosh	59daed75d6	DRA: refactor checkpointing Co-authored-by: Kevin Klues <klueska@gmail.com>	2024-07-24 00:10:30 +03:00
Patrick Ohly	d11b58efe6	DRA kubelet: refactor gRPC call timeouts Some of the E2E node tests were flaky. Their timeout apparently was chosen under the assumption that kubelet would retry immediately after a failed gRPC call, with a factor of 2 as safety margin. But according to `0449cef8fd`, kubelet has a different, higher retry period of 90 seconds, which was exactly the test timeout. The test timeout has to be higher than that. As the tests don't use the gRPC call timeout anymore, it can be made private. While at it, the name and documentation gets updated.	2024-07-22 18:09:34 +02:00
Patrick Ohly	877829aeaa	DRA kubelet: adapt to v1alpha3 API This adds the ability to select specific requests inside a claim for a container. NodePrepareResources is always called, even if the claim is not used by any container. This could be useful for drivers where that call has some effect other than injecting CDI device IDs into containers. It also ensures that drivers can validate configs. The pod resource API can no longer report a class for each claim because there is no such 1:1 relationship anymore. Instead, that API reports claim, API devices (with driver/pool/device as ID) and CDI device IDs. The kubelet itself doesn't extract that information from the claim. Instead, it relies on drivers to report this information when the claim gets prepared. This isolates the kubelet from API changes. Because of a faulty E2E test, kubelet was told to contact the wrong driver for a claim. This was not visible in the kubelet log output. Now changes to the claim info cache are getting logged. While at it, naming of variables and some existing log output gets harmonized. Co-authored-by: Oksana Baranova <oksana.baranova@intel.com> Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>	2024-07-22 18:09:34 +02:00
Patrick Ohly	91d7882e86	DRA: new API for 1.31 This is a complete revamp of the original API. Some of the key differences: - refocused on structured parameters and allocating devices - support for constraints across devices - support for allocating "all" or a fixed amount of similar devices in a single request - no class for ResourceClaims, instead individual device requests are associated with a mandatory DeviceClass For the sake of simplicity, optional basic types (ints, strings) where the null value is the default are represented as values in the API types. This makes Go code simpler because it doesn't have to check for nil (consumers) and values can be set directly (producers). The effect is that in protobuf, these fields always get encoded because `opt` only has an effect for pointers. The roundtrip test data for v1.29.0 and v1.30.0 changes because of the new "request" field. This is considered acceptable because the entire `claims` field in the pod spec is still alpha. The implementation is complete enough to bring up the apiserver. Adapting other components follows.	2024-07-22 18:09:34 +02:00
Francesco Romani	0a9b17771d	node: cpumgr: log: make errors louder We have a special case which is not supposed to happen. Make it louder with default log settings to make sure this is visible. Signed-off-by: Francesco Romani <fromani@redhat.com>	2024-07-22 14:04:05 +02:00
Francesco Romani	2dc5ddd08a	node: cpumgr: logs: bump log verbosiness for expected skips In the reconciliation flow, there are expected skipping conditions (e.g. for active logs). To reduce noise in the logs, bump up the verbosiness of these messages, using odd levels. Signed-off-by: Francesco Romani <fromani@redhat.com>	2024-07-22 14:04:05 +02:00
Francesco Romani	5a0bc1020b	node: cpumgr: move flow to left and add logs Refactor the code to align to the left bailing out earlier if the code must do nothing. Add log to trace this occurrence. Besides extra log, no intended change in behavior. Signed-off-by: Francesco Romani <fromani@redhat.com>	2024-07-22 14:04:04 +02:00
Francesco Romani	a89c843edd	node: cpumgr: ErrorS -> InfoS Convert uncommon use of ErrorS(nil, ...) into more regular use of InfoS. Set the verbosiness level to make sure the message is still emitted in regular expected configuration. Signed-off-by: Francesco Romani <fromani@redhat.com>	2024-07-22 14:04:04 +02:00
Patrick Ohly	b51d68bb87	DRA: bump API v1alpha2 -> v1alpha3 This is in preparation for revamping the resource.k8s.io completely. Because there will be no support for transitioning from v1alpha2 to v1alpha3, the roundtrip test data for that API in 1.29 and 1.30 gets removed. Repeating the version in the import name of the API packages is not really required. It was done for a while to support simpler grepping for usage of alpha APIs, but there are better ways for that now. So during this transition, "resourceapi" gets used instead of "resourcev1alpha3" and the version gets dropped from informer and lister imports. The advantage is that the next bump to v1beta1 will affect fewer source code lines. Only source code where the version really matters (like API registration) retains the versioned import.	2024-07-21 17:28:13 +02:00
Kubernetes Prow Robot	f2428d66cc	Merge pull request #125163 from pohly/dra-kubelet-api-version-independent-no-rest-proxy DRA: make kubelet independent of the resource.k8s.io API version	2024-07-18 17:47:48 -07:00
Kubernetes Prow Robot	5fc7032a0e	Merge pull request #126156 from pohly/kubelet-test-enhancements kubelet test enhancements	2024-07-18 14:50:54 -07:00
Patrick Ohly	7701a48bd6	dra kubelet: bump gRPC API to v1alpha4 The previous changes are an API break, therefore we need a new version.	2024-07-18 23:30:09 +02:00
Kubernetes Prow Robot	9196650533	Merge pull request #123819 from fakecore/fc/master fix: handle socket file detection on Windows	2024-07-18 00:53:16 -07:00
Patrick Ohly	348f94ab55	DRA: read ResourceClaim in DRA drivers This is the second and final step towards making kubelet independent of the resource.k8s.io API versioning because it now doesn't need to copy structs defined by that API from the driver to the API server.	2024-07-18 09:09:20 +02:00
Patrick Ohly	616a014347	DRA: move ResourceSlice publishing into DRA drivers This is a first step towards making kubelet independent of the resource.k8s.io API versioning because it now doesn't need to copy structs defined by that API from the driver to the API server. The next step is removing the other direction (reading ResourceClaim status and passing the resource handle to drivers). The drivers must get deployed so that they have their own connection to the API server. Securing at least the writes via a validating admission policy should be possible. As before, the kubelet removes all ResourceSlices for its node at startup, then DRA drivers recreate them if (and only if) they start up again. This ensures that there are no orphaned ResourceSlices when a driver gets removed while the kubelet was down. While at it, logging gets cleaned up and updated to use structured, contextual logging as much as possible. gRPC requests and streams now use a shared, per-process request ID and streams also get logged.	2024-07-18 09:09:19 +02:00
Patrick Ohly	b9d00841a6	kubelet: improve checkpoint errors Recording the expected and actual checksum in the error makes it possible to provide that information, for example in a failed test like the ones for DRA. Otherwise developers have to manually step through the test with a debugger to figure out what the new checksum is.	2024-07-17 16:07:31 +02:00
Kubernetes Prow Robot	2263f2d719	Merge pull request #124148 from cyclinder/add_flag_kubelet kubelet: Add a TopologyManager policy option: max-allowable-numa-nodes	2024-07-15 19:27:16 -07:00
Kubernetes Prow Robot	3361895612	Merge pull request #123733 from Jeffwan/jiaxin/kep-4176-240305 KEP-4176: Add a new static policy SpreadPhysicalCPUsPreferredOption	2024-07-15 01:41:10 -07:00
Jiaxin Shan	6c85fd4ddd	KEP-4176: Add static policy option to distribute cpus across cores	2024-07-12 11:52:51 -07:00
Kubernetes Prow Robot	2d4514e169	Merge pull request #125802 from mmorel-35/testifylint/len+empty fix: enable empty and len rules from testifylint on pkg and staging package	2024-07-11 23:12:06 -07:00
Harshal Patil	68d317a8d1	Add a warning log, event and metric for cgroup version 1 Signed-off-by: Harshal Patil <harpatil@redhat.com>	2024-07-09 11:34:46 -04:00
cyclinder	87129c350a	kubelet: Add a TopologyManager policy options: "max-allowable-numa-nodes" Signed-off-by: cyclidner <kuocyclinder@gmail.com>	2024-07-09 22:26:24 +08:00
Matthieu MOREL	f014b754fb	fix: enable empty and len rules from testifylint on pkg package Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>	2024-07-06 23:15:43 +00:00
Kubernetes Prow Robot	7e1a5a0ea8	Merge pull request #125687 from bart0sh/PR146-DevicePluginCDIDevices-LockToDefault kube_features: DevicePluginCDIDevices: LockToDefault	2024-07-01 17:07:41 -07:00
Kubernetes Prow Robot	34b8832edb	Merge pull request #125631 from SergeyKanzhelev/logFailedAdmission improve logging of pod admission denied	2024-06-28 19:36:20 -07:00
Kubernetes Prow Robot	16b7d5310a	Merge pull request #125047 from zhanluxianshen/clean-typos-in-kubelet clean typos logs in kubelet.	2024-06-28 16:48:24 -07:00
Kubernetes Prow Robot	ac9aec9f9b	Merge pull request #125116 from pohly/dra-one-of-source DRA: remove "source" indirection from v1 Pod API	2024-06-28 12:46:45 -07:00
Matthieu MOREL	0cde5f1e28	fix: enable bool-compare rule from testifylint linter (#125135 ) * fix: enable bool-compare rule from testifylint linter Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> * Update hack/golangci.yaml.in Co-authored-by: Patrick Ohly <patrick.ohly@intel.com> * Update golangci.yaml.in * Update golangci-strict.yaml * Update golangci.yaml.in * Update golangci.yaml.in * Update golangci.yaml.in * Update golangci.yaml.in * Update golangci.yaml * Update golangci-hints.yaml * Update golangci-strict.yaml * Update golangci.yaml.in * Update golangci.yaml * Update mux_test.go --------- Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com> Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>	2024-06-28 10:58:05 -07:00

1 2 3 4 5 ...

1440 Commits