kubernetes

mirror of https://github.com/optim-enterprises-bv/kubernetes.git synced 2025-12-25 01:07:45 +00:00

Author	SHA1	Message	Date
Davanum Srinivas	abbc5ad346	Copy limited pieces of code we use from runc's apparmor and utils packages Signed-off-by: Davanum Srinivas <davanum@gmail.com>	2024-10-22 09:56:22 -04:00
Jing Zhang	0365cf4b20	KEP-4540: Add CPUManager policy option strict-cpu-reservation Signed-off-by: Jing Zhang <jing.c.zhang.ext@nokia.com>	2024-10-21 11:57:17 -04:00
Kubernetes Prow Robot	ded7ad554e	Merge pull request #125513 from mauri870/hotfix/grpc-handle-err kubelet/cm/devicemanager: log grpc Serve error	2024-10-18 02:49:03 +01:00
PiotrProkop	37ac9aa060	topologymanager: promote TopologyManagerPolicyOptions feature to GA * Promote TopologyManagerPolicyOptions feature to GA * Promote PreferClosestNUMANodes TopologyManagerPolicyOption to stable Signed-off-by: PiotrProkop <pprokop@nvidia.com>	2024-10-17 20:58:34 +02:00
Kubernetes Prow Robot	a4c262bc8c	Merge pull request #127293 from hshiina/typecheck kubelet/cm: Unite return value types of helper functions	2024-10-17 07:45:04 +01:00
Peter Hunt	77d03e42cd	kubelet/cm: move CPU reading from cm to cm/cpumanager Authored-by: Francesco Romani <fromani@redhat.com> Signed-off-by: Peter Hunt <pehunt@redhat.com>	2024-10-11 11:29:16 -04:00
Peter Hunt	c51195dbd0	kubelet/cm: fix bug where kubelet restarts from missing cpuset cgroup on None cpumanager policy, cgroupv2, and systemd cgroup manager, kubelet could get into a situation where it believes the cpuset cgroup was created (by libcontainer in the cgroupfs) but systemd has deleted it, as it wasn't requested to create it. This causes one unnecessary restart, as kubelet fails with `failed to initialize top level QOS containers: root container [kubepods] doesn't exist.` This only causes one restart because the kubelet skips recreating the cgroup if it already exists, but it's still a bother and is fixed this way Signed-off-by: Peter Hunt <pehunt@redhat.com>	2024-10-11 10:49:16 -04:00
Kubernetes Prow Robot	3bf17e2340	Merge pull request #127959 from ffromani/fix-smtalign-error-message node: cpumanager: fix smtalign error message and minor cleanup	2024-10-11 00:32:20 +01:00
Francesco Romani	838f911dea	cpumanager: smtalign: fix error message Fix error message if availablePhysicalCPUs = 0. Without this change, the logic was mistakenly emitting the old error message, which is confusing for troubleshooting. Plus, a tiny quality of life improvement: cpumanager static policy wants to use `cpuGroupSize` multiple times. The value represents how many VCPUs per PCPUs the machine has. So, let's cache (and log!) the value in the policy data. We don't support dynamic update of the HW topology anyway. Signed-off-by: Francesco Romani <fromani@redhat.com>	2024-10-10 10:18:44 +02:00
Kubernetes Prow Robot	c923a61ddd	Merge pull request #125982 from harche/compressible_reserved Set only compressible resources on system and kube reserved cgroup slices	2024-10-04 04:08:27 +01:00
Harshal Patil	3bad47e8ed	Set only compressible resources on system slice Signed-off-by: Harshal Patil <harpatil@redhat.com>	2024-10-03 13:23:34 -04:00
Kubernetes Prow Robot	e34f7f4d80	Merge pull request #127671 from mmorel-35/testify/error-contains fix: use `ErrorContains(t, err` instead of `Contains(t, err.Error()`	2024-09-28 19:18:01 +01:00
Matthieu MOREL	f736cca0e5	fix: enable expected-actual rule from testifylint in module `k8s.io/kubernetes` Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-09-27 07:56:31 +02:00
Matthieu MOREL	f777addb05	fix: use `ErrorContains(t, err` instead of `Contains(t, err.Error()` Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-09-26 22:22:20 +02:00
Matthieu MOREL	27b98be303	fix: enable nil-compare and error-nil rules from testifylint in module `k8s.io/kubernetes` Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-09-25 06:02:47 +02:00
rongfu.leng	ead64fb8f0	add resourceupdates.Update chan buffer Signed-off-by: rongfu.leng <lenronfu@gmail.com>	2024-09-24 16:48:32 +00:00
Matthieu MOREL	fa0e38981c	fix: enable compares rule from testifylint in module k8s.io/kubernetes Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2024-09-22 11:20:05 +02:00
Kubernetes Prow Robot	f2700895a4	Merge pull request #127422 from srivastav-abhishek/go-vet-fix Go vet fixes for gotip	2024-09-20 14:37:58 +01:00
Abhishek Kr Srivastav	95860cff1c	Fix Go vet errors for master golang Co-authored-by: Rajalakshmi-Girish <rajalakshmi.girish1@ibm.com> Co-authored-by: Abhishek Kr Srivastav <Abhishek.kr.srivastav@ibm.com>	2024-09-20 12:36:38 +05:30
Kubernetes Prow Robot	24a74f887a	Merge pull request #126595 from pacoxu/kubelet-cgroup-v2-kernel-version [1.32]kubelet: add log and event for cgroup v2 running on kernel < 5.8	2024-09-18 18:34:44 +01:00
Paco Xu	259671bd43	check root cpu.stat instead of kernel version for cgroup v2	2024-09-18 11:39:36 +08:00
Kubernetes Prow Robot	f153edf356	Merge pull request #123443 from Tal-or/mm_consistent_memory_numa_alloc memorymanager: avoid violating NUMA node memory allocation rule	2024-09-17 20:10:43 +01:00
Ed Bartosh	bba786496b	Kubelet: DRA: fix golangci-lint findings	2024-09-16 12:35:12 +03:00
rongfu.leng	d04a54c50b	optimize code, filter podUID is empty string Signed-off-by: rongfu.leng <lenronfu@gmail.com>	2024-09-13 01:48:14 +00:00
Kubernetes Prow Robot	11e8169a16	Merge pull request #120569 from ffromani/cpumanager-extra-logs enhance the cpumanager logs	2024-09-12 00:25:18 +01:00
Hironori Shiina	107bb17538	kubelet/cm: Unite return value types of helper functions	2024-09-11 10:50:48 +02:00
Kubernetes Prow Robot	8e5d7cbef7	Merge pull request #127250 from bart0sh/PR157-Kubelet-DRA-fix-testify-errors Kubelet: DRA: fix testify errors	2024-09-09 23:24:48 +01:00
Ed Bartosh	e70a2ad828	Kubelet: DRA: fix testify errors	2024-09-09 22:18:07 +03:00
Kubernetes Prow Robot	1c15f718b6	Merge pull request #126717 from bart0sh/PR154-DRA-test-restoring-checkpoint-for-upgraded-structure DRA: test checkpoint structure upgrade	2024-09-09 11:32:27 +01:00
Kubernetes Prow Robot	14ff551c96	Merge pull request #124246 from SataQiu/fix-20240409 Remove unused code in container manager	2024-09-06 23:47:02 +01:00
Kubernetes Prow Robot	3ac8fc04e1	Merge pull request #126834 from carlory/fix-125924-1 DRA: rename pkg/cm/dra/plugin files	2024-09-05 12:15:58 +01:00
Kubernetes Prow Robot	14f2cab4de	Merge pull request #126976 from jsturtevant/socket-file-revert Revert "fix: handle socket file detection on Windows"	2024-09-03 18:31:16 +01:00
Kubernetes Prow Robot	a4ec0c039a	Merge pull request #126435 from bart0sh/PR151-Kubelet-devicemanager-stop-using-CDI-annotations Kubelet: stop using CDI annotations	2024-08-29 16:49:30 +01:00
Ed Bartosh	d3b5cb6f41	DRA: test checkpoint structure and version upgrades Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>	2024-08-29 15:25:58 +03:00
James Sturtevant	3ca610757e	Revert "fix: handle socket file detection on Windows" This reverts commit `4060ee60c1`.	2024-08-28 10:31:58 -07:00
carlory	3372c056cd	fix linter hints	2024-08-27 01:30:58 +08:00
carlory	7b33495d9d	DRA: rename pkg/cm/dra/plugin files	2024-08-27 00:54:37 +08:00
Ed Bartosh	e1bc8defac	kubelet: Migrate DRA Manager to contextual logging Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>	2024-08-22 11:12:41 +03:00
Ed Bartosh	9d893c83f0	DRA: fix failing test Added error assertion for NodePrepareResources call unveiled "rpc error: code = DeadlineExceeded desc = context deadline exceeded" failure in the TestGRPCConnIsReused test. Setting clientCallTimeout field when creating plugin should fix it.	2024-08-20 11:11:43 +03:00
Paco Xu	69a67556c7	kubelet: add warning log and events for cgroup v2 running on kernel < 5.8	2024-08-12 14:06:56 +08:00
Ed Bartosh	ea3c6628b7	Kubelet: stop using CDI annotations Removing setting CDI annotations by the device manager as CRI field CDIDevices is mature enough to be used instead.	2024-07-29 18:26:27 +03:00
Paco Xu	78d3830d97	ignore order of containers status allocated resources	2024-07-29 16:48:00 +08:00
Kubernetes Prow Robot	5af1710d90	Merge pull request #126243 from SergeyKanzhelev/devicePluginFailures Implement resource health in pod status (KEP 4680)	2024-07-23 20:12:24 -07:00
Sergey Kanzhelev	62f96d2748	set AllocatedResourcesStatus in the Pod Status	2024-07-24 00:29:35 +00:00
Ed Bartosh	c0d922e786	DRA: Kubelet code cleanup	2024-07-24 00:27:52 +03:00
Ed Bartosh	59555c6a62	DRA: move dra/checkpont/* to dra/state/*	2024-07-24 00:12:10 +03:00
Ed Bartosh	35fbbc5cfd	DRA: use crc32.ChecksumIEEE to calculate checkpoint checksum	2024-07-24 00:10:39 +03:00
Ed Bartosh	59daed75d6	DRA: refactor checkpointing Co-authored-by: Kevin Klues <klueska@gmail.com>	2024-07-24 00:10:30 +03:00
Patrick Ohly	d11b58efe6	DRA kubelet: refactor gRPC call timeouts Some of the E2E node tests were flaky. Their timeout apparently was chosen under the assumption that kubelet would retry immediately after a failed gRPC call, with a factor of 2 as safety margin. But according to `0449cef8fd`, kubelet has a different, higher retry period of 90 seconds, which was exactly the test timeout. The test timeout has to be higher than that. As the tests don't use the gRPC call timeout anymore, it can be made private. While at it, the name and documentation gets updated.	2024-07-22 18:09:34 +02:00
Patrick Ohly	877829aeaa	DRA kubelet: adapt to v1alpha3 API This adds the ability to select specific requests inside a claim for a container. NodePrepareResources is always called, even if the claim is not used by any container. This could be useful for drivers where that call has some effect other than injecting CDI device IDs into containers. It also ensures that drivers can validate configs. The pod resource API can no longer report a class for each claim because there is no such 1:1 relationship anymore. Instead, that API reports claim, API devices (with driver/pool/device as ID) and CDI device IDs. The kubelet itself doesn't extract that information from the claim. Instead, it relies on drivers to report this information when the claim gets prepared. This isolates the kubelet from API changes. Because of a faulty E2E test, kubelet was told to contact the wrong driver for a claim. This was not visible in the kubelet log output. Now changes to the claim info cache are getting logged. While at it, naming of variables and some existing log output gets harmonized. Co-authored-by: Oksana Baranova <oksana.baranova@intel.com> Co-authored-by: Ed Bartosh <eduard.bartosh@intel.com>	2024-07-22 18:09:34 +02:00

1 2 3 4 5 ...

1568 Commits