kubernetes

mirror of https://github.com/optim-enterprises-bv/kubernetes.git synced 2025-12-08 17:15:36 +00:00

Author	SHA1	Message	Date
Kubernetes Prow Robot	b56dc43458	Merge pull request #106282 from bobbypage/cadvisor-v043 vendor: Bump cAdvisor to v0.43.0	2021-11-10 08:17:38 -08:00
Kubernetes Prow Robot	5d60c8d857	Merge pull request #102393 from mengjiao-liu/fix-sysctl-regex Upgrade preparation to verify sysctl values containing forward slashes by regex	2021-11-09 18:23:26 -08:00
David Porter	b6269ce5de	kubelet: update cAdvisor usage for v0.43 * Change cAdvisor manager constructor * Change call to adding AcceleratorUsageMetrics Signed-off-by: David Porter <david@porter.me>	2021-11-09 17:09:12 -08:00
Kubernetes Prow Robot	6ac2d8edc8	Merge pull request #105967 from shivanshu1333/feature2/master/105841 Migrated scheduler files `preemption.go`, `stateful.go`, `resource_allocation.go` to structured logging	2021-11-09 10:28:01 -08:00
ravisantoshgudimetla	889d45d3fb	[kubelet] Reject pods with OS field mismatch Once kubernetes#104613 and kubernetes#104693 merge, we'll have OS field in pod spec. Kubelet should start rejecting pods where pod.Spec.OS and node's OS(using runtime.GOOS) won't match	2021-11-08 19:18:15 -05:00
Kubernetes Prow Robot	cda360c59f	Merge pull request #104613 from ravisantoshgudimetla/reconcile-labels [kubelet]: Reconcile OS and arch labels periodically	2021-11-08 14:15:19 -08:00
Kubernetes Prow Robot	8b463cd141	Merge pull request #105406 from marosset/kubelet-metrics-for-host-process-containers Adding kubelet metrics for started and failed to start HostProcess containers	2021-11-08 13:11:20 -08:00
Shivanshu Raj Shrivastava	f4aad52885	migrated preemption.go, stateful.go, resource_allocation.go to structured logging	2021-11-08 22:52:47 +05:30
Kubernetes Prow Robot	33de444861	Merge pull request #103095 from haircommander/podAndContainerStatsFromCRI-feature-gate Kubelet: implement support for podAndContainerStatsFromCRI	2021-11-07 18:26:53 -08:00
ravisantoshgudimetla	21c5c2ec5c	[kubelet][podadmission]: Validate and reject pods with mismatching labels	2021-11-05 18:47:43 -04:00
ravisantoshgudimetla	02c1bac0b6	[kubelet]: Sync label periodically	2021-11-05 18:47:43 -04:00
Mark Rossetti	ef324d6bbd	Adding kubelet metrics for started and failed to start HostProcess containers Signed-off-by: Mark Rossetti <marosset@microsoft.com>	2021-11-04 14:39:57 -07:00
Mengjiao Liu	275d832ce2	Upgrade preparation to verify sysctl values containing forward slashes by regex	2021-11-04 11:49:56 +08:00
Patrick Ohly	3948cb8d1b	component-base: move v/vmodule/log-flush-frequency into LoggingConfiguration These three options are the ones from logs.AddFlags which are not deprecated. Therefore it makes sense to make them available also via the configuration file support in the one command which currently supports that (kubelet). Long-term, all commands should use LoggingConfiguration, either with a configuration file (as in kubelet) or via flags (kube-scheduler, kube-apiserver, kube-controller-manager). Short-term, both approaches have to be supported. As the majority of the commands only use logs.AddFlags, that function by default continues to register the flags and only leaves that to Options.AddFlags when explicitly requested. A drive-by bug fix is done for log flushing: the periodic flushing called klog.Flush and therefore missed explicit flushing of the newer logr backend. This bug was never present in any release Kubernetes and therefore the fix is not submitted in a separate PR.	2021-11-03 07:41:46 +01:00
Kubernetes Prow Robot	aa0ea62489	Merge pull request #104903 from ikeeip/storageobjectinuseprotection_feature_ga_cleanup Remove StorageObjectInUseProtection feature gate logic	2021-11-02 20:22:57 -07:00
Kubernetes Prow Robot	359b722c19	Merge pull request #102882 from fromanirh/device-manager-checkpoints devicemanager: checkpoint: support pre-1.20 data	2021-11-02 16:56:57 -07:00
Konstantin Misyutin	808c8f42d5	Remove StorageObjectInUseProtection feature gate logic This feature has graduated to GA in v1.11 and will always be enabled. So no longe need to check if enabled. Signed-off-by: Konstantin Misyutin <konstantin.misyutin@huawei.com>	2021-11-03 00:13:50 +03:00
Kubernetes Prow Robot	08bf54678e	Merge pull request #101909 from nolancon/cpu-mgr-testing Additional cases for reconcileState testing	2021-10-30 00:01:17 -07:00
Tim Hockin	11a25bfeb6	De-share the Handler struct in core API (#105979 ) * De-share the Handler struct in core API An upcoming PR adds a handler that only applies on one of these paths. Having fields that don't work seems bad. This never should have been shared. Lifecycle hooks are like a "write" while probes are more like a "read". HTTPGet and TCPSocket don't really make sense as lifecycle hooks (but I can't take that back). When we add gRPC, it is EXPLICITLY a health check (defined by gRPC) not an arbitrary RPC - so a probe makes sense but a hook does not. In the future I can also see adding lifecycle hooks that don't make sense as probes. E.g. 'sleep' is a common lifecycle request. The only option is `exec`, which requires having a sleep binary in your image. * Run update scripts	2021-10-29 13:15:11 -07:00
Peter Hunt	6b3f8e5662	kubelet: fallback to partial CRI stats if full fails This is partially to allow the kube alpha tests to pass until CRI implementations have support, but also to handle this error situation a bit more elegantly Signed-off-by: Peter Hunt <pehunt@redhat.com>	2021-10-29 09:40:20 -04:00
Peter Hunt	feb5f5e0ed	kubelet: use helper function to check for nil fields in sandbox stats Signed-off-by: Peter Hunt <pehunt@redhat.com>	2021-10-29 09:40:20 -04:00
Peter Hunt	85e8a4bf73	kubelet stats: use UsageNanoCores if available Signed-off-by: Peter Hunt <pehunt@redhat.com>	2021-10-29 09:40:20 -04:00
Peter Hunt	ffdb4b9c4a	kubelet: slightly move around some cri stats functions to reduce duplication and add clarity Signed-off-by: Peter Hunt <pehunt@redhat.com>	2021-10-29 09:40:20 -04:00
Peter Hunt	d2c436700e	kubelet stats: add support for podAndContainerStatsFromCRI This commit adds an initial implementation of translating from the new CRI fields to the /stats/summary PodStats object Signed-off-by: Peter Hunt <pehunt@redhat.com>	2021-10-29 09:40:20 -04:00
Peter Hunt	7866287ba1	kubelet stats: wire up podAndContainerStatsFromCRI feature gate though it is currently unused Signed-off-by: Peter Hunt <pehunt@redhat.com>	2021-10-29 09:40:20 -04:00
Kubernetes Prow Robot	c592bd40f2	Merge pull request #105609 from pohly/generic-ephemeral-volume-ga generic ephemeral volume GA	2021-10-28 17:36:50 -07:00
Francesco Romani	2f426fdba6	devicemanager: checkpoint: support pre-1.20 data The commit `a8b8995ef2` changed the content of the data kubelet writes in the checkpoint. Unfortunately, the checkpoint restore code was not updated, so if we upgrade kubelet from pre-1.20 to 1.20+, the device manager cannot anymore restore its state correctly. The only trace of this misbehaviour is this line in the kubelet logs: ``` W0615 07:31:49.744770 4852 manager.go:244] Continue after failing to read checkpoint file. Device allocation info may NOT be up-to-date. Err: json: cannot unmarshal array into Go struct field PodDevicesEntry.Data.PodDeviceEntries.DeviceIDs of type checkpoint.DevicesPerNUMA ``` If we hit this bug, the device allocation info is indeed NOT up-to-date up until the device plugins register themselves again. This can take up to few minutes, depending on the specific device plugin. While the device manager state is inconsistent: 1. the kubelet will NOT update the device availability to zero, so the scheduler will send pods towards the inconsistent kubelet. 2. at pod admission time, the device manager allocation will not trigger, so pods will be admitted without devices actually being allocated to them. To fix these issues, we add support to the device manager to read pre-1.20 checkpoint data. We retroactively call this format "v1". Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-26 09:54:11 +02:00
Kubernetes Prow Robot	17da6a2345	Merge pull request #105699 from yuzhiquan/remove-format-pods Remove format.pods func, instead with klog.Kobjs	2021-10-25 15:53:30 -07:00
Eric Ernst	2c0fad1f52	kuberuntime: populate sandbox resources, overhead Populate Resources and Overhead fields which, are now part of LinuxPodSandboxConfig. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-10-20 11:30:23 -07:00
Eric Ernst	ddcf815d12	kuberuntime: refactor linux resources for better reuse Seperate the CPU/Memory req/limit -> linux resource conversion into its own function for better reuse. Elsewhere in kuberuntime pkg, we will want to leverage this requests/limits to Linux Resource type conversion. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-10-20 11:30:23 -07:00
Eric Ernst	b1361aed93	kuberuntime: augment linux container config unit test Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-10-20 11:30:23 -07:00
Eric Ernst	a73502a0be	kuberuntime: augment linux container config unit test Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2021-10-20 11:29:22 -07:00
Kubernetes Prow Robot	b2c4269992	Merge pull request #105631 from klueska/upstream-distribute-cpus-across-numa Add CPUManager policy option to distribute CPUs across NUMA nodes instead of packing them	2021-10-19 11:40:24 -07:00
Kubernetes Prow Robot	1af8a8c026	Merge pull request #105465 from marosset/remove-host-process-contianer-kubelet-annotations Stop passing WindowsHostProcessContainer annotations for CRI calls in kubelet	2021-10-18 15:50:02 -07:00
Kubernetes Prow Robot	e595d79dfc	Merge pull request #104574 from 249043822/br-repeat-package fix duplicate package import in pod_worker	2021-10-18 15:49:46 -07:00
Kubernetes Prow Robot	5889fb4fbc	Merge pull request #105652 from wzshiming/feat/structure-shutdown-config Refactor to use structure to pass parameters for GracefulNodeShutdown	2021-10-18 14:45:20 -07:00
Kevin Klues	86f9c266bc	Add optimizations to reduce iterations in distributed NUMA algorithm Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-18 08:53:25 +00:00
Kevin Klues	70e0f47191	Support full-pcpus-only with the new NUMA distribution policy option Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	d54445a84d	Generalize the NUMA distribution algorithm to take cpuGroupSize This parameter ensures that CPUs are always allocated in groups of size 'cpuGroupSize'. This is important, for example, to ensure that all CPUs (i.e. hyperthreads) from the same core are handed out together. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	1436e33642	Add more extensive testing for NUMA distribution algorithm in CPUManager Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	cf3afb8602	Add 2 distinguishing test cases between the 2 takeByTopology algorithms Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	eb78e2406b	Add a new TestTakeByTopologyNUMADistributed() test to the CPUManager As part of this, pull out all of the existing "TakeByTopology" tests and have them be called by the original TestTakeByTopologyNUMAPacked() as well as the new TestTakeByTopologyNUMADistributed() test. In a subsequent commit, we will add some tests that should differ between these two algorithms. Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	876dd9b078	Added algorithm to CPUManager to distribute CPUs across NUMA nodes Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 19:31:02 +00:00
Kevin Klues	462544d079	Split CPUManager takeByTopology() into two different algorithms The first implements the original algorithm which packs CPUs onto NUMA nodes if more than one NUMA node is required to satisfy the allocation. The second disitributes CPUs across NUMA nodes if they can't all fit into one. The "distributing" algorithm is currently a noop and just returns an error of "unimplemented". A subsequent commit will add the logic to implement this algorithm according to KEP 2902: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2902-cpumanager-distribute-cpus-policy-option Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 14:46:19 +00:00
Kevin Klues	0e7928edce	Add new CPUManager policy option for "distribute-cpus-across-numa" This commit only adds the option to the policy options framework. A subsequent commit will add the logic to utilize it. The KEP describing this new option can be found here: https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2902-cpumanager-distribute-cpus-policy-option Signed-off-by: Kevin Klues <kklues@nvidia.com>	2021-10-16 14:46:19 +00:00
yuzhiquanlong	27fe56e916	remove unused import	2021-10-15 18:40:31 +08:00
Francesco Romani	4bae656835	cpumanager: test NUMA node support for CPU assign (2) This batch of tests adds a fake topology on which each numa node has multiple sockets. We didn't find yet a real HW topology in the wild like this, but we need one to fully exercise the code. So, until we find a HW topology, we add a fake one flipping the NUMA/socket config of the existing xeon dual gold 6320. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-15 10:29:21 +00:00
Francesco Romani	547996f3f6	cpumanager: test NUMA node support for CPU assign (1) This batch of tests adds a real topology on which each physical socket has multiple NUMA zones. Taken by a real dual xeon 6320 gold. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-15 10:29:21 +00:00
Francesco Romani	f6ccc4426a	cpumanager: test: use proper subtests The exisiting unit tests where performing subtests without actually using the full features of the testing package (https://pkg.go.dev/testing#hdr-Subtests_and_Sub_benchmarks) Update them with fairly minimal changes. The patch is deceptively large because we need to move the code inside a new block. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-15 10:29:21 +00:00
Francesco Romani	15caa134b2	cpumanager: topology: use rich cmp package User the `cmp.Diff` package in the unit tests, moving away from `reflect.DeepEqual`. This gives us a clearer picture of the differences when the tests fail. Signed-off-by: Francesco Romani <fromani@redhat.com>	2021-10-15 10:29:21 +00:00

1 2 3 4 5 ...

9669 Commits