Mengqi (David) Yu
d6e17ad808
Add random interval to nodeStatusReport interval every time after an actual node status change
2024-11-06 06:11:05 +00:00
Kubernetes Prow Robot
5e0b818ff9
Merge pull request #128551 from tallclair/allocated-checkpoint
...
[FG:InPlacePodVerticalScaling] Don't checkpoint ResizeStatus
2024-11-06 04:19:36 +00:00
Kubernetes Prow Robot
bf75546494
Merge pull request #128432 from zhifei92/integrating-health-check
...
Integrate device plugin registration gRPC server health checks.
2024-11-06 04:19:29 +00:00
Tim Allclair
ea53083c14
Don't checkpoint ResizeStatus
2024-11-05 15:48:35 -08:00
Tim Allclair
4a4748d23c
Determine resize status from state in handlePodResourcesResize
2024-11-05 15:41:49 -08:00
Kubernetes Prow Robot
f81a68f488
Merge pull request #128377 from tallclair/allocated-status-2
...
[FG:InPlacePodVerticalScaling] Implement AllocatedResources status changes for Beta
2024-11-05 23:21:49 +00:00
Kubernetes Prow Robot
e57618970e
Merge pull request #126870 from AnishShah/outofcpu-fix
...
Ensure mirror pods are created as soon as node is registered
2024-11-05 19:15:29 +00:00
zhangzhifei16
1381e41f28
feat: Integrate device plugin registration gRPC server health checks.
2024-11-05 19:59:56 +08:00
Anish Shah
dcafd93b68
kubelet: try registering mirror pods as soon as node is registered.
...
Mirror pods for static pods may not be created immediately during node startup
because either the node is not registered or node informer is not synced.
They will be created eventually when static pods are resynced (every 1-1.5 minutes).
However, during this delay of 1-1.5 mins, kube-scheduler might overcommit resources
to the node and eventually cause kubelet to reject pods with
OutOfCPU/OutOfMemory/OutOfPods error.
To ensure kube-scheduler is aware of static pod resource usage faster,
mirror pods are created as soon as the node registers.
2024-11-05 00:56:21 -08:00
lauralorenz
4965a7a8a0
KEP-4603: Refactor various hardcoded backoffs into separate constants ( #128369 )
...
* Refactor various hardcoded backoffs into separate constants
Signed-off-by: Laura Lorenz <lauralorenz@google.com >
* Fix comment formatting
Signed-off-by: Laura Lorenz <lauralorenz@google.com >
---------
Signed-off-by: Laura Lorenz <lauralorenz@google.com >
2024-11-05 06:07:28 +00:00
Kubernetes Prow Robot
c6ea102f5f
Merge pull request #128298 from SergeyKanzhelev/convergePluginRegistrationForDRAAndDP
...
converge DRA and Device Plugin plugins registration
2024-11-04 11:41:28 +00:00
Tim Allclair
84201658c3
Fix ResizeStatus state transitions
2024-11-01 14:02:59 -07:00
Sergey Kanzhelev
1297d0cdd1
converge DRA and Device Plugin plugins registration
2024-10-30 16:58:13 +00:00
Oksana Baranova
49b88f1d8a
kubelet: migrate clustertrustbundle, token to contextual logging
...
Signed-off-by: Oksana Baranova <oksana.baranova@intel.com >
2024-10-30 17:31:11 +02:00
Kubernetes Prow Robot
b3cf9c6e5c
Merge pull request #128269 from tallclair/allocated
...
[FG:InPlacePodVerticalScaling] Rework handling of allocated resources
2024-10-25 23:24:52 +01:00
Tim Allclair
b186c160ca
Clarify eviction based on allocated pods
2024-10-25 13:53:11 -07:00
Tim Allclair
c75a3e717e
More precise allocatedPod name usage
2024-10-25 13:32:36 -07:00
Tim Allclair
7166169c82
Tidy up handlePodResize
2024-10-24 16:35:28 -07:00
Tim Allclair
34cf754fe9
Pass allocatedPods to canAdmitPod
2024-10-24 16:31:49 -07:00
Tim Allclair
d1f1bf200c
Add more comments
2024-10-24 15:51:19 -07:00
Tim Allclair
321eff34f7
Rework allocated resources handling
2024-10-24 09:27:40 -07:00
Kubernetes Prow Robot
e526a27118
Merge pull request #116388 from mxpv/shutdown
...
Clean/refactor node shutdown manager
2024-10-24 08:34:53 +01:00
Maksym Pavlenko
449f86b0ba
Refactor node shutdown manager
...
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com >
2024-10-23 17:36:22 -07:00
Kubernetes Prow Robot
d7e5ff87e0
Merge pull request #128083 from carlory/fix-126662-kubelet
...
kubelet: fix a bug where kubelet wrongly drops the QOSClass field of the Pod's s status when it rejects a Pod
2024-10-23 21:00:53 +01:00
carlory
c7e384f9ff
kubelet: fix a bug where kubelet drops the QOSClass field of the Pod's status when it rejects a Pod
...
Co-authored-by: Sergey Kanzhelev <S.Kanzhelev@live.com >
2024-10-24 01:01:04 +08:00
Kubernetes Prow Robot
cdc0cf5c10
Merge pull request #128237 from kolyshkin/userns
...
pkg/kubelet/userns/inuserns: use moby/sys/userns
2024-10-23 05:28:51 +01:00
Kir Kolyshkin
54d43ecaed
pkg/kubelet/user/userns: remove, use moby/sys/userns
...
The code from github.com/opencontainers/runc/libcontainer/userns package
was moved into github.com/moby/sys/user and github.com/moby/sys/userns
(see [1]), and the runc package is now deprecated in favor of moby/sys
(see [2]).
In addition, moby/sys/userns now has a non-Linux implementation, so
pkg/kubelet/user/userns package (introduced in commit 2e999ff to make a
non-Linux implementation) is not really needed anymore.
Let's switch to moby/sys/userns, and remove the package.
[1]: https://github.com/moby/sys/releases/tag/userns%2Fv0.1.0
[2]: https://github.com/opencontainers/runc/pull/4350
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com >
2024-10-22 14:36:14 -07:00
Tim Allclair
53aa727708
Checkpoint allocated requests and limits
2024-10-22 11:26:48 -07:00
zhifei92
dac7332ed2
integrate kubelet with the systemd watchdog
...
feat: add unit test
feat: add FeatureGate for SystemdWatchdog
fix: linter and failed tests
feat: add SystemdWatchdog to versioned feature list yaml
2024-10-21 10:46:14 +08:00
Kubernetes Prow Robot
e2c17c09a4
Merge pull request #125070 from torredil/kublet-vm-race
...
Ensure volumes are unmounted during graceful node shutdown
2024-10-02 00:33:48 +01:00
elmiko
38fe239ac4
factor our cloudprovider.DeprecationWarningForProvider
...
this change removes the deprecation warning function in favor of using
the `cloudprovider.DisableWarningForProvider`. it also fixes some of the
logic to ensure that non-external providers are properly detected and
warned about.
2024-09-30 12:20:25 -04:00
elmiko
d1d05d3eba
remove IsDeprecatedInternal from cloudprovider.plugins
...
The internal cloud controller loops are disabled at this point, this
function should not be used as it does not return accurate information.
In its place we check for the presence of the external cloud provider as
that is the only acceptable value.
2024-09-26 14:55:25 -04:00
torredil
b4a21a3e43
Wait for volume teardown during graceful node shutdown
...
Signed-off-by: torredil <torredil@amazon.com >
2024-09-19 20:39:33 +00:00
Kubernetes Prow Robot
24a74f887a
Merge pull request #126595 from pacoxu/kubelet-cgroup-v2-kernel-version
...
[1.32]kubelet: add log and event for cgroup v2 running on kernel < 5.8
2024-09-18 18:34:44 +01:00
Paco Xu
259671bd43
check root cpu.stat instead of kernel version for cgroup v2
2024-09-18 11:39:36 +08:00
Kubernetes Prow Robot
bf6a9da1c3
Merge pull request #126931 from lengrongfu/feat/add-unready
...
Add default status "not ready" when readiness probe sync
2024-09-18 03:28:44 +01:00
rongfu.leng
de4010939d
add default status unready when readiness probe sync
...
Signed-off-by: rongfu.leng <lenronfu@gmail.com >
2024-09-14 08:56:32 +00:00
Oksana Baranova
2474369227
kubelet: migrate pleg to contextual logging
...
Signed-off-by: Oksana Baranova <oksana.baranova@intel.com >
2024-09-13 12:13:26 +03:00
Kubernetes Prow Robot
5ac315faf4
Merge pull request #126494 from bart0sh/PR153-migrate-DRA-Manager-to-contextual-logging
...
Migrate pkg/kubelet/cm/dra to contextual logging
2024-08-26 13:14:26 +01:00
Kubernetes Prow Robot
3f306ae140
Merge pull request #126343 from SergeyKanzhelev/succeededPodReadmitted
...
Terminated pod should not be re-admitted
2024-08-22 16:32:09 +01:00
Ed Bartosh
e1bc8defac
kubelet: Migrate DRA Manager to contextual logging
...
Co-authored-by: Patrick Ohly <patrick.ohly@intel.com >
2024-08-22 11:12:41 +03:00
Kubernetes Prow Robot
113b12c6fb
Merge pull request #124439 from bells17/csi-translation-lib-structured-and-contextual-logging
...
Migrate k8s.io/csi-translation-lib/.* to structured logging
2024-08-19 18:13:54 -07:00
Paco Xu
69a67556c7
kubelet: add warning log and events for cgroup v2 running on kernel < 5.8
2024-08-12 14:06:56 +08:00
Sergey Kanzhelev
300128de65
succeeded pod is being re-admitted
2024-07-25 17:45:27 +00:00
Kubernetes Prow Robot
57d197fb89
Merge pull request #124430 from AllenXu93/fix-kubelet-restart-notReady
...
fix node notReady in first sync period after kubelet restart
2024-07-23 21:20:40 -07:00
Sergey Kanzhelev
62f96d2748
set AllocatedResourcesStatus in the Pod Status
2024-07-24 00:29:35 +00:00
Kubernetes Prow Robot
558c9536a1
Merge pull request #123678 from kinvolk/userns-use-kubelet-user-mappings
...
kubelet: Add logs for userns custom mappings parsing
2024-07-20 19:59:57 -07:00
bells17
1298c8a5fe
csi-translation-lib: Support structured and contextual logging
2024-07-18 14:01:27 +09:00
Kubernetes Prow Robot
52c0ed4673
Merge pull request #124342 from zhifei92/fix-error-check
...
fix error checking in kl.killPod within SyncPod
2024-07-16 16:05:07 -07:00
Kubernetes Prow Robot
fc3abdaf2d
Merge pull request #125470 from everpeace/kep-3619-SupplementalGroupsPolicy-e2e
...
KEP-3619: Add NodeStatus.Features.SupplementalGroupsPolicy API and e2e
2024-07-16 13:57:06 -07:00