Commit Graph

11982 Commits

Author SHA1 Message Date
Tim Allclair
424c7ca7e5 Remove unused ClearState function 2025-01-29 12:04:40 -08:00
Kubernetes Prow Robot
8294abc599 Merge pull request #128998 from bart0sh/PR165-migrate-oom-to-contextual-logging
kubelet: Migrate pkg/kubelet/oom to contextual logging
2025-01-28 13:33:22 -08:00
Kubernetes Prow Robot
2bda5dd8c7 Merge pull request #129656 from vinayakankugoyal/kep2862beta
KEP-2862: Graduate to BETA.
2025-01-27 19:05:23 -08:00
vivzbansal
c31b1b3332 Resolved some review comments 2025-01-27 19:46:55 +00:00
vivzbansal
5889da1bbc Resolved latest review comments 2025-01-27 19:46:54 +00:00
vivzbansal
242dec3e34 Updated some unit tests and resolved some review comments 2025-01-27 19:46:54 +00:00
vivzbansal
5ed5732fa2 Refactored status manager code of updatePodFromAllocation 2025-01-27 19:46:54 +00:00
vivzbansal
8fa8277908 Added some unit tests 2025-01-27 19:46:54 +00:00
vivzbansal
2ba61325f6 Fix e2e test error due to ContainersToUpdate map not created 2025-01-27 19:46:54 +00:00
vivzbansal
6c5cf68722 Resolved latest review comments 2025-01-27 19:46:33 +00:00
vivzbansal
6cf5b80c64 Fix some unit test error 2025-01-27 19:42:14 +00:00
vivzbansal
1cf4587277 Fix build error 2025-01-27 19:42:14 +00:00
vivzbansal
d1fac494f4 resolve merge conflicts 2025-01-27 19:42:13 +00:00
Itamar Holder
54500bfe69 cadvisor_provider, unit tests: ensure container-level metrics are collected
Signed-off-by: Itamar Holder <iholder@redhat.com>
2025-01-27 13:13:17 +02:00
Itamar Holder
ceeba21d3d cadvisor_provider, unit test: Add swap stats to cadvisor CPU and Memory stats
Signed-off-by: Itamar Holder <iholder@redhat.com>
2025-01-27 13:13:17 +02:00
Itamar Holder
c111266609 cadvisor_provider, bugfix: Add swap stats to CPU and Memory stats
Signed-off-by: Itamar Holder <iholder@redhat.com>
2025-01-27 13:13:17 +02:00
Itamar Holder
e6c19f315f cri_provider, unit tests: ensure container-level metrics are collected
Signed-off-by: Itamar Holder <iholder@redhat.com>
2025-01-27 13:13:17 +02:00
Itamar Holder
748b52a130 cri_provider, bugfix: Add cadvisor container stats
Without this fix, when CRI stats provided collects cadvisor
stats, pod swap stats are being collected but corresponding
container swap stats are not. This commit fixes this.

Signed-off-by: Itamar Holder <iholder@redhat.com>
2025-01-27 13:13:17 +02:00
Davanum Srinivas
4e05bc20db Linter to ensure go-cmp/cmp is used ONLY in tests
Signed-off-by: Davanum Srinivas <davanum@gmail.com>
2025-01-24 20:49:14 -05:00
Vinayak Goyal
3a780a1c1b KEP-2862: Graduate to BETA. 2025-01-24 21:36:00 +00:00
Kubernetes Prow Robot
28ad751946 Merge pull request #128727 from Tal-or/memorymanager_cleanup
memmanager:cleanup: drop `Experimental` prefix
2025-01-23 14:15:20 -08:00
Tim Allclair
bda81f1b68 Kubelet server handler cleanup 2025-01-21 16:31:52 -08:00
Kubernetes Prow Robot
0d988d7209 Merge pull request #129619 from ffromani/sig-node-approvers-ffromani
Self-nominating ffromani as approver for sig-node container and resource managers
2025-01-21 15:50:36 -08:00
Swati Sehgal
c56426bd9f node: device-mgr: Update klog.Infof(..., err) to klog.ErrorS(err,...)
Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-01-21 16:21:16 +00:00
Swati Sehgal
f8596d6d28 node: device-mgr: Change ErrorS(nil, ...) to InfoS
Ensure consistency across resource managers and update
ErrorS(nil, ...) to InfoS. Similar changes have been
proposed in CPU Manager and Memory Manager.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-01-21 16:21:09 +00:00
Swati Sehgal
110868691b node: cpu-mgr: Update klog.Infof(..., err) to klog.ErrorS(err,...)
We are trying to ensure consistency across resource managers when
it comes to logging. While working on logging improvements for
memory manager, it was identified that some parts of code base
is still using klog.InfoS(..., err) instead of klog.ErrorS(err,...).
This change is addressing this in CPU Manager.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-01-17 10:06:19 +00:00
Anish Ramasekar
92e35e7618 update credential provider godoc with unique provider name req
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2025-01-16 15:50:39 -08:00
Swati Sehgal
1714fbfa75 node: memory-mgr: Change ErrorS(nil, ...) to InfoS
Ensure consistency across resource managers and update
ErrorS(nil, ...) to InfoS.

Signed-off-by: Swati Sehgal <swsehgal@redhat.com>
2025-01-16 08:32:42 +00:00
Francesco Romani
8221e28e4d Add ffromani as approver for kubelet resource managers and their tests
Signed-off-by: Francesco Romani <fromani@redhat.com>
2025-01-14 13:18:40 +01:00
Aravindh Puthiyaparambil
12345a14c3 kubelet: use env vars in node log query PS command
- Use environment variables to pass string arguments in the node log
  query PS command
- Split getLoggingCmd into getLoggingCmdEnv and getLoggingCmdArgs
  for better modularization
2025-01-13 11:43:04 -08:00
Filipe Xavier
fd35f652d4 fix state mem constructor and adjust restoreState 2025-01-11 15:37:57 -03:00
Kubernetes Prow Robot
f34d791b13 Merge pull request #125901 from jralmaraz/kubelet_prober
Report event for the cases when probe returned Unknown result
2025-01-09 05:20:33 -08:00
Filipe Xavier
efdd6bea2e kubelet checkpoint: refactor state memory
refactor state mem constructor to accept the state as parameter
and SetPodAllocation to update a single pod.
2025-01-07 16:31:06 -03:00
Filipe Xavier
8e872978e8 kubelet: improve allocated resources checkpointing
changed calls to set allocation from container level to pod level on status manager.
2025-01-07 09:20:36 -03:00
Kubernetes Prow Robot
3ec9c7f4d2 Merge pull request #128811 from zhifei92/statusz
add statusz endpoint  for kubelet
2024-12-22 11:24:10 +01:00
zhifei92
75e5bd6a4f Fix unit test. 2024-12-17 14:25:37 +08:00
zhifei92
b9fc5678d9 Not using fine-grained auth. 2024-12-17 13:27:01 +08:00
Kubernetes Prow Robot
21525f39e0 Merge pull request #129217 from tallclair/revert-129214-kubelet-state-perms
Revert "Change default filestore permissions to 0700"
2024-12-14 09:28:42 +01:00
Kubernetes Prow Robot
46f0b3fc13 Merge pull request #128920 from tallclair/ippr-defaults
[FG:InPlacePodVerticalScaling] Remove ResizePolicy defaulting
2024-12-14 03:04:25 +01:00
Tim Allclair
532607ecbb Revert "Change default filestore permissions to 0700" 2024-12-13 16:59:12 -08:00
Tim Allclair
b2c84061c9 Change default filestore permissions to 0700 2024-12-13 13:11:52 -08:00
Kubernetes Prow Robot
c3d0002303 Merge pull request #129072 from kannon92/add-validation-container-log-max
add kubelet validation for containerLogMaxFiles
2024-12-12 16:44:35 +01:00
Kubernetes Prow Robot
e8615e2712 Merge pull request #129054 from pohly/remove-import-name
remove import doc comments
2024-12-12 09:58:35 +01:00
zhangzhifei16
7caff55fd9 Add statusz to kubelet auth. 2024-12-11 14:34:13 +08:00
Ryan Phillips
5d3c07e89d kubelet: only emit one reboot event
There are cases when the kubelet is starting where networking, or other
components can cause the kubelet to not post the status with the bootId.
The failed status update will cause the Kubelet to queue the
NodeRebooted warning and sometimes cause many events to be created.

This fix wraps the recordEventFunc to only emit one message per kubelet
instantiation.
2024-12-10 16:44:17 -06:00
zhifei92
902dedbb52 fix: Move statusz to debugging handlers. 2024-12-10 10:40:14 +08:00
zhifei92
816cd40280 Unify ComponentKubelet and add unit tests. 2024-12-10 10:32:14 +08:00
zhifei92
a04df83f86 add statusz for kubelet 2024-12-10 10:32:14 +08:00
Ed Bartosh
804f8c7584 kubelet: fix DRA registration test
Set expected slice fields in the reactor function instead of
test cleanup instead of doing it in the test cleanup.

This should fix the test failure caused by kubelet calling reactor function
before the test cleanup sets the deleteCollectionForDriver variable.
2024-12-09 20:58:47 +02:00
Kevin Hannon
a0b74011b2 add kubelet validation for containerLogMaxFiles 2024-12-03 11:03:05 -05:00