Commit Graph

2379 Commits

Author SHA1 Message Date
Anish Ramasekar
ad8666ce88 Update credential provider plugin to support using service account token
Signed-off-by: Anish Ramasekar <anish.ramasekar@gmail.com>
2025-03-11 20:36:32 -07:00
Stanislav Láznička
d3f44a5bc0 kubelet: lazy enabling the ClusterTrustBundleProjection feature
Determine whether the ClusterTrustBundleProjection should be enabled
based on ClusterTrustBundle API discovery.
Some distributions may rely on a running kubelet in order to start
their kube-apiserver. Therefore we must delay the API discovery.

This patch delays it until the first time a clustertrustbundle is
requested from the InformerMaanager.
2025-03-11 18:07:28 +01:00
Stanislav Láznička
e0f536bf1f use the ClusterTrustBundles beta API 2025-03-11 18:07:24 +01:00
Kubernetes Prow Robot
82667879bb Merge pull request #130599 from tallclair/acknowledged-resources
[FG:InPlacePodVerticalScaling] Track actuated resources to trigger resizes
2025-03-10 19:01:46 -07:00
Kubernetes Prow Robot
f510123183 Merge pull request #130559 from esotsal/fix-use-CamelCase-for-memory-manager-policy-name-check-for-InPlacePodVerticalScalingExclusiveCPUs-feature-gate
[FG:InPlacePodVerticalScaling] Fix use CamelCase for memory manager policy in InPlacePodVerticalScalingExclusiveCPUs
2025-03-10 14:41:47 -07:00
Tim Allclair
6d0b6278cd Rename some allocation.Manager methods 2025-03-10 10:03:35 -07:00
Tim Allclair
d4444dd598 Use actuated resources to determine resize status 2025-03-10 10:03:35 -07:00
Tim Allclair
660bd6b42d Track actuated resources in the allocation manager 2025-03-10 09:58:29 -07:00
Tim Allclair
ed326fea13 Always report pod status resources consistent with the current pod sync 2025-03-05 16:01:03 -08:00
Sotiris Salloumis
33bf509eb0 Use CamelCase for memory manager policy name check in InPlacePodVerticalScalingExclusiveCPUs 2025-03-04 14:23:08 +01:00
Tim Allclair
cb5c8d159c Don't automatically clear in-progress status when resize is not allowed 2025-03-03 15:26:40 -08:00
Tim Allclair
523a19aa44 Extract isInPlacePodVerticalScalingAllowed to shared function 2025-03-03 14:08:49 -08:00
Tim Allclair
460db5c137 Always use allocated resources for pods that don't support resize 2025-03-03 14:07:30 -08:00
Kubernetes Prow Robot
3560950041 Merge pull request #130254 from tallclair/allocation-manager-2
[FG:InPlacePodVerticalScaling] Move pod resource allocation management out of the status manager
2025-02-28 11:30:56 -08:00
Tim Allclair
fe4671356c Call allocationManager directly 2025-02-21 09:28:37 -08:00
Antonio Ojea
2418b54ee2 Revert "Add random interval to nodeStatusReport interval every time after an actual node status change" 2025-02-21 17:29:08 +01:00
Kubernetes Prow Robot
0634e21fb5 Merge pull request #128367 from vivzbansal/sidecar-2
[FG:InPlacePodVerticalScaling] Implement resize for sidecar containers
2025-02-05 14:38:15 -08:00
Ed Bartosh
71b9114840 kubelet: Migrate pkg/kubelet/sysctl to contextual logging 2025-01-30 10:31:58 +02:00
Kubernetes Prow Robot
8294abc599 Merge pull request #128998 from bart0sh/PR165-migrate-oom-to-contextual-logging
kubelet: Migrate pkg/kubelet/oom to contextual logging
2025-01-28 13:33:22 -08:00
vivzbansal
6c5cf68722 Resolved latest review comments 2025-01-27 19:46:33 +00:00
vivzbansal
d1fac494f4 resolve merge conflicts 2025-01-27 19:42:13 +00:00
Ed Bartosh
f622be0333 kubelet: Migrate pkg/kubelet/oom to contextual logging 2024-11-28 17:47:02 +02:00
Talor Itzhak
dc258e65ac memmanager:cleanup: drop Experimental prefix
Since MemoryManager goes GA, we should drop the
`Experimental` prefix from the its fields.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
2024-11-12 09:45:17 +02:00
Kubernetes Prow Robot
6b031e50b2 Merge pull request #128713 from tallclair/ippr-debug-events
[FG:InPlacePodVerticalScaling] Emit events for Deferred and Infeasible statuses
2024-11-11 23:22:45 +00:00
lauralorenz
7fe41da522 KEP-4603: Node specific kubelet config for maximum backoff down to 1 second (#128374)
* Add feature gate, API, and conflict validation tests for enablecrashloopbackoffmax

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Handle when current base is longer than node max

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Update pkg/features/kube_features.go

Co-authored-by: Tsubasa Nagasawa <toversus2357@gmail.com>

* Fix indentation

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Follow convention for success test

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Normalize casing, and change field to Duration

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Fix json name and some other casing errors

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Another one I missed before

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Don't clobber global max function

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Change to flat value in defaults.go

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Streamline validation and defaults

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Fix typecheck

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Lint

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Tighten up validation for subsecond values

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Rename field from MaxBackOffPeriod to MaxContainerRestartPeriod

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* A few missed references to renames

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Only compare flags in flags test

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Don't mess with SetDefault signature

Nobody messes with SetDefault signature

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Fix stale signature change, and update test data

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Inspect current feature gates at defaulting time

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Don't use the global feature gate for temp usage

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Expose default error, and some comments

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

* Hint fuzzer for less arbitrary values to FeatureGates

Signed-off-by: Laura Lorenz <lauralorenz@google.com>

---------

Signed-off-by: Laura Lorenz <lauralorenz@google.com>
Co-authored-by: Tsubasa Nagasawa <toversus2357@gmail.com>
2024-11-09 01:44:43 +00:00
Tim Allclair
3a2555ee93 Emit events for resize error states 2024-11-08 16:43:55 -08:00
Tim Allclair
61e6242967 Move windows infeasible resize check into canResizePod 2024-11-08 16:42:10 -08:00
Kubernetes Prow Robot
0fff5bbe7d Merge pull request #128680 from tallclair/min-cpu
[FG:InPlacePodVerticalScaling] Handle edge cases around CPU MinShares
2024-11-08 05:24:51 +00:00
Kubernetes Prow Robot
81dc4538db Merge pull request #128287 from Nordix/esotsal/128068
[FG:InPlacePodVerticalScaling] Gate Disallow in-place resize for guaranteed pods on nodes with a static topology policy
2024-11-08 05:24:44 +00:00
Tim Allclair
5a3a40cd19 Handle resize edge cases around min CPU shares 2024-11-07 17:02:25 -08:00
Kubernetes Prow Robot
8504758a2e Merge pull request #125757 from Nordix/esotsal/125205
[FG:InPlacePodVerticalScaling] Fix backoff problem when quickly reverting resize patch
2024-11-07 23:32:42 +00:00
Kubernetes Prow Robot
ab30adcbae Merge pull request #128356 from lauralorenz/crashloopbackoff-maintain10minuterecoverythreshold
KEP-4603: Maintain current 10 minute recovery threshold for container backoff regardless of changes to the maximum duration
2024-11-07 22:20:50 +00:00
Kubernetes Prow Robot
1ce20b2b6f Merge pull request #126336 from HirazawaUi/remove-runonce-mode
Kubelet: Remove runonce mode
2024-11-07 21:06:46 +00:00
Kubernetes Prow Robot
25101d33bc Merge pull request #128518 from tallclair/pleg-watch-conditions
[FG:InPlacePodVerticalScaling] PLEG watch conditions: rapid polling for expected changes
2024-11-07 19:45:01 +00:00
Sotiris Salloumis
68fcc9cf8a Fix slow reconcile when quickly reverting resize patch 2024-11-07 19:51:47 +01:00
Laura Lorenz
a0b83a7741 Maintain 10 minute recovery threshold for container backoff
Signed-off-by: Laura Lorenz <lauralorenz@google.com>
2024-11-07 18:46:11 +00:00
Sotiris Salloumis
2d8939c4ae Gate: disallow in-place resize for guaranteed pods on nodes with a static topology policy
New gate "InPlacePodVerticalScalingExclusiveCPUs" is off by default,
but can be enabled to unblock development of Static CPU management alongside
InPlacePodVerticalScaling.
2024-11-07 16:59:23 +00:00
Kubernetes Prow Robot
c9024e7ae6 Merge pull request #128640 from mengqiy/spreadkubeletlaod
Add random interval to nodeStatusReport interval every time after an actual node status change
2024-11-07 13:48:03 +00:00
HirazawaUi
ecf2b402be remove runonce mode 2024-11-07 19:54:11 +08:00
Kubernetes Prow Robot
c462d4c8e5 Merge pull request #126096 from utam0k/support-disabling-oom-group-kill
kubelet: new kubelet config option for disabling group oom kill
2024-11-07 06:29:36 +00:00
Mengqi (David) Yu
1003d36870 Add random interval to nodeStatusReport interval every time after an actual node status change
update TestUpdateNodeStatusWithLease this time to avoid flakiness
2024-11-07 04:33:59 +00:00
Kubernetes Prow Robot
3184eb3d1b Merge pull request #128629 from liggitt/revert-spreadkubeletload
Revert "Add random interval to nodeStatusReport interval every time after an actual node status change
2024-11-07 03:53:42 +00:00
utam0k
4f909c14a0 kubelet: new kubelet config option for disabling group oom kill
Signed-off-by: utam0k <k0ma@utam0k.jp>
2024-11-07 12:03:04 +09:00
Jordan Liggitt
4850b31bda Revert "Add random interval to nodeStatusReport interval every time after an actual node status change"
This reverts commit d6e17ad808.
2024-11-06 17:12:13 -05:00
Anish Shah
207842d3e0 drop InPlacePodVerticalScaling support in windows 2024-11-06 12:57:55 -08:00
Kubernetes Prow Robot
099449954e Merge pull request #128556 from AnishShah/kubelet-reject-metric
Introduce a metric to track kubelet admission failure.
2024-11-06 20:10:33 +00:00
Tim Allclair
da9c2c553b Set pod watch conditions for resize 2024-11-06 11:05:24 -08:00
Anish Shah
d4f05fdda5 Introduce a metric to track kubelet admission failure. 2024-11-06 00:07:17 -08:00
Mengqi (David) Yu
d6e17ad808 Add random interval to nodeStatusReport interval every time after an actual node status change 2024-11-06 06:11:05 +00:00
Kubernetes Prow Robot
5e0b818ff9 Merge pull request #128551 from tallclair/allocated-checkpoint
[FG:InPlacePodVerticalScaling] Don't checkpoint ResizeStatus
2024-11-06 04:19:36 +00:00