Commit Graph

10914 Commits

Author SHA1 Message Date
Kubernetes Prow Robot
378866edba Merge pull request #120518 from saschagrunert/metrics-container-start
kubelet: fix metric `container_start_time_seconds` timestamp
2023-10-15 07:05:37 +02:00
Kubernetes Prow Robot
95bd8b95a7 Merge pull request #100448 from saschagrunert/cri-stats-log
Do not error log CRI stats for not cached partitions
2023-10-14 23:49:12 +02:00
Kubernetes Prow Robot
4911aad463 Merge pull request #115702 from xyz-li/master
Fix:  kubelet will not output logs after log file is rotated
2023-10-14 22:42:04 +02:00
Kubernetes Prow Robot
a7f8c2f787 Merge pull request #118846 from cyclinder/net.ipv4.tcp_keepalive_time
Mark net.ipv4.tcp_keepalive_time as a safe sysctl
2023-10-13 05:02:51 +02:00
Kubernetes Prow Robot
8923c3c871 Merge pull request #119659 from kannon92/beta-pod-ready-to-start
[KEP-3085] Promote PodReadyToStartContainers to beta in 1.29
2023-10-12 22:49:16 +02:00
Kevin Hannon
c94240e2e2 move kubelet constant for podreadytostart to staging 2023-10-12 11:18:11 -04:00
Kubernetes Prow Robot
38a1ec75f0 Merge pull request #119882 from ffromani/podres-client-wait
podresources: e2e: force eager connection
2023-10-12 15:59:55 +02:00
cyclinder
0167a9f833 mark net.ipv4.tcp_keepalive_time as a safe sysctl 2023-10-11 10:24:19 +08:00
Antonio Ojea
3ee2f27e5b kubelet: cloud-provider external addresses
Kubelet, if using cloud provider external, initializes temporary
the node addresses using the non-cloud provider logic, until the
cloud provider overrides it.

This behavior has undesired consequences if the cloud-provider addresses
are different than the original ones, specially for hostNetwork pods,
that inherit these addresses from the Node.

Since some cloud-providers depend on this behavior, in order to keep
backward compatibility, assume that the specifying addresses via
the node-ip flags means that the intent is to keep the existing
behavior to temporary initialize the addresses.

If the node-ips are the unspecified addresses or are not set, then
wait for the external cloud provider to set the node addresses.

Change-Id: I3a3895f9b830769f9658e6a03f058c914c438a09
Signed-off-by: Antonio Ojea <aojea@google.com>
2023-10-06 14:01:28 +00:00
matte21
a213edae2a Add package-level godoc to pkg/kubelet/cm
Add file doc.go with some rudimentary information to package
kubelet/cm. This will make it easier for people approaching the
kubelet codebase for the first time to quickly understand what's
in the package, since its name is abbreviated and hostile to
newcomers.
2023-10-05 14:20:51 -04:00
Kubernetes Prow Robot
a321897e77 Merge pull request #120262 from harche/list_timeout
Add timeout to listContainerStats context
2023-10-02 07:46:46 -07:00
Kubernetes Prow Robot
622509830c Merge pull request #120716 from xrstf/fix-typos
Fix typos
2023-09-30 00:25:56 -07:00
Evan Lezar
394bcaf182 Only configure swap if available on node
This change bypasses all logic to set swap in the linux container
resources if a swap controller is not available on node. Failing
to do so may cause errors in runc when starting a container with
a swap configuration -- even if this is set to 0.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-26 21:32:58 +02:00
Evan Lezar
d3d1827c05 Use local isCgroup2UnifiedMode consistently
This change switches to using isCgroup2UnifiedMode locally to ensure
that any mocked function is also used when checking the swap controller
availability.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
2023-09-21 16:09:04 +02:00
Kubernetes Prow Robot
f9f00da6bc Merge pull request #118761 from TommyStarK/gh_113831
move common logic of highestSupportedVersion to util package
2023-09-18 13:59:25 -07:00
TommyStarK
42356bfbb3 move common logic of highestSupportedVersion to util package
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2023-09-18 21:25:29 +02:00
Kubernetes Prow Robot
82bca6304b Merge pull request #119464 from TommyStarK/dra/cleanup-manager-unit-tests
dra: cleanup manager unit tests
2023-09-18 07:08:43 -07:00
Christoph Mewes
79a7833ade fix typo Mininum => Minimum 2023-09-17 11:24:29 +02:00
Kubernetes Prow Robot
4fd8bd9975 Merge pull request #118568 from qiutongs/node-startup-latency
Create a node startup latency tracker
2023-09-15 13:00:12 -07:00
Kubernetes Prow Robot
d393d4e151 Merge pull request #120574 from logicalhan/cslis
promote component SLIs to GA; remove feature gates for component slis
2023-09-14 22:52:12 -07:00
Kubernetes Prow Robot
a08ee80807 Merge pull request #119829 from cvvz/fix-volumemanager-logs
fix: implement MarshalLog for structures in volumemanager for structured-logging.
2023-09-13 07:46:12 -07:00
Kubernetes Prow Robot
74f6c263d8 Merge pull request #118544 from sohankunkerkar/remove-sandbox-image-ref
pkg/kubelet: allow sandbox image pinning from CRI
2023-09-11 11:52:12 -07:00
Han Kang
e6435e98ed promote component SLIs to GA; remove feature gates for component slis 2023-09-11 09:15:32 -07:00
Qiutong Song
d3eb082568 Create a node startup latency tracker
Signed-off-by: Qiutong Song <songqt01@gmail.com>
2023-09-11 05:54:25 +00:00
Kubernetes Prow Robot
49768134e5 Merge pull request #119754 from pbxqdown/kubelet-fix-typo
Fix some typos in kubelet component source code
2023-09-09 19:36:11 -07:00
Sascha Grunert
5e0931336b kubelet: fix metric container_start_time_seconds's timestamp
Adapting the tests and reverting https://github.com/kubernetes/kubernetes/pull/103429

Carry-over from https://github.com/kubernetes/kubernetes/pull/117881

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2023-09-08 09:13:37 +02:00
Kubernetes Prow Robot
58ce734223 Merge pull request #120255 from likakuli/feat-addreferenceonlyfirsttime
feat: minimize unnecessary API requests to the API server for the configmap/secret get API
2023-09-07 06:42:57 -07:00
Kubernetes Prow Robot
b27670dfbd Merge pull request #118740 from saschagrunert/kubelet-label-types
Make kubelet label types public
2023-09-06 23:46:57 -07:00
Francesco Romani
2ea47038b9 podresources: e2e: force eager connection
Add and use more facilities to the *internal* podresources client.
Checking e2e test runs, we have quite some
```
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /var/lib/kubelet/pod-resources/kubelet.sock: connect: connection refused": rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial unix /var/lib/kubelet/pod-resources/kubelet.sock: connect: connection refused"
```

This is likely caused by kubelet restarts, which we do plenty in e2e tests,
combined with the fact gRPC does lazy connection AND we don't really
check the errors in client code - we just bubble them up.

While it's arguably bad we don't check properly error codes, it's also
true that in the main case, e2e tests, the functions should just never
fail besides few well known cases, we're connecting over a
super-reliable unix domain socket after all.

So, we centralize the fix adding a function (alongside with minor
cleanups) which wants to trigger and ensure the connection happens,
localizing the changes just here. The main advantage is this approach
is opt-in, composable, and doesn't leak gRPC details into the client
code.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2023-09-07 08:24:49 +02:00
Gunju Kim
696f84aeb0 Feature-gate SidecarContainers code in pkg/kubelet/kuberuntime 2023-09-01 00:13:47 +09:00
likakuli
e167ecbb9e fix: only invoke AddReference first time when to sync same pod to minimize unnecessary API requests to the API server for the configmap/secret get API
Signed-off-by: likakuli <1154584512@qq.com>
2023-08-31 22:56:49 +08:00
Harshal Patil
2e174a029f Add timeout to listContainerStats context
Signed-off-by: Harshal Patil <harpatil@redhat.com>
2023-08-30 10:49:44 -04:00
Sohan Kunkerkar
d5690f12b6 pkg/kubelet: allow sandbox image pinning from CRI
As part of this change, the code responsible for managing the sandbox
image within the kubelet has been removed. Previously, the kubelet used
to prevent sandbox image from the garbage collection process. However,
with this update, the responsibility of managing the sandbox containers
has been shifted to the CRI implementation itself. By allowing sandbox
image pinning from CRI, we improve efficiency and simplify the kubelet's
interaction with the container runtime. As a result, the kubelet can now
rely on the container runtime's built-in mechanisms for sandbox container
lifecycle management.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2023-08-29 15:34:51 -04:00
cvvz
03126c5465 add comment 2023-08-29 10:46:31 +08:00
cvvz
94d03ccc83 Squashed commit of the following:
commit d623614de31fe411f1dcb1e784472135f3ca0c5e
Merge: 8054af3b303 91344b4008
Author: cvvz <ftdchenwz@gmail.com>
Date:   Mon Aug 28 18:43:49 2023 +0800

    Merge branch 'master' of https://github.com/kubernetes/kubernetes into fix-volumemanager-logs

commit 8054af3b303e10e7b74b1ba4d3c4035f488cbdad
Author: cvvz <ftdchenwz@gmail.com>
Date:   Fri Aug 25 22:03:08 2023 +0800

    fix

commit b414972831c4e4030162ee385d8f600e1e0257ac
Author: cvvz <ftdchenwz@gmail.com>
Date:   Fri Aug 25 21:41:36 2023 +0800

    fix

commit ebea00a8dd50eb3d8859a912b464bbda5548b1d4
Author: cvvz <ftdchenwz@gmail.com>
Date:   Fri Aug 25 20:54:40 2023 +0800

    123

commit 9f6f1dbbe717fa34e1c13fec645f4c474cbf99a0
Author: cvvz <ftdchenwz@gmail.com>
Date:   Fri Aug 25 20:53:16 2023 +0800

    add MarshalLog

commit d7d2878409343df937c770d6796f8c125e18ce7a
Author: cvvz <ftdchenwz@gmail.com>
Date:   Tue Aug 8 23:57:47 2023 +0800

    fix volumemanager logs
2023-08-28 18:44:40 +08:00
Patrick Ohly
2472291790 api: introduce separate VolumeResourceRequirements struct
PVC and containers shared the same ResourceRequirements struct to define their
API. When resource claims were added, that struct got extended, which
accidentally also changed the PVC API. To avoid such a mistake from happening
again, PVC now uses its own VolumeResourceRequirements struct.

The `Claims` field gets removed because risk of breaking someone is low:
theoretically, YAML files which have a claims field for volumes now
get rejected when validating against the OpenAPI. Such files
have never made sense and should be fixed.

Code that uses the struct definitions needs to be updated.
2023-08-21 15:31:28 +02:00
Kubernetes Prow Robot
addc0391e7 Merge pull request #116897 from Richabanker/kubelete-resource-metrics-ga
Graduate kubelet resource metrics to GA
2023-08-18 16:03:37 -07:00
Richa Banker
4712025ea8 Graduate kubelet resource metrics to GA 2023-08-17 09:22:48 -07:00
ruiwen-zhao
5bbc4f7605 Pass Pinned field to kubecontainer.Image
Signed-off-by: ruiwen-zhao <ruiwen@google.com>
2023-08-17 00:32:59 +00:00
git-jxj
a5b3a4b738 cleanup: Update deprecated FromInt to FromInt32 (#119858)
* redo commit

* apply suggestions from liggitt

* update Parse function based on suggestions
2023-08-16 09:33:01 -07:00
Kubernetes Prow Robot
19deb04a90 Merge pull request #118619 from TommyStarK/gh_113832
dynamic resource allocation: reuse gRPC connection
2023-08-16 09:32:27 -07:00
Kubernetes Prow Robot
419df231bc Merge pull request #119709 from charles-chenzz/fix_flaky
fix flaky test on dra TestPrepareResources/should_timeout
2023-08-16 06:16:26 -07:00
Antonio Ojea
f355b22f5f implement Stringer for podActions
klog prints an internal error when trying to log the podActions struct.

> I0505 14:12:12.827065  190662 kuberuntime_manager.go:1014] "computePodActions got for pod" podActions="<internal error: json: unsupported type: map[container.ContainerID]kuberuntime.containerToKillInfo>" pod="kube-system/coredns-8f5847b64-mzw46"

Implement the stringer interface on the struct to avoid the json error.

Change-Id: I22444524a78a0ecec9490b9240def371a4129434
2023-08-07 22:48:28 +00:00
Qian Xiao
0944c00778 Fix some typo in kubelet component source code 2023-08-03 23:56:50 -07:00
charles-chenzz
ba9ce3ab08 fix flaky test on dra TestPrepareResources/should_timeout
Co-authored-by: TommyStarK <thomasmilox@gmail.com>
2023-08-03 22:37:54 +08:00
TommyStarK
391c1a3ecc dra: cleanup manager unit tests
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2023-08-02 23:35:45 +02:00
Kubernetes Prow Robot
d4fde1e92a Merge pull request #118549 from a7i/kubelet-prober-metric-pod
fix 'pod' in kubelet prober metrics
2023-07-26 18:28:06 -07:00
Paco Xu
c4bf42199a do not touch swap for cgroup v1 if swap not enabled 2023-07-21 13:27:50 +08:00
TommyStarK
60a8bca507 dynamic resource allocation: add unit test to check the reuse of the gRPC connection
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2023-07-20 19:22:25 +02:00
TommyStarK
7ffd3063ce dynamic resource allocation: reuse gRPC connection
Signed-off-by: TommyStarK <thomasmilox@gmail.com>
2023-07-19 10:12:52 +02:00