Commit Graph

11632 Commits

Author SHA1 Message Date
Tim Allclair
c75a3e717e More precise allocatedPod name usage 2024-10-25 13:32:36 -07:00
Tim Allclair
40595bd94b Fix FakeStatusManager SetPodAllocation 2024-10-25 09:51:42 -07:00
Tim Allclair
7166169c82 Tidy up handlePodResize 2024-10-24 16:35:28 -07:00
Tim Allclair
34cf754fe9 Pass allocatedPods to canAdmitPod 2024-10-24 16:31:49 -07:00
Tim Allclair
d1f1bf200c Add more comments 2024-10-24 15:51:19 -07:00
Tim Allclair
321eff34f7 Rework allocated resources handling 2024-10-24 09:27:40 -07:00
Tim Allclair
53aa727708 Checkpoint allocated requests and limits 2024-10-22 11:26:48 -07:00
Kubernetes Prow Robot
8013bc1c25 Merge pull request #126249 from xigang/node_status
kubelet: remove useless comment code for node status
2024-10-22 18:44:53 +01:00
Kubernetes Prow Robot
7429566b07 Merge pull request #127918 from saschagrunert/backoff-status
Use image pull error in `message` during back-off
2024-10-18 19:09:03 +01:00
Sascha Grunert
0fc4b740f8 Use image pull error in message during back-off
The container status waiting reason toggles between `ImagePullBackOff`
and the actual pull error, resulting in a bad user experience for
consumers like kubectl. For example, the output of
`kubectl get pods` does return either:

```
NAME   READY   STATUS                      RESTARTS   AGE
pod    0/1     SignatureValidationFailed   0          10s
```

or

```
NAME   READY   STATUS             RESTARTS   AGE
pod    0/1     ImagePullBackOff   0          18s
```

depending in which state the image pull is. We now improve that behavior
by preserving the actual error in the `message` of the `waiting` state
from the pull during back-off:

```json
{
  "waiting": {
    "message": "Back-off pulling image \"quay.io/crio/unsigned:latest\": SignatureValidationFailed: image pull failed for quay.io/crio/unsigned:latest because the signature validation failed: Source
 image rejected: A signature was required, but no signature exists",
    "reason": "ImagePullBackOff"
  }
}
```

While the `SignatureValidationFailed` value inherits from the previous
known state:

```json
{
  "waiting": {
    "message": "image pull failed for quay.io/crio/unsigned:latest because the signature validation failed: Source image rejected: A signature was required, but no signature exists",
    "reason": "SignatureValidationFailed"
  }
}
```

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
2024-10-18 08:47:37 +02:00
Kubernetes Prow Robot
f5ae0413ca Merge pull request #126347 from vinayakankugoyal/kep2862impl
KEP-2862: Fine-grained Kubelet API Authorization
2024-10-18 03:53:04 +01:00
Kubernetes Prow Robot
ded7ad554e Merge pull request #125513 from mauri870/hotfix/grpc-handle-err
kubelet/cm/devicemanager: log grpc Serve error
2024-10-18 02:49:03 +01:00
Kubernetes Prow Robot
48f36acc7a Merge pull request #125337 from aojea/document_node_addresses
kubelet --node-ip flag using  unspecified IPs and external cloud provider node addresses behavior
2024-10-18 00:55:03 +01:00
Vinayak Goyal
b1f290d444 KEP-2862: Fine-grained Kubelet API Authorization
Signed-off-by: Vinayak Goyal <vinaygo@google.com>
2024-10-17 20:53:27 +00:00
Kubernetes Prow Robot
141951cd6b Merge pull request #126420 from hoskeri/fix-container-succeeded-check-status
kuberuntime_manager: fix container success check.
2024-10-17 20:31:04 +01:00
Kubernetes Prow Robot
e6099268e3 Merge pull request #125080 from TommyStarK/unit-tests/kubelet-apis-config-validation
kubelet/apis/config/validation: improve unit test coverage
2024-10-17 17:17:10 +01:00
Kubernetes Prow Robot
f5b92902a3 Merge pull request #124434 from tu1h/fix-compute-resources-link
API docs: point outdate link to current link
2024-10-17 17:17:03 +01:00
Kubernetes Prow Robot
1f9038a468 Merge pull request #127919 from carlory/fix-127852
Fix data race in kubelet/volumemanager
2024-10-17 14:57:03 +01:00
Kubernetes Prow Robot
a4c262bc8c Merge pull request #127293 from hshiina/typecheck
kubelet/cm: Unite return value types of helper functions
2024-10-17 07:45:04 +01:00
Kubernetes Prow Robot
d67e6545b1 Merge pull request #124227 from iholder101/in-pod-vertical-scaling/extended-resources
[FG:InPlacePodVerticalScaling] Add extended resources to ContainerStatuses[i].Resources
2024-10-17 01:39:03 +01:00
Kubernetes Prow Robot
d1e03f3a77 Merge pull request #127195 from yaojunyu/fix-pod-alway-restart-open-envetedpleg
EventedPLEG: Set Timestamp in PodStatus for Generic PLEG more accurate
2024-10-14 23:36:20 +01:00
Prince Pereira
3448455083 Replacing hcsshim library with new hnslib library. 2024-10-14 10:44:30 -07:00
Kubernetes Prow Robot
426aa3d6ce Merge pull request #127489 from pacoxu/feature/125234
feat: Added net.ipv4.tcp_rmem and net.ipv4.tcp_wmem into safe sysctl list
2024-10-12 08:46:20 +01:00
Peter Hunt
77d03e42cd kubelet/cm: move CPU reading from cm to cm/cpumanager
Authored-by: Francesco Romani <fromani@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-10-11 11:29:16 -04:00
Peter Hunt
c51195dbd0 kubelet/cm: fix bug where kubelet restarts from missing cpuset cgroup
on None cpumanager policy, cgroupv2, and systemd cgroup manager, kubelet
could get into a situation where it believes the cpuset cgroup was created
(by libcontainer in the cgroupfs) but systemd has deleted it, as it wasn't requested
to create it. This causes one unnecessary restart, as kubelet fails with

`failed to initialize top level QOS containers: root container [kubepods] doesn't exist.`

This only causes one restart because the kubelet skips recreating the cgroup
if it already exists, but it's still a bother and is fixed this way

Signed-off-by: Peter Hunt <pehunt@redhat.com>
2024-10-11 10:49:16 -04:00
Kubernetes Prow Robot
3bf17e2340 Merge pull request #127959 from ffromani/fix-smtalign-error-message
node: cpumanager: fix smtalign error message and minor cleanup
2024-10-11 00:32:20 +01:00
Francesco Romani
838f911dea cpumanager: smtalign: fix error message
Fix error message if availablePhysicalCPUs = 0.
Without this change, the logic was mistakenly emitting
the old error message, which is confusing for troubleshooting.

Plus, a tiny quality of life improvement:
cpumanager static policy wants to use `cpuGroupSize` multiple times.
The value represents how many VCPUs per PCPUs the machine has.
So, let's cache (and log!) the value in the policy data.
We don't support dynamic update of the HW topology anyway.

Signed-off-by: Francesco Romani <fromani@redhat.com>
2024-10-10 10:18:44 +02:00
Kubernetes Prow Robot
36122d5a9b Merge pull request #125103 from hjet/kuberuntime-testcov
[FG:InPlacePodVerticalScaling] Expand coverage for TestGenerateLinuxContainerResources
2024-10-09 01:58:22 +01:00
carlory
4c10212d7b Fix data race in kubelet/volumemanager 2024-10-08 16:39:02 +08:00
Kubernetes Prow Robot
a7fcc89ac0 Merge pull request #125936 from sivchari/use-ptr
use utils/ptr package instead of utils/pointer
2024-10-07 01:02:04 +01:00
Kubernetes Prow Robot
3d6c99e1a7 Merge pull request #125139 from huww98/kubelet-vm-cleaup
kubelet/volumemanager: cleanup set and sort
2024-10-04 17:12:27 +01:00
Kubernetes Prow Robot
83a1310228 Merge pull request #126575 from Lucaber/volume-attach-memory-allocations
Reduce memory usage/allocations during wait for volume attachment
2024-10-04 16:08:27 +01:00
Kubernetes Prow Robot
c95dd85823 Merge pull request #127396 from olyazavr/no-dupe-mount-unmount
check if volume already has mount op in progress before mount/unmount
2024-10-04 11:30:26 +01:00
Kubernetes Prow Robot
c923a61ddd Merge pull request #125982 from harche/compressible_reserved
Set only compressible resources on system and kube reserved cgroup slices
2024-10-04 04:08:27 +01:00
Harshal Patil
3bad47e8ed Set only compressible resources on system slice
Signed-off-by: Harshal Patil <harpatil@redhat.com>
2024-10-03 13:23:34 -04:00
sivchari
4eab3cca0a use utils/ptr package instead of utils/pointer
Signed-off-by: sivchari <shibuuuu5@gmail.com>
2024-10-03 11:33:12 +09:00
Kubernetes Prow Robot
e2c17c09a4 Merge pull request #125070 from torredil/kublet-vm-race
Ensure volumes are unmounted during graceful node shutdown
2024-10-02 00:33:48 +01:00
Kubernetes Prow Robot
1b71b94b73 Merge pull request #127711 from elmiko/correct-provider-deprecation-logic
Correct cloud provider detection logic to be more representative of deprecation and disablement status
2024-09-30 20:37:24 +01:00
elmiko
38fe239ac4 factor our cloudprovider.DeprecationWarningForProvider
this change removes the deprecation warning function in favor of using
the `cloudprovider.DisableWarningForProvider`. it also fixes some of the
logic to ensure that non-external providers are properly detected and
warned about.
2024-09-30 12:20:25 -04:00
Kubernetes Prow Robot
7ee17ce9b7 Merge pull request #126488 from haircommander/cri-stats-suffix-hack
kubelet: don't use cadvisor stats if PodAndContainerStatsFromCRI feature is enabled
2024-09-30 17:00:07 +01:00
carlory
4c913a86ea Fix close of closed channel in Test_Run_Positive_VolumeMountControllerAttachEnabledRace 2024-09-29 18:34:35 +08:00
Kubernetes Prow Robot
e34f7f4d80 Merge pull request #127671 from mmorel-35/testify/error-contains
fix: use `ErrorContains(t, err` instead of `Contains(t, err.Error()`
2024-09-28 19:18:01 +01:00
Kubernetes Prow Robot
909f9b912e Merge pull request #127692 from mmorel-35/testifylint/expected-actual@k8s.io/kubernetes
fix: enable expected-actual rule from testifylint in module `k8s.io/kubernetes`
2024-09-28 05:54:01 +01:00
Kubernetes Prow Robot
26399fa7be Merge pull request #127649 from mmorel-35/testifylint/formatter@k8s.io/kubernetes
fix: enable formatter rule from testifylint in module `k8s.io/kubernetes`
2024-09-28 02:52:09 +01:00
Matthieu MOREL
f736cca0e5 fix: enable expected-actual rule from testifylint in module k8s.io/kubernetes
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-09-27 07:56:31 +02:00
Matthieu MOREL
f777addb05 fix: use ErrorContains(t, err instead of Contains(t, err.Error()
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-09-26 22:22:20 +02:00
elmiko
d1d05d3eba remove IsDeprecatedInternal from cloudprovider.plugins
The internal cloud controller loops are disabled at this point, this
function should not be used as it does not return accurate information.
In its place we check for the presence of the external cloud provider as
that is the only acceptable value.
2024-09-26 14:55:25 -04:00
Matthieu MOREL
b7248077a9 fix: enable formatter rule from testifylint in module k8s.io/kubernetes
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-09-26 08:19:54 +02:00
Kirtana Ashok
4a5513c19c Add local reference to hcs structs in windows cri stats test
Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
2024-09-25 18:56:03 -07:00
Matthieu MOREL
27b98be303 fix: enable nil-compare and error-nil rules from testifylint in module k8s.io/kubernetes
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2024-09-25 06:02:47 +02:00