* Add `Linux{Sandbox,Container}SecurityContext.SupplementalGroupsPolicy` and `ContainerStatus.user` in cri-api
* Add `PodSecurityContext.SupplementalGroupsPolicy`, `ContainerStatus.User` and its featuregate
* Implement DropDisabledPodFields for PodSecurityContext.SupplementalGroupsPolicy and ContainerStatus.User fields
* Implement kubelet so to wire between SecurityContext.SupplementalGroupsPolicy/ContainerStatus.User and cri-api in kubelet
* Clarify `SupplementalGroupsPolicy` is an OS depdendent field.
* Make `ContainerStatus.User` is initially attached user identity to the first process in the ContainerStatus
It is because, the process identity can be dynamic if the initially attached identity
has enough privilege calling setuid/setgid/setgroups syscalls in Linux.
* Rewording suggestion applied
* Add TODO comment for updating SupplementalGroupsPolicy default value in v1.34
* Added validations for SupplementalGroupsPolicy and ContainerUser
* No need featuregate check in validation when adding new field with no default value
* fix typo: identitiy -> identity
allow to specify what IDs must be used by the kubelet to create user
namespaces.
If no additional UIDs/GIDs are not allocated to the "kubelet" user,
then the kubelet assumes it can use any ID on the system.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
There is a conversion function `ConvertPodStatusToRunningPod`, which
can override the `Container.ImageID` into a digested reference from the
`ContainerStatus` CRI RPC, which gets mapped from the `image_ref`:
411c29c39f/pkg/kubelet/container/helpers.go (L259-L292)
To avoid that failure case, we now introduce the same `image_id` into
the container status and let runtimes separate the fields.
We also add a note that the mapping from the digested reference of the
CRI to the Kubernetes Pod API `ImageID` field is intentional and should
not change.
Follow-up on: https://github.com/kubernetes/kubernetes/pull/123508
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
- Implement `computeInitContainerActions` to sets the actions for the
init containers, including those with `RestartPolicyAlways`.
- Allow StartupProbe on the restartable init containers.
- Update PodPhase considering the restartable init containers.
- Update PodInitialized status and status manager considering the
restartable init containers.
Co-authored-by: Matthias Bertschy <matthias.bertschy@gmail.com>
hostIPs order may not be be consistent. If secondary IP is before
primary one, current logic adds primary IP twice into PodIPs, which
leads to error: "may specify no more than one IP for each IP family".
In this case, the second IP shouldn't be added.
Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Container runtimes like CRI-O and containerd reuse the code by copying
it from Kubernetes. To have a single source of truth for the streaming
server we now move the already isolated implementation to the
k8s.io/kubelet staging repository. This way runtimes can re-use the code
without copying the parts.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
A pod that cannot be started yet (due to static pod fullname
exclusion when UIDs are reused) must be accounted for in the
pod worker since it is considered to have been admitted and will
eventually start.
Due to a bug we accidentally cleared pendingUpdate for pods that
cannot start yet which means we can't report the right metric to
users in kubelet_working_pods and in theory we might fail to start
the pod in the future (although we currently have not observed
that in tests that should catch such an error). Describe, implement,
and test the invariant that when startPodSync returns in every path
that either activeUpdate OR pendingUpdate is set on the status, but
never both, and is only nil when the pod can never start.
This bug was detected by a "programmer error" assertion we added
on metrics that were not being reported, suggesting that we should
be more aggressive on using log assertions and automating detection
in tests.
HandlePodCleanups is responsible for restarting pods that are no
longer running (usually due to delete and recreation with the same
UID in quick succession). We have to filter the list of pods to
restart from podManager to get the list of admitted pods, which
uses filterOutInactivePods on the kubelet. That method excludes
pods the pod worker has already terminated. Since a restarted
pod will be in the terminated state before HandlePodCleanups
calls SyncKnownPods, we have to call filterOutInactivePods after
SyncKnownPods, otherwise the to-be-restarted pod is ignored and
we have to wait for the next houskeeping cycle to restart it.
Since static pods are often critical system components, this
extra 2s wait is undesirable and we should restart as soon as
we can. Add a failing test that passes after we move the filter
call after SyncKnownPods.