See https://github.com/golang/mock#gomock: golang/mock is no longer
maintained, and should be replaced by go.uber.org/mock.
This allows golang/mock to be dropped from the status and vendored
fields in unwanted-dependencies.json.
Signed-off-by: Stephen Kitt <skitt@redhat.com>
Generating the name avoids all potential name collisions. It's not clear how
much of a problem that was because users can avoid them and the deterministic
names for generic ephemeral volumes have not led to reports from users. But
using generated names is not too hard either.
What makes it relatively easy is that the new pod.status.resourceClaimStatus
map stores the generated name for kubelet and node authorizer, i.e. the
information in the pod is sufficient to determine the name of the
ResourceClaim.
The resource claim controller becomes a bit more complex and now needs
permission to modify the pod status. The new failure scenario of "ResourceClaim
created, updating pod status fails" is handled with the help of a new special
"resource.kubernetes.io/pod-claim-name" annotation that together with the owner
reference identifies exactly for what a ResourceClaim was generated, so
updating the pod status can be retried for existing ResourceClaims.
The transition from deterministic names is handled with a special case for that
recovery code path: a ResourceClaim with no annotation and a name that follows
the Kubernetes <= 1.27 naming pattern is assumed to be generated for that pod
claim and gets added to the pod status.
There's no immediate need for it, but just in case that it may become relevant,
the name of the generated ResourceClaim may also be left unset to record that
no claim was needed. Components processing such a pod can skip whatever they
normally would do for the claim. To ensure that they do and also cover other
cases properly ("no known field is set", "must check ownership"),
resourceclaim.Name gets extended.
- Implement `computeInitContainerActions` to sets the actions for the
init containers, including those with `RestartPolicyAlways`.
- Allow StartupProbe on the restartable init containers.
- Update PodPhase considering the restartable init containers.
- Update PodInitialized status and status manager considering the
restartable init containers.
Co-authored-by: Matthias Bertschy <matthias.bertschy@gmail.com>
Every component that uses a pod.Manager should use a stub interface
(like we do for podWorker) that explicitly describes what methods
they use. This will allow podWorker to implement the minimum set
of manager interfaces.
The two are not coupled except accidentally. Separate them and
update callsites. This will reduce the scope of PodManager interface
to make exposing the pod worker cleaner.
The status manager channel forces all container status to be
processed, even if multiple updates are generated in succession.
Instead of queueing the updates, just remember which ones changed
and process them in a batch. This should reduce QPS load from
the Kubelet for status, reduce latency of status propagation to
the API in general, and is easier to reason about.
This also prevents status from being lost when the channel is
full - all updates sent by SetPodStatus are guaranteed to be
recorded. Changing to remove the channel allows us to set a
marker flag when the pod worker state machine completes that
avoids the status manager having to call into the pod worker
directly.
1. Core Kubelet changes to implement In-place Pod Vertical Scaling.
2. E2E tests for In-place Pod Vertical Scaling.
3. Refactor kubelet code and add missing tests (Derek's kubelet review)
4. Add a new hash over container fields without Resources field to allow feature gate toggling without restarting containers not using the feature.
5. Fix corner-case where resize A->B->A gets ignored
6. Add cgroup v2 support to pod resize E2E test.
KEP: /enhancements/keps/sig-node/1287-in-place-update-pod-resources
Co-authored-by: Chen Wang <Chen.Wang1@ibm.com>
Currently, there are some unit tests that are failing on Windows due to
various reasons:
- config options not supported on Windows.
- files not closed, which means that they cannot be removed / renamed.
- paths not properly joined (filepath.Join should be used).
- time.Now() is not as precise on Windows, which means that 2
consecutive calls may return the same timestamp.
- different error messages on Windows.
- files have \r\n line endings on Windows.
- /tmp directory being used, which might not exist on Windows. Instead,
the OS-specific Temp directory should be used.
- the default value for Kubelet's EvictionHard field was containing
OS-specific fields. This is now moved, the field is now set during
Kubelet's initialization, after the config file is read.
Track how long it takes for pod updates to propagate from detection
to successful change on API server. Will guide future improvements
in pod start and shutdown latency.
Metric is `kubelet_pod_status_sync_duration_seconds` and is ALPHA
stability. Histogram buckets are chosen based on distribution of
observed status delays in practice.
- PreemptionByKubeScheduler (Pod preempted by kube-scheduler)
- DeletionByTaintManager (Pod deleted by taint manager due to NoExecute taint)
- EvictionByEvictionAPI (Pod evicted by Eviction API)
- DeletionByPodGC (an orphaned Pod deleted by PodGC)PreemptedByScheduler (Pod preempted by kube-scheduler)
Terminal pods may continue to report a ready condition of true because
there is a delay in reconciling the ready condition of the containers
from the runtime with the pod status. It should be invalid for kubelet
to report a terminal phase with a true ready condition. To fix the
issue, explicitly override the ready condition to false for terminal
pods during status updates.
Signed-off-by: David Porter <david@porter.me>
Create an E2E test that creates a job that spawns a pod that should
succeed. The job reserves a fixed amount of CPU and has a large number
of completions and parallelism. Use to repro github.com/kubernetes/kubernetes/issues/106884
Signed-off-by: David Porter <david@porter.me>
Other components must know when the Kubelet has released critical
resources for terminal pods. Do not set the phase in the apiserver
to terminal until all containers are stopped and cannot restart.
As a consequence of this change, the Kubelet must explicitly transition
a terminal pod to the terminating state in the pod worker which is
handled by returning a new isTerminal boolean from syncPod.
Finally, if a pod with init containers hasn't been initialized yet,
don't default container statuses or not yet attempted init containers
to the unknown failure state.
In the following code pattern, the log message will get logged with v=0 in JSON
output although conceptually it has a higher verbosity:
if klog.V(5).Enabled() {
klog.Info("hello world")
}
Having the actual verbosity in the JSON output is relevant, for example for
filtering out only the important info messages. The solution is to use
klog.V(5).Info or something similar.
Whether the outer if is necessary at all depends on how complex the parameters
are. The return value of klog.V can be captured in a variable and be used
multiple times to avoid the overhead for that function call and to avoid
repeating the verbosity level.