PVC and containers shared the same ResourceRequirements struct to define their
API. When resource claims were added, that struct got extended, which
accidentally also changed the PVC API. To avoid such a mistake from happening
again, PVC now uses its own VolumeResourceRequirements struct.
The `Claims` field gets removed because risk of breaking someone is low:
theoretically, YAML files which have a claims field for volumes now
get rejected when validating against the OpenAPI. Such files
have never made sense and should be fixed.
Code that uses the struct definitions needs to be updated.
node.status.volumesInUse should report only attachable volumes, therefore
it needs to wait for the reconciler to update uncertain attachability of
volumes from the API server.
During CSI volume reconstruction it's not possible to tell, if the volume
is attachable or not - CSIDriver instance may not be available, because
kubelet may not have connection to the API server at that time.
Adding uncertain state during reconstruction + adding a correct state when
the API server is available.
reconstructVolume() is called when kubelet may not have connection to the
API server yet, therefore it cannot get CSIDriver instances to figure out
if a CSI volume is attachable or not.
Refactor reconstructVolume(), so it does not need
FindAttachablePluginBySpec for CSI volumes, because all of them are
deviceMountable (i.e. FindDeviceMountablePluginBySpec always returns the
CSI volume plugin).
Every component that uses a pod.Manager should use a stub interface
(like we do for podWorker) that explicitly describes what methods
they use. This will allow podWorker to implement the minimum set
of manager interfaces.
The two are not coupled except accidentally. Separate them and
update callsites. This will reduce the scope of PodManager interface
to make exposing the pod worker cleaner.
This allows us to return with a timeout error as soon as the
context is canceled. Previously in cases where the mount will
never succeed pods can get stuck deleting for 2 minutes.
In the Sync*Pod methods that call VolumeManager.WaitFor*, we
must filter out wait.Interrupted errors from being logged as
they are part of control flow, not runtime problems. Any
early interruption should result in exiting the Sync*Pod method
as quickly as possible without logging intermediate errors.
`findAndRemoveDeletedPods()` processes only pods from volume_manager cache: `dswp.desiredStateOfWorld.GetVolumesToMount()`. `podWorker` calls volume_manager `WaitForUnmount()` asynchronously. If it happens after populator cleaned up resources, an entry is added to `processedPods` and will never be seen. Let's cleanup such entries if they don't have a pod and marked for deletion.
Kubelet in standalone mode won't have kubeclient, it cannot get node.status
and get devices from it. Such a kubelet cannot mount attachable volumes
anyway.
DesiredStateOfWorld must remember both
- the effective SELinux label to apply as a mount option (non-empty for
RWOP volumes, empty otherwise)
- and the label that _would_ be used if the mount option would be used by
all access modes.
Mismatch warning metrics must be generated from the second label.
The path module has a few different functions:
Clean, Split, Join, Ext, Dir, Base, IsAbs. These functions do not
take into account the OS-specific path separator, meaning that they
won't behave as intended on Windows.
For example, Dir is supposed to return all but the last element of the
path. For the path "C:\some\dir\somewhere", it is supposed to return
"C:\some\dir\", however, it returns ".".
Instead of these functions, the ones in filepath should be used instead.
SELinux context discovered from Pod is not final, it can be cleared when a
volume plugin does not support SELinux or the volume is not
ReadWriteOncePod. Update the existing log line + add a new one for easier
debugging.
Add volume reconstruction logs to V(2) to see initial kubelet
ActualStateOfWorld after kubelet start. Kubelet logs SetUp / TearDown
events at V(2) already, so we can track the whole volume mount state in
V(2) logs.
To preserve fix in https://github.com/kubernetes/kubernetes/pull/110670,
add an unit test that check a volume is *uncertain* even after final mount
error when it was reconstructed.
And actually fix a regression introduced in the previous patch.
Move reconciler logic from reconstruct{new}.go to:
- reconciler.go - only the functionality used by the current (old)
reconciler.
- reconciler_new.go - only the functionality used by the new reconciler.
- reconciler_common.go - common functions.