On systems with SELinux enabled, non-privileged containers can't access
data of privileged containers. Since the CSI driver socket is exposed
by a privileged container, all sidecars must be privileged too.
This is required for promoting volume limits to GA. The new version of
the driver reports the max number of volumes it supports. Such number
should be specified as a CLI argument when starting the driver.
This is needed for raw block volumes. It mirrors a change made in the upstream
deployment in https://github.com/kubernetes-csi/csi-driver-host-path/pull/109
Raw block volumes use loop devices under the hood. "losetup --find
--show" uses LOOP_CTL_GET_FREE to get a free loop device. It then
expects to have the corresponding /dev/loopX already available. When
/dev inside the container is a static tmpfs which doesn't already have
those /dev/loop* devices (*) the new device fails to show up,
resulting in:
I1028 13:25:19.937846 1 server.go:117] GRPC call: /csi.v1.Controller/CreateVolume
I1028 13:25:19.938083 1 server.go:118] GRPC request: {"accessibility_requirements":{"preferred":[{"segments":{"topology.hostpath.csi/node":"pmem-csi-pmem-govm-worker3"}}],"requisite":[{"segments":{"topology.hostpath.csi/node":"pmem-csi-pmem-govm-worker3"}}]},"capacity_range":{"required_bytes":5368709120},"name":"pvc-24985a49-5638-4bf6-b789-bb99a28d1073","volume_capabilities":[{"AccessType":{"Block":{}},"access_mode":{"mode":1}}]}
I1028 13:25:19.961124 1 volume_path_handler_linux.go:41] Creating device for path: /csi-data-dir/635c6569-f986-11e9-baa6-0242ac110004
I1028 13:25:20.391472 1 volume_path_handler_linux.go:75] Failed device create command for path: /csi-data-dir/635c6569-f986-11e9-baa6-0242ac110004 exit status 1 losetup: /csi-data-dir/635c6569-f986-11e9-baa6-0242ac110004: failed to set up loop device: No such file or directory
E1028 13:25:20.392916 1 server.go:121] GRPC error: rpc error: code = Internal desc = failed to create volume 635c6569-f986-11e9-baa6-0242ac110004: failed to attach device /csi-data-dir/635c6569-f986-11e9-baa6-0242ac110004: exit status 1
(*) It seems that the static tmpfs gets populated by Docker based on
what's currently on the host when the container starts. That would
explain why it worked in the Kubernetes Prow testing - the host must
have had enough loop devices already defined.
This updates to the releases meant to be used with Kubernetes 1.16
except for external-snapshotter, which is kept at the more recent
2.0.0-rc1 which targets 1.17.
The new external-attacher v2.0.0 needs updated RBAC rules, copied
verbatim from the v2.0.0 release.
We need the 1.2.0 driver for that because that has support for
detecting the volume mode dynamically, and we need to deploy a
CSIDriver object which enables pod info (for the dynamic detection)
and both modes (to satisfy the new mode sanity check).
This ensures that the files are in sync with:
hostpath: v1.2.0-rc3
external-attacher: v2.0.1
external-provisioner: v1.3.0
external-resizer: v0.2.0
external-snapshotter: v1.2.0
driver-registrar/rbac.yaml is obsolete because only
node-driver-registrar is in use now and does not need RBAC rules.
mock/e2e-test-rbac.yaml was not used anywhere.
The README.md files were updated to indicate that these really are
files copied from elsewhere. To avoid the need to constantly edit
these files on each update, <version> is used as placeholder in the URL.
The setup of the V0 hostpath driver was done with copy-and-paste and
then changing just the driver name and the manifests. The same can be
achieved by making the base struct a bit more configurable, which
simplifies future changes (less code).
Renaming the provisioner container was unnecessary and was reverted to
make it possible to use the same patch configuration.
While at it, also fix the InitHostV0PathCSIDriver typo.
Commit 503f654d7a, "Update CSI tests to
point to 1.0.0 external bits", changed to images published on gcr.io,
perhaps because the images on quay.io weren't ready yet. We now have
up-to-date images on quay.io and should be using those.
The driver dynamically provisions its volumes in /tmp. To preserve the data
across driver restarts, the directory must be mapped to more persistent
place, /tmp on the host seems to be the safest choice.
Ensuring that CSI drivers get deployed for testing exactly as intended
was problematic because the original .yaml files had to be converted
into code. e2e/manifest helped a bit, but not enough:
- could not load all entities
- didn't handle loading .yaml files with multiple entities
- actually creating and deleting entities still had to be done in tests
The new framework utility code handles all of that, including the
tricky cleanup operation that tests got wrong (AfterEach does not get
called after test failures!).
In addition, it is ensuring that each test gets its own instance of the
entities.
The PSP role binding for hostpath is now necessary because we switch
from creating a pod directly to creation via the StatefulSet
controller, which runs with less privileges.
Without this, the hostpath test runs into these errors in the
kubernetes-e2e-gce job:
Oct 19 16:30:09.225: INFO: At 2018-10-19 16:25:07 +0000 UTC - event for csi-hostpath-attacher: {statefulset-controller } FailedCreate: create Pod csi-hostpath-attacher-0 in StatefulSet csi-hostpath-attacher failed error: pods "csi-hostpath-attacher-0" is forbidden: unable to validate against any pod security policy: []
Oct 19 16:30:09.225: INFO: At 2018-10-19 16:25:07 +0000 UTC - event for csi-hostpath-provisioner: {statefulset-controller } FailedCreate: create Pod csi-hostpath-provisioner-0 in StatefulSet csi-hostpath-provisioner failed error: pods "csi-hostpath-provisioner-0" is forbidden: unable to validate against any pod security policy: []
Oct 19 16:30:09.225: INFO: At 2018-10-19 16:25:07 +0000 UTC - event for csi-hostpathplugin: {daemonset-controller } FailedCreate: Error creating: pods "csi-hostpathplugin-" is forbidden: unable to validate against any pod security policy: []
The extra role binding is silently ignored on clusters which don't
have this particular role.