mirror of
https://github.com/optim-enterprises-bv/kubernetes.git
synced 2025-11-11 17:16:18 +00:00
Automatic merge from submit-queue (batch tested with PRs 57572, 57512, 57770). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. RBD Plugin: Pass monitors addresses in a comma-separed list instead of trying one by one. **What this PR does / why we need it**: In production, monitors may crash (or have a network problem), if we try monitors one by one, rbd command will hang a long time (e.g. `rbd map -m <unconnectable_host_ip>` on linux 4.4 timed out in 6 minutes) when trying a unconnectable monitor. This is unacceptable. Actually, we can simply pass a comma-separated list monitor addresses to `rbd` command utility. Kernel rbd/libceph modules will pick monitor randomly and try one by one, `rbd` command utility succeed soon if there is a good one in monitors list. [Docs](http://docs.ceph.com/docs/jewel/man/8/rbd/#cmdoption-rbd-m) about `-m` option of `rbd` is wrong, 'rbd' utility simply pass '-m <mon>' parameter to kernel rbd/libceph modules, which takes a comma-seprated list of one or more monitor addresses (e.g. ip1[:port1][,ip2[:port2]...]) in its first version in linux (see602adf4002/net/ceph/ceph_common.c (L239)). Also, libceph choose monitor randomly, so we can simply pass all addresses without randomization (see602adf4002/net/ceph/mon_client.c (L132)). From what I saw, there is no need to iterate monitor hosts one by one. **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes # **Special notes for your reviewer**: Run `rbd map` against unconnectable monitor address logs on Linux 4.4: ``` root@myhost:~# uname -a Linux myhost 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux root@myhost:~# time rbd map kubernetes-dynamic-pvc-941ff4d2-b951-11e7-8836-049fca8e58df --pool <pool> --id <id> -m <unconnectable_host_ip> --key=<password> rbd: sysfs write failed 2017-12-20 18:55:11.810583 7f7ec56863c0 0 monclient(hunting): authenticate timed out after 300 2017-12-20 18:55:11.810638 7f7ec56863c0 0 librados: client.<id> authentication error (110) Connection timed out rbd: couldn't connect to the cluster! In some cases useful info is found in syslog - try "dmesg | tail" or so. rbd: map failed: (110) Connection timed out real 6m0.018s user 0m0.052s sys 0m0.064s ``` We can simply pass a comma-separated list of monitors, if there is a good one in them, `rbd map` succeed soon. ``` root@myhost:~# time rbd map kubernetes-dynamic-pvc-941ff4d2-b951-11e7-8836-049fca8e58df --pool <pool> --id <id> -m <unconnectable_host_ip>,<good_host_ip> --key=<password> /dev/rbd3 real 0m0.426s user 0m0.008s sys 0m0.008s ``` **Release note**: ```release-note NONE ```