There is a reference error in the parameter "client_interface" in the "_
nova-console-compute-init.sh.tpl" file, now fix it.
Change-Id: I0b1bdd348e1f424afda9aa2183c0e876afd12968
The current copyright refers to a non-existent group
"openstack helm authors" with often out-of-date references that
are confusing when adding a new file to the repo.
This change removes all references to this copyright by the
non-existent group and any blank lines underneath.
Change-Id: Ia035037e000f1bf95202fc07b8cd1ad0fc019094
This patch adds ability to unhardcode readiness/
liveness probes timings. Moreover it introduces
RPC_PROBE_TIMEOUT and RPC_PROBE_RETRIES variables
which are passed to health probe script and
allow to unhardcode RPCtest timeout and number of
retries
Change-Id: I2498a14e97557feafbd45c8df3c683f8500026e6
[0] added route command to identify multiple default routes.
In some deployments, route command is not available which set the
client_interface value incorrectly. In this case VNC clinet tries
to connect to default host 127.0.0.1 and fails.
[0] https://review.opendev.org/#/c/696187
Change-Id: I4a936af053114988e0b70048e276a71833c5638e
In this patchset, the iSCSI protocol support is added
to enable Cinder to use iSCSI based storage backends.
Bootable volumes are not supported, only VM attached
volumes are supported for this initial patchset.
Change-Id: I1b35290b62d2cebae4bd8be62126a53f230ac6c0
Changed Nova and Neutron health-probe script to exit if previous
probe process is still running.
The health-probe has RPC call timeout of 60 seconds and has 2
retries. In worst case scenario the probe process can run a little
over 180 seconds. Changing the periodSeconds so that probe starts
after previous one is complete. Also changing timeoutSeconds value
a little to give little more extra time for the probe to finish.
Increasing the liveness probe periods as they are not do critical
which will reduce the resource usage for the probes.
Co-authored-by: Randeep Jalli <rj2083@att.com>
Change-Id: Ife1c381d663c1e271a5099bdc6d0dfefb00d8d73
It was observed that when increasing amount of
conductor workers from default "1" to higher value
the readiness probe fails to check rabbitmq connections
for conductor processes - it happens since the script is trying
to obtain rabbitmq connections for parent conductor process
which in case of workers>1 doesn`t open rabbit connections
but spawns child processes which handle rabbitmq
connections instead.
This patch removes the "check-all-pids" option, keeps the logic
but simplifies and fastens he code - instead of checking all
processes when "check-all-pids" option was set (however
regardless of "sock_count value" if only one process opens connection
the check returns positive result) processes will be checked one-by-one
until the first one with open rabbitmq connection(s) is
found.
Change-Id: I72be0bbdefcba77a55b6ceed6e192c9621c069eb
This patch set adds in a capability for the user to defaultly use a
FQDN for the nova compute hostname and the hypervisor hostname when
the host is not explicitly specified in the .Values.conf override.
Change-Id: I3243068dfe91ebb97b3885002296a0f454822ec5
Co-authored-by: Drew Walters <andrew.walters@att.com>
Signed-off-by: Tin Lam <tin@irrational.io>
Because it's almost time for expiring on some python version, OpenStack client
running on that version generates some messages for warning. Two scripts on
nova Fixed by this PS get version information using the OpenStack client
without any protection for this kinds of messages. This PS gives a little
more sophisticated way of it.
Change-Id: I2896c76e012b9acbf1e725276ba9c0b74789fa54
This patchset adds the capability to the Nova chart to be able to wait
for a percentage of the compute nodes/hypervisors to become ready/available
before continuing on with the deployment. It will be disabled by default,
because this is a feature that may or may not be needed in production
deployments.
Change-Id: I971151a663afc87e7d62efa4ab3723c5472a3736
This PS udpates the nova compute start script to account for cases where
there may be multiple default routes to the outside world.
Change-Id: Ibd051c2577a0ab67aa2a5284fc9ccab799c28953
Signed-off-by: Pete Birley <pete@port.direct>
Python psutil library has not been consistent in behavior
a. gives trucated process names at times
b. the truncated names sometimes contain path to Python instead
of the program name Python runs
Change-Id: I99b77a4c28761a2187e59be4e562d5893ef3caa9
Python 2 is sunsetting in Jan 2020. We should not be finding python 2
explicitly. This patch removes those calls.
Change-Id: Ie6c9ad77097e662393c5fdd26490ebef25bdc3de
Signed-off-by: Tin Lam <tin@irrational.io>
This PS allows the db connection string for the singular cell that OSH
currently supports to be updated, and also uses the full connection
string for the transport url.
Change-Id: I700133263273e04dad5b3e69d5e1f8255323e560
Signed-off-by: Pete Birley <pete@port.direct>
If the transport url changes, cell needs to be updated to use new
transport.
Change-Id: I1a931b5ce272a731be710c43f3fea08abc79af71
Signed-off-by: Pete Birley <pete@port.direct>
There can be more than one RabbitMQ node in
transport_url in conf file when RabbitMQ is
configured in HA mode.
Change-Id: I9721e2e33212918d402bce295c02b1869dce67f7
This PS updates the charts to use the htk function recently introduced
to allow oslo.messaging clients ans servers to directly hit their
backends rather than using either DNS or K8S svc based routing.
Depends-On: I5150a64bd29fa062e30496c1f2127de138322863
Change-Id: I458b4313c57fc50c8181cedeca9919670487926a
Signed-off-by: Pete Birley <pete@port.direct>
If an image is built with python3 therefore libapache2-mod-wsgi-py3
module have to be installed accordingly but the module doesn't create
/var/run/apache2 directory which is APACHE_RUN_DIR in apache configuration
file so apache can't start without it due to the fact that the directory
is used to make there pid, run, etc files.
Change-Id: Ic92b095e9d7636c3ed833241bd3badbb4bb6e552
Currently there is fake rpc call "pod_health_probe_method_ignore_errors"
that is passed to the service, just to find out if it is responding. Because
such method does not exist, it is needed to catch and handle the exception
that is inevitably thrown by the service.
While this is technically working correctly, the exceptions pollute the
log files and make it harder for user to see possible real errors.
This is how the error looks like:
ERROR oslo_messaging.rpc.server [-] Exception during message handling: oslo_messaging.rpc.dispatcher.UnsupportedVersion: Endpoint does not support RPC version 1.0. Attempted method: pod_health_probe_method_ignore_errors
ERROR oslo_messaging.rpc.server Traceback (most recent call last):
ERROR oslo_messaging.rpc.server File "/var/lib/openstack/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
ERROR oslo_messaging.rpc.server File "/var/lib/openstack/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch
ERROR oslo_messaging.rpc.server raise UnsupportedVersion(version, method=method)
ERROR oslo_messaging.rpc.server oslo_messaging.rpc.dispatcher.UnsupportedVersion: Endpoint does not support RPC version 1.0. Attempted method: pod_health_probe_method_ignore_errors
This situation is new since https://review.openstack.org/#/c/639711/
which (correctly) increased the default level of logging. Before 639711
error messages from oslo (both real and ones that could be ignored) were not
present in nova logs at all.
Fortunatelly, nova's BaseAPI class provides 'ping' method that is can
be used for this basic purpose by all nova components.
Change-Id: I0062e74bed399206becb8d9e00f9ec805da864a3
When novnc pod is re-run because of host reboot and so on,
novnc pod has existing volume /tmp/usr/share, which has 0444 permissions.
So init container occurs an error while it tries to copy asset files.
cp: cannot create regular file '/tmp/usr/share/novnc/index.html': Permission denied
With -f option, the init container can copy without errors.
Change-Id: I56d928b7f4a30a6be29b47560357a3b4f5eec764
Signed-off-by: hagun.kim <hagun.kim@samsung.com>
With this patch we allow for a more easy way of overriding some of
the values that may be used in other distros while maintainting the
default values if those values are not overriden
The following values are introduced to be overriden:
conf:
security:
software:
apache2:
conf_dir:
site_dir:
mods_dir
binary:
extra_flags:
a2enmod:
a2dismod:
On which:
* conf_dir: directory where to drop the config files for apache vhosts
* site_dir: directory where to drop the enabled virtualhosts
* mods_dir: directory where to drop any mod configuration
* binary: the binary to use for launching apache
* extra_flags: any flags that will be passed to the apache binary call
* a2enmod: mods to enable
* a2dismod: mods to disable
* security: security configuration for apache
Notice that if there is no overrides given, it should not affect anything
and the templates will not be changed as the default values are set
to what they used to be
Change-Id: I4fcfde78c5c8fa65956aeae55108ffa1f10e6972
Only start the sshd container of nova-compute pod if the capability is
enabled. Defaults to off to allow cases where nova docker image does
not have ssh packages to run cleanly.
Story: 2003463
Task: 30441
Change-Id: I3acf5b654ecda23a93f6c28e865e1bbee14370aa
Signed-off-by: Gerry Kopec <Gerry.Kopec@windriver.com>
- Fix .ssh/config file mapping
- Move private key from nova-compute-ssh container to nova-compute
container.
- Map private and public keys to configmap-ssh which will default to
the appropriate file permissions.
- Add additional config to /etc/ssh/sshd_config to allow passwordless
root logins over appropriate subnet passed in from overrides.
- Remove chmods from sshd bash script as they are failing.
Depends on helm-toolkit supporting multiple containers per daemonset
pod.
Story: 2003463
Task: 24723
Change-Id: Idd2e802c293f1e14991ee787ade9a4936fb373ff
Signed-off-by: Gerry Kopec <Gerry.Kopec@windriver.com>
On some services it looks like the parent pid does not connect to
rabbitmq and its the children the ones that do instead, for example
in nova-scheduler from rocky version onwards.
The current health check only checks for the main parent pid to see
if it has an active connection to the rabbitmq port.
This patch adds a flag to allow the health probe to check all processes
for the mysql/rabbit connection instead of skipping any children process.
It also enables it by default for nova-scheduler as it wont affect older versions
than only run 1 process, but will work on later versions where
the main process forks.
Change-Id: I9677fd2aff11b563ab18059927ca12d5ace107ce
Under python3 an Exception no longer has the message attribute,
instead you can just str the exception to get the error message
Change-Id: Ibf88ae6b73f3bafcc2b99bb01e31bf8c25021e47
Health probe for Nova pods is used for both liveness
and readiness probe.
nova-compute, nova-conductor, nova-consoleauth and nova-scheduler:
Check if the rpc socket status on the nova pods to rabbitmq and
database are in established state.
sends an RPC call with a non-existence method to component's queue.
Probe is success if agent returns with NoSuchMethod error.
If agent is not reachable or fails to respond in time,
returns failure to probe.
novnc/spice proxy: uses Kubernetes tcp probe on corresponding ports
they expose.
Added code to catch nova config file not present exception.
Change-Id: Ib8e4b93486588320fd2d562c3bc90b65844e52e5
This patch set host_interface for update host_ip information in compute
node.
Currently helm chart defines the value of my_ip set "0.0.0.0",
therefore host_ip of compute node is null.
$ nova hypervisor-show {uuid}
+---------------------------+------------------------------------------+
| Property | Value |
+---------------------------+------------------------------------------+
| cpu_info_arch | x86_64 |
.
.
| host_ip | None |
Through this patch, OpenStack can provide appropriate values for
the required field.
Change-Id: I05f929cb2c777582c177e8c7a64b9fd431d554ec
The update makes sure the Openstack service's cephx
user capabilities match best practices in terms of
security permissions after a site or software update.
Change-Id: I70e7f620accb186da2013ba95472777c25739cc1
Some environment which is enabled zeroconf network, it returns
multilines. This PS make to get the one ip address correctly.
Change-Id: I577f02908b76b280d8fa87acec25d96c3f556e47
This option is useful in environments where the live-migration traffic
can impact the network plane significantly.
A separate network for live-migration traffic can then use this config
option and avoids the impact on the management network.
Change-Id: Id16c95e77730e5b244cf5bc69beb0e549c979701
This PS adds a cron job that reaps dead services from Nova.
Change-Id: I59e74c7520b0341d7cb7ebddd4c21e459e9c2049
Signed-off-by: Pete Birley <pete@port.direct>
nova-console-compute-init.sh and nove_consolt-proxy-init.sh generates
incorrect configuration if there exists multiple IP address on default
interface.
To solve this problem, we pickup first IP address if there exists
multiple IP on that interface.
Change-Id: Iaadd2e71d624122e68fdd628771df21cd61c0784