Currently there is fake rpc call "pod_health_probe_method_ignore_errors"
that is passed to the service, just to find out if it is responding. Because
such method does not exist, it is needed to catch and handle the exception
that is inevitably thrown by the service.
While this is technically working correctly, the exceptions pollute the
log files and make it harder for user to see possible real errors.
This is how the error looks like:
ERROR oslo_messaging.rpc.server [-] Exception during message handling: oslo_messaging.rpc.dispatcher.UnsupportedVersion: Endpoint does not support RPC version 1.0. Attempted method: pod_health_probe_method_ignore_errors
ERROR oslo_messaging.rpc.server Traceback (most recent call last):
ERROR oslo_messaging.rpc.server File "/var/lib/openstack/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
ERROR oslo_messaging.rpc.server File "/var/lib/openstack/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch
ERROR oslo_messaging.rpc.server raise UnsupportedVersion(version, method=method)
ERROR oslo_messaging.rpc.server oslo_messaging.rpc.dispatcher.UnsupportedVersion: Endpoint does not support RPC version 1.0. Attempted method: pod_health_probe_method_ignore_errors
This situation is new since https://review.openstack.org/#/c/639711/
which (correctly) increased the default level of logging. Before 639711
error messages from oslo (both real and ones that could be ignored) were not
present in nova logs at all.
Fortunatelly, nova's BaseAPI class provides 'ping' method that is can
be used for this basic purpose by all nova components.
Change-Id: I0062e74bed399206becb8d9e00f9ec805da864a3
This PS adds emptydirs backing the /tmp directory in pods, which
is required in most cases for full operation when using a read only
filesystem backing the container.
Additionally some yaml indent issues are resolved.
Change-Id: I9df8f70e913b911ff755600fa2f669d9c5dcb928
Signed-off-by: Pete Birley <pete@port.direct>
When novnc pod is re-run because of host reboot and so on,
novnc pod has existing volume /tmp/usr/share, which has 0444 permissions.
So init container occurs an error while it tries to copy asset files.
cp: cannot create regular file '/tmp/usr/share/novnc/index.html': Permission denied
With -f option, the init container can copy without errors.
Change-Id: I56d928b7f4a30a6be29b47560357a3b4f5eec764
Signed-off-by: hagun.kim <hagun.kim@samsung.com>
There is currently no testing of the Leap 15 images in OSH.
This addresses it by:
- Using the values_overrides folder according to the multi-os
spec, creating value override files there for changes that
needs to happen on Leap 15 images.
- Point to the right images using the previously created folder,
to allow using those in CI easily.
- Change CI to use previously created overrides.
Depends-On: https://review.openstack.org/#/c/651501
Change-Id: I520d3676195c62b253a19397c86b0d0fbabee710
With this patch we allow for a more easy way of overriding some of
the values that may be used in other distros while maintainting the
default values if those values are not overriden
The following values are introduced to be overriden:
conf:
security:
software:
apache2:
conf_dir:
site_dir:
mods_dir
binary:
extra_flags:
a2enmod:
a2dismod:
On which:
* conf_dir: directory where to drop the config files for apache vhosts
* site_dir: directory where to drop the enabled virtualhosts
* mods_dir: directory where to drop any mod configuration
* binary: the binary to use for launching apache
* extra_flags: any flags that will be passed to the apache binary call
* a2enmod: mods to enable
* a2dismod: mods to disable
* security: security configuration for apache
Notice that if there is no overrides given, it should not affect anything
and the templates will not be changed as the default values are set
to what they used to be
Change-Id: I4fcfde78c5c8fa65956aeae55108ffa1f10e6972
This change adds the keystonemiddleware audit paste filter[0]
and enables it for the nova-api services.
This provides the ability to audit API requests for nova.
[0] https://docs.openstack.org/keystonemiddleware/latest/audit.html
Change-Id: Ic6df044d83f4dee581c9cc0405f61d926e45bcab
Using {{- if for the volume mounts caused them to be added inline with
the previous line.
Removing the - from the if expression makes them be properly aligned on
the next line
Change-Id: Ia5e28366fb1f2ae7420b7f5217c10cbb94bc48ab
Only start the sshd container of nova-compute pod if the capability is
enabled. Defaults to off to allow cases where nova docker image does
not have ssh packages to run cleanly.
Story: 2003463
Task: 30441
Change-Id: I3acf5b654ecda23a93f6c28e865e1bbee14370aa
Signed-off-by: Gerry Kopec <Gerry.Kopec@windriver.com>
- Fix .ssh/config file mapping
- Move private key from nova-compute-ssh container to nova-compute
container.
- Map private and public keys to configmap-ssh which will default to
the appropriate file permissions.
- Add additional config to /etc/ssh/sshd_config to allow passwordless
root logins over appropriate subnet passed in from overrides.
- Remove chmods from sshd bash script as they are failing.
Depends on helm-toolkit supporting multiple containers per daemonset
pod.
Story: 2003463
Task: 24723
Change-Id: Idd2e802c293f1e14991ee787ade9a4936fb373ff
Signed-off-by: Gerry Kopec <Gerry.Kopec@windriver.com>
This adds a nonvoting apparmor check job to openstack-helm, which
allows for the removal of default apparmor profiles from the nova
chart. This job also includes overrides for using the default
docker apparmor profile for the neutron chart
Change-Id: I8f407f24b7f10c5d7cf10f21f73671f7e6c72767
without the dependencies in the values.yaml, the role and rolebinding will
not be created by helm-toolkit as it uses those to create and generate the
role/rolebinding for the accounts
Change-Id: I711d5fc4a2a376a29daf526fc420790ea9cacf25
Currently there are issues with using the memcache_pool backend as
the memcache driver for nova under python3[0][1] which doesnt seem
like they have a quick fix or something that is backportable to
rocky
This moves the default cache from oslo_cache.memcache_pool to
dogpile.cache.memcached so we can move forward with python3
enabled images.
[0] https://bugs.launchpad.net/cloud-archive/+bug/1812672
[1] https://bugs.launchpad.net/oslo.cache/+bug/1812935
Change-Id: I65a4770c374357a8e1c80d904bcd4af36217448f
This PS tells nova to make rabbitmq queues ha when available.
Change-Id: I965d18ea5d5cdf5ab54bb33c6a46b4a92e039c5e
Signed-off-by: Pete Birley <pete@port.direct>
On some services it looks like the parent pid does not connect to
rabbitmq and its the children the ones that do instead, for example
in nova-scheduler from rocky version onwards.
The current health check only checks for the main parent pid to see
if it has an active connection to the rabbitmq port.
This patch adds a flag to allow the health probe to check all processes
for the mysql/rabbit connection instead of skipping any children process.
It also enables it by default for nova-scheduler as it wont affect older versions
than only run 1 process, but will work on later versions where
the main process forks.
Change-Id: I9677fd2aff11b563ab18059927ca12d5ace107ce
Under python3 an Exception no longer has the message attribute,
instead you can just str the exception to get the error message
Change-Id: Ibf88ae6b73f3bafcc2b99bb01e31bf8c25021e47
If user wants to add an extra volumeMounts/volume to a pod,
amd uses override values e.g. like this
pod:
mounts:
nova_placement:
init_container: null
nova_placement:
volumeMounts:
- name: nova-etc
...
helm template parser complains with
Warning: The destination item 'nova_placement' is a table and ignoring the source 'nova_placement' as it has a non-table value of: <nil>
So when we create empty values for such keys in values.yaml, the source
will be present and warning does not need to be shown.
Change-Id: Ib8dc53c3a54e12014025de8fafe16fbe9721c0da
Health probe for Nova pods is used for both liveness
and readiness probe.
nova-compute, nova-conductor, nova-consoleauth and nova-scheduler:
Check if the rpc socket status on the nova pods to rabbitmq and
database are in established state.
sends an RPC call with a non-existence method to component's queue.
Probe is success if agent returns with NoSuchMethod error.
If agent is not reachable or fails to respond in time,
returns failure to probe.
novnc/spice proxy: uses Kubernetes tcp probe on corresponding ports
they expose.
Added code to catch nova config file not present exception.
Change-Id: Ib8e4b93486588320fd2d562c3bc90b65844e52e5
The current helm chart defaults drops logs of any warnings
(and above) for any logger outside of the namespace
of the deployed chart.
This is a problem, as logging could reveal important information for
operators. While this could be done with a value override, there
is no reason to hide warning, errors, or critical information that
are happening in the cycle of the operation of the software
deployed with the helm charts. For example, nothing would get
logged in oslo_service, which is a very important part of running
OpenStack.
This fixes it by logging to stdout all the warnings (and above)
for OpenStack apps.
Change-Id: I16f77f4cc64caf21b21c8519e6da34eaf5d31498
the defaults in Python [0] and oslo.log [1] are such that when using
separate config file for logging configuration (log-config-append)
the log fomat of dates containes miliseconds twice (as in sec,ms.ms)
which is exactly what is currently seen in logs of OpenStack services
deployed by openstack-helm.
When not provided with datefmt log formatter option, Python effectively
uses '%Y-%m-%d %H:%M:%S,%f' [0] as a default time formatting string to
render `%(asctime)s`, but the defaults in oslo.log add another `.%f`
to it [1].
Since `log-date-format` oslo.log option has no effect when using
log-config-append, we need to explicitly set date format to avoid double
miliseconds rendering in date of log entries.
[0] 6ee41793d2/Lib/logging/__init__.py (L427-L428)
[1] http://git.openstack.org/cgit/openstack/oslo.log/tree/oslo_log/_options.py?id=7c5f8362b26313217b6c248e77be3dc8e2ef74a5#n148
Change-Id: I47aa7ce96770d94b905b56d6fe4abad428f01047
This patch set adds "startingDeadlineSeconds" field to cronJobs.
When the field is not set, the controller counts how many missed
jobs occured from the last scheduled time till now. And if it happends
more than 100 time the job will not be scheduled. To avoid this
the "startingDeadlineSeconds" field should be set to sufficient period
of time. In this case the controller counts how many missed jobs occured
during this period of time. The value of the field should be less than
time (in seconds) needed for running >100 jobs (according to schedule).
Change-Id: I3bf7c7077b55ca5a3421052bd0b59b70c9bbcf24
This adds the release-uuid annotation to the pod spec for all
replication controller templates in the openstack-helm charts
Change-Id: I0159f2741c27277fd173208e7169ff657bb33e57
This removes the NovaImages.list_images test from the rally
tests defined in the nova chart, as the updated rally version
seemingly doesn't include this test. This caused the multinode
periodic job to fail.
See: http://zuul.openstack.org/build/9628003399d640e683945260d9738ade
Change-Id: I9515fc3fee192ee6636e85a745071f93ff86c051
This patch set host_interface for update host_ip information in compute
node.
Currently helm chart defines the value of my_ip set "0.0.0.0",
therefore host_ip of compute node is null.
$ nova hypervisor-show {uuid}
+---------------------------+------------------------------------------+
| Property | Value |
+---------------------------+------------------------------------------+
| cpu_info_arch | x86_64 |
.
.
| host_ip | None |
Through this patch, OpenStack can provide appropriate values for
the required field.
Change-Id: I05f929cb2c777582c177e8c7a64b9fd431d554ec
This patchset enables and moves the securityContext: runAsUser to the pod
level, and uses a non-root user (UID != 0) wherever applicable.
Depends-On: I95264c933b51e2a8e38f63faa1e239bb3c1ebfda
Change-Id: I81f6e11fe31ab7333a3805399b2e5326ec1e06a7
Signed-off-by: Tin Lam <tin@irrational.io>