We will try to import only the module which has a configuration section.
I.e. if there is only zookeeper section in the config, Patroni will try to import only `patroni.dcs.zookeeper` and skip `etcd`, `consul`, and `kubernetes`.
This approach has two benefits:
1. When there are no dependencies installed Patroni was showing INFO messages `Failed to import smth`, which looks scary.
2. It reduces memory usage, because sometimes dependencies are heavy.
* Implement proper tests for `multiprocessing.set_start_method()`
* Exclude some watchdog code from coverage (it is used only for behave tests)
* properly use os.path.join for windows compatibility
* import DCS modules in `features/environment.py` on demand. It allows to run behave tests against chosen DCS without installing all dependencies.
* remove some unused behave code
* fix some minor issues in the dcs.kubernetes module
There is an opinion that LIST requests with labelSelector to K8s API are expensive and Patroni was doing two such requests per HA loop (LIST pods and LIST endpoints/configmaps).
To efficiently detect object changes we will switch to the LIST+WATCH approach.
The initial LIST request populates the ObjectCache and events from the WATCH request update it.
In addition to that, the ObjectCache will be updated after performing the UPDATE operations on the K8s objects. To avoid race conditions, all operations on ObjectCache are performed after comparing the resource_version of the old and the new objects and rejected if the new resource_version value is smaller than the old one.
The disadvantage of such an approach is that it will require keeping three connections to the K8s API from each Patroni Pod (previously it was two).
Yesterday I deployed this feature branch on our biggest K8s cluster, with ~300 Patroni pods.
The CPU Utilization on K8s master nodes immediately dropped from ~20% to ~10% (two times), and the incoming traffic on master nodes dropped ~7-8 times!
Last, but not least, we get more or less the same impact on etcd cluster behind K8s master nodes, the CPU Utilization dropped nearly twice and outgoing traffic ~7-8 times.
Starting from PostgreSQL 12 the following recovery parameters could be changed without restart, but Patroni didn't yet support it:
* archive_cleanup_command
* promote_trigger_file
* recovery_end_command
* recovery_min_apply_delay
In future postgres releases this list will be extended and Patroni will support it automatically.
The start of postgres happens in two stages:
1. First Patroni is waiting for postgres port to be open
2. After that, it is waiting for postgres starts to accept connections
There is a default timeout 60 seconds for both stages (in total).
When the port isn't open, pg_isready exits with code=2.
If postgres is rejecting connections due to recovery, exit code=1.
In most cases postgres quickly opens the port and pg_isready starts returning 1, but in rare cases the whole timeout could spend in `1.`
After that, the HA loop is still waiting for postgres to start, but executing only the check from `2.`. Since pg_isready exit code is still = 2, Patroni was falsely assuming that 'start failed' without taking into consideration the fact that the postmaster process is up and running.
Fixes https://github.com/zalando/patroni/issues/1160
The previous documentation was wrong and will throw the following error
when used:
Exception when parsing list {[{"name": "postgresql", "port": 5432}]}
When removing the surrounding braces, the error goes away and the
endpoint is updated with the correct Port name.
Previously check_recovery_conf() function was only checking whether primary_conninfo has changed and never taking into account all other recovery parameters.
Fixes https://github.com/zalando/patroni/issues/1201
Not doing so makes it hard to implement callbacks in bash and eventually can lead to the situation when two callbacks are running at the same time. In case if we failed to kill the child process we will still wait for it to finish.
The same problem could happen with custom bootstrap, therefore if we happen to kill the custom bootstrap process we also kill all child subprocesses.
Closes https://github.com/zalando/patroni/issues/1238
When the system is under IO stress, `os.listdir()` could take a few seconds (or even minutes) to execute what is badly affecting the HA loop of Patroni and could even cause the leader key to disappear from DCS due to the lack of updates.
There is a better and less expensive way to check that the PGDATA is not empty. Instead of doing the `os.listdir` we simply check the presence of the `global/pg_control` file in it.
Recently it has happened two times when people tried to deploy the new cluster but postgres data directory wasn't empty and also wasn't valid. In this case Patroni was still creating initialize key in DCS and trying to start the postgres up.
Now it will complain about non-empty invalid postgres data directory and exit.
Close https://github.com/zalando/patroni/issues/1216
The /history endpoint shows the content of the `history` key in DCS
The /cluster endpoint show all cluster members and some service info like pending and scheduled restarts or switchovers.
In addition to that implement `patronictl history`
Close#586Close#675Close#1133
In addition to that try to protect from the case when some recovery parameters are set in one of included files by explicitly setting their value to an empty string on postgres 12.
Simplifies https://github.com/zalando/patroni/pull/1208
Specifically, there was a chance that `patronictl reinit --force` was overwritten by recover and we end up in a situation when Patroni was trying to start the postgres while basebackup still running.
* make it possible to use client certificates with REST API
* define a separate PatroniRequest class which handles all communication
* refactor patronictl to use the new class
* make Ha to use the new class instead of calling requests.get. The old call wasn't taking into account certificates and basic-auth
Close#898
It is possible that some config files are not controlled by Patroni and when somebody is doing reload via REST API or by sending SIGHUP to Patroni process the usual expectation is that postgres will also be reloaded, but it didn't happen when there were no changes in the postgresql section of Patroni config.
For example one might replace ssl_cert_file and ssl_key_file on the filesystem and starting from PostgreSQL 10 it just requires a reload, but Patroni wasn't doing it.
In addition to that fix the issue with handling of `wal_buffers`. The default value depends on `shared_buffers` and `wal_segment_size` and therefore Patroni was exposing pending_restart when the new value in the config was explicitly set to -1 (default).
Close https://github.com/zalando/patroni/issues/1198
Watch requests to K8s API either streaming the data or close connection by timeout. In any case it requires a second connection open, but opening a new connection every 10 seconds is more expensive for both, Patroni and K8s API.
Switching to the streaming model also brings other benefits: we can watch not only on leader object, but also on config and wake up Patroni main thread if the config was changed.
The PatroniLogger object is instantiated in the Patroni constructor and down the road there might be a fatal error causing Patroni process to exit, but live thread prevents the normal shutdown.
In order to mitigate the issue and don't loose ability to use the logging infrastructure we will switch to QueueLogger only when the thread was explicitly started from the Patroni.run() method.
Continuation of https://github.com/zalando/patroni/pull/1178
Since it is based on Thread with daemon set to True, the shutdown of logger was very likely to happen too early, what was causing some lines not to appear at the destination.
Close https://github.com/zalando/patroni/issues/1173