patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 18:20:05 +00:00

Author	SHA1	Message	Date
Alexander Kukushkin	96b75fa7cb	Special handling of check_recovery_conf for v12+ (#2292 ) When starting as a replica it may take some time before Postgres starts accepting new connections, but meanwhile, it could happen that the leader transitioned to a different member and the `primary_conninfo` must be updated. On pre v12 Patroni regularly checks `recovery.conf` in order to check that recovery parameters match the expectation. Starting from v12 recovery parameters were converted to GUC's and Patroni gets current values from the `pg_settings` view. The last one creates a problem when it takes more than a minute for Postgres to start accepting new connections. Since Patroni attempts to execute at least `pg_is_in_recovery()` every HA loop, and it is raising at exception, the `check_recovery_conf()` effectively wasn't reachable until recovery is finished, but it changed when #2082 was introduced. As a result of #2082 we got the following behavior: 1. Up to v12 (not including) everything was working as expected 2. v12 and v13 - Patroni restarting Postgres after 1m of recovery 3. v14+ - the `check_recovery_conf()` is not executed because the `replay_paused()` method raising an exception. In order to properly handle changes of recovery parameters or leader transitioned to a different node on v12+, we will rely on the cached values of recovery parameters until Postgres becomes ready to execute queries. Close https://github.com/zalando/patroni/issues/2289	2022-05-12 07:45:49 +02:00
Michael Banck	2d15e0dae6	Add target_session_attrs=read-write to standby_leader primary_conninfo (#2193 ) This allows to have multiple hosts in a standby_cluster and ensures that the standby leader follows the main cluster's new leader after a switchover. Partially addresses #2189	2022-02-10 15:50:14 +01:00
Alexander Kukushkin	fce889cd04	Compatibility with psycopg 3.0 (#2088 ) By default `psycopg2` is preferred. The `psycopg>=3.0` will be used only if `psycopg2` is not available or its version is too old.	2021-11-19 14:32:54 +01:00
Alexander Kukushkin	250328b84b	Use cached role as a fallback when postgres is slow (#2082 ) In some extreme cases Postgres could be so slow that the normal monitoring query doesn't finish in a few seconds. It results in the exception being raised from the `Postgresql._cluster_info_state_get()` method, which could lead to the situation that postgres isn't demoted on time. In order to make it reliable we will catch the exception and use the cached state of postgres (`is_running()` and `role`) to determine whether postgres is running as a primary. Close https://github.com/zalando/patroni/issues/2073	2021-10-07 16:08:21 +02:00
Alexander Kukushkin	d394b63c9f	Release the leader lock when pg_controldata reports "shut down" (#2067 ) Due to different reasons, it could happen that WAL archiving on the primary stuck or significantly delayed. If we try to do a switchover or shut it down, the shutdown will take forever and will not finish until the whole backlog of WALs is processed. In the meantime, Patroni keeps updating the leader lock, which prevents other nodes from starting the leader race even if it is known that they received/applied all changes. The `Database cluster state:` is changed to `"shut down"` after: - all data is fsynced to disk and the latest checkpoint is written to WAL - all streaming replicas confirmed that they received all changes (including the latest checkpoint) - at the same time, the archiver process continues to do its job and the postmaster process is still running. In order to solve this problem and make the switchover more reliable/fast in a case when `archive_command` is slow/failing, Patroni will remove the leader key immediately after `pg_controldata` started reporting PGDATA as `"shut down"` cleanly and it verified that there is at least one replica that received all changes. If there are no replicas that fulfill the condition the leader key isn't removed and the old behavior is retained, i.e. Patroni will keep updating it.	2021-10-05 10:55:35 +02:00
Alexander Kukushkin	0ceb59b49d	Write prev LSN to before checkpoint to optime if wal_achive=on (#1889 ) The #1527 introduced a feature of updating `/optime/leader` with the location of the last checkpoint after the Postgres was shutdown cleanly. If wal archiving is enabled, Postgres always switching the WAL file before writing the checkpoint shutdown record. Normally it is not an issue, but for databases without too much write activity it could lead to the situation that the visible replication lag becomes equal to the size of a single WAL file. In fact, the previous WAL file is mostly empty and contains only a few records. Therefore it should be safe to report the LSN of the SWITCH record before the shutdown checkpoint. In order to do that, Patroni first gets the output of the pg_controldata and based on it calls pg_waldump two times: * The first call reads the checkpoint record (and verifies that this is really the shutdown checkpoint). * The next call reads the previous record and in case if it is the 'xlog switch' (for 9.3 and 9.4) or 'SWITCH' (for 9.5+), the LSN of the SWITCH record is written to the `/optime/leader`. In case of any mismatch, failure to call pg_waldump or parse its output, the old behavior is retained, i.e. `Latest checkpoint location` from the pg_controldata is used. Close https://github.com/zalando/patroni/issues/1860	2021-07-05 09:29:39 +02:00
Alexander Kukushkin	6616acff58	Postpone writing postgresql.conf when joining running Postgres 12+ (#1956 ) When joining already running Postgres, Patroni ensures that config files are set according to expectations. With recovery parameters converted to GUCs in Postgres v12 it became a little problem, because when the `Postgresql` object is being created it is not yet known where the given replica is supposed to stream from. It resulted in postgresql.conf first being written without recovery parameters, and on the next run of HA loop Patroni noticing inconsistencies and updating the config one more time. For Postgres v12 it is not a big issue, but for v13+ it resulted in interruption of streaming replication.	2021-06-30 09:11:12 +02:00
Alexander Kukushkin	f3420e2db5	Compatibility with PostgreSQL 14 (#1926 ) PostgreSQL 14 changed the behavior of replicas when certain parameters (like for example `max_connections`) are changed (increased): https://github.com/postgres/postgres/commit/15251c0a. Instead of immediately exiting Postgres 14 pauses replication and waits for actions from the operator. Since the `pg_is_wal_replay_paused()` returning `True` is the only indicator of such a change, Patroni on the replica will call the `pg_wal_replay_resume()`, which would cause either continue replication or shutdown (like previously). So far Patroni was never calling `pg_wal_replay_resume()` on its own, therefore, to remain backward compatible it will call it only for PostgreSQL 14+.	2021-06-25 13:41:45 +02:00
Alexander Kukushkin	eaa98e71e3	Fix bug with unix socket connections (#1933 ) When the unix_socket_directories is not known Patroni was immediately going back to tcp connection via the localhost. The bug was introduced in https://github.com/zalando/patroni/pull/1865	2021-05-10 09:53:25 +02:00
melrifa	6d6b504cb8	Add support for patroni replication user socket connection (#1865 ) Close #1866	2021-04-20 09:43:05 +02:00
Alexander Kukushkin	9edbe7e3f7	Fix little issues with custom bootstrap (#1891 ) 1. Set hot_standby=off only when we do PITR 2. Restart postgres after PITR is done to avoid warnings 3. Address invalid config issue https://github.com/zalando/patroni/issues/1870#issuecomment-800088643	2021-03-29 08:06:12 +02:00
Alexander Kukushkin	c7173aadd7	Failover logical slots (#1820 ) Effectively, this PR consists of a few changes: 1. The easy part: In case of permanent logical slots are defined in the global configuration, Patroni on the primary will not only create them, but also periodically update DCS with the current values of `confirmed_flush_lsn` for all these slots. In order to reduce the number of interactions with DCS the new `/status` key was introduced. It will contain the json object with `optime` and `slots` keys. For backward compatibility the `/optime/leader` will be updated if there are members with old Patroni in the cluster. 2. The tricky part: On replicas that are eligible for a failover, Patroni creates the logical replication slot by copying the slot file from the primary and restarting the replica. In order to copy the slot file Patroni opens a connection to the primary with `rewind` or `superuser` credentials and calls `pg_read_binary_file()` function. When the logical slot already exists on the replica Patroni periodically calls `pg_replication_slot_advance()` function, which allows moving the slot forward. 3. Additional requirements: In order to ensure that primary doesn't cleanup tuples from pg_catalog that are required for logical decoding, Patroni enables `hot_standby_feedback` on replicas with logical slots and on cascading replicas if they are used for streaming by replicas with logical slots. 4. When logical slots are copied from to the replica there is a timeframe when it could be not safe to use them after promotion. Right now there is no protection from promoting such a replica. But, Patroni will show the warning with names of the slots that might be not safe to use. Compatibility. The `pg_replication_slot_advance()` function is only available starting from PostgreSQL 11. For older Postgres versions Patroni will refuse to create the logical slot on the primary. The old "permanent slots" feature, which creates logical slots right after promotion and before allowing connections, was removed. Close: https://github.com/zalando/patroni/issues/1749	2021-03-25 16:18:23 +01:00
Mark Mercado	09f2f579d7	Quick attempt at Prometheus (#1848 ) Close https://github.com/zalando/patroni/issues/318	2021-03-04 12:37:29 +01:00
krishna	b3dc765e6d	Choose synchronous nodes based on replication lag (#1786 ) This commit makes it possible to configure the maximum lag (`maximum_lag_on_syncnode`) after which Patroni will "demote" the node from synchronous and replace it with another node. The previous implementation always tried to stick to the same synchronous nodes (even if they are not optimal ones).	2021-02-02 15:45:02 +01:00
Alexander Kukushkin	89a15a2df4	Fix small issues with ignore-slots feature (#1797 ) When there is no config key in DCS Patroni shouldn't try accessing ignore_slots, otherwise an exception is raised. In addition to that implement missing unit-tests and fix linting issues in behave tests.	2020-12-16 18:10:12 +01:00
Alexander Kukushkin	0a1f389686	Release 2.0.0 (#1680 ) * update release notes * bump version * change the default alignment in patronictl table output to `left` * add missing tests * add missing pieces to the documentation	2020-09-02 15:35:04 +02:00
Alexander Kukushkin	13e24d832d	Advanced validation of PostgreSQL parameters (#1674 ) So far Patroni was performing a comparison of the old value (in the `pg_settings`) with the new value (from Patroni configuration or from DCS) in order to figure out if reload or restart is required when the parameter has been changed. If the given parameter was missing in the `pg_settings` Patroni was ignoring it and not writing into the `postgresql.conf`. In case if Postgres is not running, no validation has been performed and parameters and values were written into the config as it is. It is not a very common mistake, but people tend to mistype parameter names or values. Also, it happens that some parameters are removed in specific Postgres versions and some new are added (e.g. `checkpoint_segments` replaced with `min_wal_size` and `max_wal_size` in 9.5 or` wal_keep_segments` was replaced with `wal_keep_size` in 13). Writing nonexistent parameters or invalid values into the `postgresql.conf` makes postgres unstartable. This change doesn't solve the issue 100%, but at least approaching this goal very close.	2020-09-01 16:26:57 +02:00
Sergey Dudoladov	950eff27ad	Optional fencing script (pre_promote) (#1099 ) Call a fencing script after acquiring the leader lock. If the script didn't finish successfully - don't promote but remove leader key Close https://github.com/zalando/patroni/issues/1567	2020-09-01 07:50:39 +02:00
Feike Steenbergen	e3bc546dd5	Move WAL and tablespaces after a failed init (#1631 ) For init processes that use a symlinked WAL directory, or use custom scripts that create new tablespaces, these directories should also be renamed after a failed init attempt, as currently the following errors occur if the first init attempt failed, but a second one might succeed: fixing permissions on existing directory /var/lib/postgresql/data ... ok initdb: error: directory "/var/lib/postgresql/wal/pg_wal" exists but is not empty [...] File "/usr/lib/python3/dist-packages/patroni/ha.py", line 1173, in post_bootstrap self.cancel_initialization() File "/usr/lib/python3/dist-packages/patroni/ha.py", line 1168, in cancel_initialization raise PatroniException('Failed to bootstrap cluster') patroni.exceptions.PatroniException: 'Failed to bootstrap cluster' In the remove_data_directory function the same happens for removing the data directory, it seems the same kind of thing should also happen when moving a data directory. To ensure the data directory can still be used, the symlinks will point to the renamed directories.	2020-08-17 16:12:33 +02:00
ksarabu1	1ab709c5f0	Multi Sync Standby Support (#1594 ) The new parameter `synchronous_node_count` is used by Patroni to manage number of synchronous standby databases. It is set to 1 by default. It has no effect when synchronous_mode is set to off. When enabled, Patroni manages precise number of synchronous standby databases based on parameter synchronous_node_count and adjusts the state in DCS & synchronous_standby_names as members join and leave. This functionality can be further extended to support Priority (FIRST n) based synchronous replication & Quorum (ANY n) based synchronous replication in future.	2020-08-14 11:51:07 +02:00
Alexander Kukushkin	3341c898ff	Add Etcd v3 protocol support via api gRPC-gateway (#1162 ) The only python-etcd3 client working directly via gRPC still supports only a single endpoint, which is not very nice for high-availability. Since Patroni is already using a heavily hacked version of python-etcd with smart retries and auto-discovery out-of-the-box, I decided to enhance the existing code with limited support of v3 protocol via gRPC-gateway. Unfortunately, watches via gRPC-gateway requires us to open and keep the second connection to the etcd. Known limitations: * The very minimal supported version is 3.0.4. On earlier versions transactions don't work due to bugs in grpc-gateway. Without transactions we can't do atomic operations, i.e. leader locks. * Watches work only starting from 3.1.0 * Authentication works only starting from 3.3.0 * gRPC-gateway does not support authentication using TLS Common Name. This is because gRPC-proxy terminates TLS from its client so all the clients share a cert of the proxy: https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/authentication.md#using-tls-common-name	2020-07-31 14:33:40 +02:00
Alexander Kukushkin	ad5c686c11	Take advantage of pg_stat_wal_recevier (#1513 ) So far Patroni was parsing `recovery.conf` or querying `pg_settings` in order to get the current values of recovery parameters. On PostgreSQL earlier than 12 it could easily happen that the value of `primary_conninfo` in the `recovery.conf` has nothing to do with reality. Luckily for us, on PostgreSQL 9.6+ there is a `pg_stat_wal_receiver` view, which contains current values of `primary_conninfo` and `primary_slot_name`. The password field is masked through, but this is fine, because authentication happens only during opening the connection. All other parameters we compare as usual. Another advantage of `pg_stat_wal_recevier` - it contains the current timeline, therefore on 9.6+ we don't need to use the replication connection trick if walreceiver process is alive. If there is no walreceiver process available or it is not streaming we will stick to old methods.	2020-05-15 18:04:24 +02:00
Alexander Kukushkin	08b3d5d20d	Move ensure_clean_shutdown into rewind module (#1528 ) Logically fits there better	2020-05-15 16:22:57 +02:00
Alexander Kukushkin	30aa355eb5	Shorten and beautify history log output (#1526 ) when Patroni is trying to figure out the necessity of pg_rewind it could write the content history file from the primary into the log. The history file is growing with every failover/switchover and eventually starts taking too many lines in the log, most of them are not so much useful. Instead of showing the raw data, we will show only 3 lines before the current replica timeline and 2 lines after.	2020-05-15 16:14:25 +02:00
Alexander Kukushkin	7cf0b753ab	Update optime/leader with checkpoint location after clean shut down (#1527 ) Potentially this information could be used in order to make sure that there is no data loss on switchover.	2020-05-15 16:13:16 +02:00
Alexander Kukushkin	0d957076ca	Improve compatibility with PostgreSQL 12 and 13 (#1523 ) There were two new connection parameters introduced: 1. `gssencmode` in 12 2. `channel_binding` in 13	2020-05-13 13:13:25 +02:00
Alexander Kukushkin	fe23d1f2d0	Release 1.6.5 (#1503 ) * bump version * update release notes * implement missing unit-tests and format code.	2020-04-23 16:02:01 +02:00
ksarabu1	e3335bea1a	Master stop timeout (#1445 ) ## Feature: Postgres stop timeout Switchover/Failover operation hangs on signal_stop (or checkpoint) call when postmaster doesn't respond or hangs for some reason(Issue described in [1371](https://github.com/zalando/patroni/issues/1371)). This is leading to service loss for an extended period of time until the hung postmaster starts responding or it is killed by some other actor. ### master_stop_timeout The number of seconds Patroni is allowed to wait when stopping Postgres and effective only when synchronous_mode is enabled. When set to > 0 and the synchronous_mode is enabled, Patroni sends SIGKILL to the postmaster if the stop operation is running for more than the value set by master_stop_timeout. Set the value according to your durability/availability tradeoff. If the parameter is not set or set <= 0, master_stop_timeout does not apply.	2020-04-15 12:18:49 +02:00
Alexander Kukushkin	4a29caa9d3	On role change callback didn't fire on failed primary (#1420 ) Bug was introduced in https://github.com/zalando/patroni/pull/703 Close https://github.com/zalando/patroni/issues/1418	2020-02-27 12:22:44 +01:00
Alexander Kukushkin	80ce61876e	Don't create permanent physical slot with name of the primary (#1392 ) It is a regular issue that primary is recycling WALs when one of the replicas is down for a long time. So far there were only two solutions for such a problem and both of them are not perfect: 1. Increase `wal_keep_segments`, but it is hard to guess the good value. 2. Use continuous archiving and PITR, but it is not always possible. This PR is introducing the way to solve the problem for static clusters, with a fixed number of nodes and names that never change. You just need to list the names of all nodes in the `slots` so the primary will not remove the slot when the node is down (not registered in DCS). Of course, the primary will not create the permanent slot which is matching its own name. Usage example: let's assume you have a cluster with nodes named abc1, abc2, and abc3. You have to run `patronictl edit-config` and put the following snippet into the configuration: ```yaml slots: abc1: type: physical abc2: type: physical abc3: type: physical ``` If the node abc2 is the primary, it will always create slots for abc1 and abc3 even if they are not running, but will not create slot abc2. Other nodes will behave the same. Close #280	2020-02-20 10:07:43 +01:00
Alexander Kukushkin	1461d7d4b8	Allow certain recovery parameters be defined in the custom_conf (#1335 ) Fixes https://github.com/zalando/patroni/issues/1333	2020-01-15 12:41:07 +01:00
Igor Yanchenko	ea76a40845	Make sure postgresql.pgpass is a file or it does not exist (#1337 ) Also make sure that it is located in the writable directory.	2020-01-15 12:40:41 +01:00
Alexander Kukushkin	16d1ffdde7	Update timeline on standby cluster (#1332 ) Fixes https://github.com/zalando/patroni/issues/1031	2019-12-20 12:56:00 +01:00
Alexander Kukushkin	0693fe7dd0	Housekeeping (#1315 ) * Reduce memory usage by patroni init process * More cleanup in setup.py * Implement missing tests	2019-12-04 11:28:46 +01:00
Alexander Kukushkin	e1d569ad75	Inherit CaseInsensitiveDict from urllib3 HTTPHeaderDict (#1302 ) It might look like a hack, but the API is stable enough and didn't change in the past 3+ years.	2019-12-02 12:14:59 +01:00
Alexander Kukushkin	7793887ea7	Fix tests on windows (#1303 ) and disable junit, it produces a deprecation warning	2019-11-27 14:57:33 +01:00
Alexander Kukushkin	5ea73d50ed	Make it possible to apply some recovery params without restart (#1260 ) Starting from PostgreSQL 12 the following recovery parameters could be changed without restart, but Patroni didn't yet support it: * archive_cleanup_command * promote_trigger_file * recovery_end_command * recovery_min_apply_delay In future postgres releases this list will be extended and Patroni will support it automatically.	2019-11-11 16:18:23 +01:00
Alexander Kukushkin	29ac77b6e7	Compare all recovery parameters (#1208 ) Previously check_recovery_conf() function was only checking whether primary_conninfo has changed and never taking into account all other recovery parameters. Fixes https://github.com/zalando/patroni/issues/1201	2019-10-30 12:30:09 +01:00
Alexander Kukushkin	9e87b00d36	Kill callback child processes when it is necessary (#1242 ) Not doing so makes it hard to implement callbacks in bash and eventually can lead to the situation when two callbacks are running at the same time. In case if we failed to kill the child process we will still wait for it to finish. The same problem could happen with custom bootstrap, therefore if we happen to kill the custom bootstrap process we also kill all child subprocesses. Closes https://github.com/zalando/patroni/issues/1238	2019-10-29 12:44:18 +01:00
Alexander Kukushkin	828585079f	Improve workflow when PGDATA is not empty during bootstrap (#1217 ) Recently it has happened two times when people tried to deploy the new cluster but postgres data directory wasn't empty and also wasn't valid. In this case Patroni was still creating initialize key in DCS and trying to start the postgres up. Now it will complain about non-empty invalid postgres data directory and exit. Close https://github.com/zalando/patroni/issues/1216	2019-10-25 14:09:44 +02:00
Alexander Kukushkin	0947ac1e43	Fix race condition in postmaster_start_time() (#1243 ) when it is executed not from the main thread we need to create a new cursor object.	2019-10-24 11:23:34 +02:00
Alexander Kukushkin	f4623c4e8e	Build recovery params in a separate method (#1219 ) In addition to that try to protect from the case when some recovery parameters are set in one of included files by explicitly setting their value to an empty string on postgres 12. Simplifies https://github.com/zalando/patroni/pull/1208	2019-10-11 20:18:06 +02:00
Alexander Kukushkin	3d29cb7e50	Perform pg_ctl reload regardless of config changes (#1204 ) It is possible that some config files are not controlled by Patroni and when somebody is doing reload via REST API or by sending SIGHUP to Patroni process the usual expectation is that postgres will also be reloaded, but it didn't happen when there were no changes in the postgresql section of Patroni config. For example one might replace ssl_cert_file and ssl_key_file on the filesystem and starting from PostgreSQL 10 it just requires a reload, but Patroni wasn't doing it. In addition to that fix the issue with handling of `wal_buffers`. The default value depends on `shared_buffers` and `wal_segment_size` and therefore Patroni was exposing pending_restart when the new value in the config was explicitly set to -1 (default). Close https://github.com/zalando/patroni/issues/1198	2019-10-10 14:49:30 +02:00
Alexander Kukushkin	1572c02ced	Use passfile in the primary_conninfo instead of password (#1194 ) Fixed a few minor issues related to the #1134 and #1122 Close https://github.com/zalando/patroni/issues/1185	2019-10-09 18:04:14 +02:00
Alexander Kukushkin	0b1b1e3b54	Compatibility with postgresql 12 (#1068 ) * use `SHOW primary_conninfo` instead of parsing config file on pg12 * strip out standby and recovery parameters from postgresql.auto.conf before starting the postgres 12 Patroni config remains backward compatible. Despite for example `restore_command` converted to a GUC starting from postgresql 12, in the Patroni configuration you can still keep it in the `postgresql.recovery_conf` section. If you put it into `postgresql.parameters.restore_command`, that will also work, but it is important not to mix both ways: ```yaml # is OK postgresql: parameters: restore_command: my_restore_command archive_cleanup_command: my_archive_cleanup_command # is OK postgresql: recovery_conf: restore_command: my_restore_command archive_cleanup_command: my_archive_cleanup_command # is NOT ok postgresql: parameters: restore_command: my_restore_command recovery_conf: archive_cleanup_command: my_archive_cleanup_command ```	2019-08-02 16:00:55 +02:00
Alexander Kukushkin	a4bd6a9b4b	Refactor postgresql class (#1060 ) * Convert postgresql.py into a package * Factor out cancellable process into a separate class * Factor out connection handler into a separate class * Move postmaster into postgresql package * Factor out pg_rewind into a separate class * Factor out bootstrap into a separate class * Factor out slots handler into a separate class * Factor out postgresql config handler into a separate class * Move callback_executor into postgresql package This is just a careful refactoring, without code changes.	2019-05-21 16:02:47 +02:00
Alexander Kukushkin	bba9066315	Make it possible to run pg_rewind without superuser on pg11+ (#1035 ) * expose the current patroni version in DCS * expose `checkpoint_after_promote` flag in DCS as an indicator that pg_rewind could be safely executed * other nodes will wait until this flag is set instead of connecting as superuser and issuing the CHECKPOINT * define `postgresql.authention.rewind` with credentials for pg_rewind in patroni configuration files. * create user for pg_rewind if postgres is 11+ * grant execute on functions required for pg_rewind to rewind user	2019-05-02 14:07:26 +02:00
Alexander Kukushkin	f0b784fe7f	Manage pg_ident.conf with Patroni (#1037 ) This functionality works similarly to the `pg_hba`: If the `postgresql.pg_ident` is defined in the config file or DCS, Patroni will write its value to pg_ident.conf, however, if `postgresql.parameters.ident_file` is defined, Patroni will assume that pg_ident is managed from outside and not update the file.	2019-04-23 16:16:53 +02:00
Pavlo Golub	b53a29c022	Fix unit-tests for Windows (#1014 ) Closes #1013	2019-04-02 13:58:17 +02:00
Alexander Kukushkin	e38fe78b56	Fix callbacks behavior (mostly for standby cluster) (#998 ) First of all, this patch changes the behavior of `on_start`/`on_restart` callbacks, they will be called only when postgres is started or restarted without role changes. In case if the member is promoted or demoted only the `on_role_change` callback will be executed. `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument. Before that `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument. In addition to that, the REST API will return standby_leader role for the leader of the standby cluster. Closes https://github.com/zalando/patroni/issues/988	2019-03-29 10:28:07 +01:00

1 2 3 4 5

237 Commits