patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 18:20:05 +00:00

Author	SHA1	Message	Date
Alexander Kukushkin	63ee42a85c	Clear event on the leader node when /status was updated (#2125 ) Not doing so causing excessive HA loop runs with Zookeeper. This moment wasn't fixed correctly in the #1875	2021-11-30 16:33:38 +01:00
Alexander Kukushkin	d24051c31c	Optimize case when we don't have permanent logical slots (#2121 ) The unnecessary call of SlotsHandler.process_permanent_slots() results in one additional query to `pg_replication_slots` view every HA loop.	2021-11-30 14:20:55 +01:00
Alexander Kukushkin	31d7540cc5	Prefer members without nofailover when picking sync nodes (#2108 ) Previously sync nodes were selected only based on replication lag and hence the node with `nofailover` tag had the same chances to become synchronous as any other node. That behavior was confusing and dangerous at the same time, because in case of failed primary the failover couldn't happen automatically. Close https://github.com/zalando/patroni/issues/2089	2021-11-30 14:20:03 +01:00
Alexander Kukushkin	256a359a1e	Fix a litle bug around psycopg 3.0 (#2123 ) Cursor.execute now returns cursor itself, while in psycopg2 it was returning None	2021-11-19 16:29:39 +01:00
Alexander Kukushkin	17e523b175	Optimize checkpoint after promote (#2114 ) 1. Avoid doing CHECKPOINT if `pg_control` is already updated. 2. Explicitly call ensure_checkpoint_after_promote() right after the bootstrap finished successfully.	2021-11-19 14:33:24 +01:00
Alexander Kukushkin	fce889cd04	Compatibility with psycopg 3.0 (#2088 ) By default `psycopg2` is preferred. The `psycopg>=3.0` will be used only if `psycopg2` is not available or its version is too old.	2021-11-19 14:32:54 +01:00
Alexander Kukushkin	edfe2a84e9	Fix a few issues with Patroni API (#2116 ) 1. The `client_address` tuple may have more than two elements in case of IPv6 2. Return `cluster_unlocked` only when the value is true and handle it respectively in the do_GET_metrics() 3. Return `cluster_unlocked` and `dcs_last_seen` even if Postgres isn't running/queries timing out Close https://github.com/zalando/patroni/issues/2113	2021-11-12 15:02:53 +01:00
Alexander Kukushkin	00d125c512	Avoid unnecessary updates of the members ZNode. (#2115 ) When deciding whether the ZNode should be updated we rely on the cached version of the cluster, which is updated only when members ZNodes are deleted/created or the `/status`, `/sync`, `/failover`, `/config`, or `/history` ZNodes are updated. I.e. after the update of the current member ZNode succeeded the cache becomes stale and all further updates are always performed even if the value didn't change. In order to solve it, we introduce the new attribute in the Zookeeper class and will use it for memorizing the actual value and for later comparison.	2021-11-12 15:00:54 +01:00
Alexander Kukushkin	fd1e0f1c1b	BUGFIX: use_unix_socket_repl didn't work is some cases (#2103 ) Specifically, if `postgresql.unix_socket_directories` is not set. In this case Patroni is supposed to use only the port in the connection string, but the `get_replication_connection_cursor()` method defaulted to host='localhost'	2021-10-29 12:09:38 +02:00
Alwyn Davis	14bb28c349	Allow setting ACLs for znodes in Zookeeper (#2086 ) Add a configuration option (`set_acls`) for Zookeeper DCS so that Kazoo will apply a default ACL for each znode that it creates. The intention is to improve security of the znodes when a single Zookeeper cluster is used as the DCS for multiple Patroni clusters. Zookeeper [does not apply an ACL to child znodes](https://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#sc_ZooKeeperAccessControl), so permissions can't be set at the `scope` level and then be inherited by other znodes that Patroni creates. Kazoo instead [provides an option for configuring a default_acl](https://kazoo.readthedocs.io/en/latest/api/client.html#kazoo.client.KazooClient.__init__) that will be applied on node creation. Example configuration in Patroni might then be: ``` zookeeper: set_acls: CN=principal1: [ALL] CN=principal2: - READ ```	2021-10-28 09:59:45 +02:00
Michael Banck	95479526ef	Fix typo (#2091 )	2021-10-26 15:12:10 +02:00
Alexander Kukushkin	47ebda0d5d	Fix a few issues in kubernetes.py (#2084 ) 1. Two `TypeError`-s raised from `ApiClient.request()` method 2. Use the _retry() wrapper function instead of callable object in the `_update_leader_with_retry()` when trying to workaround concurrent updates of the leader object.	2021-10-08 16:13:28 +02:00
Nicolas PAYART	64ae2bb885	Add compatibility with PG 14 in README (#2083 ) It was just missing there	2021-10-08 15:51:55 +02:00
Alexander Kukushkin	250328b84b	Use cached role as a fallback when postgres is slow (#2082 ) In some extreme cases Postgres could be so slow that the normal monitoring query doesn't finish in a few seconds. It results in the exception being raised from the `Postgresql._cluster_info_state_get()` method, which could lead to the situation that postgres isn't demoted on time. In order to make it reliable we will catch the exception and use the cached state of postgres (`is_running()` and `role`) to determine whether postgres is running as a primary. Close https://github.com/zalando/patroni/issues/2073	2021-10-07 16:08:21 +02:00
Alexander Kukushkin	89388c2e4b	Handle DCS exceptions when demoting (#2081 ) While doing demote due to failure to update leader lock it could happen that DCS goes completely down and the get_cluster() call raise the exception. Not being properly handled it results in postgres remaining stopped until DCS recovers.	2021-10-07 16:08:10 +02:00
Michael Banck	e28557d2f0	Fix sphinx build. (#2080 ) Sphinx' add_stylesheet() has been deprecated for a long time and got removed in recent versions of sphinx. If available, use add_css_file() instead. Close #2079.	2021-10-07 16:07:41 +02:00
Farid Zarazvand	34db0bba16	PostgreSQL v14 is supported since v2.1.0 (#2078 )	2021-10-07 16:07:00 +02:00
Kostiantyn Nemchenko	3616906434	Add sslcrldir connection parameter support (#2068 ) This allows setting the `sslcrldir` connection parameter available since PostgreSQL 14.	2021-10-07 16:04:27 +02:00
Alexander Kukushkin	d394b63c9f	Release the leader lock when pg_controldata reports "shut down" (#2067 ) Due to different reasons, it could happen that WAL archiving on the primary stuck or significantly delayed. If we try to do a switchover or shut it down, the shutdown will take forever and will not finish until the whole backlog of WALs is processed. In the meantime, Patroni keeps updating the leader lock, which prevents other nodes from starting the leader race even if it is known that they received/applied all changes. The `Database cluster state:` is changed to `"shut down"` after: - all data is fsynced to disk and the latest checkpoint is written to WAL - all streaming replicas confirmed that they received all changes (including the latest checkpoint) - at the same time, the archiver process continues to do its job and the postmaster process is still running. In order to solve this problem and make the switchover more reliable/fast in a case when `archive_command` is slow/failing, Patroni will remove the leader key immediately after `pg_controldata` started reporting PGDATA as `"shut down"` cleanly and it verified that there is at least one replica that received all changes. If there are no replicas that fulfill the condition the leader key isn't removed and the old behavior is retained, i.e. Patroni will keep updating it.	2021-10-05 10:55:35 +02:00
Alexander Kukushkin	1c2bf258d6	Allow switchover only to sync nodes when synchronous replication is on (#2076 ) Close https://github.com/zalando/patroni/issues/2074	2021-10-04 16:23:45 +02:00
Alexander Kukushkin	a431f50378	Check only sync nodes when assessing failover capabilities (#2065 ) When synchronous_mode is enabled we should check only synchronous nodes in is_failover_possible().	2021-09-24 08:22:54 +02:00
Alexander Kukushkin	fca724186e	DCS.write_leader_optime() should update /status key (#2064 ) This moment was forgotten in the failover logical slots implementation.	2021-09-24 08:22:20 +02:00
Michael Banck	2f31e88bdc	Add dcs_last_seen field to API (#2051 ) This field notes the last time (as unix epoch) a cluster member has successfully communicated with the DCS. This is useful to identify and/or analyze network partitions. Also, expose dcs_last_seen in the MemberStatus class and its from_api_response() method.	2021-09-22 10:01:35 +02:00
Jorge Solórzano	80c1127b70	Cast to int wal_keep_segments conversion to wal_keep_size (#2063 ) Fixes #2062	2021-09-22 10:00:42 +02:00
Alexander Kukushkin	258e7e24f4	Ensure pg_replication_slot_advance() doesn't timeout (#2060 ) The bigger gap between the slot flush LSN and the LSN we want to advance to becomes more time it takes for the call to finish. Once started failing the "lag" will grow more or less infinitely, that have the following negative side-effects: 1. Size of pg_wal on the replica will grow 2. Since the hot_standby_feedback is forcefully enabled, the primary will stop cleaning up dead tuples I.e., we are not only in danger of running out of disk space, but also increasing chances of transaction wraparound to happen. In order to mitigate it, we want to set the `statement_timeout` to 0 before calling `pg_replication_slot_advance()`. Since the call is happening from the main HA loop and could take more than `loop_wait`, the next heartbeat run could be delayed. There is also a possibility that the call could take longer than `ttl` and the member key/session in DCS for a given replica expires, but, the slot LSN in DCS is updated by the primary every `loop_wait` seconds. Hence, we don't expect that the slot_advance() call will take significantly longer than the `loop_wait` and therefore chances of the member key/session to expire are very low.	2021-09-17 16:39:39 +02:00
Michael Banck	fae96b3148	Improve "I am" status messages (#2056 )	2021-09-17 14:46:07 +02:00
Michael Banck	8a9e649aa1	Add log before demoting (which can take some time) (#2057 ) It can take some time for the demote to finish and it might not be obvious from looking at the logs what exactly is going on.	2021-09-17 14:45:32 +02:00
Alexander Kukushkin	7bd28250ca	Skip temporary replication slots while doing slot management (#2055 ) Starting from v10 `pg_basebackup` creates a temporary replication slot for WAL streaming and Patroni was trying to drop it because the slot name looks unknown. In order to fix it, we skip all temporary slots when querying `pg_stat_replication_slots` view. Another option to solve the problem would be running `pg_basebackup` with `--slot=current_node_name` option, but unfortunately at the moment when `pg_basebackup` is executed, we don't yet know the major version (the `--slot` option was added in v9.6). Ref: https://github.com/zalando/patroni/issues/2046#issuecomment-912521502	2021-09-17 14:44:54 +02:00
Alexander Kukushkin	21145d18d1	Delay the next attempt of recovery till next HA loop (#2054 ) If Postgres crashed due to out of disk space (for example) and fails to start because of that Patroni is too eagerly trying to recover it and producing too many logs	2021-09-17 13:46:46 +02:00
Alexander Kukushkin	93efa91bbd	Release 2.1.1 (#2039 ) * Update release notes * Bump version * Improve unit-test coverage v2.1.1	2021-08-19 15:44:37 +02:00
David Pavlíček	c9f420fd4a	Fix for bad string format in etcd srv_suffix resolution (#2037 ) Fix for silly mistake introduced in #2029. This time tested on real ETCD and DNS.	2021-08-19 08:32:45 +02:00
Tommy Li	ed0e308b9b	Support dynamically registering/deregistering as a consul service and changing tags (#1993 ) Close #1988	2021-08-17 16:38:10 +02:00
Aron Parsons	313adb61ec	Make the CA bundle configurable for in-cluster Kubernetes config (#2025 ) Close https://github.com/zalando/patroni/issues/1758	2021-08-17 16:15:39 +02:00
Alexander Kukushkin	db12051a5b	Improve compatibility with latest minor releases (#2034 ) The commit `93a0bf2390` changed the behavior of `pg_settings.pending_restart`. Mostly it will not cause issues because usually people very rarely removing values from the config, but one case is unique. If Patroni is restarted for an upgrade and it finds that Postgres is up and running, it rewrites `postgresql.conf` and performs a reload. As a result, recovery parameters are removed from the config for Postgres v12+ and the `pending_restart` flag is falsely set. In order to partially mitigate the problem before it is fixed in Postgres we will skip recovery parameters when checking `pending_restart` flags in the `pg_settings`. In addition to that, remove two parameters from the validator because they were reverted from Postgres v14.	2021-08-17 16:14:10 +02:00
惠雅林	c51234557d	Enrich history with the new leader (#2032 ) Close https://github.com/zalando/patroni/issues/2009	2021-08-13 16:05:45 +02:00
DavidPavlicek	195b8bf049	Support for ETCD SRV name suffix (#2029 ) Add support for ETCD SRV name suffix as per description in ETCD dosc: > The -discovery-srv-name flag additionally configures a suffix to the SRV name that is queried during discovery. Use this flag to differentiate between multiple etcd clusters under the same domain. For example, if discovery-srv=example.com and -discovery-srv-name=foo are set, the following DNS SRV queries are made: > > _etcd-server-ssl-foo._tcp.example.com > _etcd-server-foo._tcp.example.com All test passes, but not been tested on the live ETCD system yet... Please, take a look and send feedback. Resolves #2028	2021-08-13 15:49:01 +02:00
Alexander Kukushkin	ccfe30729e	Skip NULLs in the pg_stat_replication (#2016 ) Close https://github.com/zalando/patroni/issues/2014	2021-08-13 15:48:15 +02:00
Alexander Kukushkin	c81391e314	Don't resolve cluster members when use_proxies is set (#2007 ) Close https://github.com/zalando/patroni/issues/2006	2021-07-21 08:36:21 +02:00
Michael Banck	415180048a	Add selector to StatefulSet definition (#2011 ) This adds the `selector` to the Patroni Kubernetes StatefulSet spec Without it, one gets errors like ``` error: error validating "patroni_k8s.yaml": error validating data: ValidationError(StatefulSet.spec): missing required field "selector" in io.k8s.api.apps.v1.StatefulSetSpec; if you choose to ignore these errors, turn validation off with --validate=false ``` (as mentioned in #1867)	2021-07-21 08:35:51 +02:00
Alexander Kukushkin	9288ce066b	Reload REST API certificate only on SIGHUP (#2004 ) And make sure that cert number is cached when the RestApiServer object is created. Close https://github.com/zalando/patroni/issues/2003	2021-07-21 08:33:08 +02:00
Julien Riou	cb80f7ee06	docs: fix typo in 2.1.0 release note (#1999 )	2021-07-09 13:21:53 +02:00
Alexander Kukushkin	f2309abc87	Release 2.1.0 (#1998 ) * bump version * update release notes v2.1.0	2021-07-06 10:19:22 +02:00
Christian Clauss	75e52226a8	Fix typos discovered by codespell (#1997 )	2021-07-06 10:01:30 +02:00
Alexander Kukushkin	62aa1333cd	Implemented allowlist for REST API (#1959 ) If configured, only IPs that matching rules would be allowed to call unsafe endpoints. In addition to that, it is possible to automatically include IPs of members of the cluster to the list. If neither of the above is configured the old behavior is retained. Partially address https://github.com/zalando/patroni/issues/1734	2021-07-05 09:43:56 +02:00
Alexander Kukushkin	b7a11232eb	Track /status key updates (#1957 ) The #1820 introduced a new key in DCS, named /status, which contains the leader optime and maybe flush LSN of logical slots. In case of etcd3 and raft we should use updates of this key for syncing HA loop across nodes (before we were using /optime/leader).	2021-07-05 09:43:01 +02:00
Alexander Kukushkin	333d292eb3	Handle DNS issues in Raft implementation (#1960 ) - Resolve Node IP for every connection attempt - Handle exception with connection failures due to failed resolve - Set PySyncObj DNS Cache timeouts aligned with `loop_wait` and `ttl` In addition to that, postpone the leader race for freshly started Raft nodes. It will help with the situation when the leader node was alone and demoted the Postgres and after that, the replica arrives, and quickly takes the leader lock without really performing the leader race. Close https://github.com/zalando/patroni/issues/1930, https://github.com/zalando/patroni/issues/1931	2021-07-05 09:30:31 +02:00
Alexander Kukushkin	0ceb59b49d	Write prev LSN to before checkpoint to optime if wal_achive=on (#1889 ) The #1527 introduced a feature of updating `/optime/leader` with the location of the last checkpoint after the Postgres was shutdown cleanly. If wal archiving is enabled, Postgres always switching the WAL file before writing the checkpoint shutdown record. Normally it is not an issue, but for databases without too much write activity it could lead to the situation that the visible replication lag becomes equal to the size of a single WAL file. In fact, the previous WAL file is mostly empty and contains only a few records. Therefore it should be safe to report the LSN of the SWITCH record before the shutdown checkpoint. In order to do that, Patroni first gets the output of the pg_controldata and based on it calls pg_waldump two times: * The first call reads the checkpoint record (and verifies that this is really the shutdown checkpoint). * The next call reads the previous record and in case if it is the 'xlog switch' (for 9.3 and 9.4) or 'SWITCH' (for 9.5+), the LSN of the SWITCH record is written to the `/optime/leader`. In case of any mismatch, failure to call pg_waldump or parse its output, the old behavior is retained, i.e. `Latest checkpoint location` from the pg_controldata is used. Close https://github.com/zalando/patroni/issues/1860	2021-07-05 09:29:39 +02:00
Rok Mandeljc	96ebf42dc7	Fix problem with PyInstaller-frozen code and dot in the path (#1994 ) If `PyInstaller`-frozen application using `patroni` is placed in a location that contains a dot in the path, the `dcs_modules()` function in `patroni.dcs` breaks because `pkgutil.iter_importers()` treats the given path as a package name. Ref: pyinstaller/pyinstaller#5944 This can be avoided altogether by not passing a path to `iter_importers()`, because `PyInstaller`'s `FrozenImporter` is a singleton and registered as top-level finder.	2021-07-02 08:44:45 +02:00
Florian Bütler	e2d8a7d086	fix minor typo (#1991 ) close #1990	2021-07-02 08:27:17 +02:00
Alexander Kukushkin	77382e75dc	Compatibility with kazoo-2.7+ (#1982 ) Old versions of `kazoo` immediately discarded all requests to Zookeeper if the connection is in the `SUSPENDED` state. This is absolutely fine because Patroni is handling retries on its own. Starting from 2.7, kazoo started queueing requests instead of discarding and as a result, the Patroni HA loop was getting stuck until the connection to Zookeeper is reestablished, causing no demote of the Postgres. In order to return to the old behavior we override the `KazooClient._call()` method. In addition to that, we ensure that the `Postgresql.reset_cluster_info_state()` method is called even if DCS failed (the order of calls was changed in the #1820). Close https://github.com/zalando/patroni/issues/1981	2021-06-30 09:11:27 +02:00

1 2 3 4 5 ...

2020 Commits