patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 10:20:10 +00:00

Author	SHA1	Message	Date
Alexander Kukushkin	096ee8f36f	Read GUC's values when joining running Postgres (#2876 ) If restarted in pause Patroni was discarding `synchronous_standby_names` from `postgresql.conf` because in the internal cache this values was set to `None`. As a result synchronous replication transitioned to a broken state, with no synchronous replicas according to the `synchronous_standby_names` and Patroni not selecting/setting the new synchronous replicas (another bug). To solve the problem of broken initial state and to avoid similar issues with other GUC's we will read GUC's value if Patroni is joining running Postgres.	2023-09-26 11:34:19 +02:00
Alexander Kukushkin	91e2be092c	Detect and solve inconsistency between /sync and actual sync nodes (#2877 ) Patroni is changing `synchronous_standby_names` and the `/sync` key in a very specific order, first we add nodes to `synchronous_standby_names` and only after, when they are recognized as synchronous they are added to the `/sync` key. When removing nodes the order is different: they are first removed from the `/sync` key and only after that from the `synchronous_standby_names`. As a result Patroni expects that either actual synchronous nodes will match with the nodes listed in the `/sync` key or that new candidates to synchronous nodes will not match with nodes listed in the `/sync` key. In case if `synchronous_standby_names` was removed from the `postgresql.conf`, manually, or due the the bug (#2876), the state becomes inconsistent because of the wrong order of updates. To solve inconsistent state we introduce additional checks and will update the `/sync` key with actual names of synchronous nodes (usually empty set).	2023-09-26 11:17:12 +02:00
Alexander Kukushkin	4148e0b5b2	Take into account current role when deciding on removal of member ZNode (#2884 ) Patroni doesn't watch on all changes of member keys in order to not create too much load on ZooKeeper, but only subscribes to changes (ZNodes added or deleted) in the `/member` directory. Therefore when some important fields in the value are updated we remove and recreate ZNode in order to notify the leader or other members. The leader should remove the member key only when the `checkpoint_after_promote` value is changed and replicas when the `state` is changed to/from `running`. We don't care about the `version` field, because Patroni version can't be changed without restart, what will case ZooKeeper `session_id` to change it anyway. This fix hopefully will reduce failures of behave tests on GH Actions.	2023-09-26 11:16:52 +02:00
Alexander Kukushkin	5ceba81269	Bugfix for GUC's values with units (#2883 ) Despite being validated by `IntValidator` some GUC's couldn't be casted directly to `int` because they include suffix. Example: `128MB`. Close https://github.com/zalando/patroni/issues/2879	2023-09-26 11:16:47 +02:00
Alexander Kukushkin	dbbe065a27	Silence annoying warnings when checking for node uniqueness (#2878 ) WARNING messages are produced by `urllib3` if Patroni is quickly restarted. Instead we will check that the node is listen on a given port. This fact is actually enough to detect names clashes, while HTTP request could raise an exception is a few other cases, what might case false negatives. Close https://github.com/zalando/patroni/issues/2881	2023-09-26 11:16:38 +02:00
Alexander Kukushkin	ce51eb02d0	Mock request() method when running tests (#2802 ) 1. Unit tests should not really try accessing any resources. 2. Not doing so results in significant execution time of unit tests on Windows In addition to that perform a request with timeout 3s. Usually this is more than enough to figure out whether resource is accessible. Followup on #2724	2023-09-20 14:26:12 +02:00
Matt Baker	bcafe91a55	Add docstrings to `patroni.postgresql.slots.py` (#2778 ) Also, made some small code changes to satisfy formatting and pylint.	2023-09-20 12:30:57 +02:00
Alexander Kukushkin	6e82de8751	Don't rely on pg_stat_wal_receiver when deciding on pg_rewind (#2863 ) As was reported by @ants on Slack it could happen that `received_tli` is ahead of replayed timeline, therefore we should stop using it when deciding on pg_rewind if postgres is running and use only `IDENTIFY_SYSTEM` via replication connection.	2023-09-18 13:01:08 +02:00
Polina Bungina	85db209c19	Always store CMDLINE_OPTIONS config values as int (#2861 )	2023-09-18 12:59:55 +02:00
Israel	564dd7e7af	Fix bug in `patronictl query` command (#2859 ) Previous to this commit `patronictl query` was working only if `-r` argument was provided to the command. Otherwise it would face issues: * If neither `-r` nor `-m` were provided: ``` PGPASSWORD=zalando patronictl -c postgres0.yml query -U postgres -c "SHOW PORT" 2023-09-12 17:45:38 No connection to role=None is available ``` * If only `-m` was provided: ``` $ PGPASSWORD=zalando patronictl -c postgres0.yml query -U postgres -c "SHOW PORT" -m postgresql0 2023-09-12 17:46:15 No connection to member postgresql0 is available ``` This issue was a regression introduced by `4c3e0b9382820524239d2aa4d6b95379ef1291db` through PR #2687. Through that PR we decided to move the common logic used to check mutually exclusiveness of `--role` and `--member` arguments to `get_any_member` function. However, previous to that change `role` variable would assume the default value of `any` in `query` method, before `get_any_member` was called, which was not the case after the change. This commit fixes that issue by adding a handler in `get_cursor` function to `role=None`. As `role` defaulting to `any` is handled in a sub-call to `get_any_member`, we are apparently safe in `get_cursor` to return the cursor if `role=None`. Unit tests were updated accordingly. References: PAT-204.	2023-09-18 12:59:55 +02:00
Alexander Kukushkin	95ed90183b	Don't start stopped postgres in pause (#2848 ) Due to a race condition Patroni was falsely assuming that the standby should be restarted because some recovery parameters (primary_conninfo or similar) were changed. Close https://github.com/zalando/patroni/issues/2834	2023-09-18 12:59:55 +02:00
Alexander Kukushkin	33e02e14ca	Override write_leader_optime method in K8s implementation (#2850 ) It is being called when postgres is already shut down cleanly but there are no healthy replicas to take it over. Close https://github.com/zalando/patroni/issues/2837 Close https://github.com/zalando/patroni/pull/2838	2023-09-18 12:59:55 +02:00
Polina Bungina	e38d7d4e9f	Return system id to the ctl list title (#2840 )	2023-09-18 12:59:55 +02:00
Alexander Kukushkin	24bf2f3fa0	Fix bug with kubernetes.standby_leader_label_value (#2832 ) When running with the leader lock Patroni was just setting the `role` label to `master` and effectively `kubernetes.standby_leader_label_value` feature never worked. Now it is fixed, but in order to not introduce breaking changes we just update default value of the `standby_leader_label_value` to the `master`.	2023-09-18 12:59:55 +02:00
Alexander Kukushkin	1849bd1a56	Don't return logical slots for standby cluster (#2816 ) Cluster.get_replication_slots() didn't take into account that there can not be logical replication slots in a standby cluster replicas. It was only skipping logical slots for the standby_leader, but replicas were expecting that they will have to copy them over. Also on replicas in a standby cluster these logical slots were falsely added to the `_replication_slots` dict.	2023-09-18 12:59:55 +02:00
Alexander Kukushkin	f659edd60f	Explicitly enable synchronous mode (#2820 ) Close https://github.com/zalando/patroni/issues/2819 Co-authored-by: Polina Bungina <27892524+hughcapet@users.noreply.github.com>	2023-09-18 12:59:36 +02:00
Alexander Kukushkin	84aac437c1	Release v3.1.0 (#2801 ) - bump pyright and resolve reported issues - bump Patroni version - update release notes	2023-08-03 13:02:29 +02:00
Alexander Kukushkin	01d07f86cd	Set permissions for files and directories created in PGDATA (#2781 ) Postgres supports two types of permissions: 1. owner only 2. group readable By default the first one is used because it provides better security. But, sometimes people want to run a backup tool with the user that is different from postgres. In this case the second option becomes very useful. Unfortunately it didn't work correctly because Patroni was creating files with owner access only permissions. This PR changes the behavior and permissions on files and directories that are created by Patroni will be calculated based on permissions of PGDATA. I.e., they will get group readable access when it is necessary. Close #1899 Close #1901	2023-08-02 13:15:43 +02:00
Alexander Kukushkin	94bfea1a81	Do not fail validation for a value that is fine (#2791 ) In issue #2735 it was discussed that there should be some warning around PostgreSQL parameters that do not pass validation. This commit ensures something is logged for parameters that fail validation and therefore fall back to default values. Close #2735 Close #2740	2023-07-31 11:35:30 +02:00
Alexander Kukushkin	01976ec10b	Don't allow stale primary to win the leader race (#2787 ) Consider a following situation: 1. node1 is stressed so much that Patroni heart-beat can't run regularly and the leader lock expires. 2. node2 notice that there is no leader, gets the lock, promotes, and gets to a situation like it is described in 1. 3. Patroni on node1 finally wakes up, notice that Postgres is running as a primary, but without a leader lock and "happily" acquires the lock. That is, node1 discarded promoting of node2, and the node2 after that it will not be possible to join the node2 back to the cluster, because pg_rewind is not possible when two nodes are on the same timeline. To partially mitigate the problem we introduce an additional timeline check. If postgres is running as primary Patroni will consider it as a perfect candidate only if timeline isn't behind the last known cluster timeline recorder in the `/history` key. If postgres timeline is behind the cluster timeline postgres will be demoted to read-only. Further behavior would depend on `maximum_lag_on_failover` and `check_timeline` settings. Since the `/history` key isn't updated instantly after promotion, there is still a short period of time when the issue could happen, but it seems that it is close to impossible to make it more reliable. Close https://github.com/zalando/patroni/issues/2779	2023-07-31 11:22:18 +02:00
Alexander Kukushkin	8f3ed00886	Invalidate cache if txn failed due to revision mismatch (#2783 ) It was reported in #2779 that the primary was constantly logging messages like `Synchronous replication key updated by someone else`. It happened after Patroni was stuck due to resource starvation. Key updates are performed using create_revision/mod_revision field, which value is taken from the internal cached. Hence, it is a clear symptom of stale cache. Similar issues in K8s implementation were addressed by invalidating the cache and restarting watcher connections every time when update failed due to resource_version mismatch, so we do the same for Etcd3.	2023-07-31 10:16:19 +02:00
Alexander Kukushkin	7e89583ec7	Please new flake8 (#2789 ) it stopped liking lack of space character between `,` and `\` ```python foo,\ bar ```	2023-07-31 09:08:46 +02:00
Alexander Kukushkin	238aba3956	Fix patronictl list (#2775 ) the `Cluster` name field was missing in tsv, json, and yaml formats The bug was introduced in #2652	2023-07-26 12:33:17 +02:00
Alexander Kukushkin	ae2bbd28ae	Fix `in_recovery` check (#2773 ) The primary that is still alive wasn't properly recognized. Regression was introduced in #2652	2023-07-25 11:50:40 +02:00
Waynerv	0e19e3e98e	Make pod role label configurable (#2659 ) Close #2495	2023-07-25 10:29:04 +02:00
Alexander Kukushkin	06db296612	Fixes in `patroni.request` (#2768 ) 1. Take client certificates only from the `ctl` section. Motivation: sometimes there are server-only certificates that can't be used as client certificates. As a result neither Patroni not patronictl work correctly even if `--insecure` option is used. 2. Document that if `restapi.verify_client` is set to `required` then client certificates must be provided in the `ctl` section. 3. Add support for `ctl.authentication` and prefer to use it over `restapi.authentication`. 4. Silence annoying InsecureRequestWarning when `patronictl -k` is used, so that behavior becomes is similar to `curl -k`.	2023-07-25 08:48:18 +02:00
Matt Baker	817f39ad6d	Refactor get_dcs (#2747 ) Now uses generators instead of for loops and implements importing modules once.	2023-07-25 08:01:35 +02:00
Matt Baker	c5a4befdc4	Refactor `check_logical_slots_readiness` split to reduce complexity (#2749 ) Includes: * renaming of `_unready_logical_slots` to better represent that it is a processing queue which is emptied on successful completion. * ensuring that return type is consistent. * made logic variable names explicit to help explain how the decision of whether a slot is "ready" is made.	2023-07-25 08:00:59 +02:00
Waynerv	84c574e1ec	Run archive_command through shell (#2766 ) Close #2764	2023-07-21 14:30:22 +02:00
Alexander Kukushkin	0c5bf3c4cd	Validate more parameters in the config file (#2761 ) - parameters for different DCS - more bootstrap.dcs parameters - ctl, restapi, and watchdog parameters	2023-07-19 12:42:14 +02:00
Alexander Kukushkin	d46ca88e6b	Make it visible replication state on standbys (#2733 ) To do that we use `pg_stat_get_wal_receiver()` function, which is available since 9.6. For older versions the `patronictl list` output and REST API responses remain as before. In case if there is no wal receiver process we check if `restore_command` is set and show the state as `in archive recovery`. Example of `patronictl list` output: ```bash $ patronictl list + Cluster: batman -------------+---------+---------------------+----+-----------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| +-------------+----------------+---------+---------------------+----+-----------+ \| postgresql0 \| 127.0.0.1:5432 \| Leader \| running \| 12 \| \| \| postgresql1 \| 127.0.0.1:5433 \| Replica \| in archive recovery \| 12 \| 0 \| +-------------+----------------+---------+---------------------+----+-----------+ $ patronictl list + Cluster: batman -------------+---------+-----------+----+-----------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| +-------------+----------------+---------+-----------+----+-----------+ \| postgresql0 \| 127.0.0.1:5432 \| Leader \| running \| 12 \| \| \| postgresql1 \| 127.0.0.1:5433 \| Replica \| streaming \| 12 \| 0 \| +-------------+----------------+---------+-----------+----+-----------+ ``` Example of REST API response: ```bash $ curl -s localhost:8009 \| jq . { "state": "running", "postmaster_start_time": "2023-07-06 13:12:00.595118+02:00", "role": "replica", "server_version": 150003, "xlog": { "received_location": 335544480, "replayed_location": 335544480, "replayed_timestamp": null, "paused": false }, "timeline": 12, "replication_state": "in archive recovery", "dcs_last_seen": 1688642069, "database_system_identifier": "7252327498286490579", "patroni": { "version": "3.0.3", "scope": "batman" } } $ curl -s localhost:8009 \| jq . { "state": "running", "postmaster_start_time": "2023-07-06 13:12:00.595118+02:00", "role": "replica", "server_version": 150003, "xlog": { "received_location": 335544816, "replayed_location": 335544816, "replayed_timestamp": null, "paused": false }, "timeline": 12, "replication_state": "streaming", "dcs_last_seen": 1688642089, "database_system_identifier": "7252327498286490579", "patroni": { "version": "3.0.3", "scope": "batman" } } ```	2023-07-13 09:24:20 +02:00
Alexander Kukushkin	e4fe239a9d	A few fixes in synchronous_mode (#2741 ) - make sure that physical replication slots are created even before the promote happened (when async executor is busy with promote). - execute `txid_current()` with `synchronous_commit=off` so it doesn't accidentally wait for absent synchronous standbys when `synchronous_mode_strict` is enable and `synchronous_standby_names=`. These standbys can't connect because replication slots weren't there. - `synchronous_standby_names` wasn't set to `` after bootstrap with `synchronous_mode` and `synchronous_mode_strict`. - add `-c statement_timeout=0` to `PGOPTIONS` when executing `post_bootstrap` script. Close https://github.com/zalando/patroni/issues/2738	2023-07-12 09:43:40 +02:00
Alexander Kukushkin	6e96db173f	Start postgres not in recovery in some cases (#2726 ) If we know for sure that a few moments ago postgres was still running as a primary and we still have the leader lock and can successfully update it, in this case we can safely start postgres back not in recovery. That will allow to avoid bumping timeline without a reason and hopefully improve reliability because it will address issues similar to #2720. In addition to that remove `if self.state_handler.is_starting()` check from the `recover()` method. This branch could never be reached because the `starting` state is handled earlier in the `_run_cycle()`. Besides that remove redundant `self._crash_recovery_executed`. P.S. now we do not cover cases when Patroni was killed along with Postgres. Lets consider that we just started Patroni, there is no leader, and `pg_controldata` reports `Database cluster state` as `shut down`. It feels logical to use `Latest checkpoint location` and `Latest checkpoint's TimeLineID` to do a usual leader race and start directly as a primary, but it could be totally wrong. The thing is that we run `postgres --single` if standby wasn't shut down cleanly before executing `pg_rewind`. As a result `Database cluster state` transition from `in archive recovery` to `shut down`, but if such a node becomes a leader the timeline must be increased.	2023-07-12 09:42:34 +02:00
Mark Pekala	412c51ddf1	Prevent splitbrain from duplicate names in configuration (#2724 ) When starting check if node with the same is registered in DCS and try to query it's REST API. If REST API is accessible exit with the error. Close #1804	2023-07-11 07:43:57 +02:00
Israel	ed02826103	REST API would not reload SSL certificate upon receiving an SIGHUP (#2722 ) Revert to using `ssl._ssl._test_decode_cert` A change has been included as part of Patroni 3.0.3 release: use public functions instead of `ssl._ssl._test_decode_cert` to get serial number of certificates. There was a slight bug in that implementation: it was only loading the certificates through `load_verify_locations`, but was missing to get the certificates through `get_ca_certs`. As a consequence Patroni was not able anymore to reload REST API cert on SIGHUP. An attempt to fix that issue was made through commit `20f578f09f3aa604e5288710d4fd4e611152ed5f`. However, even with the correct call of `get_ca_certs`, it was detected a corner case where `load_verify_locations` would skip loading a certificate: if it was issued with `CA:FALSE`. That essentially means the implementation is still buggy in that situation. See [CPython](`c283a0cff5/Modules/_ssl.c (L4618C14-L4619)`) for the underlying problem. In order to get back a functional implementation again we are reverting the code to use the private function `ssl._ssl._test_decode_cert`. We can later study a possible more elegant alternative for solving this, if any. --------- Signed-off-by: Israel Barth Rubio <israel.barth@enterprisedb.com>	2023-07-04 18:53:24 +03:00
Alexander Kukushkin	2354f8f004	Fix a few concurrency bugs in Citus support (#2710 ) - the `_in_fligh` attribute is accessed from multiple threads and must be protected with mutex when it is changed - allow adding tasks for `_in_fligh.group` from the `sync_pg_dist_node()` method when timeout is reached. Not doing so might result is indefinite transaction if REST API request from worker node failed.	2023-06-12 07:52:46 +02:00
Polina Bungina	21e92fd166	Add env vars for custom bin names (#2706 )	2023-06-01 14:06:11 +02:00
Israel	df18885f20	Extend Postgres GUCs validator (#2671 ) * Use YAML files to validate Postgres GUCs through Patroni. Patroni used to have a static list of Postgres GUCs validators in `patroni.postgresql.validator`. One problem with that approach, for example, is that it would not allow GUCs from custom Postgres builds to be validated/accepted. The idea that we had to work around that issue was to move the validators from the source code to an external and extendable source. With that Patroni will start reading the current validators from that external source plus whatever custom validators are found. From this commit onwards Patroni will read and parse all YAML files that are found under the `patroni/postgresql/available_parameters` directory to build its Postgres GUCs validation rules. All the details about how this work can be found in the docstring of the introduced function `_load_postgres_gucs_validators`.	2023-05-31 13:54:54 +02:00
Israel	d11328020d	Add support for custom Postgres binary names (#2692 ) When using a custom Postgres distribution it may be the case that the Postgres binaries are compiled with different names other than the ones used by the community Postgres distribution. With that in mind we implemented a new set of settings for Patroni, so the user is able to override the default binary names with custom binary names through the new section postgresql.bin_name in the local configuration. References: PAT-17.	2023-05-30 13:57:57 +02:00
Polina Bungina	37fffa618f	Refactor daemon entrypoints (#2697 ) - abstract_main only creates Config object using the passed configfile and instantiates the passed daemon class - common args parser is extracted into a separate func that is called from daemons' main funcs (specific args can be added afterwards) Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>	2023-05-26 15:13:04 +02:00
Alexander Kukushkin	af8e5f0d0f	Refactor update_leader interface (#2690 ) pass reference to a last known leader object in order to avoid obtaining it from the `AbstractDCS.cluster` cache. This change is useful for Consul, Etcd3 and Zookeeper implementations.	2023-05-25 14:21:05 +02:00
Alexander Kukushkin	1c7bf2f59e	Fix a problem with etcd3.update_leader() (#2693 ) It didn't took into account the fact that we can get a new lease after changing a TTL. In this case we have to update the exiting leader key with the new lease. To solve it we introduce the on transaction 'failure' callback. The whole workflow looks like (schematically): ```python txn( compare=(old_value == self._name)), success=put(self.leader_path, self._name, self._lease), failure=txn( compare=(create_revision == '0'), success=put(self.leader_path, self._name, self._lease) ) ) ``` The problem was introduced in d98d6d0b02c9cc67464a7b1b31b1c5570c26e12d	2023-05-25 10:01:43 +02:00
Alexander Kukushkin	b4afc6830b	Little fixes in etcd3 and kubernetes (#2689 ) - Always pass etcd3 key revision as a string - Make sure the leader key isn't unconditionally overwritten. It may happen that the leader heart-beat loop didn't run properly and the session has expired. In this case the leader may create a new session and a new leader key. But, there are chances that the other node already created a leader key and we don't want to overwrite it. - Try to sync HA loops between nodes by adding 0.5 seconds to timeout on non-leader nodes	2023-05-24 10:54:26 +02:00
Polina Bungina	6c8a3b0d25	Remove bootstrap.pg_hba (#2684 ) * Remove bootstrap.pg_hba * Extend docs for postgresql.pg_hba/pg_ident * Add postgresql.pg_hba/pg_ident to dynamic config docs --------- Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>	2023-05-24 09:01:56 +02:00
Polina Bungina	506b5bec48	Validate-config fixes (#2678 ) - fix --validate-config not to error out if bin_dir is an empty string in the yaml config - mention bin_dir optionality in the docs - validate bin_dir even if it is not in the yaml config (add optional default value for Optional config params in validator) - make rewind user optional	2023-05-15 13:40:22 +02:00
Alexander Kukushkin	66a0e44371	Enable pyright job for every commit (#2675 ) And fix remaining issues that the job doesn't fail.	2023-05-15 11:38:40 +02:00
Israel	fdcf8b1997	Add docstrings and type hints to `patroni/api.py` (#2648 ) References: PAT-77	2023-05-12 15:38:59 +02:00
Alexander Kukushkin	7941c86775	Refactor write_sync_state() (#2669 ) Make it return the new `SyncState` object in order to avoid reading the new cluster state in the Ha.process_sync_replication(). Now it is a small optimization, but it will become very handy in the quorum commit feature.	2023-05-11 09:58:15 +02:00
Alexander Kukushkin	76b3b99de2	Enable pyright strict mode (#2652 ) - added pyrightconfig.json with typeCheckingMode=strict - added type hints to all files except api.py - added type stubs for dns, etcd, consul, kazoo, pysyncobj and other modules - added type stubs for psycopg2 and urllib3 with some little fixes - fixes most of the issues reported by pyright - remaining issues will be addressed later, along with enabling CI linting task	2023-05-09 09:38:00 +02:00
Le Duane	bebe6754fc	Add before stop hook (#2642 ) The two cases we have in mind are: * In spite of following all best practices client-side, logical replication connections can sometimes hang the Postgres shutdown sequence. We'd like to sigterm any misbehaving logical replication connections which remain after x seconds. These will inevitably get killed anyway on master stop timeout. * remove "role=master" label on current primary when not using k8s as DCS. Waiting until after Postgres fully stops can sometimes be too long for this. * Pause pgbouncer connections before switchover Close #2596	2023-04-27 13:07:32 +02:00

1 2 3 4 5 ...

845 Commits