patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 18:20:05 +00:00

Author	SHA1	Message	Date
Alexander Kukushkin	bda07fa526	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2024-04-02 12:10:17 +02:00
Polina Bungina	9b237b332e	Set global_config from dynamic_config if DCS data is empty (#3038 ) Fix the oversight of `193c73f` We need to set global config from the local cache if cluster.config is not initialized. If there is nothing written into the DCS (yet), we need the setup info for the decision making (e.g., if it is a standby cluster)	2024-03-28 08:15:58 +01:00
Grigory Smolkin	b09af642e6	Disable WAL streaming on standby node via new boolean tag "nostream" (#2842 ) Add support for ``nostream`` tag. If set to ``true`` the node will not use replication protocol to stream WAL. It will rely instead on archive recovery (if ``restore_command`` is configured) and ``pg_wal``/``pg_xlog`` polling. It also disables copying and synchronization of permanent logical replication slots on the node itself and all its cascading replicas. Setting this tag on primary node has no effect.	2024-03-20 10:10:53 +01:00
zhjwpku	e131065d74	rename citus_handler to mpp_handler (#2991 ) obey the following 5 meanings of terminology _cluster_ in Patroni. 1. PostgreSQL cluster: a cluster of postgresql instances which have the same system identifier. 2. MPP cluster: a cluster of PostgreSQL clusters that one of them acts as Coodinator and others act as workers. 3. Coordinator cluster: a PostgreSQL cluster which act the role of 'coordinator' within a MPP cluster. 4. Worker cluster: a PostgreSQL cluster which act the role 'worker' within a MPP cluster. 5. Patroni cluster: all cluster managed by Patroni can be called Patroni cluster, but we usually use this term to refering a single PostgreSQL cluster or an MPP cluster.	2024-02-28 06:16:20 +01:00
Alexander Kukushkin	59ecfb1799	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2024-01-05 10:22:35 +01:00
Alexander Kukushkin	bcfd8438a5	Abstract CitusHandler and decouple it from configuration (#2950 ) the main issue was that the configuration for Citus handler and for DCS existed in two places, while ideally AbstractDCS should not know many details about what kind of MPP is in use. To solve the problem we first dynamically create an object implementing AbstractMPP interfaces, which is a configuration for DCS. Later this object is used to instantiate the class implementing AbstractMPPHandler interface. This is just a starting point, which does some heavy lifting. As a next steps all kind of variables named after Citus in files different from patroni/postgres/mpp/citus.py should be renamed. In other words this commit takes over the most complex part of #2940, which was never implemented. Co-authored-by: zhjwpku <zhjwpku@gmail.com>	2023-12-21 08:58:26 +01:00
Alexander Kukushkin	0e6a2ff3a9	Don't let replica restore initialize key when DCS was wiped (#2970 ) It was happening from the branch where Patroni was supposed to be complain about converting standalone PG cluster to be governed by Patroni and exit.	2023-12-05 08:30:20 +01:00
Alexander Kukushkin	13cc86f851	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-11-24 14:42:15 +01:00
Alexander Kukushkin	193c73f6b8	Make GlobalConfig really global (#2935 ) 1. extract `GlobalConfig` class to its own module 2. make the module instantiate the `GlobalConfig` object on load and replace sys.modules with the this instance 3. don't pass `GlobalConfig` object around, but use `patroni.global_config` module everywhere. 4. move `ignore_slots_matchers`, `max_timelines_history`, and `permanent_slots` from `ClusterConfig` to `GlobalConfig`. 5. add `use_slots` property to global_config and remove duplicated code from `Cluster` and `Postgresql.ConfigHandler`. Besides that improve readability of couple of checks in ha.py and formatting of `/config` key when saved from patronictl.	2023-11-24 09:26:05 +01:00
Alexander Kukushkin	552e8643d9	Verify that replica nodes received checkpoint LSN on shutdown (#2939 ) In case if archiving is enabled the `Postgresql.latest_checkpoint_location()` method returns LSN of the prev (SWITCH) record, which points to the beginning of the WAL file. It is done in order to make it possible to safely promote replica which recovers WAL files from the archive and wasn't streaming when the primary was stopped (primary doesn't archive this WAL file). But, in certain cases using the LSN pointing to SWITCH record was causing unnecessary pg_rewind, if replica didn't managed to replay shutdown checkpoint record before it was promoted. In order to mitigate the problem we need to check that replica received/replayed exactly the shutdown checkpoint LSN. But, at the same time we will still write LSN of the SWITCH record to the `/status` key when releasing the leader lock.	2023-11-07 11:05:54 +01:00
Alexander Kukushkin	94e128c51a	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-10-25 10:43:21 +02:00
Mark Pekala	f5ee67fa1c	Feature: failover priority (#2780 ) The priority is configured with `failover_priority` tag. Possible values are from `0` till infinity, where `0` means that the node will never become the leader, which is the same as `nofailover` tag set to `true`. As a result, in the configuration file one should set only one of `failover_priority` or `nofailover` tags. The failover priority kicks in only when there are more than one node have the same receive/replay LSN and are ahead of other nodes in the cluster. In this case the node with higher value of `failover_priority` is preferred. If there is a node with higher values of receive/replay LSN, it will become the new leader even if it has lower value of `failover_priority` (except when priority is set to 0). Close https://github.com/zalando/patroni/issues/2759	2023-10-24 12:22:48 +02:00
Alexander Kukushkin	f32989124c	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-10-23 15:28:54 +02:00
Alexander Kukushkin	c5fffb3c97	Further work on permanent physical slots (#2891 ) - Fixed issues with has_permanent_slots() method. It didn't took into account the case of permanent physical slots for members, falsely concluding that there are no permanent slots. - Write to the status key only LSNs for permanent slots (not just for slots that exist on the primary). - Include pg_current_wal_flush_lsn() to slots feedback, so that slots on standby nodes could be advanced - Improved behave tests: - Verify that permanent slots are properly created on standby nodes - Verify that permanent slots are properly advanced, including DCS failsafe mode - Verify that only permanent slots are written to the `/status`	2023-10-23 08:24:28 +02:00
Alexander Kukushkin	2f6678e4a7	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-09-26 12:01:05 +02:00
Alexander Kukushkin	c855b0bff9	Detect and solve inconsistency between /sync and actual sync nodes (#2877 ) Patroni is changing `synchronous_standby_names` and the `/sync` key in a very specific order, first we add nodes to `synchronous_standby_names` and only after, when they are recognized as synchronous they are added to the `/sync` key. When removing nodes the order is different: they are first removed from the `/sync` key and only after that from the `synchronous_standby_names`. As a result Patroni expects that either actual synchronous nodes will match with the nodes listed in the `/sync` key or that new candidates to synchronous nodes will not match with nodes listed in the `/sync` key. In case if `synchronous_standby_names` was removed from the `postgresql.conf`, manually, or due the the bug (#2876), the state becomes inconsistent because of the wrong order of updates. To solve inconsistent state we introduce additional checks and will update the `/sync` key with actual names of synchronous nodes (usually empty set).	2023-09-26 11:14:20 +02:00
Alexander Kukushkin	9fccc058bf	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-09-14 15:56:55 +02:00
Alexander Kukushkin	238b8db91e	Introduce Status class (#2853 ) It represents the `/status` key in DCS and makes it easier to introduce new values stored in the `/status` key without need to refactor all DCS implementations.	2023-09-14 14:40:44 +02:00
Alexander Kukushkin	d1dff78326	Rename methods in unit tests to match names of methods we are testing	2023-09-12 09:11:57 +02:00
Alexander Kukushkin	67612f5667	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-09-12 09:07:11 +02:00
Polina Bungina	b31a4d55c9	Ensure strict failover/switchover definition difference (#2784 ) - Don't set leader in failover key from patronictl failover - Show warning and execute switchover if leader option is provided for patronictl failover command - Be more precise in the log messages - Allow to failover to an async candidate in sync mode - Check if candidate is the same as the leader specified in api - Fix and extend some tests - Add documentation	2023-09-12 08:51:17 +02:00
Alexander Kukushkin	8e24d72f98	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-08-22 10:07:35 +02:00
Alexander Kukushkin	3333e78500	Factor out tags handling into a dedicated class (#2823 ) The same (almost) logic was used in three different places: 1. `Patroni` class 2. `Member` class 3. `_MemberStatus` class Now they all inherit newly intoduced `Tags` class.	2023-08-21 17:03:14 +02:00
Alexander Kukushkin	f51309d03e	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-08-18 15:12:28 +02:00
Alexander Kukushkin	704d36815a	Explicitly enable synchronous mode (#2820 ) Close https://github.com/zalando/patroni/issues/2819 Co-authored-by: Polina Bungina <27892524+hughcapet@users.noreply.github.com>	2023-08-17 12:33:15 +02:00
Alexander Kukushkin	f5cb888f80	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-08-14 17:06:42 +02:00
Alexander Kukushkin	efaba9f183	Rename Postgresql.is_leader() to is_primary() (#2809 ) It'll help to avoid confusion with the Ha.is_leader() method.	2023-08-09 14:47:53 +02:00
Alexander Kukushkin	74b89d740f	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-08-08 16:14:46 +02:00
Polina Bungina	f24db395c6	Refactor is_failover_possible() (#2804 ) * Refactor is_failover_possible() Move all the members filtering inside the function. * Remove check_synchronous parameter * Add sync_mode_is_active() method and user it everywhere where it is appropriate * Reduce nesting --------- Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>	2023-08-08 11:50:02 +02:00
Alexander Kukushkin	d6e3f25bd2	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-08-08 11:27:11 +02:00
Alexander Kukushkin	01d07f86cd	Set permissions for files and directories created in PGDATA (#2781 ) Postgres supports two types of permissions: 1. owner only 2. group readable By default the first one is used because it provides better security. But, sometimes people want to run a backup tool with the user that is different from postgres. In this case the second option becomes very useful. Unfortunately it didn't work correctly because Patroni was creating files with owner access only permissions. This PR changes the behavior and permissions on files and directories that are created by Patroni will be calculated based on permissions of PGDATA. I.e., they will get group readable access when it is necessary. Close #1899 Close #1901	2023-08-02 13:15:43 +02:00
Alexander Kukushkin	1a0549d540	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-07-31 16:26:10 +02:00
Alexander Kukushkin	01976ec10b	Don't allow stale primary to win the leader race (#2787 ) Consider a following situation: 1. node1 is stressed so much that Patroni heart-beat can't run regularly and the leader lock expires. 2. node2 notice that there is no leader, gets the lock, promotes, and gets to a situation like it is described in 1. 3. Patroni on node1 finally wakes up, notice that Postgres is running as a primary, but without a leader lock and "happily" acquires the lock. That is, node1 discarded promoting of node2, and the node2 after that it will not be possible to join the node2 back to the cluster, because pg_rewind is not possible when two nodes are on the same timeline. To partially mitigate the problem we introduce an additional timeline check. If postgres is running as primary Patroni will consider it as a perfect candidate only if timeline isn't behind the last known cluster timeline recorder in the `/history` key. If postgres timeline is behind the cluster timeline postgres will be demoted to read-only. Further behavior would depend on `maximum_lag_on_failover` and `check_timeline` settings. Since the `/history` key isn't updated instantly after promotion, there is still a short period of time when the issue could happen, but it seems that it is close to impossible to make it more reliable. Close https://github.com/zalando/patroni/issues/2779	2023-07-31 11:22:18 +02:00
Alexander Kukushkin	7e89583ec7	Please new flake8 (#2789 ) it stopped liking lack of space character between `,` and `\` ```python foo,\ bar ```	2023-07-31 09:08:46 +02:00
Alexander Kukushkin	538d621fed	Address review feedback	2023-07-31 07:12:54 +02:00
Alexander Kukushkin	aa0c32167d	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-07-26 07:35:12 +02:00
Alexander Kukushkin	ae2bbd28ae	Fix `in_recovery` check (#2773 ) The primary that is still alive wasn't properly recognized. Regression was introduced in #2652	2023-07-25 11:50:40 +02:00
Alexander Kukushkin	768c9ebb04	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-07-25 08:42:23 +02:00
Matt Baker	817f39ad6d	Refactor get_dcs (#2747 ) Now uses generators instead of for loops and implements importing modules once.	2023-07-25 08:01:35 +02:00
Alexander Kukushkin	e6d251bda0	Limit time spent in _process_quorum_replication by loop_wait seconds	2023-07-18 14:53:55 +02:00
Alexander Kukushkin	b0d8b21d49	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-07-13 12:55:39 +02:00
Alexander Kukushkin	d46ca88e6b	Make it visible replication state on standbys (#2733 ) To do that we use `pg_stat_get_wal_receiver()` function, which is available since 9.6. For older versions the `patronictl list` output and REST API responses remain as before. In case if there is no wal receiver process we check if `restore_command` is set and show the state as `in archive recovery`. Example of `patronictl list` output: ```bash $ patronictl list + Cluster: batman -------------+---------+---------------------+----+-----------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| +-------------+----------------+---------+---------------------+----+-----------+ \| postgresql0 \| 127.0.0.1:5432 \| Leader \| running \| 12 \| \| \| postgresql1 \| 127.0.0.1:5433 \| Replica \| in archive recovery \| 12 \| 0 \| +-------------+----------------+---------+---------------------+----+-----------+ $ patronictl list + Cluster: batman -------------+---------+-----------+----+-----------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| +-------------+----------------+---------+-----------+----+-----------+ \| postgresql0 \| 127.0.0.1:5432 \| Leader \| running \| 12 \| \| \| postgresql1 \| 127.0.0.1:5433 \| Replica \| streaming \| 12 \| 0 \| +-------------+----------------+---------+-----------+----+-----------+ ``` Example of REST API response: ```bash $ curl -s localhost:8009 \| jq . { "state": "running", "postmaster_start_time": "2023-07-06 13:12:00.595118+02:00", "role": "replica", "server_version": 150003, "xlog": { "received_location": 335544480, "replayed_location": 335544480, "replayed_timestamp": null, "paused": false }, "timeline": 12, "replication_state": "in archive recovery", "dcs_last_seen": 1688642069, "database_system_identifier": "7252327498286490579", "patroni": { "version": "3.0.3", "scope": "batman" } } $ curl -s localhost:8009 \| jq . { "state": "running", "postmaster_start_time": "2023-07-06 13:12:00.595118+02:00", "role": "replica", "server_version": 150003, "xlog": { "received_location": 335544816, "replayed_location": 335544816, "replayed_timestamp": null, "paused": false }, "timeline": 12, "replication_state": "streaming", "dcs_last_seen": 1688642089, "database_system_identifier": "7252327498286490579", "patroni": { "version": "3.0.3", "scope": "batman" } } ```	2023-07-13 09:24:20 +02:00
Alexander Kukushkin	e4fe239a9d	A few fixes in synchronous_mode (#2741 ) - make sure that physical replication slots are created even before the promote happened (when async executor is busy with promote). - execute `txid_current()` with `synchronous_commit=off` so it doesn't accidentally wait for absent synchronous standbys when `synchronous_mode_strict` is enable and `synchronous_standby_names=`. These standbys can't connect because replication slots weren't there. - `synchronous_standby_names` wasn't set to `` after bootstrap with `synchronous_mode` and `synchronous_mode_strict`. - add `-c statement_timeout=0` to `PGOPTIONS` when executing `post_bootstrap` script. Close https://github.com/zalando/patroni/issues/2738	2023-07-12 09:43:40 +02:00
Alexander Kukushkin	6e96db173f	Start postgres not in recovery in some cases (#2726 ) If we know for sure that a few moments ago postgres was still running as a primary and we still have the leader lock and can successfully update it, in this case we can safely start postgres back not in recovery. That will allow to avoid bumping timeline without a reason and hopefully improve reliability because it will address issues similar to #2720. In addition to that remove `if self.state_handler.is_starting()` check from the `recover()` method. This branch could never be reached because the `starting` state is handled earlier in the `_run_cycle()`. Besides that remove redundant `self._crash_recovery_executed`. P.S. now we do not cover cases when Patroni was killed along with Postgres. Lets consider that we just started Patroni, there is no leader, and `pg_controldata` reports `Database cluster state` as `shut down`. It feels logical to use `Latest checkpoint location` and `Latest checkpoint's TimeLineID` to do a usual leader race and start directly as a primary, but it could be totally wrong. The thing is that we run `postgres --single` if standby wasn't shut down cleanly before executing `pg_rewind`. As a result `Database cluster state` transition from `in archive recovery` to `shut down`, but if such a node becomes a leader the timeline must be increased.	2023-07-12 09:42:34 +02:00
Alexander Kukushkin	8f60b18f03	Delay _process_quorum_replication by loop_wait seconds after promote It takes some time for existing standbys to start streaming from the new primary and we want to do our best to not empty the /sync key before that.	2023-05-23 14:01:58 +02:00
Alexander Kukushkin	f298921315	Merge branch 'master' of github.com:zalando/patroni into feature/quorum-commit	2023-05-23 09:59:16 +02:00
Alexander Kukushkin	66a0e44371	Enable pyright job for every commit (#2675 ) And fix remaining issues that the job doesn't fail.	2023-05-15 11:38:40 +02:00
Alexander Kukushkin	e97d2f0999	Implement synchronous_mode=quorum	2023-05-11 12:18:08 +02:00
Alexander Kukushkin	f5f0adba14	Compatibility with future synchronous_mode=quorum	2023-05-11 12:16:52 +02:00
Alexander Kukushkin	ea019ba549	Adapt SyncHandler interfaces for quorum commit	2023-05-11 11:20:36 +02:00

1 2 3 4 5 ...

259 Commits