patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 10:20:10 +00:00

Author	SHA1	Message	Date
Alexander Kukushkin	b3f109751c	Merge branch 'master' of github.com:zalando/patroni into feature/terminaltables	2020-06-30 16:56:47 +02:00
Alexander Kukushkin	8eb01c77b6	Don't fire on_reload when promoting to standby_leader on 13+ (#1552 ) PostgreSQL 13 finally introduced the possibility to change the `primary_conninfo` without a restart. Just doing reload is enough, but in case if the role is changing from the `replica` to the `standby_leader` we want to call only `on_role_change` callback and skip `on_reload`, because they duplicate each other.	2020-06-29 14:49:25 +02:00
Alexander Kukushkin	cbff544b9c	Implement patronictl flush switchover (#1554 ) It includes implementing the `DELETE /switchover` REST API endpoint. Close https://github.com/zalando/patroni/issues/1376	2020-06-25 16:27:57 +02:00
Alexander Kukushkin	7f343c2c57	Try to fetch missing WAL if pg_rewind complains about it (#1561 ) It could happen that the WAL segment required for `pg_rewind` doesn't exist in the `pg_wal` anymore and therefore `pg_rewind` can't find the checkpoint location before the diverging point. Starting from PostgreSQL 13 `pg_rewind` could use `restore_command` for fetching missing WALs, but we can do better than that. On older PostgreSQL versions Patroni will parse the stdout and stderr of failed rewind attempt, try to fetch the missing WAL by calling the `restore_command`, and repeat an attempt.	2020-06-25 16:24:21 +02:00
Alexander Kukushkin	e00acdf6df	Fix possible race conditions in update_leader (#1596 ) 1. Between get_cluster() and update_leader() calls the K8s leader object might be updated from outside and therefore the resource version will not match (error code=409). Since we are watching for all changes, the ObjectCache likely will have the most up-to-date version and we will take advantage of that. There is still a chance to hit a race-condition, but it would be smaller than before. Actually, other DCS are free of this issue. Etcd - update is based on the value comparison, Zookeeper and Consul are relying on session mechanism. 2. If the update still failed - recheck the resource version of the leader object and that the current node is still the leader there and repeat the call. P.S. The leader race is still relying on the version of the leader object as it was during the get_cluster() call. In addition to that fixed handling of K8s API errors, we should retry on 500, not on 502. Close https://github.com/zalando/patroni/issues/1589	2020-06-22 16:07:52 +02:00
Alexander Kukushkin	ee4bf79c11	Populate references and nodename in subsets addresses (#1591 ) It makes subsets to exactly look like they were populated by the service with label selector and would help with https://github.com/zalando/postgres-operator/issues/340#issuecomment-587001109 Unit-tests are refactored to minimize amount of mocks.	2020-06-16 12:56:20 +02:00
Maxim Fedotov	623b594539	patronictl add ability to print ASCII topology (#1576 ) Example: ```bash $ patronictl topology + Cluster: batman (6834835313225022118) -----+---------+----+-----------+------------------------------------------------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| Tags \| +-----------------+----------------+---------+---------+----+-----------+------------------------------------------------+ \| postgresql0 \| localhost:5432 \| Leader \| running \| 2 \| \| \| \| + postgresql1 \| localhost:5433 \| Replica \| running \| 2 \| 0.0 \| \| \| + postgresql2 \| localhost:5434 \| Replica \| running \| 2 \| 0.0 \| {nofailover: true, replicatefrom: postgresql1} \| +-----------------+----------------+---------+---------+----+-----------+------------------------------------------------+ ```	2020-06-12 15:23:42 +02:00
Alexander Kukushkin	e95e54b94e	Handle correctly health-checks for standby cluster (#1553 ) Close https://github.com/zalando/patroni/issues/1388	2020-06-05 10:37:02 +02:00
Alexander Kukushkin	4f1a3e53cd	Defer TLS handshake until thread has started (#1547 ) The `SSLSocket` is immediately doing the handshake on accept. Effectively it blocks the whole API thread if the client-side doesn't send any data. In order to solve the issue we defer the handshake until a thread serving request has started. The solution is a bit hacky, but thread-safe. Close https://github.com/zalando/patroni/issues/1545	2020-06-05 09:36:13 +02:00
Alexander Kukushkin	c2a78ee652	Bugfix: GET /cluster was showing stale member info in zookeeper (#1573 ) Zookpeeper implementation heavily relies on cached version of the cluster view in order to minimize the number of requests. Having stale members information is fine for Patroni workflow because it basically relies only on member names and tags. The `GET /cluster` is a different case. Being exposed outside it might be used for monitoring purposes and therefore we should show the up-to-date members information.	2020-06-05 09:23:54 +02:00
Alexander Kukushkin	cd1b2741fa	Improve timeline divergence check (#1563 ) We don't need to rewind when: 1. replayed location for the former replica is not ahead of switchpoint 2. end of checkpoint record for the former primary is the same as switchpoint In order to get the end of checkpoint record we use the `pg_waldump` and parse its output. Close https://github.com/zalando/patroni/issues/1493	2020-05-29 14:15:10 +02:00
Alexander Kukushkin	98c2081c67	Detect a new timeline in the standby cluster (#1522 ) The standby cluster doesn't know about leader elections in the main cluster and therefore the usual mechanisms of detecting divergences don't work. For example, it could happen that the standby cluster is ahead of the new primary of the main cluster and must be rewound. There is a way to know that the new timeline has been created by checking the presence of a history file in pg_wal. If the new file is there, we will start usual procedures of making sure that we can continue streaming or will run the pg_rewind.	2020-05-29 14:14:47 +02:00
Alexander Kukushkin	c6207933d1	Properly handle the exception raised from refresh_session (#1531 ) The `touch_member()` could be called from the finally block of the `_run_cycle()`. In case if it raised an exception the whole Patroni process was crashing. In order to avoid future crashes we wrap `_run_cycle()` into the try..except block and ask a user to report a BUG. Close https://github.com/zalando/patroni/issues/1529	2020-05-29 14:14:11 +02:00
Alexander Kukushkin	6a0d2924a0	Separate received and replayed location (#1514 ) When making a decision whether the running replica is able to stream from the new primary or must be rewound we should use replayed location, therefore we extract received and replayed independently. Reuse the part of the query that extracts the timeline and locations in the REST API.	2020-05-27 13:33:37 +02:00
Alexander Kukushkin	ad5c686c11	Take advantage of pg_stat_wal_recevier (#1513 ) So far Patroni was parsing `recovery.conf` or querying `pg_settings` in order to get the current values of recovery parameters. On PostgreSQL earlier than 12 it could easily happen that the value of `primary_conninfo` in the `recovery.conf` has nothing to do with reality. Luckily for us, on PostgreSQL 9.6+ there is a `pg_stat_wal_receiver` view, which contains current values of `primary_conninfo` and `primary_slot_name`. The password field is masked through, but this is fine, because authentication happens only during opening the connection. All other parameters we compare as usual. Another advantage of `pg_stat_wal_recevier` - it contains the current timeline, therefore on 9.6+ we don't need to use the replication connection trick if walreceiver process is alive. If there is no walreceiver process available or it is not streaming we will stick to old methods.	2020-05-15 18:04:24 +02:00
Alexander Kukushkin	08b3d5d20d	Move ensure_clean_shutdown into rewind module (#1528 ) Logically fits there better	2020-05-15 16:22:57 +02:00
Alexander Kukushkin	30aa355eb5	Shorten and beautify history log output (#1526 ) when Patroni is trying to figure out the necessity of pg_rewind it could write the content history file from the primary into the log. The history file is growing with every failover/switchover and eventually starts taking too many lines in the log, most of them are not so much useful. Instead of showing the raw data, we will show only 3 lines before the current replica timeline and 2 lines after.	2020-05-15 16:14:25 +02:00
Alexander Kukushkin	7cf0b753ab	Update optime/leader with checkpoint location after clean shut down (#1527 ) Potentially this information could be used in order to make sure that there is no data loss on switchover.	2020-05-15 16:13:16 +02:00
Alexander Kukushkin	285bffc68d	Use pg_rewind with --restore-target-wal on 13 if possible (#1525 ) On PostgreSQL 13 check if restore_command is configured and tell pg_rewind to use it	2020-05-15 16:05:07 +02:00
Alexander Kukushkin	e6ef3c340a	Wake up the main thread after checkpoint is done (#1524 ) Replicas are waiting for checkpoint indication via member key of the leader in DCS. The key is normally updated only one time per HA loop. Without waking the main thread up replicas will have to wait up to `loop_wait` seconds longer than necessary.	2020-05-15 16:02:17 +02:00
Alexander Kukushkin	0d957076ca	Improve compatibility with PostgreSQL 12 and 13 (#1523 ) There were two new connection parameters introduced: 1. `gssencmode` in 12 2. `channel_binding` in 13	2020-05-13 13:13:25 +02:00
Alexander Kukushkin	fe23d1f2d0	Release 1.6.5 (#1503 ) * bump version * update release notes * implement missing unit-tests and format code.	2020-04-23 16:02:01 +02:00
Alexander Kukushkin	be4c078d95	Etcd smart refresh members (#1499 ) In dynamic environments it is common that during the rolling upgrade etcd nodes are changing their IP addresses. If the etcd node where Patroni is currently connected to is upgraded last, it could happen that the cached topology doesn't contain any live node anymore and therefore request can't be retried and totally fails, usually resulting in demoting of the primary. In order to partially overcome the problem, Patroni is already doing a periodic (every 5 minutes) rediscovery of the etcd cluster topology, but in case of very fast node rotation there was still a possibility to hit the issue. This PR is an attempt to address the problem. If the list of nodes exhausted, Patroni will try to perform initial discovery via an external mechanism, like resolving A or SRV dns records and if the new list is different from the original, Patroni will use it as the new etcd cluster topology. In order to deal with tcp issues the connect_timeout is set to max(read_timeout/2, 1). It will make list of members exhaust faster, but leaves the time to perform topology rediscovery and another attempt. The third issue addressed by this PR - it could happen that dns names of etcd nodes didn't change, but ip addresses are new, therefore we clean up the internal dns cache when doing topology rediscovery. Besides that, this commit makes `_machines_cache` property pretty much static, it will be updated only when the topology has changed and helps to avoid concurrency issues.	2020-04-23 12:51:05 +02:00
Alexander Kukushkin	80fbe90056	Issue CHEKPOINT explicitely after promote happened (#1498 ) It is safe to call pg_rewind on the replica only when pg_control on the primary contains information about the latest timeline. Postgres is usually doing immediate checkpoint right after promote and in most cases it works just fine. Unfortunately we regularly receive complaints that it takes to long (minutes) until the checkpoint is done and replicas can't perform rewind. At the same time doing the checkpoint manually immediately helped. So Patroni starts doing the same. When the promotion happened and postgres is not running in recovery, we explicitly issue the checkpoint. We are intentionally not using the AsyncExecutor here, because we want the HA loop continues doing its normal flow.	2020-04-20 11:55:05 +02:00
Alexander Kukushkin	52761ac46c	Merge branch 'master' of github.com:zalando/patroni into feature/terminaltables	2020-04-15 12:29:12 +02:00
Alexander Kukushkin	337f9efc9e	Improve patronictl list output (#1486 ) The redundant column `Column` will be presented in the table header. Depending on output format `Tags` are serialized differently: * For pretty format YAML is used, every element on the new line * For tsv format for YAML is also used, but all elements and on the same line (similar to JSON) * For json and yaml formats `Tags` are serialized into an appropriate format. <details><summary>Examples of output in pretty formats:</summary> ```bash $ patronictl list + Cluster: batman (6813309862653668387) +---------+----+-----------+---------------------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| Tags \| +-------------+----------------+--------+---------+----+-----------+---------------------+ \| postgresql0 \| 127.0.0.1:5432 \| Leader \| running \| 3 \| \| clonefrom: true \| \| \| \| \| \| \| \| noloadbalance: true \| \| \| \| \| \| \| \| nosync: true \| +-------------+----------------+--------+---------+----+-----------+---------------------+ \| postgresql1 \| 127.0.0.1:5433 \| \| running \| 3 \| 0.0 \| \| +-------------+----------------+--------+---------+----+-----------+---------------------+ $ patronictl list badclustername + Cluster: badclustername (uninitialized) ------+ \| Member \| Host \| Role \| State \| TL \| Lag in MB \| +--------+------+------+-------+----+-----------+ +--------+------+------+-------+----+-----------+ ``` </details> <details><summary>Example in tsv format:</summary> ```bash Cluster Member Host Role State TL Lag in MB Pending restart Tags batman postgresql0 127.0.0.1:5432 Leader running 2 batman postgresql1 127.0.0.1:5433 running 2 0 {clonefrom: true, nofailover: true, noloadbalance: true, replicatefrom: postgresql0} batman postgresql2 127.0.0.1:5434 running 2 0 * {replicatefrom: postgres1} ``` </details> In addition to that, `patronictl list` command will stop showing keys with empty values in `json` and `yaml` formats. <details><summary>Examples:</summary> ```yaml $ patronictl list -f yaml - Cluster: batman Host: 127.0.0.1:5432 Member: postgresql0 Role: Leader State: running TL: 2 - Cluster: batman Host: 127.0.0.1:5433 Lag in MB: 0 Member: postgresql1 State: running TL: 2 Tags: clonefrom: true nofailover: true noloadbalance: true replicatefrom: postgresql0 - Cluster: batman Host: 127.0.0.1:5434 Lag in MB: 0 Member: postgresql2 Pending restart: '' State: running TL: 2 Tags: replicatefrom: postgres1 ``` ```json $ patronictl list -f json \| jq . [ { "Cluster": "batman", "Member": "postgresql0", "Host": "127.0.0.1:5432", "Role": "Leader", "State": "running", "TL": 2 }, { "Cluster": "batman", "Member": "postgresql1", "Host": "127.0.0.1:5433", "State": "running", "TL": 2, "Lag in MB": 0, "Tags": { "nofailover": true, "noloadbalance": true, "replicatefrom": "postgresql0", "clonefrom": true } }, { "Cluster": "batman", "Member": "postgresql2", "Host": "127.0.0.1:5434", "State": "running", "TL": 2, "Lag in MB": 0, "Pending restart": "", "Tags": { "replicatefrom": "postgres1" } } ] ``` </details>	2020-04-15 12:19:18 +02:00
ksarabu1	e3335bea1a	Master stop timeout (#1445 ) ## Feature: Postgres stop timeout Switchover/Failover operation hangs on signal_stop (or checkpoint) call when postmaster doesn't respond or hangs for some reason(Issue described in [1371](https://github.com/zalando/patroni/issues/1371)). This is leading to service loss for an extended period of time until the hung postmaster starts responding or it is killed by some other actor. ### master_stop_timeout The number of seconds Patroni is allowed to wait when stopping Postgres and effective only when synchronous_mode is enabled. When set to > 0 and the synchronous_mode is enabled, Patroni sends SIGKILL to the postmaster if the stop operation is running for more than the value set by master_stop_timeout. Set the value according to your durability/availability tradeoff. If the parameter is not set or set <= 0, master_stop_timeout does not apply.	2020-04-15 12:18:49 +02:00
Alexander Kukushkin	27cda08ece	Improve unit-tests (#1479 ) * tests were failing on windows and macos * improve coverage	2020-04-09 10:34:35 +02:00
Alexander Kukushkin	369a93ce2a	Handle cases when conn_url is not defined (#1482 ) On K8s when one of the Patroni pods in starting there is valid annotation yet, which could cause failure in patronictl. In addition to that handle cases if port isn't specified in the standby_cluster configuration. Close https://github.com/zalando/patroni/issues/1100 Close https://github.com/zalando/patroni/issues/1463	2020-04-09 10:34:12 +02:00
Kaarel Moppel	d58006319b	Patronictl - fail if a config file is specified explicitly but not found (#1467 ) $ python3 patronictl.py -c postgresql0.yml list Error: Provided config file postgresql0.yml not existing or no read rights. Check the -c/--config-file parameter	2020-04-01 15:52:43 +02:00
Alexander Kukushkin	b020874486	Small improvement in tests (#1423 ) which actually revealed a small issue in the validator	2020-03-10 12:07:40 +01:00
Alexander Kukushkin	ab38ab2e97	Apply 1 second backoff if LIST failed (#1424 ) It is mostly necessary to avoid flooding logs, but also help to prevent starvation of the main thread.	2020-03-10 12:07:26 +01:00
Alexander Kukushkin	613634c26b	Reset rewind state if postgres started after successful pg_rewind (#1408 ) Close https://github.com/zalando/patroni/issues/1406	2020-02-27 12:24:17 +01:00
Alexander Kukushkin	4a29caa9d3	On role change callback didn't fire on failed primary (#1420 ) Bug was introduced in https://github.com/zalando/patroni/pull/703 Close https://github.com/zalando/patroni/issues/1418	2020-02-27 12:22:44 +01:00
Alexander Kukushkin	80ce61876e	Don't create permanent physical slot with name of the primary (#1392 ) It is a regular issue that primary is recycling WALs when one of the replicas is down for a long time. So far there were only two solutions for such a problem and both of them are not perfect: 1. Increase `wal_keep_segments`, but it is hard to guess the good value. 2. Use continuous archiving and PITR, but it is not always possible. This PR is introducing the way to solve the problem for static clusters, with a fixed number of nodes and names that never change. You just need to list the names of all nodes in the `slots` so the primary will not remove the slot when the node is down (not registered in DCS). Of course, the primary will not create the permanent slot which is matching its own name. Usage example: let's assume you have a cluster with nodes named abc1, abc2, and abc3. You have to run `patronictl edit-config` and put the following snippet into the configuration: ```yaml slots: abc1: type: physical abc2: type: physical abc3: type: physical ``` If the node abc2 is the primary, it will always create slots for abc1 and abc3 even if they are not running, but will not create slot abc2. Other nodes will behave the same. Close #280	2020-02-20 10:07:43 +01:00
Igor Yanchenko	ffde403a0a	Config validator implemented (#1314 )	2020-02-20 09:40:44 +01:00
Alexander Kukushkin	7c409f59d7	Switch to texttable it seems to be well maintained and packages are available even for old distros.	2020-02-19 12:29:25 +01:00
Alexander Kukushkin	ee79a390c2	Fix little bug and unit-tests	2020-02-14 14:30:48 +01:00
Michael Banck	f419d73465	Set postgresql.pgpass to ./pgpass (#1386 ) This avoids test failures if $HOME is not available (fixes: #1385).	2020-02-13 15:06:36 +01:00
Alexander Kukushkin	6aa3f809d4	Configure keepalive for connections to K8s API (#1366 ) In case if we got nothing from the socket after the TTL seconds it should be considered dead.	2020-01-27 09:25:08 +01:00
Alexander Kukushkin	902411239f	More compatibility with windows (#1367 ) * unix-domain sockets are not yet supported * signal.SIGQUIT doesn't exists	2020-01-24 12:52:55 +01:00
Igor Yanchenko	16fe180ed6	implemented stop signal using pg_ctl for non posix systems (#1342 ) Using pg_ctl to send stop signal for non posix os.	2020-01-16 14:35:47 +01:00
Alexander Kukushkin	1c4d395d5a	Handle exception from Ha.shutdown (#1351 ) During the shutdown Patroni is trying to update its status in the DCS. If the DCS is inaccessible an exception might be raised. Lack of exception handling prevents logger thread from stopping. Fixes https://github.com/zalando/patroni/issues/1344	2020-01-16 14:34:58 +01:00
Alexander Kukushkin	1461d7d4b8	Allow certain recovery parameters be defined in the custom_conf (#1335 ) Fixes https://github.com/zalando/patroni/issues/1333	2020-01-15 12:41:07 +01:00
Igor Yanchenko	ea76a40845	Make sure postgresql.pgpass is a file or it does not exist (#1337 ) Also make sure that it is located in the writable directory.	2020-01-15 12:40:41 +01:00
Alexander Kukushkin	16d1ffdde7	Update timeline on standby cluster (#1332 ) Fixes https://github.com/zalando/patroni/issues/1031	2019-12-20 12:56:00 +01:00
Igor Yanchenko	26b6e00575	wait option for patronictl reinit implemented (#1339 ) Wait to finish `reinit` if `--wait` option is used. Every 2 seconds it pulls the status from Patroni REST API and reports to console.	2019-12-20 12:05:39 +01:00
Igor Yanchenko	7ff27d9e10	Make sure unix_socket_directories and stats_temp_directory exist (#1293 ) Upon the start of Patroni and Postgres make sure that unix_socket_directories and stats_temp_directory exist or try to create them. Patroni will exit if failed to create them. Close https://github.com/zalando/patroni/issues/863	2019-12-11 12:26:17 +01:00
Alexander Kukushkin	08d6e5e50e	BUGFIX: don't leak password when running pg_rewind (#1321 ) In addition to that: * enforce security settings from `postgresql.authention` * update release notes * bump version * close https://github.com/zalando/patroni/issues/1320	2019-12-05 18:19:38 +01:00
Alexander Kukushkin	0693fe7dd0	Housekeeping (#1315 ) * Reduce memory usage by patroni init process * More cleanup in setup.py * Implement missing tests	2019-12-04 11:28:46 +01:00

1 2 3 4 5 ...

647 Commits