patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 18:20:05 +00:00

Author	SHA1	Message	Date
Alexander Kukushkin	48514db84b	Take into account current role when deciding on removal of member ZNode (#2884 ) Patroni doesn't watch on all changes of member keys in order to not create too much load on ZooKeeper, but only subscribes to changes (ZNodes added or deleted) in the `/member` directory. Therefore when some important fields in the value are updated we remove and recreate ZNode in order to notify the leader or other members. The leader should remove the member key only when the `checkpoint_after_promote` value is changed and replicas when the `state` is changed to/from `running`. We don't care about the `version` field, because Patroni version can't be changed without restart, what will case ZooKeeper `session_id` to change it anyway. This fix hopefully will reduce failures of behave tests on GH Actions.	2023-09-26 09:12:31 +02:00
Feike Steenbergen	4725f12f9a	Allow integer gucs without units in validation (#2734 ) Previously, integer gucs, for example `max_connections` would not pass the validation, as these settings have no unit, if and only if they were specified as a string. This causes problems if the `max_connections` is configured in `patroni.yaml` as a string, for example, the following configuration would not result in the right `max_connections` settings, as `max_connections` is configured as a string: bootstrap: dcs: postgresql: parameters: log_checkpoints: "on" log_connections: "off" max_connections: "57" Allowing a user to specify all parameters as a string was accepted before in Patroni and also seems very useful, as many of us will be using Ansible/Helm/Golang to build a Patroni configuration, in which creating a `map[string]string` is easier than having to deal with data types. Attemps to address issue #2735 Regression was introduced in `76b3b99de2`	2023-07-10 13:44:54 +02:00
Alexander Kukushkin	24af774adb	Attempt to reduce behave flakiness on MacOS (#2645 ) Sometimes MacOS workers are so slow that Postgres shutdown might take more than 30s-40s, what breaks a test with replica reinit in parallel with the primary restart, because basebackup() does only two attempts and in a pause the replica remains running with empty PGDATA. In addition to that increase timeouts in ignore_slots test. Close #2637	2023-04-13 12:21:08 +02:00
Alexander Kukushkin	4c3af2d1a0	Change master->primary/leader/member (#2541 ) keep as much backward compatibility as possible. Following changes were made: 1. All internal checks are performed as `role in ('master', 'primary')` 2. All internal variables/functions/methods are renamed 3. `GET /metrics` endpoint returns `patroni_primary` in addition to `patroni_master`. 4. Logs are changed to use leader/primary/member/remote depending on the context 5. Unit-tests are using only role = 'primary' instead of 'master' to verify that 1 works. 6. patronictl still supports old syntax, but also accepts `--leader` and `--primary`. 7. `master_(start\|stop)_timeout` is automatically translated to `primary_(start\|stop)_timeout` if the last one is not set. 8. updated the documentation and some examples Future plan: in the next major release switch role name from `master` to `primary` and maybe drop `master` altogether. The Kubernetes implementation will require more work and keep two labels in parallel. Label values should probably be configurable as described in https://github.com/zalando/patroni/issues/2495.	2023-01-27 07:40:24 +01:00
Alexander Kukushkin	4872ac51e0	Citus integration (#2504 ) Citus cluster (coordinator and workers) will be stored in DCS as a fleet of Patroni logically grouped together: ``` /service/batman/ /service/batman/0/ /service/batman/0/initialize /service/batman/0/leader /service/batman/0/members/ /service/batman/0/members/m1 /service/batman/0/members/m2 /service/batman/ /service/batman/1/ /service/batman/1/initialize /service/batman/1/leader /service/batman/1/members/ /service/batman/1/members/m1 /service/batman/1/members/m2 ... ``` Where 0 is a Citus group for coordinator and 1, 2, etc are worker groups. Such hierarchy allows reading the entire Citus cluster with a single call to DCS (except Zookeeper). The get_cluster() method will be reading the entire Citus cluster on the coordinator because it needs to discover workers. For the worker cluster it will be reading the subtree of its own group. Besides that we introduce a new method get_citus_coordinator(). It will be used only by worker clusters. Since there is no hierarchical structures on K8s we will use the citus group suffix on all objects that Patroni creates. E.g. ``` batman-0-leader # the leader config map for the coordinator batman-0-config # the config map holding initialize, config, and history "keys" ... batman-1-leader # the leader config map for worker group 1 batman-1-config ... ``` Citus integration is enabled from patroni.yaml: ```yaml citus: database: citus group: 0 # 0 is for coordinator, 1, 2, etc are for workers ``` If enabled, Patroni will create the database, citus extension in it, and INSERTs INTO `pg_dist_authinfo` information required for Citus nodes to communicate between each other, i.e. 'password', 'sslcert', 'sslkey' for superuser if they are defined in the Patroni configuration file. When the new Citus coordinator/worker is bootstrapped, Patroni adds `synchronous_mode: on` to the `bootstrap.dcs` section. Besides that, Patroni takes over management of some Postgres GUCs: - `shared_preload_libraries` - Patroni ensures that the "citus" is added to the first place - `max_prepared_transactions` - if not set or set to 0, Patroni changes the value to `max_connections*2` - wal_level - automatically set to logical. It is used by Citus to move/split shards. Under the hood Citus is creating/removing replication slots and they are automatically added by Patroni to the `ignore_slots` configuration to avoid accidental removal. The coordinator primary actively discovers worker primary nodes and registers/updates them in the `pg_dist_node` table using citus_add_node() and citus_update_node() functions. Patroni running on the coordinator provides the new REST API endpoint: `POST /citus`. It is used by workers to facilitate controlled switchovers and restarts of worker primaries. When the worker primary needs to shut down Postgres because of restart or switchover, it calls the `POST /citus` endpoint on the coordinator and the Patroni on the coordinator starts a transaction and calls `citus_update_node(nodeid, 'host-demoted', port)` in order to pause client connections that work with the given worker. Once the new leader is elected or postgres started back, they perform another call to the `POST/citus` endpoint, that does another `citus_update_node()` call with actual hostname and port and commits a transaction. After transaction is committed, coordinator reestablishes connections to the worker node and client connections are unblocked. If clients don't run long transaction the operation finishes without client visible errors, but only a short latency spike. All operations on the `pg_dist_node` are serialized by Patroni on the coordinator. It allows to have more control and ROLLBACK transaction in progress if its lifetime exceeding a certain threshold and there are other worker nodes should be updated.	2023-01-24 16:14:58 +01:00
Alexander Kukushkin	580530b30f	Behave tests on Windows (#2432 ) Windows doesn't support `SIGTERM`, but our behave tests in majority of cases relying on Patroni graceful shutdown. In order to emulate the behaviour we introduced the new REST API endpoint `POST /sigterm`. The endpoint works only on Windows and when `BEHAVE_DEBUG` environment variable is set. Besides that some minor adjustments in behave tests were done. Mainly related to backslash-slash handling. In addition to that improve test coverage on Windows by properly mocking access to filesystem and avoiding calling `subprocess.call()`. Specifically, symlink creation on Windows requires Admin privileges and there is no `true.exe`.	2022-10-21 12:24:24 +02:00
Alexander Kukushkin	ead798d9ac	Speed up behave tests by always using loop_wait=2 (#2361 ) run time is reduced from ~5m30s to ~5m	2022-07-18 15:23:55 +02:00
Alexander Kukushkin	4215565cb4	Rearrange tests (#2146 ) - remove codacy steps: they removed legacy organizations and there seems to be no easy way of installing codacy app to the Zalando GH. - Don't run behave on MacOS: recently worker became way to slow - Disable behave for combination of kubernetes and python 2.7 - Remove python 3.5 (it will be removed by GH from workers in January) and add 3.10 - Run behave with 3.6 and 3.9 instead of 3.5 and 3.8	2021-12-21 09:36:22 +01:00
Alexander Kukushkin	d24051c31c	Optimize case when we don't have permanent logical slots (#2121 ) The unnecessary call of SlotsHandler.process_permanent_slots() results in one additional query to `pg_replication_slots` view every HA loop.	2021-11-30 14:20:55 +01:00
Alexander Kukushkin	8a8409999d	Change the behavior in pause (#1687 ) 1. Don't call bootstrap if PGDATA is missing/empty, because it might be for purpose, and someone/something working on it. 2. Consider postgres running as a leader in pause not healthy if pg_control sysid doesn't match with the /initialize key (empty initialize key will allow the "race" and the leader will "restore" initialize key). 3. Don't exit on sysid mismatch in pause, only log a warning. 4. Cover corner cases when Patroni started in pause with empty PGDATA and it was restored by somebody else 5. Empty string is a valid `recovery_target`.	2020-09-18 08:25:00 +02:00
Alexander Kukushkin	e95e54b94e	Handle correctly health-checks for standby cluster (#1553 ) Close https://github.com/zalando/patroni/issues/1388	2020-06-05 10:37:02 +02:00
Alexander Kukushkin	a5ff38a034	Improve behave tests (#1313 ) Hopefully, make them less flaky	2019-12-02 10:33:44 +01:00
Alexander Kukushkin	367d787ff9	Implement /history and /cluster endpoints (#1191 ) The /history endpoint shows the content of the `history` key in DCS The /cluster endpoint show all cluster members and some service info like pending and scheduled restarts or switchovers. In addition to that implement `patronictl history` Close #586 Close #675 Close #1133	2019-10-22 17:19:02 +02:00
Alexander Kukushkin	3d29cb7e50	Perform pg_ctl reload regardless of config changes (#1204 ) It is possible that some config files are not controlled by Patroni and when somebody is doing reload via REST API or by sending SIGHUP to Patroni process the usual expectation is that postgres will also be reloaded, but it didn't happen when there were no changes in the postgresql section of Patroni config. For example one might replace ssl_cert_file and ssl_key_file on the filesystem and starting from PostgreSQL 10 it just requires a reload, but Patroni wasn't doing it. In addition to that fix the issue with handling of `wal_buffers`. The default value depends on `shared_buffers` and `wal_segment_size` and therefore Patroni was exposing pending_restart when the new value in the config was explicitly set to -1 (default). Close https://github.com/zalando/patroni/issues/1198	2019-10-10 14:49:30 +02:00
wilfriedroset	2384d9e735	Add API route /health (#1079 ) close #119	2019-06-11 15:22:52 +02:00
Alexander Kukushkin	1a0876e5ca	Refactor acceptance tests to improve stability (#884 ) Hope it will crash less often when executed on travis against k8s	2018-11-30 12:40:56 +01:00
Alexander Kukushkin	87e9aab04c	Improve tests (#778 ) * Implement missing unit-tests * Add acceptance tests for ISSUE #776 * Update list of classifiers, keywords and authors	2018-08-29 11:29:37 +02:00
Alexander Kukushkin	18786464a1	Rename failover to switchover and make new failover work without leader (#588 ) In addition to that implement /switchover endpoint as an alias to /failover endpoint and implement more checks like: * candidate must be provided for a failover * switchover can't be scheduled in a pause state * and so on Fixes https://github.com/zalando/patroni/issues/585 Fixes https://github.com/zalando/patroni/issues/520	2018-01-05 15:17:56 +01:00
Alexander Kukushkin	25aa49b240	Run one manual failover test via rest API instead of patronictl and bump Patroni version	2017-07-31 11:18:01 +02:00
Alexander Kukushkin	322aa45e09	BUGFIX: patronictl edit-config didn't worked with zookeeper (#492 ) When updating config key we should use `ClusterConfig.index` instead of `ClusterConfig.modify_index`. The second one should be used by Patroni internally to check that key was really changed, because when key is deleted and recreated it's version always starts from the same value: 0 In addition to that use patronictl instead of http PATCH in some of acceptance tests to change cluster config. Fixes https://github.com/zalando/patroni/issues/491	2017-07-31 11:07:00 +02:00
Alexander Kukushkin	39f5f7982c	Scheduled failovers in 1 second don't work reliably with loop_wait=2	2017-01-13 11:25:07 +01:00
Alexander Kukushkin	1f829a4b34	Switch to trusty and run acceptance tests with postgres 9.6	2017-01-13 09:32:38 +01:00
Alexander Kukushkin	1e573aec8f	Do session/renew call to Consul when update_leader is called (#336 )	2016-10-10 10:05:55 +02:00
Alexander Kukushkin	33ff372ef6	Always try to rewind on manual failover	2016-09-01 11:08:26 +02:00
Alexander Kukushkin	1dcdd6eaa0	Acceptance tests for pause mode	2016-08-30 16:50:07 +02:00
Alexander Kukushkin	366ed9cc52	fix pep8 formatting and implement missing tests	2016-08-29 15:39:24 +02:00
Murat Kabilov	a47a2bceff	Manage scheduled restarts using patronictl (#248 ) Manage scheduled restarts using patronictl	2016-08-09 12:54:48 +02:00
Oleksii Kliukin	ffd27b5705	Rename with_pending_restart to restart_pending.	2016-07-13 11:07:37 +02:00
Oleksii Kliukin	bf95b75489	Use the parameter that really sets the pending_restart flag.	2016-07-11 18:20:15 +02:00
Oleksii Kliukin	c91eda8d78	Merge branch 'master' into feature/scheduled_restarts	2016-07-11 12:56:24 +02:00
Oleksii Kliukin	29845dd383	Restart the node according to the schedule. The scheduled restart data structures are now independent of those used by the normal restarts. This would be fixed in subsequent commits. Add the behave tests, that cover the POST /restart (but not DELETE).	2016-06-23 10:43:54 +02:00
Alexander Kukushkin	fcde17583c	Acceptance tests for patronictl Call patronictl.py when it's possible instead of doing REST API calls.	2016-06-16 15:06:18 +02:00
Alexander Kukushkin	24822bd9ac	Returning 304 for POST, PATCH, PUT is not good idea	2016-06-06 10:50:42 +02:00
Alexander Kukushkin	ebb9e252d8	Rename restart_pending to pending_restart for compatibility	2016-06-02 09:31:30 +02:00
Alexander Kukushkin	1c30948ef9	Implement PUT /config and enhance some checks	2016-06-01 17:06:31 +02:00
Alexander Kukushkin	f7912991a8	Reshuffle acceptance tests one more time	2016-05-30 12:37:14 +02:00
Alexander Kukushkin	e085c866dc	Reshuffle acceptance tests Move dynamic config tests from basic_replication to patroni_api	2016-05-30 11:30:41 +02:00
Alexander Kukushkin	073ef3784f	Implement PATCH /config	2016-05-27 16:29:33 +02:00
Alexander Kukushkin	d57310bbc0	Fix one more corner-case It could take up to 10 seconds to create replication slot. In addition to that when replica fails to connect to the master via streaming replication it doesn't retry immediately, but with some timeout (5 seconds). 10 + 5 == 15 what causes replication check scenarios fail.	2016-04-13 14:09:45 +02:00
Alexander Kukushkin	b4e86f0809	Make it possible to schedule failover in less then 10 seconds But only when API request was posted to the leader	2016-04-13 13:32:39 +02:00
Alexander Kukushkin	15d30a2d35	Try to stabilize acceptance tests	2016-04-13 13:32:39 +02:00
Alexander Kukushkin	24a2ea6cef	Refactor acceptance tests to make them work against ZooKeeper and make it easier to implement controllers for new DCS, i.e. consul	2016-04-10 10:37:43 +02:00
Alexander Kukushkin	e6af18f0bb	Former leader was not able to reattach to cluster without pg_rewind It was shutdown correctly and I expected such 'join' working, but it was not, because new leader didn't had enough time to catch up with the master before promote.	2016-03-24 14:45:21 +01:00
Alexander Kukushkin	54055c1ff8	Rename ambiguous `Failover.member` to candidate But! 'member' is still accepted by REST API and also name 'member' is used to strore/read this value to/from DCS (for backward comatibility)	2016-03-18 15:59:47 +01:00
Alexander Kukushkin	42d798a3de	acceptance tests on travis	2016-03-10 17:19:10 +01:00
Oleksii Kliukin	3f1c34f557	Add tests for the scheduled failover. The actual amount of time to establish the master and the replication after the scheduled failover seems sufficient (15 seconds with the failover in 10 seconds), but occasionally leads to test failures. This is unlikely the test issue and should be investigated inside the patroni.	2016-03-02 19:39:12 +01:00
Oleksii Kliukin	069440be15	Improve the "replication work" sentence definition. Add an ability to specify the origin and the destination for the replication works clause. Use this ability in the API promotion test to ensure the replication from the former replica to the former master.	2016-03-02 15:43:44 +01:00
Oleksii Kliukin	24ebcc72f6	Add more tests for the restart and promotion.	2016-03-01 22:07:18 +01:00
Oleksii Kliukin	0d44e3eb7c	Add simple API tests for 2 nodes, to be extended.	2016-02-26 18:00:11 +01:00
Oleksii Kliukin	4e9ebf48a8	Add API tests for a stand-alone node. Bugfixes. Add tests for patroni API. Fix test failures when an already running etcd is used.	2016-02-26 17:37:37 +01:00

50 Commits