patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 18:20:05 +00:00

Author	SHA1	Message	Date
Alexander Kukushkin	f1f2389146	A couple of small improvements in acceptance tests (#1057 ) * Keep basebackup and wal_archive next to PGDATA in the data directory * Test bootstrap of standby cluster nodes with custom scripts	2019-05-13 16:33:19 +02:00
Alexander Kukushkin	e38fe78b56	Fix callbacks behavior (mostly for standby cluster) (#998 ) First of all, this patch changes the behavior of `on_start`/`on_restart` callbacks, they will be called only when postgres is started or restarted without role changes. In case if the member is promoted or demoted only the `on_role_change` callback will be executed. `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument. Before that `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument. In addition to that, the REST API will return standby_leader role for the leader of the standby cluster. Closes https://github.com/zalando/patroni/issues/988	2019-03-29 10:28:07 +01:00
Michael Banck	073074f83e	Run coverage as python -m coverage (#968 ) Depending on the platform the coverage binary might not always be available under the standard name.	2019-02-13 16:02:12 +01:00
Michael Banck	345e6d3131	Copy away output directories of failed acceptance tests. (#967 ) And dump logs on travis from only failed features	2019-02-13 16:00:15 +01:00
Michael Banck	d01a9bdcd5	Change base port for acceptance tests from 5440 to 5360 (#966 )	2019-02-13 15:59:13 +01:00
Alexander Kukushkin	381a5b80d2	Release 1.5.4 (#931 ) * Bump version * Update release notes * Make it possible to configure registration of Service in Consul via env variables	2019-01-15 12:14:19 +01:00
Alexander Kukushkin	1a0876e5ca	Refactor acceptance tests to improve stability (#884 ) Hope it will crash less often when executed on travis against k8s	2018-11-30 12:40:56 +01:00
Alexander Kukushkin	2efd97baab	Permanent replication slots (#819 ) Permanent replication slots are preserved on failover/switchover, that is Patroni on the new primary will create configured replication slots right after doing promote. Slots could be configured with the help of `patronictl edit-config`. The initial configuration could be also done in the `bootstrap.dcs` ```yaml slots: permanent_physical_1: type: physical permanent_logical_1: type: logical database: foo plugin: pgoutput ``` It is the responsibility of the operator to make sure that there are no clashes in names between replication slots automatically created by Patroni for members and permanent replication slots. Closes https://github.com/zalando/patroni/issues/656	2018-10-31 11:37:42 +01:00
Alexander Kukushkin	e939304001	Take and apply some parameters from controldata when starting as replica (#703 ) * Take and apply some parameters from controldata when starting as replica https://www.postgresql.org/docs/10/static/hot-standby.html#HOT-STANDBY-ADMIN There is set of parameters which value on the replica must be not smaller than on the primary, otherwise replica will refuse to start: * max_connections * max_prepared_transactions * max_locks_per_transaction * max_worker_processes It might happen that values of these parameters in the global configuration are not set high enough, what makes impossible to start a replica without human intervention. Usually it happens when we bootstrap a new cluster from the basebackup. As a solution to this problem we will take values of above parameters from the pg_controldata output and in case if the values in the global configuration are not high enough, apply values taken from pg_controldata and set `pending_restart` flag.	2018-06-12 14:04:32 +02:00
Alexander Kukushkin	4328c15010	Make Patroni Kubernetes native (#500 ) * Use ConfigMaps or Endpoins for leader elections and to keep cluster state * Label pods with a postgres role * change behavior of pip install. From now on it will not install all dependencies, you have to specify explicitly DCS you want to use Patroni with: `pip install patroni[etcd,zookeeper,kubernetes]`	2017-12-08 16:55:00 +01:00
Alexander Kukushkin	8e3511ca6b	Different minor fixes (#551 ) * Use unix line endings * Make flake8 happy	2017-11-02 16:24:17 +01:00
Ants Aasma	70d718a058	Simplify watchdog code (#452 ) * Only activate watchdog while master and not paused We don't really need the protections while we are not master. This way we only need to tickle the watchdog when we are updating leader key or while demotion is happening. As implemented we might fail to notice to shut down the watchdog if someone demotes postgres and removes leader key behind Patroni's back. There are probably other similar cases. Basically if the administrator if being actively stupid they might get unexpected restarts. That seems fine. * Add configuration change support. Change MODE_REQUIRED to disable leader eligibility instead of closing Patroni. Changes watchdog timeout during the next keepalive when ttl is changed. Watchdog driver and requirement can also be switched online. When watchdog mode is `required` and watchdog setup does not work then the effect is similar to nofailover. Add watchdog_failed to status API to signify this. This is True only when watchdog does not work AND it is required. * Reset implementation when config changed while active. * Add watchdog safety margin configuration Defaults to 5 seconds. Basically this is the maximum amount of time that can pass between the calls to odcs.update_leader()` and `watchdog.keepalive()`, which are called right after each other. Should be safe for pretty much any sane scenario and allows the default settings to not trigger watchdog when DCS is not responding. * Cancel bootstrap if watchdog activation fails The system would have demoted itself anyway the next HA loop. Doing it in bootstrap gives at least some other node chance to try bootstrapping in the hope that it is configured correctly. If all nodes are unable to activate they will continue to try until the disk is filled with moved datadirs. Perhaps not ideal behavior, but as the situation is unlikely to resolve itself without administrator intervention it doesn't seem too bad.	2017-07-27 12:16:11 +02:00
Alexander Kukushkin	d5b3d94377	Custom bootstrap (#454 ) Task of restoring a cluster from backup or cloning existing cluster into a new one was floating around for some time. It was kind of possible to achieve it by doing a lot of manual actions and very error prone. So I come up with the idea of making the way how we bootstrap a new cluster configurable. In short - we want to run a custom script instead of running initdb.	2017-07-18 15:12:58 +02:00
Alexander Kukushkin	acc6d7c2c2	Watchdog unit-tests, bugfixes and questions (#449 ) Implement missing unit-tests for and drop unused code	2017-07-11 10:00:30 +02:00
Alexander Kukushkin	681b6b507b	Support unix sockets when connecting to a local postgres cluster (#457 ) For backward compatibility this feature is not enabled by default. To enable it you have to set `postgresql.use_unix_socket: true`. If feature is enable, and `unix_socket_directories` is defined and non empty, Patroni will use the first suitable value from it to connect to the local postgres cluster. If the `unix_socket_directories` is not defined, Patroni will assume that default value should be used and will not pass `host` to command line arguments and omit it from connection url. Solves: https://github.com/zalando/patroni/issues/61 In addition to mentioned above, this commit solves couple of bugs: * manual failover with pg_rewind in a pause state was broken * psycopg2 (or libpq, I am not really sure what exactly) doesn't mark cursors connection as closed when we use unix socket and there is an `OperationalError` occurs. We will close such connection on our own.	2017-06-22 11:47:57 +02:00
Ants Aasma	a70b46ef13	Add watchdog support on Linux (#343 ) Ensures that system gets rebooted before TTL runs out. Initial version. Open questions: Do we want to disable watchdog while we are not master?	2017-06-01 16:53:46 +02:00
Alexander Kukushkin	37c1552c0a	Smart pg_rewind (#417 ) Previously we were running pg_rewind only in limited amount of cases: * when we knew postgres was a master (no recovery.conf in data dir) * when we were doing a manual switchover to a specific node (no guaranty that this node is the most up-to-date) * when a given node has nofailover tag (it could be ahead of new master) This approach was kind of working in most of the cases, but sometimes we were executing pg_rewind when it was not necessary and in some other cases we were not executing it although it was needed. The main idea of this PR is first try to figure out that we really need to run pg_rewind by analyzing timelineid, LSN and history file on master and replica and run it only if it's needed.	2017-05-19 16:32:06 +02:00
Alexander Kukushkin	d138a8db17	AT for master_start_timeout + minor fixes (#361 )	2016-12-09 12:02:41 +01:00
Alexander Kukushkin	1e573aec8f	Do session/renew call to Consul when update_leader is called (#336 )	2016-10-10 10:05:55 +02:00
Alexander Kukushkin	4594bc98da	Increase timeouts when running AT on travis (#324 ) * Increase timeouts two times when running AT on travis * Make up to 3 attempts to download DCS * Get rid from hard-coded names	2016-09-28 15:13:09 +02:00
Alexander Kukushkin	0b1bfeca5b	Make sure that we are running and testing latest versions of everything (#303 )	2016-09-19 13:32:53 +02:00
Alexander Kukushkin	ae88e7c96e	Document that every single zookeeper host:port MUST be quoted otherwise yaml library can not parse the list. And make visible yaml exception when trying to parse this list.	2016-06-29 14:25:50 +02:00
Alexander Kukushkin	fcde17583c	Acceptance tests for patronictl Call patronictl.py when it's possible instead of doing REST API calls.	2016-06-16 15:06:18 +02:00
Alexander Kukushkin	5f4e582660	Merge branch 'master' of github.com:zalando/patroni into feature/dynamic-configuration	2016-06-09 11:04:28 +02:00
Alexander Kukushkin	50d118c3aa	Split ZooKeeper and Exhibitor Originally Exhibitor was supported in the ZooKeeper class and configuration for Exhibitor was taken also from `zookeeper` section in the yaml config file. In fact, Exhibitor just extends ZooKeeper and now it is reflected in the code and also Exhibitor got it's own section in the config.yaml file. It will make it easier to configure Exhibitor hosts and port via environment variables when PR#211 will be merged.	2016-06-08 19:21:18 +02:00
Alexander Kukushkin	6700cd0aa6	Implement reload of config.yml with REST API call and acceptance tests for that	2016-05-26 17:09:40 +02:00
Alexander Kukushkin	45cbc8ca70	Implement acceptance test for dynamic configuration functionality and fix some bugs revealed by acceptance tests	2016-05-26 10:16:24 +02:00
Alexander Kukushkin	ceace03646	Address codacy and travis issues	2016-05-25 14:49:33 +02:00
Alexander Kukushkin	7827951c8c	Dynamic configuration	2016-05-25 14:17:05 +02:00
Alexander Kukushkin	eabfd82a5d	Implement Consul support	2016-04-27 10:59:01 +02:00
Alexander Kukushkin	fd4f12aac8	Do not assume that connection user is postgres, but take it from config.yml	2016-04-21 13:56:09 +02:00
Alexander Kukushkin	f8bf1bb0ab	Disable sudo, reshuffle travis tasks and introduce caching Without sudo travis is executing build tasks using docker and waiting time in this case is really small, usually not longer then 10 seconds. postgresql-9.5 is installed via addons.apt.packages (without sudo) But ports 5432 and 5433 are busy. So I had to ajust environment.py to assign port from higher diapason. And a few words about build tasks: First task is used for executing unit tests for all different python versions The second one is used for executing acceptance tests against etcd The third one is used for executing acceptance tests against zookeeper acceptance tests are executed with python2.7 and python3.5 In addition that I've introduced caching of python virtual environment. It really helps to reduce time needed to install python modules.	2016-04-13 13:32:39 +02:00
Alexander Kukushkin	24a2ea6cef	Refactor acceptance tests to make them work against ZooKeeper and make it easier to implement controllers for new DCS, i.e. consul	2016-04-10 10:37:43 +02:00
Alexander Kukushkin	79f4d9a13b	Attempt to export acceptance tests coverage results to coveralls	2016-03-13 09:42:02 +01:00
Alexander Kukushkin	62f11ab747	Attempt to export acceptance tests coverage results to coveralls	2016-03-13 09:09:31 +01:00
Alexander Kukushkin	ba444adb67	make codacy and quantifiedcode happier	2016-03-11 15:32:16 +01:00
Alexander Kukushkin	5f6beae22f	Enforce data-type checks for step matcher and increase default timeout for patroni start	2016-03-11 14:46:14 +01:00
Alexander Kukushkin	30d3982d25	Acceptance tests with behave	2016-03-11 12:56:29 +01:00

38 Commits