patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 18:20:05 +00:00

Author	SHA1	Message	Date
Alexander Kukushkin	4a24b79b73	IPv6 support (#1122 ) fixes https://github.com/zalando/patroni/issues/1121	2019-08-02 11:34:29 +02:00
Alexander Kukushkin	37f03790cc	Implement two-step logging (#1080 ) A few times we observed that Patroni HA loop was blocked for a few minutes due to not being able to write logs to stderr. This is a very rare condition which we hit so far only on k8s. This commit makes Patroni resilient to such kind of problems. All log messages first are written into the in-memory queue and later they are asynchronously flushed into the stderr or file from a separate thread. The maximum queue size is configurable and the default value is 1000. This should be enough to keep more than one hour of log messages with default settings and when Patroni cluster operates normally (without big issues). In case if we hit the maximum size of the queue further logs will be discarded until the queue size will be reduced. The number of discarded messages will be reported into the log later. In addition to that, the number of non-flushed and discarded messages (if there are any), will be reported via Patroni REST API as: ```json "logger_queue_size": X, "logger_records_lost": Y` ```	2019-06-13 14:18:49 +02:00
wilfriedroset	2384d9e735	Add API route /health (#1079 ) close #119	2019-06-11 15:22:52 +02:00
Alexander Kukushkin	a4bd6a9b4b	Refactor postgresql class (#1060 ) * Convert postgresql.py into a package * Factor out cancellable process into a separate class * Factor out connection handler into a separate class * Move postmaster into postgresql package * Factor out pg_rewind into a separate class * Factor out bootstrap into a separate class * Factor out slots handler into a separate class * Factor out postgresql config handler into a separate class * Move callback_executor into postgresql package This is just a careful refactoring, without code changes.	2019-05-21 16:02:47 +02:00
Julien Riou	663026c34c	Use SSLContext to wrap REST API socket (#1039 ) Using `ssl.wrap_socket` is deprecated and was still allowing soon-to-be-deprecated protocols like TLS 1.1. Now using `SSLContext.create_default_context()` to produce a secure SSL context to wrap the REST API server's socket.	2019-04-23 11:23:22 +02:00
Alexander Kukushkin	7c0c9599fc	Remove psycopg2 from requirements (#1023 ) Recently released psycopg2 split into two different packages, psycopg2, and psycopg2-binary which could be installed at the same time into the same place on the filesystem. In order to decrease dependency hell problem, we let a user choose how to install psycopg2. There are a few options available and it is reflected in the documentation. This PR also changes the following behavior: * `pip install patroni` will fail if psycopg2 is not installed * Patroni will check psycopg2 upon start and fail if it can't be found or outdated. Closes https://github.com/zalando/patroni/issues/1021	2019-04-15 14:30:16 +02:00
Alexander Kukushkin	2c128520cf	Python34 compatibility (#933 ) and some other minor fixes. Closes https://github.com/zalando/patroni/issues/932	2019-01-16 14:40:05 +01:00
Alexander Kukushkin	381a5b80d2	Release 1.5.4 (#931 ) * Bump version * Update release notes * Make it possible to configure registration of Service in Consul via env variables	2019-01-15 12:14:19 +01:00
Alexander Kukushkin	71dae6a905	Optionally consider node not healthy if it is not on the latest timeline (#892 ) The latest timeline is calculated from the `/history` key in DCS. In case there is no such key or it contains some garbage we consider the node healthy. Closes https://github.com/zalando/patroni/issues/890	2019-01-15 11:16:30 +01:00
Alexander Kukushkin	f70edefc65	A few bugfixes in the "standby cluster" workflow (#823 ) * Always run `pg_rewind` against the remote master * Always use the remote master as the source when "recovering" stopped standby leader * Use remote master as the source when "recovering" the node in the unhealthy cluster * Use the local dbname as the fallback when doing `pg_rewind` from the remote master * `no_replication_slot` is the allowed key in the `RemoteMember` object * Make it possible to "bootstrap" the new `standby_cluster` with existing (and valid) data directory. There is one prerequisite though, there should be no `patroni.dynamic.json` file in it!	2018-10-09 13:30:48 +02:00
Alexander Kukushkin	76d1b4cfd8	Minor fixes (#808 ) * Use `shutil.move` instead of `os.replace`, which is available only from 3.3 * Introduce standby-leader health-check and consul service * Improve unit tests, some lines were not covered * rename `assertEquals` -> `assertEqual`, due to deprecation warning	2018-09-19 16:32:33 +02:00
Alexander Kukushkin	90cf930036	Refactor REST API health-checks (#779 ) Make it more readable and easy to understand. Mostly it is needed to implement https://github.com/zalando/patroni/issues/772	2018-08-29 11:35:22 +02:00
Alexander Kukushkin	0c1ae6fbeb	Respond 200 to master health-check only if update_lock was successful (#713 ) If Patroni gets partitioned it starts receiving stale information from DCS. We can't use this information to determine that we have the leader key. Instead, we will record in Ha object the actual state of acquire/update lock and report as a leader only if it was successful. P.S. despite responding with 200 on `GET /master` postgres was still running read-only.	2018-08-03 17:00:01 +02:00
Henning Jacobs	2537147810	#694 handle configuration error (#695 ) It is possible to change a lot of parameters in runtime (including `restapi.listen`) by updating Patroni config file and sending SIGHUP to Patroni process. If something was misconfigured it was throwing a weird exception and breaking `restapi` thread. This PR improves friendliness of error message and avoids breaking of `restapi`.	2018-06-12 14:08:38 +02:00
Alexander Kukushkin	84f29caf92	Fix race condition in poll_failover_result (#658 ) It didn't affect directly neither failover nor switchover, but in some rare cases it was reporting it as a success too early, when the former leader released the lock: `Failed over to "None" instead of "desired-node"` In addition to that this commit improves logs and status messages by differentiating between failover and switchover.	2018-04-16 17:45:05 +02:00
Alexander Kukushkin	5668367181	Implement '/sync' and `/async` endpoints (#578 ) They will respond with http status code 200 only when the node is running as a synchronous or asynchronous replica. Fixes https://github.com/zalando/patroni/issues/189 Fixes https://github.com/zalando/patroni/issues/415	2018-01-05 15:28:40 +01:00
Alexander Kukushkin	03c2a85d23	Expose current timeline in DCS and via API (#591 ) It is very easy to get current timeline on the master by executing ```sql SELECT ('x' \|\| SUBSTR(pg_walfile_name(pg_current_wal_lsn()), 1, 8))::bit(32)::int ``` Unfortunately the same method doesn't work when postgres is_in_recovery. Therefore we will use replication connection for that on the replicas. In order to avoid opening and closing replication connection on every HA loop we will cache the result if its value matches with the timeline of the master. Also this PR introduces a new key in DCS: `/history`. It will contain a json serialized object with timeline history in a format similar to the usual history files. The differences are: * Second column is the absolute wal position in bytes, instead of LSN * Optionally there might be a fourth column - timestamp, (mtime of history file)	2018-01-05 15:25:56 +01:00
Alexander Kukushkin	18786464a1	Rename failover to switchover and make new failover work without leader (#588 ) In addition to that implement /switchover endpoint as an alias to /failover endpoint and implement more checks like: * candidate must be provided for a failover * switchover can't be scheduled in a pause state * and so on Fixes https://github.com/zalando/patroni/issues/585 Fixes https://github.com/zalando/patroni/issues/520	2018-01-05 15:17:56 +01:00
Alexander Kukushkin	3a96ffa718	Expose pause state of every member to DCS and via REST (#592 ) and implement patronictl pause\|resume --wait on top of that Fixes https://github.com/zalando/patroni/issues/349	2018-01-05 15:16:45 +01:00
Alexander Kukushkin	0e01bb33bb	Improve patronictl reinit (#576 ) Make it possible to cancel a running task if you want to reinitialize replica. There are two possible ways to trigger it: 1. patronictl will ask whether you want to cancel already running task if an attempt to trigger reinitialize has failed 2. if you are using `--force` argument with `patronictl reinit`	2018-01-04 10:31:44 +01:00
Alexander Kukushkin	23152a7fc4	synchronous_standby_names must be quoted with quote_ident (#505 ) in addition to that implement additional checks around manual failover and recover when synchronous_mode is enabled * Comparison must be case insensitive	2017-08-24 07:55:02 +02:00
Alexander Kukushkin	77aea03df9	Different bugfixes around pause state, mostly related to watchdog (#507 ) * Do not send keepalives if watchdog is not active * Avoid activating watchdog in a pause mode * Set correct postgres state in pause mode * Don't try to run queries from API if postgres is stopped	2017-08-24 07:53:32 +02:00
Alexander Kukushkin	f8b3703d6e	Bugfix: failover via API didn't work due to change in _MemberStatus (#489 ) Originally fetch_nodes_statuses was returning a tuple, later it was wrapped into namedtuple _MemberStatus and recently _MemberStatus was extened with watchdog_failed field, but api.py was still relying on usual tuple and checking failover limitations on it's own instead of calling `failover_limitation` method.	2017-07-28 15:38:55 +02:00
Ants Aasma	70d718a058	Simplify watchdog code (#452 ) * Only activate watchdog while master and not paused We don't really need the protections while we are not master. This way we only need to tickle the watchdog when we are updating leader key or while demotion is happening. As implemented we might fail to notice to shut down the watchdog if someone demotes postgres and removes leader key behind Patroni's back. There are probably other similar cases. Basically if the administrator if being actively stupid they might get unexpected restarts. That seems fine. * Add configuration change support. Change MODE_REQUIRED to disable leader eligibility instead of closing Patroni. Changes watchdog timeout during the next keepalive when ttl is changed. Watchdog driver and requirement can also be switched online. When watchdog mode is `required` and watchdog setup does not work then the effect is similar to nofailover. Add watchdog_failed to status API to signify this. This is True only when watchdog does not work AND it is required. * Reset implementation when config changed while active. * Add watchdog safety margin configuration Defaults to 5 seconds. Basically this is the maximum amount of time that can pass between the calls to odcs.update_leader()` and `watchdog.keepalive()`, which are called right after each other. Should be safe for pretty much any sane scenario and allows the default settings to not trigger watchdog when DCS is not responding. * Cancel bootstrap if watchdog activation fails The system would have demoted itself anyway the next HA loop. Doing it in bootstrap gives at least some other node chance to try bootstrapping in the hope that it is configured correctly. If all nodes are unable to activate they will continue to try until the disk is filled with moved datadirs. Perhaps not ideal behavior, but as the situation is unlikely to resolve itself without administrator intervention it doesn't seem too bad.	2017-07-27 12:16:11 +02:00
Alexander Kukushkin	cd84dc82b6	Implement postgresql-10 support (#444 ) Mainly it handles rename of xlog to wal. In the API and inside DCS it is still named xlog (for compatibility). * Address feedback	2017-05-19 17:04:53 +02:00
Alexander Kukushkin	44a7142a9d	Synchronous mode strict (#435 ) If synchronous_mode_strict==true then '*' will be written as synchronous_standby_names when that last replication host dies.	2017-04-27 14:32:15 +02:00
Oleksii Kliukin	d39f895082	Fix unit tests for Python 3.6 (#431 ) Python 3.6 complains about 'AttributeError: 'MockRequest' object has no attribute 'sendall'	2017-04-18 12:44:42 +02:00
Ants Aasma	1290b30b84	Introduce starting state and master start timeout. (#295 ) Previously pg_ctl waited for a timeout and then happily trodded on considering PostgreSQL to be running. This caused PostgreSQL to show up in listings as running when it was actually not and caused a race condition that resulted in either a failover or a crash recovery or a crash recovery interrupted by failover and a missed rewind. This change adds a master_start_timeout parameter and introduces a new state for the main run_cycle loop: starting. When master_start_timeout is zero we will fail over as soon as there is a failover candidate. Otherwise PostgreSQL will be started, but once master_start_timeout expires we will stop and release leader lock if failover is possible. Once failover succeeds or fails (no leader and no one to take the role) we continue with normal processing. While we are waiting for the master timeout we handle manual failover requests. * Introduce timeout parameter to restart. When restart timeout is set master becomes eligible for failover after that timeout expires regardless of master_start_time. Immediate restart calls will wait for this timeout to pass, even when node is a standby.	2016-12-08 14:44:27 +01:00
Alexander Kukushkin	038b5aed72	Improve leader watch functionality (#356 ) Previously replicas were always watching for leader key (even if the postgres was not in the running there). It was not a big issue, but it was not possible to interrupt such watch in cases if the postgres started up or stopped successfully. Also it was delaying update_member call and we had kind of stale information in DCS up to `loop_wait` seconds. This commit changes such behavior. If the async_executor is busy by starting/stopping or restarting postgres we will not watch for leader key but waiting for event from async_executor up to `loop_wait` seconds. Async executor will fire such event only in case if the function it was calling returned something what could be evaluated to boolean True. Such functionality is really needed to change the way how we are making decision about necessity of pg_rewind. It will require to have a local postgres running and for us it is really important to get such notification as soon as possible.	2016-11-22 16:22:30 +01:00
Alexander Kukushkin	37b020e7a3	Various bugfixes and improvements: (#346 ) * Replace pytz.UTC with dateutil.tz.tzutc, it helps to reduce memory by more than 4Mb... * fix check of python version: 0x0300000 => 0x3000000 * Update leader key before restart and demote	2016-11-04 18:42:56 +02:00
Ants Aasma	7e53a604d4	Add synchronous replication support. (#314 ) Adds a new configuration variable synchronous_mode. When enabled Patroni will manage synchronous_standby_names to enable synchronous replication whenever there are healthy standbys available. With synchronous mode enabled Patroni will automatically fail over only to a standby that was synchronously replicating at the time of master failure. This effectively means zero lost user visible transactions. To enforce the synchronous failover guarantee Patroni stores current synchronous replication state in the DCS, using strict ordering, first enable synchronous replication, then publish the information. Standby can use this to verify that it was indeed a synchronous standby before master failed and is allowed to fail over. We can't enable multiple standbys as synchronous, allowing PostreSQL to pick one because we can't know which one was actually set to be synchronous on the master when it failed. This means that on standby failure commits will be blocked on the master until next run_cycle iteration. TODO: figure out a way to poke Patroni to run sooner or allow for PostgreSQL to pick one without the possibility of lost transactions. On graceful shutdown standbys will disable themselves by setting a nosync tag for themselves and waiting for the master to notice and pick another standby. This adds a new mechanism for Ha to publish dynamic tags to the DCS. When the synchronous standby goes away or disconnects a new one is picked and Patroni switches master over to the new one. If no synchronous standby exists Patroni disables synchronous replication (synchronous_standby_names=''), but not synchronous_mode. In this case, only the node that was previously master is allowed to acquire the leader lock. Added acceptance tests and documentation. Implementation by @ants with extensive review by @CyberDem0n.	2016-10-19 16:12:51 +02:00
Alexander Kukushkin	6dc1d9c88e	Trigger reinitialize from api and make it possible to reinitialize in a pause state	2016-08-29 15:38:58 +02:00
Alexander Kukushkin	9fdd021e08	Fix unit-tests for api	2016-08-29 10:25:46 +02:00
Murat Kabilov	3d1fe3fa49	Introduce is_paused method in the Cluster	2016-08-29 09:29:49 +02:00
Murat Kabilov	89ef5da5ae	Add tests for api; add checks for ctl and api for the paused state case	2016-08-29 08:36:35 +02:00
Alexander Kukushkin	96da6340a9	Calculate future restart time dynamically (#268 ) `do_POST_restart` was ramdomly showing not 100% coverage after 2016-08-20 due to hardcoded timestamps.	2016-08-24 09:46:56 +02:00
Alexander Kukushkin	fa7aa71092	Always call on_start callback when starting Patroni (#262 ) When Patroni was "joining" already running postgres it was not calling callbacks, what in some cases causing issues (callback could be used to change routing/load-balancer or assign/remove floating (service) ip. In addition to that we should `start` postgres instead of `restart`-ing it when doing recovery, because in this case 'on_start' callback should be called, instead of 'on_restart'	2016-08-18 09:35:13 +02:00
Alexander Kukushkin	5fe74bec3b	Make different kazoo timeouts depend on loop_wait (#243 ) * Make different kazoo timeouts dependant on loop_wait ping timeout ~ 1/2 * loop_wait connect_timeout ~ 1/2 * loop_wait Originally these values were calculated from negotiated session timeout and didn't worked very well, because it was taking significant time to figure out that connection is dead and reconnect (up to session timeout) and not giving us time to retry. * Address the code review	2016-08-10 10:15:09 +02:00
Oleksii Kliukin	13b4306f40	Remove one more occurrence of the time bomb	2016-07-14 16:53:02 +02:00
Oleksii Kliukin	3181c4e59f	Code review, asynchronous restarts. - Make the restart initiated by the schedule asynchronous - Fix the placeholders in logs. - Fix the regexp to detect the PostgreSQL version.	2016-07-12 20:25:01 +02:00
Oleksii Kliukin	8834f929aa	Improve the unit tests/coverage.	2016-07-05 10:07:29 +02:00
Oleksii Kliukin	d2832ee43b	Address the code review. Fix return value in the should_run_scheduled_action and the comments. Correct the json composition in the scheduled_restart test. Fix the delete in case there is no scheduled restart. Fix the usage of format in the logger output. Fix the indentation in the evaluate_scheduled_restart. Fix the condition related to the body_is_optional in the do_POST_restart. Fix a few typos in the error messages. Fix the _read_json_content Make the scheduled restart unit-tests a bit less ugly	2016-06-28 16:54:20 +02:00
Oleksii Kliukin	568eb730bc	Clear the scheduled restart after the normal one. Make sure the scheduled restart flag is cleared when the postmaster_start_time changes since the time restart was scheduled. Additionally, separate the logic of checking the restart conditions into the function in order to support conditions for the normal restart as well.	2016-06-24 17:39:04 +02:00
Oleksii Kliukin	29845dd383	Restart the node according to the schedule. The scheduled restart data structures are now independent of those used by the normal restarts. This would be fixed in subsequent commits. Add the behave tests, that cover the POST /restart (but not DELETE).	2016-06-23 10:43:54 +02:00
Oleksii Kliukin	318ca6be38	Implement scheduling and deleting a restart. The scheduled restart API extends the already existing restart endpoint by processing the parameters in the request body. Only one scheduled restart at a time is support. DELETE method on the /restart endpoint is used to remove an existing restart.	2016-06-20 15:16:22 +02:00
Alexander Kukushkin	9ecff0f64d	Bugfixes * GET /config was returning latesy "correct" version of dynamic configuration. * PATCH /config was breaking when trying to patch not dict with dict	2016-06-10 12:35:04 +02:00
Alexander Kukushkin	ebb9e252d8	Rename restart_pending to pending_restart for compatibility	2016-06-02 09:31:30 +02:00
Alexander Kukushkin	1c30948ef9	Implement PUT /config and enhance some checks	2016-06-01 17:06:31 +02:00
Alexander Kukushkin	e10873dd9c	RestApiHandler._patch_config returns True if configuration was changed	2016-05-31 15:49:55 +02:00
Alexander Kukushkin	1cd42d4e47	Get rid from some stupid logic with options=True/False And some other tricks with overriding handle_one_request and finish methods from the parent class which were necessary only to make OPTIONS request from haproxy work with python2, but in fact it was still not working with python3. Instead of doing all the magic we should simply give to haproxy what it wants to get: HTTP response code and nothing more.	2016-05-31 14:42:00 +02:00

1 2

92 Commits