Commit Graph

90 Commits

Author SHA1 Message Date
Alexander Kukushkin
a4bd6a9b4b Refactor postgresql class (#1060)
* Convert postgresql.py into a package
* Factor out cancellable process into a separate class
* Factor out connection handler into a separate class
* Move postmaster into postgresql package
* Factor out pg_rewind into a separate class
* Factor out bootstrap into a separate class
* Factor out slots handler into a separate class
* Factor out postgresql config handler into a separate class
* Move callback_executor into postgresql package

This is just a careful refactoring, without code changes.
2019-05-21 16:02:47 +02:00
Alexander Kukushkin
4a4258fc3f Mock external resources (#995)
unit tests should not accidentally hit running Postgres, DCS or filesystem unless we want it explicitly.
2019-03-12 10:39:42 +01:00
Alexander Kukushkin
c64d51f79c Better support for static etcd cluster (#986)
if the `etcd.use_proxies` is set to true, Patroni will stick to the list of hosts specified in the `etcd.hosts` and avoid doing topology discovery. Such mode might be useful when you know that you connect to the etcd cluster via the set of proxies or when th etcd cluster has static topology.
2019-03-07 11:36:36 +01:00
Alexander Kukushkin
76d1b4cfd8 Minor fixes (#808)
* Use `shutil.move` instead of `os.replace`, which is available only from 3.3
*  Introduce standby-leader health-check and consul service
* Improve unit tests, some lines were not covered
* rename `assertEquals` -> `assertEqual`, due to deprecation warning
2018-09-19 16:32:33 +02:00
Alexander Kukushkin
5ce18a8045 Improve protection of DCS being accidentally wiped (#680)
We already have a lot of logic in place to prevent failover in such case and restore all keys, but an accidental removal of `/config` key was effectively switching off pause mode for 1 cycle of HA loop.
2018-05-18 11:18:58 +02:00
Alexander Kukushkin
89a11fed07 Don't rediscover etcd cluster topology when watch timed out (#630)
but switch to the next node if it is possible.

Fixes https://github.com/zalando/patroni/issues/628
2018-02-26 18:48:30 +01:00
Alexander Kukushkin
03c2a85d23 Expose current timeline in DCS and via API (#591)
It is very easy to get current timeline on the master by executing
```sql
SELECT ('x' || SUBSTR(pg_walfile_name(pg_current_wal_lsn()), 1, 8))::bit(32)::int
```

Unfortunately the same method doesn't work when postgres is_in_recovery. Therefore we will use replication connection for that on the replicas. In order to avoid opening and closing replication connection on every HA loop we will cache the result if its value matches with the timeline of the master.

Also this PR introduces a new key in DCS: `/history`. It will contain a json serialized object with timeline history in a format similar to the usual history files. The differences are:
* Second column is the absolute wal position in bytes, instead of LSN
* Optionally there might be a fourth column - timestamp, (mtime of history file)
2018-01-05 15:25:56 +01:00
Alexander Kukushkin
0e01bb33bb Improve patronictl reinit (#576)
Make it possible to cancel a running task if you want to reinitialize replica.
There are two possible ways to trigger it:
1. patronictl will ask whether you want to cancel already running task if an attempt to trigger reinitialize has failed
2. if you are using `--force` argument with `patronictl reinit`
2018-01-04 10:31:44 +01:00
Alexander Kukushkin
b6425cab85 Allow to specify multiple hosts for etcd (#589)
This list will be used for initial discovery of etcd cluster members.
If for some reason during work this list of hosts has been exhausted (during work), Patroni will return to initial list.

In addition to that improve ipv6 compatibility by using a special function for splitting host and port.

Fixes https://github.com/zalando/patroni/issues/523
2018-01-04 10:25:06 +01:00
Alexander Kukushkin
4328c15010 Make Patroni Kubernetes native (#500)
* Use ConfigMaps or Endpoins for leader elections and to keep cluster state
* Label pods with a postgres role
* change behavior of pip install. From now on it will not install all dependencies, you have to specify explicitly DCS you want to use Patroni with: `pip install patroni[etcd,zookeeper,kubernetes]`
2017-12-08 16:55:00 +01:00
Ants Aasma
856a13e24c Remove error spinning on etcd failure and reduce log spam (#429)
When all etcd servers refuse connections during watch the call will fail with an exception and will be immediately retried. This creates a huge amount of log spam potentially creating additional issues on top of losing the DCS. This patch takes note if etcd failures are repeating and starting from the second failure will sleep for a second before retrying. It additionally omits the stack trace after the first failure in a streak of failures.
2017-04-20 12:40:15 +02:00
Alexander Kukushkin
1ed91a93c6 Handle EtcdEventIndexCleared and EtcdWatcherCleared exceptions (#387)
If this case it doesn't make sense to retry, because it brings nothing
but produces a log of exceptions in the log...
2017-02-16 17:07:09 +01:00
Alexander Kukushkin
c6252bc004 Don't resolve url hostnames manualy but mokey patch urllib3 (#385)
Change hostnames by ip addresses was causing certificate verification to
fail. Instead of doing it we will better monkey patch urllib3
functionality which does name resolution. It should work without
problems even for https connection.
2017-01-18 13:46:02 +01:00
Alexander Kukushkin
711d53980f Call self._load_machines_cache() method on timeout is causing switch to
a new server every 5 minutes
2017-01-12 17:30:18 +01:00
Alexander Kukushkin
d138a8db17 AT for master_start_timeout + minor fixes (#361) 2016-12-09 12:02:41 +01:00
Ants Aasma
1290b30b84 Introduce starting state and master start timeout. (#295)
Previously pg_ctl waited for a timeout and then happily trodded on considering PostgreSQL to be running. This caused PostgreSQL to show up in listings as running when it was actually not and caused a race condition that resulted in either a failover or a crash recovery or a crash recovery interrupted by failover and a missed rewind.

This change adds a master_start_timeout parameter and introduces a new state for the main run_cycle loop: starting. When master_start_timeout is zero we will fail over as soon as there is a failover candidate. Otherwise PostgreSQL will be started, but once master_start_timeout expires we will stop and release leader lock if failover is possible. Once failover succeeds or fails (no leader and no one to take the role) we continue with normal processing. While we are waiting for the master timeout we handle manual failover requests.

* Introduce timeout parameter to restart.

When restart timeout is set master becomes eligible for failover after that timeout expires regardless of master_start_time. Immediate restart calls will wait for this timeout to pass, even when node is a standby.
2016-12-08 14:44:27 +01:00
Alexander Kukushkin
ec78777778 Implement simple asynchronos dns-resolve cache (#360) 2016-12-07 13:16:26 +01:00
Alexander Kukushkin
b299b12f58 Varios configuration parameters for etcd (#358)
* Add https and auth support for etcd

Also implement support of PATRONI_ETCD_URL and PATRONI_ETCD_SRV
environment variables

* Implement etcd.proxy etcd.cacert, etcd.cert and etcd.key support

Now it should be possible to set up fully encrypted connection to etcd
with authorization.
2016-12-06 16:40:21 +01:00
Alexander Kukushkin
038b5aed72 Improve leader watch functionality (#356)
Previously replicas were always watching for leader key (even if the
postgres was not in the running there). It was not a big issue, but it
was not possible to interrupt such watch in cases if the postgres
started up or stopped successfully. Also it was delaying update_member
call and we had kind of stale information in DCS up to `loop_wait`
seconds. This commit changes such behavior. If the async_executor is
busy by starting/stopping or restarting postgres we will not watch for
leader key but waiting for event from async_executor up to `loop_wait`
seconds. Async executor will fire such event only in case if the
function it was calling returned something what could be evaluated to
boolean True.

Such functionality is really needed to change the way how we are making
decision about necessity of pg_rewind. It will require to have a local
postgres running and for us it is really important to get such
notification as soon as possible.
2016-11-22 16:22:30 +01:00
Ants Aasma
7e53a604d4 Add synchronous replication support. (#314)
Adds a new configuration variable synchronous_mode. When enabled Patroni will manage synchronous_standby_names to enable synchronous replication whenever there are healthy standbys available. With synchronous mode enabled Patroni will automatically fail over only to a standby that was synchronously replicating at the time of master failure. This effectively means zero lost user visible transactions.

To enforce the synchronous failover guarantee Patroni stores current synchronous replication state in the DCS, using strict ordering, first enable synchronous replication, then publish the information. Standby can use this to verify that it was indeed a synchronous standby before master failed and is allowed to fail over.

We can't enable multiple standbys as synchronous, allowing PostreSQL to pick one because we can't know which one was actually set to be synchronous on the master when it failed. This means that on standby failure commits will be blocked on the master until next run_cycle iteration. TODO: figure out a way to poke Patroni to run sooner or allow for PostgreSQL to pick one without the possibility of lost transactions.

On graceful shutdown standbys will disable themselves by setting a nosync tag for themselves and waiting for the master to notice and pick another standby. This adds a new mechanism for Ha to publish dynamic tags to the DCS.

When the synchronous standby goes away or disconnects a new one is picked and Patroni switches master over to the new one. If no synchronous standby exists Patroni disables synchronous replication (synchronous_standby_names=''), but not synchronous_mode. In this case, only the node that was previously master is allowed to acquire the leader lock.

Added acceptance tests and documentation.

Implementation by @ants with extensive review by @CyberDem0n.
2016-10-19 16:12:51 +02:00
Alexander Kukushkin
19c80df442 Try to mitigate EtcdEventIndexCleared exception (#287)
This error is send by etcd when Patroni is doing "watch" on leader key
which is never updated after creation and etcd cluster receives a lot of
updates, what cleans history of events.

Instead of doing watch on modifiedIndex + 1 we will do watch on X-Etcd-Index,
which is probably still available...
2016-09-02 13:44:47 +02:00
Alexander Kukushkin
413a84836b Update etcd topology only after original request succeed (#254)
There is no point to try to update topology until original request is
not performed. Also for us it is more important to execute original
request rather then keep topology of etcd cluster in sync.

In addition to that implement the same retry-timeout logic in the
`machines` property which already is used in `api_execute` method.
2016-08-10 10:17:37 +02:00
Murat Kabilov
a47a2bceff Manage scheduled restarts using patronictl (#248)
Manage scheduled restarts using patronictl
2016-08-09 12:54:48 +02:00
Alexander Kukushkin
876cfdfb2d Fix retry logic in etcd.py
Client class takes care about retrying when connection to the etcd node
fails. It calculates amount of retries and timeout depending on etcd
cluster size.

Etcd class should not retry when EtcdConnectionFailed exception is
raised (this case is already handled in the Client).

Besides that adjust retry timeouts in the Client class.
2016-06-29 15:30:54 +02:00
Alexander Kukushkin
c8b5003b86 Set __do_not_watch flag when ttl needs to be changed
it's more readable comparing to `reset_cluster`
2016-06-01 13:41:49 +02:00
Alexander Kukushkin
b3ada161cf Implement possibility to configure retry_timeout globally
Previously it was hardcoded all over the place.
2016-05-31 10:30:53 +02:00
Alexander Kukushkin
45cbc8ca70 Implement acceptance test for dynamic configuration functionality
and fix some bugs revealed by acceptance tests
2016-05-26 10:16:24 +02:00
Alexander Kukushkin
7827951c8c Dynamic configuration 2016-05-25 14:17:05 +02:00
Alexander Kukushkin
0c2aad98a3 Move dcs implementations into dcs package 2016-05-19 10:57:18 +02:00
Alexander Kukushkin
1741fa7e0f Mininize number of references to dcs implementations from tests
where it is not necessary (test_ha, test_ctl, etc...)
It will simplyfy further refactoring and make it possible to install
implementations of AbstractDCS independant of each other.
2016-05-19 10:00:32 +02:00
Alexander Kukushkin
bcbc080350 urllib3.exceptions.HTTPError fixes for python 3.5.1
Somehow when you import only urllib3 it's not possible work with
urllib3.exceptions.HTTPError exception (it looks like it is imported
from some other place. from urllib3.exceptions import HTTPError solves
the problem.
2016-04-24 14:18:34 +02:00
Alexander Kukushkin
3a7d2c3874 Remove unused code from unit tests 2016-03-21 20:48:17 +01:00
Alexander Kukushkin
0e0c8ed8d7 Implement delete_cluster interface in for all available dcs
In addition to that rename confusing `Etcd.client` and
`ZooKeeper.client` into `_client`. This attribute is available from
AbstractDCS and people had wrong impression that it provides the same
interface for different DCS implementations, which is obviously not the
case. For Etcd it has type etcd.Client and for ZooKeeper - KazooClient.
2016-03-15 16:25:48 +01:00
Alexander Kukushkin
01afd09ca2 Migrate to python-etcd 0.4.3
Despite this release was very buggy it has really nice features:
* EtcdWatchTimedOut exception is raised when `watch` call timed out
* it supports SRV autodiscovery

Since we already implemented our own SRV discovery this feature is not
really interesting for us, but it solves the problem of having two
requirements files for different python versions, because python-etcd
will install dnspython or dnspython3 as a dependency.

In order to fix https://github.com/jplana/python-etcd/issues/152 and
https://github.com/jplana/python-etcd/pull/154 I had to override
`api_execute` method.
2016-03-12 15:49:42 +01:00
Alexander Kukushkin
cb38e50ac1 Remove unused code 2016-02-26 08:50:53 +01:00
Alexander Kukushkin
f079a9f308 remove unused code 2016-02-17 12:34:22 +01:00
Alexander Kukushkin
a875e93f2e Merge branch 'master' of github.com:zalando/patroni into feature/scheduled_failover_squashed 2016-02-17 12:14:10 +01:00
Alexander Kukushkin
df9b8fed2e Improve quality of code by resolving issues found by quantifiedcode and codacy 2016-02-12 12:23:49 +01:00
Feike Steenbergen
1e2fdac891 Scheduled Failover tests
Add tests for the scheduled failover feature, also add more and better tests for patronictl.
2016-02-10 14:19:41 +01:00
Feike Steenbergen
bce96df177 Add attributes to Mocked classes 2016-01-29 13:29:51 +01:00
Oleksii Kliukin
c003af294a Merge pull request #82 from zalando/feature/patroni_cli_or_ctl_tbd
Feature/patroni cli or ctl tbd
2015-11-18 16:17:02 +01:00
Feike Steenbergen
ca4d9eaaf9 Patronictl: Expand tests to increase coverage 2015-11-18 11:51:24 +01:00
Oleksii Kliukin
fef7d45208 Handle unexpected exceptions in etcd.
Previously, patroni would die after receiving an exception
other than RetryFailedError, etcd.EtcdException from etcd.
We have observed an AttributeError raised by etcd on some
occasions. With this change, we demote ourselves, but not
terminate on such exceptions.
2015-11-17 16:08:58 +01:00
Oleksii Kliukin
84db64e0d5 Merge branch 'master' of https://github.com/zalando/patroni into feature/nofailover 2015-10-26 10:41:51 +01:00
Oleksii Kliukin
b7b47ffd79 Add support for the nofailover tag. 2015-10-23 10:11:38 +02:00
Alexander Kukushkin
2c7e3f60cc Make possible to override default namespace (/service/) from a config file
If the namespace is not specified in a config file /service/ would be
used.
Also it's possible to use just '/' as a namespace. It means we would
have following structure:
  /scope1
  /scope2
  ...
2015-10-21 15:34:55 +02:00
Alexander Kukushkin
c4a6dd48d3 remove debug print statement 2015-10-21 11:09:37 +02:00
Alexander Kukushkin
0096b6b06f Schedule update of machines cache when api_execute call has failed
Such situation could happen if we replaced all etcd nodes except one
which was used by patroni. After replacing the last node patroni will
try to execute request on all other nodes from machines_cache but non of
them are available. Michines cache would became empty and patroni will
stick to the latest node which was available in the machines_cache and
will never try to refresh machines_cache from dns for example.

Currently machines cache is refreshed only when one request to the etcd
cluster has failed, but probably it should be done periodically, for
example every minute...
2015-10-21 10:56:43 +02:00
Alexander Kukushkin
d8f4b09478 use Event.wait instead of sleep
it makes possible to break "sleep" for example from API

plus small bugfix: catch ValueError exception from json.loads
2015-10-02 10:26:48 +02:00
Alexander Kukushkin
d09875a056 refactoring:
1. run touch_member from the main loop
2. move code which takes care about long tasks into separate class
3. change format of data stored in a DCS: use json instead of url
4. change Member class: from now it deserialize everything into data property
5. rework API: from now it takes into account state of the current node in a dcs
2015-10-01 17:06:42 +02:00