patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 18:20:05 +00:00

Author	SHA1	Message	Date
Julien Riou	663026c34c	Use SSLContext to wrap REST API socket (#1039 ) Using `ssl.wrap_socket` is deprecated and was still allowing soon-to-be-deprecated protocols like TLS 1.1. Now using `SSLContext.create_default_context()` to produce a secure SSL context to wrap the REST API server's socket.	2019-04-23 11:23:22 +02:00
Alexander Kukushkin	51b085a76d	Don't wait until the previous callback finish is kill failed (#1036 ) Such wait was happening in the main thread and blocking HA loop. After all the executor thread was doing absolutely the same.	2019-04-15 15:49:06 +02:00
Cameron Daniel	74893e7b84	Reload Consul config on SIGHUP (#1030 ) It is especially useful when somebody is changing the value of `token`	2019-04-15 15:00:38 +02:00
Julien Riou	0e0b364302	Add read-only and read-write API routes (#1038 ) Patroni is great behind a load balancer like HAProxy. API routes are handy to direct the traffic to a valid node depending on its role. We could have one route for writes directed to the primary node and one route for reads directed to the replicas. In a two nodes setup, when we lose a node, there's one host left: the primary. Then, the read traffic will be dropped, even if the primary can handle it. This commit adds read-only and read-write routes. Read-only route enables reads on the primary. Read-write route is an alias for '/master', '/primary' or '/leader' route.	2019-04-15 14:49:54 +02:00
Alexander Kukushkin	51b02e65a5	Don't set archive_mode to off during the custom bootstrap (#1025 ) As @ants complained on Slack, it breaks the workflow when the same location is used for bootstrap and further archiving. For the pg_upgrade we can achieve the same results by modifying postgres config in the upgrade script, which is part of spilo.	2019-04-15 14:31:02 +02:00
Alexander Kukushkin	7c0c9599fc	Remove psycopg2 from requirements (#1023 ) Recently released psycopg2 split into two different packages, psycopg2, and psycopg2-binary which could be installed at the same time into the same place on the filesystem. In order to decrease dependency hell problem, we let a user choose how to install psycopg2. There are a few options available and it is reflected in the documentation. This PR also changes the following behavior: * `pip install patroni` will fail if psycopg2 is not installed * Patroni will check psycopg2 upon start and fail if it can't be found or outdated. Closes https://github.com/zalando/patroni/issues/1021	2019-04-15 14:30:16 +02:00
joe-tss	cf0d712450	Grant deletecollection to patroni serviceaccount (#1033 ) Closes #1032	2019-04-11 15:17:06 +02:00
Sharoon Thomas	d418a9449e	scheduled_at needs a value for manual_failover (#1024 ) The variable scheduled_at may be undefined for a manual failover where there is an exception raised trying to switchover with REST API.	2019-04-08 12:35:15 +02:00
Alexander Kukushkin	6909ce0c7a	Release 1.5.6 (#1020 ) * Update release notes * Bump version v1.5.6	2019-04-03 14:44:31 +02:00
Alexander Kukushkin	cd9d9ca0c3	Make sure we don't enforce ssl_version (#1010 ) Fixes https://github.com/zalando/patroni/issues/1009	2019-04-02 16:49:32 +02:00
Alexander Kukushkin	a0a2da238e	Couple of minor improvements (#1019 ) 1. Fix race condition on shutdown. It is very annoying when you cancel behave tests but postgres remains running. 2. Dump pg_controldata output to logs when "recovering" stopped postgres. It will help to investigate some annoying issues.	2019-04-02 16:49:21 +02:00
Pavlo Golub	b53a29c022	Fix unit-tests for Windows (#1014 ) Closes #1013	2019-04-02 13:58:17 +02:00
Alexander Kukushkin	e38fe78b56	Fix callbacks behavior (mostly for standby cluster) (#998 ) First of all, this patch changes the behavior of `on_start`/`on_restart` callbacks, they will be called only when postgres is started or restarted without role changes. In case if the member is promoted or demoted only the `on_role_change` callback will be executed. `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument. Before that `on_role_change` was never called for standby leader, only `on_start`/`on_restart` and with a wrong role argument. In addition to that, the REST API will return standby_leader role for the leader of the standby cluster. Closes https://github.com/zalando/patroni/issues/988	2019-03-29 10:28:07 +01:00
Lukas Vogel	e059e30560	Ingore VSCode files (#1007 ) Fixes #1008	2019-03-22 16:26:37 -04:00
Alexander Kukushkin	680444ae13	Reduce lock time taken by dcs.get_cluster() (#989 ) `dcs.cluster` and `dcs.get_cluster()` are using the same lock resource and therefore when get_cluster call is slow due to the slowness of DCS it was also affecting the `dcs.cluster` call, which in return was making health-check requests slow.	2019-03-12 22:37:11 +01:00
Alexander Kukushkin	92720882aa	Reset is_leader flag for every removal of leader key (#990 ) This is the next improvement of #777	2019-03-12 22:10:46 +01:00
Alexander Kukushkin	13c88e8b7a	Replace self-execute with multiprocessing.Process (#994 ) In addition to that transfer postmaster pid to Patroni process with the help of multiprocessing.Pipe instead of using stdin-stdout pipes. Closes https://github.com/zalando/patroni/issues/992	2019-03-12 10:40:37 +01:00
Alexander Kukushkin	4a4258fc3f	Mock external resources (#995 ) unit tests should not accidentally hit running Postgres, DCS or filesystem unless we want it explicitly.	2019-03-12 10:39:42 +01:00
Ants Aasma	2204454094	Remove unnecessary usage of relpath (#1002 ) `os.path.relpath` depends on being able to resolve the working directory. This will fail if Patroni is started in a directory that is later unlinked from the filesystem, creating an unnecessary exception when loading from DCS.	2019-03-11 12:31:06 +01:00
Alexander Kukushkin	9e19b43869	Rename cluster name to demo (#1000 ) * Assign hostname to haproxy container * Tune vim config	2019-03-11 10:57:26 +01:00
Julien Tachoires	13f7aede61	Wait for callback end if it could not be killed (#985 ) that could happen if the script is running under a sudo	2019-03-11 10:35:36 +01:00
Julien Tachoires	8a7ba57457	Clean target directory when pg_wal\|pg_xlog is a symlink (#997 )	2019-03-11 10:33:46 +01:00
Alexander Kukushkin	c64d51f79c	Better support for static etcd cluster (#986 ) if the `etcd.use_proxies` is set to true, Patroni will stick to the list of hosts specified in the `etcd.hosts` and avoid doing topology discovery. Such mode might be useful when you know that you connect to the etcd cluster via the set of proxies or when th etcd cluster has static topology.	2019-03-07 11:36:36 +01:00
Alexander Kukushkin	f0990532dc	Update docker-compose demo cluster (#980 ) 1. Multi-stage build with an extensive cleanup of useless files and optional image compression 2. Start three-node etcd cluster 3. Start three-node Patroni cluster 4. One container with haproxy 5. All container names are prefixed with "demo-" and don't have suffixes 6. Decommission dev_patroni_cluster.sh script, docker-compose is now standard de-facto. 7. Provide more examples in the docker/README.md	2019-03-07 11:32:03 +01:00
Alexander Kukushkin	e670122c80	Use create_replica_methods from standby_cluster for replica bootstrap (#981 ) It might happen that the standby cluster is configured to be created and replay WAL files from the different source than when it is running not in standby mode. This is necessary to avoid writing WAL files and backups into the old place after promotion. The easiest way to achieve such behavior is passing RemoteMember object to `Postgresql.clone` method instead of the usual Member object.	2019-02-21 11:37:50 +01:00
Alexander Kukushkin	c6e70a9910	Release 1.5.5 (#979 ) * Bump version * Update release notes v1.5.5	2019-02-15 16:14:39 +01:00
Alexander Kukushkin	0ec1760397	Don't write primary_conninfo into recovery.conf for wal only standby cluster (#971 ) It is useless and makes postgres to generate a lot of errors	2019-02-15 13:35:34 +01:00
Alexander Kukushkin	317721991b	Fix handling of PATRONI_*_PASSWORD environment variables (#970 ) Bug was introduced in https://github.com/zalando/patroni/pull/947 and https://github.com/zalando/patroni/pull/500	2019-02-15 13:35:22 +01:00
Alexander Kukushkin	0c516de147	Create headless service associated with $SCOPE-config endpoint (#958 ) if there is no service defined k8s assumes that endpoint is orphaned and removes it. Patroni tries to create the service only in case if use_endpoints is enabled if the following cases: 1. Upon start 2. When it tries to (re-)create the config endpoint If for some reason creation of the service has failed, Patroni will retry it on every cycle of HA loop. Usually it fails due to lack of permissions and if you don't want to give such permissions to the service account used by Patroni, you can create the service explicitly in the deployment manifest.	2019-02-15 13:35:04 +01:00
Michael Banck	073074f83e	Run coverage as python -m coverage (#968 ) Depending on the platform the coverage binary might not always be available under the standard name.	2019-02-13 16:02:12 +01:00
Michael Banck	345e6d3131	Copy away output directories of failed acceptance tests. (#967 ) And dump logs on travis from only failed features	2019-02-13 16:00:15 +01:00
Michael Banck	d01a9bdcd5	Change base port for acceptance tests from 5440 to 5360 (#966 )	2019-02-13 15:59:13 +01:00
Alexander Kukushkin	10bbf0c3c5	Always use replication=1, otherwise it is considered a logical walsender (#952 ) Fixes: https://github.com/zalando/patroni/issues/894 Fixes: https://github.com/zalando/patroni/issues/951	2019-01-30 12:38:51 +01:00
Alexander Kukushkin	4304560ce2	Adjust read timeout for leader watch blocking query (#950 ) According to the Consul documentation the actual response timeout is increased by a small random amount of additional wait time added to the supplied maximum wait time to spread out the wake up time of any concurrent requests. It adds up to wait / 16 additional time to the maximum duration. In our case we will add wait/15 or 1 second depending on what is bigger. Fixes: https://github.com/zalando/patroni/issues/945	2019-01-30 12:38:24 +01:00
Alexander Kukushkin	254ee2acfc	Show information about timelines in patronictl list (#949 ) This information will help to detect stale replicas. In addition to that Host will include ':{port}' if the port value isn't default or more than one member running on the same host. Fixes: https://github.com/zalando/patroni/issues/942	2019-01-30 12:37:51 +01:00
Alexander Kukushkin	739329b590	Make it possible to automatically reinit the former master (#948 ) If the pg_rewind is disabled or can't be used, the former master could fail to start as a new replica due to diverged timelines. In this case, the only way to fix it is wiping the data directory and reinitializing. So far Patroni was able to remove the data directory only after failed attempt to run pg_rewind. This commit fixes it. If the `postgresql.remove_data_directory_on_diverged_timelines` is set, Patroni will wipe the data directory and reinitialize the former master automatically. Fixes: https://github.com/zalando/patroni/issues/941	2019-01-30 12:37:21 +01:00
Étienne M	bd2c54581a	Add ETCD_(PROTOCOL\|USERNAME\|PASSWORD) env variables (#947 ) Fix #944	2019-01-30 12:36:50 +01:00
Maxim Ivanov	f0b12b7e2e	Document create_replicas_methods in standby_cluster section (#939 ) Fixes https://github.com/zalando/patroni/issues/935	2019-01-30 12:36:24 +01:00
Étienne M	93d157dea3	Document how to start Patroni with an existing data directory (#918 )	2019-01-30 12:35:57 +01:00
Alexander Kukushkin	2c128520cf	Python34 compatibility (#933 ) and some other minor fixes. Closes https://github.com/zalando/patroni/issues/932	2019-01-16 14:40:05 +01:00
Alexander Kukushkin	381a5b80d2	Release 1.5.4 (#931 ) * Bump version * Update release notes * Make it possible to configure registration of Service in Consul via env variables v1.5.4	2019-01-15 12:14:19 +01:00
Alexander Kukushkin	71dae6a905	Optionally consider node not healthy if it is not on the latest timeline (#892 ) The latest timeline is calculated from the `/history` key in DCS. In case there is no such key or it contains some garbage we consider the node healthy. Closes https://github.com/zalando/patroni/issues/890	2019-01-15 11:16:30 +01:00
Alexander Kukushkin	cf34fb3934	Relax requirements on superuser credentials (#930 ) libpq allows opening connection without explicitly specifying neither username nor password. Depending on situation it would rely either on `pgpass` file or trust authentication method in pg_hba.conf. Since pg_rewind is also using libpq, it could work the same way. Fixes https://github.com/zalando/patroni/issues/928	2019-01-15 11:15:35 +01:00
Alexander Kukushkin	e080ded44b	Make logging configurable via YAML file (#927 ) It allows changing logging settings in runtime by updating config and doing reload or sending `SIGHUP` to the Patroni process. Important! Environment configuration names related to logging were renamed and documentation accordingly updated. For compatibility reasons Patroni still accepts `PATRONI_LOGLEVEL` and `PATRONI_FORMAT`, but some other variables related to logging, which were introduced only recently (between releases), will stop working. I think it is ok, since we didn't release the new version yet and therefore it is very unlikely that somebody is using them except authors of corresponding PRs. Example of log section in the config file: ```yaml log: dir: /where/to/write/patroni/logs # if not specified, write logs to stderr file_size: 50000000 # 50MB file_num: 10 # keep history of 10 files dateformat: '%Y-%m-%d %H:%M:%S' loggers: # increase log verbosity for etcd.client and urllib3 etcd.client: DEBUG urllib3: DEBUG ```	2019-01-15 08:42:13 +01:00
jouir	dec3656f6e	Redirect HTTPServer exceptions to logger (#900 ) (#925 ) By default they were written to stdout	2019-01-15 08:37:06 +01:00
Alexander Kukushkin	1e2d89fa58	Apply 5 second backoff when loading global config up on start (#922 ) It doesn't make much sense to hammer DCS when we just starting up. Fixes https://github.com/zalando/patroni/issues/919	2019-01-14 14:55:56 +01:00
Alexander Kukushkin	994863c18d	Refactor wait_for_user_backends_to_close method (#917 ) 1. Log only debug level messages on any kind of error 2. Update regexp for matching postgres aux processes to make it compatible with postgres 11 Fixes https://github.com/zalando/patroni/issues/914	2019-01-14 14:55:45 +01:00
Alexander Kukushkin	3fce982909	Set archive_mode to off during the custom bootstrap (#911 ) We want to avoid archiving WALs and history files until the cluster is fully functional. It should really help if the custom bootstrap involves pg_upgrade.	2019-01-14 14:55:23 +01:00
Lucas Capistrant	d306092cbc	Explicitly secure rw perms for recovery.conf at creation time (#910 ) We don't want anybody except patroni/postgres user reading this file, it contains replication user and password.	2019-01-14 14:22:04 +01:00
Dmitry Dolgov	11f7ceb521	Do not check types of standby_cluster configuration (#924 ) Simply allow valid keys	2019-01-14 14:16:15 +01:00

1 2 3 4 5 ...

1637 Commits