patroni

mirror of https://github.com/outbackdingo/patroni.git synced 2026-01-27 18:20:05 +00:00

Author	SHA1	Message	Date
Oleksii Kliukin	b165183503	Reset is_leader status on demote (#777 ) Make sure demoted cluster member stops responding with code 200 on the /master API call. Issue a new minor release. Fixes https://github.com/zalando/patroni/issues/776 v1.4.6	2018-08-14 17:08:08 +02:00
Dmitry Dolgov	b282a0f254	Add "cluster_unlocked" field (#764 ) Add a field to an api to figure out if a master is there from patroni point of view. It can be useful, when you have an alert, based on Auto Scaling Groups, and then ASG decided to shutdown the current master, spin up a new instance but the current master shutdown is stuck. In this situation the current master is no longer a part of ASG, but patroni and Postgres are still alive on the instance, which means a new replica will not be promoted yet - this will lead to a false alert, saying that your cluster doesn't have any master node.	2018-08-13 14:02:01 +02:00
Oleksii Kliukin	5e7345a2ca	Release notes 1.4.5 (#762 ) bump version update release notes v1.4.5	2018-08-03 17:02:11 +02:00
Don Seiler	502094ee79	Log config change or not (#731 ) This adds INFO log messages that clearly state if configuration values were seen as changed by Patroni after SIGHUP/reload and warrant reloading (or if nothing was changed an no reloading is necessary). This ended up being a lot simpler than I had imagined once I found postgresql.py:reload_config(). I add a log line in config.py:reload_local_configuration() since that function will short-circuit the process early if the local config wasn't changed. But the final determination of whether or not values have changed and need reloading is in postgresql.py:reload_config().	2018-08-03 17:00:57 +02:00
Alexander Kukushkin	0c1ae6fbeb	Respond 200 to master health-check only if update_lock was successful (#713 ) If Patroni gets partitioned it starts receiving stale information from DCS. We can't use this information to determine that we have the leader key. Instead, we will record in Ha object the actual state of acquire/update lock and report as a leader only if it was successful. P.S. despite responding with 200 on `GET /master` postgres was still running read-only.	2018-08-03 17:00:01 +02:00
Alexander Kukushkin	2fd2556050	Set role to demoted if postgres isn't running and no recovery.conf (#757 ) In really rare cases it was causing following behavior: ``` 2018-07-31 10:35:30,302 INFO: starting as a secondary 2018-07-31 10:35:30,309 INFO: Lock owner: postgresql0; I am postgresql1 2018-07-31 10:35:30,310 INFO: Demoting master during restarting after failure 2018-07-31 10:35:30,381 INFO: postmaster pid=17709 2018-07-31 10:35:30,386 INFO: lost leader lock during restarting after failure 2018-07-31 10:35:30,388 ERROR: Exception during CHECKPOINT ```	2018-08-03 16:59:04 +02:00
Oleksii Kliukin	d47049ce0e	Fix condition for the replica start due to pg_rewind in paused state. (#754 ) Avoid starting the replica that had already executed pg_rewind before. Fixes in #753	2018-08-03 16:45:33 +02:00
Henning Jacobs	2f7c53031c	Python 3.6 and 3.7 are now supported, too (#752 )	2018-07-24 10:51:25 +02:00
Christoph Berg	a2c6ed5504	async is a keyword in python3.7 (#751 ) * async is a keyword in python3.7 Setting up patroni (1.4.4-1) ... File "/usr/lib/python3/dist-packages/patroni/ha.py", line 610 'offline': dict(stop='fast', checkpoint=False, release=False, offline=True, async=False), ^ SyntaxError: invalid syntax Fix #750 by replacing dict member "async" with "async_req". * requirements.txt: Update for new kubernetes version compatible with 3.7	2018-07-23 20:42:33 +02:00
Oleksii Kliukin	00c2e1c2d0	Grant delete on endpoints and configmaps in RBAC. (#749 ) 'patronictl remove' deletes the cluster configuration (stored either in configmaps or endpoints) and cannot be run from the postgres pod w/o 'delete' on those objects being granted to the pod service account.	2018-07-23 20:39:46 +03:00
Don Seiler	f5927bad70	Add EnvironmentFile directive (#746 ) Add an EnvironmentFile directive to read in a configuration file with environment variables. The "-" prefix means it can proceed if the file doesn't exist. This would allow users to keep sensitive information like the SUPERUSER/REPLICATION passwords in the config file separate from a YAML file that might be deployed from source control.	2018-07-23 20:31:47 +03:00
Alexander Kukushkin	26466237b9	Update docker-compose example to postgres 10 (#737 ) Some other changes are related to the new version of confd, which now requires specifying etcd url instead of etcd host.	2018-07-23 16:41:17 +02:00
Tony Sorrentino	c8f9199988	Added setting state to "stopped" when a member is stopped in Ha.shutdown (#733 ) Changes by @tonys66, review by @CyberDem0n	2018-07-23 14:59:39 +02:00
Alexander Kukushkin	2356af679b	Convert query params from list to dict (#744 ) Patroni is relying on params to determinte timeout and amount of retries when executing api requests to consul. Starting from v1.1.0 python-consul changed internal API and started using `list` instead of `dict` to pass query parameters. Such change broke "watch" functionality. Fixes https://github.com/zalando/patroni/issues/742 and https://github.com/zalando/patroni/issues/734	2018-07-23 14:56:51 +02:00
Ants Aasma	3b633abd91	Improve logging when stale postmaster.pid matches running process (#738 ) Currently the informational message logged is beyond confusing. This improves the logging so there is some indication what this message is about and that it is somewhat normal. Changes by @ants	2018-07-17 16:46:22 +02:00
alago197	936a4238fb	Update some descriptions for the REST API endpoints (#729 ) * Update some descriptions for the REST API endpoints By @alago197	2018-07-10 15:40:53 +02:00
Don Seiler	50a8114d0b	Use enforced minimums in postgresX.yml files (#730 ) Fix the discrepancy for the values of max_wal_senders and max_replication_slots between the sample postgres.yml files and hard-coded defaults in Patroni, bumping the former to 10. Contributed by @dtseiler	2018-07-04 10:08:54 +02:00
Don Seiler	4e8709b266	Adding reload functionality (#726 ) This allows the config to be reloaded via `systemctl reload patroni`, sending SIGHUP to the patroni process. Tested on CentOS.	2018-06-30 23:16:42 +02:00
Alexander Kukushkin	4128cba628	max_worker_processes parameter was introduced only in 9.4 (#724 ) exclude it from the list on 9.3 when building effective configuration	2018-06-26 13:48:16 +01:00
Don Seiler	959f254bfb	Adding patronictl reload functionality to reload from yaml config file (#716 ) Fixes https://github.com/zalando/patroni/issues/715	2018-06-20 10:09:10 +02:00
Alexander Kukushkin	8a3b78ca7b	Rest api thread can raise an exception during shutdown (#711 ) catch it and report	2018-06-14 13:17:50 +02:00
Oleksii Kliukin	41e5f58f2b	Describe synchronous_mode_strict (#710 ) * Describe synchronous_mode_strict Per https://github.com/zalando/patroni/issues/709	2018-06-13 11:12:22 +02:00
Dmitry Dolgov	f0d23b0b14	Merge pull request #706 from zalando/feature/rename-create-replica-method Rename create_replica_method to create_replica_methods	2018-06-12 14:16:54 +02:00
Alexander Kukushkin	cbd0a759c0	Relax kubernetes module version (#701 ) Patroni is proven to work with 2.0.0, 3.0.0 and 6.0.0	2018-06-12 14:11:00 +02:00
Alexander Kukushkin	aadd39b0a4	Do crash recovery only when we sure that postgres was running as master (#707 ) pg_controldata reports in this case: * 'in production' * 'shutting down' * 'in crash recovery'	2018-06-12 14:09:09 +02:00
Henning Jacobs	2537147810	#694 handle configuration error (#695 ) It is possible to change a lot of parameters in runtime (including `restapi.listen`) by updating Patroni config file and sending SIGHUP to Patroni process. If something was misconfigured it was throwing a weird exception and breaking `restapi` thread. This PR improves friendliness of error message and avoids breaking of `restapi`.	2018-06-12 14:08:38 +02:00
Alexander Kukushkin	e939304001	Take and apply some parameters from controldata when starting as replica (#703 ) * Take and apply some parameters from controldata when starting as replica https://www.postgresql.org/docs/10/static/hot-standby.html#HOT-STANDBY-ADMIN There is set of parameters which value on the replica must be not smaller than on the primary, otherwise replica will refuse to start: * max_connections * max_prepared_transactions * max_locks_per_transaction * max_worker_processes It might happen that values of these parameters in the global configuration are not set high enough, what makes impossible to start a replica without human intervention. Usually it happens when we bootstrap a new cluster from the basebackup. As a solution to this problem we will take values of above parameters from the pg_controldata output and in case if the values in the global configuration are not high enough, apply values taken from pg_controldata and set `pending_restart` flag.	2018-06-12 14:04:32 +02:00
Chris Fraser	aa18f70466	If set, use LD_LIBRARY_PATH when starting postgres (#698 ) Fixes #697	2018-06-12 14:00:48 +02:00
Alexander Kukushkin	e405e4e03c	Workaround to sporadic unit-test failures (#696 ) Fixes https://github.com/zalando/patroni/issues/691	2018-06-12 14:00:10 +02:00
erthalion	3d80e49b38	Rename also in settings docs	2018-06-12 13:28:30 +02:00
erthalion	d037aa8afd	Rename create_replica_method to create_replica_methods To make it clear that it's actually an array	2018-06-12 11:33:13 +02:00
Björn Albers	e5f2511764	Add WorkingDirectory to systemd sample config. (#686 ) Otherwise `initdb` fails because it tries to create the data directory in the root directory where the postgres user has no permissions.	2018-06-04 16:36:41 +02:00
Alexander Kukushkin	1de7c78c04	Release 1.4.4 (#683 ) bump version and update release notes v1.4.4	2018-05-22 14:46:19 +02:00
Alexander Kukushkin	041015037e	Sync replication slots when we noticed a new postmaster process (#677 ) Fixes: https://github.com/zalando/patroni/issues/674	2018-05-18 16:32:06 +02:00
Alexander Kukushkin	856552bd61	Sync replication slots and verify sysid after coming out of pause (#678 ) Fixes https://github.com/zalando/patroni/issues/568 and https://github.com/zalando/patroni/issues/674	2018-05-18 12:18:49 +02:00
Oleksii Kliukin	4ce539ba1b	Allow options to the basebackup built-in method. (#604 ) Options should be specified in the basebackup section, which is optional.	2018-05-18 12:18:35 +02:00
Oleksii Kliukin	1043376e6b	Do not exit when encountering invalid system ID. (#669 ) Do not exit when the cluster system ID is empty or the one that doesn't pass the validation check. In that case, the cluster most likely needs a reinit; mention it in the result message. Avoid terminating Patroni, as otherwise reinit cannot happen.	2018-05-18 11:48:15 +02:00
Alexander Kukushkin	ed479fe585	Don't demote master if failed to update leader key in pause (#668 ) Fixes https://github.com/zalando/patroni/issues/659	2018-05-18 11:19:56 +02:00
Alexander Kukushkin	5ce18a8045	Improve protection of DCS being accidentally wiped (#680 ) We already have a lot of logic in place to prevent failover in such case and restore all keys, but an accidental removal of `/config` key was effectively switching off pause mode for 1 cycle of HA loop.	2018-05-18 11:18:58 +02:00
Alexander Kukushkin	5296336f4a	BUGFIX: postmaster start can fail if pid from postmaster.pid is alive (#681 ) Upon start postmaster process performs various safety checks if there is a postmaster.pid file in the data directory. Although Patroni already detected that the running process corresponding to the postmaster.pid is not a postmaster, the new postmaster might fail to start, because it thinks that postmaster.pid is already locked. Important!!! Unlink of postmaster.pid isn't an option in this case, because it has a lot of nasty race conditions. Luckily there is a workaround to this problem, we can pass the pid from postmaster.pid in the `PG_GRANDPARENT_PID` environment variable and postmaster will ignore it. More likely to hit such problem if you run Patroni and postgres in the docker container.	2018-05-18 11:18:27 +02:00
Cody Coons	3eeb4ed979	Added check for empty subsets (#670 ) On Kubernetes 1.10.0 I experienced an issue where calls to `patch_or_create` were failing when bootstraping a cluster. The call was failing because `self._leader_observed_subsets` was `None` instead of `[]`.	2018-04-26 16:38:19 +02:00
Alexander Kukushkin	84f29caf92	Fix race condition in poll_failover_result (#658 ) It didn't affect directly neither failover nor switchover, but in some rare cases it was reporting it as a success too early, when the former leader released the lock: `Failed over to "None" instead of "desired-node"` In addition to that this commit improves logs and status messages by differentiating between failover and switchover.	2018-04-16 17:45:05 +02:00
Alexander Kukushkin	d78790b194	Abort start if attaching to running postgres and cluster not initiazlied (#661 ) Patroni can attach itself to an already running PostgreSQL instance. If that is the first instance "seen" in the given cluster, Patroni for that instance will create the initialize key, grab the leader key and, if the instance is running a replica, promote. Because of this behavior, when a cluster with a master and one or more replicas gets Patroni for each node, it is imperative to start running Patroni on the master node before getting to the replicas. This commit changes such weird behavior and will abort Patroni start if there is no initialize key in DCS and postgres is running as a replica. Closes https://github.com/zalando/patroni/issues/655	2018-04-16 17:32:26 +02:00
Kostiantyn Nemchenko	3110090154	Minor corrections to the documentation. (#654 )	2018-04-16 15:46:46 +02:00
Reinhard Nägele	20138af37a	Link to official Helm chart (#660 ) Changes the link from my outdated fork to the official Helm chart which is now up to date.	2018-04-16 15:45:53 +02:00
Dave Cramer	38ad394308	Use the word primary in favour of master (#663 ) Primary is a better alternative.	2018-04-16 01:29:51 +02:00
Alexander Kukushkin	e375fac273	Treat postgres settings parameter names as case insensitive (#650 ) Because they are indeed case insensitive. Most of the parameters have snake_case_name, but there are three exceptions from this rule: DateStyle, IntervalStyle and TimeZone. In fact, if you specify timezone = 'some/tzn' it still works, but Patroni wasn't able to find 'timezone' in pg_settings and stripping this parameter out. We will use CaseInsensitiveDict to keep postgresql.parameters. This change affects only "final" configuration. That means if you put some"duplicates" (work_mem vs WORK_MEM) into patroni yaml or into cluster config, it would be resolved only at the last stage and for example you will be able to see both values if you use `patronictl edit-config`. Fixes https://github.com/zalando/patroni/issues/649	2018-04-04 14:23:53 +02:00
Alexander Kukushkin	8c795ff0cf	Pass dict object to touch_member instead of json encoded string (#651 ) DCS implementation will take care about encoding it. Fixes https://github.com/zalando/patroni/issues/642	2018-04-04 13:45:44 +02:00
Don Seiler	140618abd2	Missing a word (#647 ) In re Issue #639	2018-04-04 13:40:46 +02:00
Josh Berkus	3c05e2e984	Added references to the Slack channel in Readme and in contributing.rst. (#653 )	2018-04-04 13:39:43 +02:00

1 2 3 4 5 ...

1543 Commits