Posts Tagged ‘postgreSQL 9.6’

PostgreSQL ? yes, this is going to be the first PostgreSQL post that I’m going to post. Well, I’m going to write my experience with PostgreSQL replication, possible error scenarios, workarounds and best practices. For this post I’m going to stick to PostgreSQL 9.6 on CentOS7 environment. But I’m sure procedure is more or less the same in other linux environments.

Introduction

PostgreSQL offers several solutions to maintain replicas. There are some open source and commercial third party tools available as well. It is needless to say choosing the best-fit solution in the early stages is vital and can save a lot of time.

Streaming replication, which I’m going to elaborate here, depends on transnational log files. Slave/secondary servers use a stream of Write Ahead Log files in order to keep up to date with master/primary server. This is a single master replication mechanism. Even though streaming replication is asynchronous by default, we are going to focus on synchronous mode here.

Environment

At this point, we have following setup. For sake of simplicity, I have assigned ip addresses of master node and slave node  to x.x.x.master and x.x.x.slave respectively.

  • Master Node : x.x.x.master
    • CentOS7
    • PostgreSQL 9.6 installed & configured
  • Slave Node : x.x.x.slave
    • CentOS7
    • PostgreSQL 9.6 installed & configured

Make sure servers can communicate to each other.

Let’s begin the process of implementing a basic synchronous streaming replication first.

Replication Settings on Master Node

  • Create a user named replicator

This is the user that we are going to use to initiate streaming replication.

postgres=# CREATE USER replicator REPLICATION PASSWORD ‘pppp’ LOGIN;

  • Add an entry to the pg_hba.conf

pg_hba.conf file normally resides at /var/lib/pgsql/9.6/data/ directory. Adding following entry, we allow access from slave node to master node.

host    replication     replicator      x.x.x.slave/32         md5

  • Configure postgresql.conf to start WAL sending process
    • postgresql.conf file normally resides at /var/lib/pgsql/9.6/data directory.
    • wal_level defines how much information written to the WAL. For streaming replication we need replica.
    • we set synchronous_commt to on, in order to confirm all the changes made by a transaction have been transferred to slave nodes before getting the acknowledgement. (default value is also on)
    • max_wal_senders must be set to higher than 0 to enable replication. It is advisable to set this to a slightly higher value than  the number of slave nodes.
    • wal_keep_segments defines the number of wal segments resides at a given time in pg_xlog directory. will discuss about this value later.
    • finally, setting synchronous_standby_names to non empty value, we enable synchronous mode of this replication. it could be multiple values, but here we use only one standby node.

## WRITE AHEAD LOG

wal_level                      = replica

synchronous_commit      = on

## REPLICATION

max_wal_senders                  = 3

wal_keep_segments                = 8

synchronous_standby_names       = ‘slave1’

Restart the postgresql service once done.

Replication Settings on Slave Node

Let’s make changes to the slave node. Before that we need to stop the postgresql service.

  • Make a base backup in slave node using master db cluster

Before receiving WAL files, it is necessary to have a base backup of a master node. Even though there are many ways that we can achieve this, preferred candidate is pg_basebackup tool. Make sure to empty the data directory of slave node before executing following command. (backup any config file inside data directory if you already configured any settings. I will assume this is a fresh installation of PostgreSQL)

pg_basebackup -h x.x.x.master -D /var/lib/pgsql/9.6/data/ -P -U replicator –xlog-method=stream

Provide the password for the user replicator and we are good to go.

After successfully getting base backup, make sure postgres user has sufficient permissions to the data directory and its sub directories.

  • Configure postgresql.conf file to set up the slave node

Making hot_standby mode to on, we allow queries to run during recovery (slave node is always in recovery mode unless in a failover scenario)

hot_standby = on

  • Create recovery.conf file in the data directory 
    • This file can be created as a new file in the data directory or copy the sample  file located in /usr/pgsql-9.6/share/ to the data directory and make the following changes.
    • Standby_mode must be set to on in order to start this node as a slave node.
    • primary_conninfo has the necessary parameters to connect to the master in order to receive WAL files. (applicaton_name is set to value of synchronous_standby_names defined in master node)

standby_mode = on

primary_conninfo = ‘host=x.x.x.master port=5432 user=replicator  password=pppp                                                     application_name=slave1’

Start the postgresql service once done.

Test it !

Running following command in master node, can get information about all the nodes.

select * from pg_stat_replication;

Note :  Large differences between pg_current_xlog_location and sent_location field might indicate that the master server is under heavy load, while differences between sent_location and pg_last_xlog_receive_location on the standby might indicate network delay, or that the standby is under heavy load.

Additional Notes

  • PostgreSQL streaming replication does not support multiple master servers.
  • Slave node is read-only in this setup, so cannot execute data modifying transactions on the slave node.
  • If the server returns error when starting up, good place to start troubleshooting is the postgresql log files, which are located in pg_log in the data directory.
  • Configuration files are always messy. Keeping it clean can be useful not only to enhance the readability but also to save some precious time. Found a nice article about keeping it clean.

This is a basic setup of synchronous replication. For sake of simplicity, points like WAL archiving, failover mechanisms have been omitted in this post.