Create DR Cluster for PostgreSQL

1. Introduction

There are three methods for creating a DR cluster, each with two options. A PostgreSQL configuration must be created before attempting to create a DR Cluster.

WAL (Write Ahead Logs) Streaming
- Hot Standby
- Warm Standby
WAL (Write Ahead Logs) file Archiving
- Hot Standby
- Warm Standby
WAL (Write Ahead Logs) file Shipping
- Hot Standby
- Warm Standby

For Hot Standby, the standby database is accessible by the application as a read-only cluster, and the databases under the cluster can be queried and used for reporting purposes. This can happen while the standby cluster databases are still getting synced by the primary cluster.

The Warm Standby Cluster databases are not accessible by the application but are still getting synced by the primary cluster.

The process of creating a configuration is same for all the three method of creating a DR cluster

To create a DR cluster, you can either click on setup now? or “Set Up Disaster Recovery”

2. Create DR Cluster using WAL Streaming Mode

2.1 Explanation

Below is the diagrammatic and explanation of steps that happen on the background when we run the creation of DR cluster using the streaming mode.

Trigger CSD in streaming mode

Create a Replication Slot on Primary

Run the following SQL query on the primary cluster to create a physical replication slot named SLOT_NAME

SELECT * FROM pg_create_physical_replication_slot(SLOT_NAME, false, false) WHERE NOT EXISTS (SELECT * FROM pg_replication_slots WHERE slot_name = SLOT_NAME)

Tar PostgreSQL Config Files on Primary Server:
- postgres.conf
- pg_ident.conf
- pg_hba.conf
Transfer Config Files to Standby Server
- Transfer the tarball containing PostgreSQL config files from the primary cluster to the standby cluster
Data Directory Setup Using pg_basebackup
- Initiate a streaming backup from the standby cluster to the primary cluster . This process retrieves all files from the primary cluster’s data directory and includes the previously established replication slot on step 2
```
/opt/rh/rh-postgresql13/root/usr/bin/pg_basebackup -D /var/lib/pgsql/data/13/main -F plain -X stream -S dbvisit_2 -R --progress --verbose --checkpoint=fast -d postgres://postgres@postgresqlcentos1:5432
```
Config Directory Setup
- Extract files from tarball received in step 4 and place them in the target configuration directory on the standby server
- Configure the PostgreSQL instance on the standby cluster for streaming replication based on the CSD parameter and the replication slot created earlier
If the operating system is Windows, register a Windows service for the PostgreSQL cluster
Restart Standby cluster
- After setting up the data directory and config directory, restart the standby cluster.
Applying Configuration Changes
- Applying the latest changes on configuration coming CSD and persisting configuration data on the Control Center

2.2 Streaming mode using Dbvisit

WAL Streaming replication allows real-time transfer of primary cluster database changes to standby. The databases are kept in sync. There are three processes wal receiver, wal sender and startup. The wal receiver process sends the LSN(Log sequence Number) up until the standby (slave) has replayed and send it to the primary. The wal sender process sends the WAL data until the latest LSN, starting from the LSN sent by the wal receiver and the same process writes the data to the WAL Segments. The startup process on the slave replays the data written to the WAL segment.

A few parameters need to be updated on the primary cluster to ensure that the streaming happens.

In Postgresql.conf file we need to ensure that the listen_address parameter is set to provide a list of IP addresses that it listens to. This is usually the standby server address. In the below example we have kept the listen_address to “*” so it listens to all the addresses.

We also have to change the pg_hba.conf file to provide connections to the standby server and also add replication connections from localhost to the remote host.

host    all             all             10.0.2.23/24 (Address of remote server)          trust

host    replication     all                 10.0.2.22/30   (Address of local server needed after Graceful Switchover)        trust  
host    replication     all                 10.0.2.23/30  (Address of remote server)         trust

Line 1 of the above code block is just for remote connection

lines 3 &4 are mainly for replication connections that do for all privileged users (usually postgres or some specific user). Line 3 is a local server address and is mainly used to ensure that the switchback happens without any issues.

3. Create DR Cluster using WAL Archive Mode

3.1 Explanation

Below is a detailed explanation of how the DR cluster is created in the background.

Trigger CSD in archive mode.
Create Base Backup Files
- Generate the essential base backup files using the following command (By default Postgres is installed under /usr/pgsql-<version> directory)
```
$PG_HOME/bin/pg_basebackup -D tempBackupDir -F tar -X stream -R --progress --verbose 
```
- The aforementioned command will generate the subsequent files to be transferred to the Standby to facilitate the cluster restoration process.
  - base.tar: PostgreSQL base files, including settings and etc
  - pg_wal.tar: wal data files for standby to restore
  - tblspc_xxxxx.tar: customized table space files
  - backup_manifest: backup metadata, is used for pg_verifybackup to verify the backup
Tar the following PostgreSQL configuration files on the primary server
- Postgresql.conf
- pg_ident.conf
- pg_hba.conf
Transfer Backup Files from Primary to Standby, which are generated on step 3 and step 2
Data directory setup
- On the Standby, receive and extract all files generated from Step 2 and Step 3. Place these files in the designated data folder, previously allocated, before initiating CSD.
Config directory setup
- Extract the files from the tarball received in Step 4 and place them into the target configuration directory on the standby server.
- Configure PostgreSQL in archive mode as per the Dbvisit DR cluster creation parameter
If the OS is Windows, register the Windows service for the cluster
After configuring the data and config directories, restart the standby PostgreSQL server
Apply Configuration Changes from CSD
- Applying the latest changes on configuration coming CSD and persisting configuration data on the Control Center

3.2 Steps for creating DR cluster using Archive Mode

To start the DR cluster creation using archive mode

The Primary Cluster will be restarted to enable the archive mode and also provide the location for archive and restore command along with archive timeout parameter. This is updated on postgresql.conf file.

The primary cluster will be restarted even if the archive mode is already enabled to update other parameters.

The creation of the DR cluster can be started either by clicking on Set up now? ( 1 ) or from the ACTIONS menu by clicking on “Set Up Disaster Recovery” ( 2 )

The top screen displays the usual configuration details

Configuration name of the PostgreSQL for which DR cluster will be created
The primary hostname and the status of the PostgreSQL cluster (ONLINE)
The standby hostname and also mentioning that no standby cluster exists
The locations of the standby cluster datafiles. This location is recommended to be the same as the primary, but this can be changed and provided a custom location. For Linux the permission of these folders must be like below.

-bash-4.2$ ls -lrt /u01
drwx------.  5 postgres postgres  4096 Aug 16 16:41 tablespace
drwx------. 20 postgres postgres  4096 Aug 26 12:06 PG5432-12

The location of the configuration files (postgresql.conf, pg_hba.conf, and pg_indent.conf). This location can be different from the primary data directory, but the example below has these two in the same directory. Keeping the same as the primary is recommended, but this can also be different in the standby.
The tablespace location of the primary is in a different location from the default data directory. The above code snippet also shows the permission of these folders.
The tablespace location of the standby. This folder can also be different from the primary but it is recommended to keep the directory structures same as primary.

After this choose the WAL file archiving method with a Hot or Warm standby option.

For this example, a Hot standby is created that can be accessible for reporting purposes as the databases in the cluster are in read only mode.
The WAL file archiving method is used to create the standby cluster.
The location on a primary where the base backup is taken and the location to which it's transferred in the standby. This location provided is a default location but it's
recommended to provide a different location to accommodate backups that are larger in size.
The custom backup location of primary and standby in which the WAL archive files are copied in the primary and transferred to standby.
The archive timeout in seconds. This corresponds to the archive_timeout parameter in postgresql.conf. After 60 sec, a checkpoint is forced on primary triggering a WAL archive file creation to sync with the standby. This would ensure that the standby is always 60 seconds behind the primary.
By, default the observer is enabled that will start monitoring the configuration as soon the DR cluster is created.
The primary cluster will be restarted irrespective of whether the archive_mode is off or on in the postgresql.conf file.
Click start to create the DR cluster.

After successful completion. The summary displayed will show the time it took to create the DR cluster ( 1 ) and also the step by steps for the DR cluster ( 2 ).

CSD for Log shipping Mode

Trigger CSD in archive mode

Create Replication Slot on Primary

Run the following SQL query on the primary cluster to create a physical replication slot named SLOT_NAME

SELECT * FROM pg_create_physical_replication_slot(SLOT_NAME, false, false) WHERE NOT EXISTS (SELECT * FROM pg_replication_slots WHERE slot_name = SLOT_NAME)

Create Base Backup Files
1. Generate the essential base backup files using the following command
```
/opt/rh/rh-postgresql13/root/usr/bin/pg_basebackup -D tempBackupDir -F tar -X stream -R --progress --verbose -S slot_name 
```
2. The aforementioned command will generate the subsequent files, which are to be transferred to the Standby in order to facilitate the cluster restoration process.
  1. base.tar: postgresql base files, including setting and etc
  2. pg_wal.tar: wal data files for standby to restore
  3. tblspc_xxxxx.tar: customized table space files
  4. backup_manifest: backup meta data, is used for pg_verifybackup to verify the backup
3. Note: Different archive mode base backups and log shipping backups also include the replication_slot within the base backup files and transfer it to the standby.
Tar the following PostgreSQL configuration files on the primary server
1. postgres.conf
2. pg_ident.conf
3. pg_hba.conf
Transfer Backup Files from Primary to Standby, which are generated on step 3 and step 2.b
Data directory setup
1. On the standby server, receive and extract all files generated from Step 2.b and Step 3. Place these files in the designated data folder, previously allocated before initiating CSD.
Config directory setup
1. Extract the files from the tarball received in Step 4 and place them into the target configuration directory on the standby server.
2. Configure PostgreSQL in archive mode as per the CSD parameter
If the OS is windows, register windows service for the cluster
After configuring the data and config directories, restart the standby PostgreSQL cluster
Apply Configuration Changes from CSD
1. Applying the latest changes on configuration coming CSD and persisting configuration data on Control Center