Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents

1. Introduction

The graceful switchover is a zero data loss process for role reversal between the primary and standby clusters. At the end of this process, the standby and primary clusters change roles, and the standby cluster becomes the new primary, and the primary cluster becomes the new standby. Even though this process involves a little downtime, but is seamless and does not require any manual intervention.

Note

Dbvisit agentmanager or controlcenter does not manage automatic startup of PostgreSQL cluster after server reboots. The services must be managed either by system startup scripts or can be done manually from the control center or directly from the server itself.

2. Technical Explanation

  1. Database Status

    1. Primary server The primary cluster should be “ONLINE“ and the standby server cluster should be “HOT STANDBY“/”WARM STANDBY” STANDBY.” 

Drawio
mVer2
zoom1
simple0
inComment0
custContentId

...

3680108964
pageId3673063630
lbox1
diagramDisplayName

...

Untitled Diagram-1693262547481.drawio
contentVer

...

3
revision

...

3
baseUrlhttps://dbvisit.atlassian.net/wiki
diagramName

...

Untitled Diagram-1693262547481.drawio
pCenter0
width

...

963.

...

1764705882354
links
tbstyle
height

...

2305.

...

  1. Trigger Graceful Switchover

  2. Disable Control Center Behaviors

    • Disable Observer: Turn off the observer node to ensure a smooth switchover without conflicting activities

    • Disable Auto Update: Suspend any automatic update processes during the switchover procedure

  3. Prepare for Sync Between Primary and Standby 1

    • If the backup mode is set to "Archive," generate a new WAL file as a reference point for synchronization checks

      Code Block
      SELECT * from pg_switch_wal()
    • Otherwise skip

  4. Stop Original Primary Cluster

  5. Prepare for Sync Between Primary and Standby 2

    • If the backup mode is "Archive," skip this step as it's already done in step 3

    • If the backup mode is "Log Shipping," transmit all unapplied WAL files to the standby server via dbvnet

      Code Block
      start_lsn:
      can be retrieved from backupInfo
      
      end_lsn:
      pg_controldata -D dataDir
      
      Any wal file with LSN: LSN > start_lsn && LSN <= end_lsn 
    • If the backup mode is "Streaming," there's no need to sync as the WAL sync is accomplished before the primary cluster shutdown

  6. Update Original Primary PostgreSQL Configuration

    • Adjust the original primary PostgreSQL configuration based on the backup mode. Convert its role to become the new standby database.

  7. Sync Between Primary and Standby for Switchover 1

    • If the backup mode is "Archive," wait for the WAL file generated in step 3 to be applied in the original standby database.

    • Otherwise skip

  8. Sync Between Primary and Standby for Switchover 2

    • If the backup mode is "Log Shipping," apply the WAL file transferred in step 5 to the original standby database.

    • Otherwise skip

  9. Stop Original Standby Cluster

  10. Update Original Standby PostgreSQL Configuration

    • Modify the original standby PostgreSQL configuration according to the backup mode, converting its role to the new primary database

  11. Switch Role and Persist New Configuration Data

  12. Start New Primary Database

  13. Refresh Replication Slot on Primary (if Streaming Backup)

Code Block
SELECT * FROM pg_drop_replication_slot($1) WHERE EXISTS (SELECT * FROM pg_replication_slots WHERE slot_name = SLOT_NAME)
SELECT * FROM pg_create_physical_replication_slot(SLOT_NAME, false, false)

...

Start New Standby Database

...

Prepare for Sync validation Between Primary and Standby

  • If the backup mode is "Archive," create a new WAL file as a reference for synchronization checks.

  • If the backup mode is "Log Shipping," generate a new WAL file and transfer it to the new standby server.

  • If the backup mode is "Streaming," verify that the current LSN on the new primary is applied in the new standby database.

...

Sync Between Primary and Standby for Validation 1

  • If the backup mode is "Archive," wait for the WAL file generated in step 15 to be applied in the new standby database.

  • Otherwise skip

...

Sync Between Primary and Standby for Validation 2

  • If the backup mode is "Log Shipping," apply the WAL file transferred in step 15 to the new standby database.

  • Otherwise skip

...

Enable Control Center Behaviors

  • Enable Observer

  • Enable Auto Update: Restore the automatic update processes

Execute User Script on New Primary and Standby Servers

...

4999999999995

3. Dbvisit Switchover

3.1 Switchover with WAL Streaming Mode

The first and foremost step is to ensure that the status of the configuration is In Sync before proceeding with the switchover. It is also important that the application/users are not making any changes to the production cluster.

...

  1. The streaming mode configuration is In Sync.

  2. The message indicates that the configuration is ready for switchover.

  3. User scripts that can be executed post the switchover process on new primary and standby.

  4. Click Start to start the switchover process.

...

  1. The time taken for the switchover process to complete.

  2. The steps for the switchover process.

...

  1. The configuration is In Sync with WAL Streaming method on the reverse direction.

  2. The observer is enabled once the switchover is completed and is now monitored.

  3. The new primary server and Cluster status show ONLINE on port 5435.

  4. The Streaming method is used by the cluster to sync between standby and primary.

  5. The new standby server and cluster status showing HOT STANDBY on port 5435

...

3.2 Switchover with the WAL Archive method.

The first and foremost step is to ensure that the status of the configuration is In Sync before proceeding with the switchover. It is also important that the application/users are not making any changes to the production cluster.

  1. The Configuration name and the status of the configuration showing In Sync.

  2. The method used for the sync process. The archive method is used here.

  3. Click on the Graceful Switchover to start the process.

...

  1. The archiving mode configuration is In Sync.

  2. The message indicates that the configuration is ready for switchover.

  3. User scripts that can be executed post the switchover process on new primary and standby.

  4. Click Start to start the switchover process.

...

Image Added

  1. The configuration is In Sync with the WAL Archiving method in the reverse direction.

  2. The observer is enabled once the switchover is completed and is now monitored.

  3. The new primary server and Cluster status show ONLINE on port 5432.

  4. The Streaming method is used by the cluster to sync between standby and primary.

  5. The new standby server and cluster status showing HOT STANDBY on port 5432

...

3.3 Switchover with the WAL Shipping method.

  1. The Configuration name and the status of the configuration show time stamp difference between standby and primary.

  2. The method used for the sync process. The shipping method is used here.

  3. Click on the Graceful Switchover to start the process.

...

  1. The shipping mode configuration and the time difference between standby and primary

  2. The message indicates that the configuration is ready for switchover.

  3. User scripts that can be executed post the switchover process on new primary and standby.

  4. Click Start to start the switchover process.

Image Added

Image Added

  1. The configuration is In Sync with the WAL Shipping method in the reverse direction.

  2. The observer is enabled once the switchover is completed and is now monitored.

  3. The new primary server and Cluster status show ONLINE on port 5433.

  4. The WAL Shipping method is used by the cluster to sync between standby and primary.

  5. The new standby server and cluster status showing HOT STANDBY on port 5433.

Image Added