Oracle Replication RAC and HA Best Practices for Linux

This article discusses the best practices, challenges and known issues one could encounter when configuring Dbvisit Replication on Oracle RAC for High Availability

Introduction

What are the challenges of RAC with Dbvisit Replicate?

The database instances can go up and down

Instances can be added and dropped

The machine running Replicate can go down

Replicate can connect to the database that goes down

The concepts are similar for any Clusterware involved, examples are provided for Oracle Clusterware 11.2

Best Practices

Location of Dbvisit Replicate

To provide HA for Dbvsit Replicate, run it on a machine configured in a cluster. The simplest way is to run it on the cluster running the database itself.

Shared Files

For mine, MINE_PLOG directory must be on a shared filesystem. If fetcher is used, then the MINE_STAGING_DIR must be shared as well.

For apply, the APPLY_STAGING_DIR must be on a shared filesystem.

The dbvrep executable and the DDC locations must be identical on all nodes. If the DDC files are not on shared storage, they must be copied to each node.

This requirement may be made obsolete in future versions. The archive logs must be on shared storage (cluster file system or ASM).

Database Connectivity

Dbvisit Replicate keeps a connection to the source (for mine and fetcher) or target (apply) database. Console keeps a connection to both of them, but only if it is used to change the configuration.

Best practice: Configure separate TNS identifier for the databases.

Best practice: For each source node, configure the TNS identifier to connect to the local instance only. If possible, do that for target nodes as well.

This ensures that when Replicate is running on a node, it is connecting to one instance only. This limits its dependencies to the current node only. The database load imposed by mine is very low on the source, so no load balancing is needed. Apply currently executes only one SQL at a time, so the load is also not high – but this will change in the future.

Best practice: Configure TAF (transparent application failover) for this TNS identifier for the machine running the console (if it is run on one node of the source, configure TAF for the apply connection only).

This is optional and it just limits the impact of a node going down on the console.

Application Connectivity

For each cluster, set up an Application VIP managed by the Clusterware. Use this address as the name of the node when the Replicate process run.

This IP will always be assigned to just one node and the Replicate process will follow it in case of node failure.

To create an application VIP, use the following syntax:

# appvipcfg create -network=network_number -ip=ip_address -vipname=vip_name-user=user_name 
[-group=group_name] [-failback=0|1]

Where network_number is the number of the network, ip_address is the IP address, vip_name is the name of the VIP (we use dbvrep VIP further on), user_name is root (as root privileges are needed to reconfigure network interfaces). The default value of the

failback

option is 0. If you set the option to 1, then the VIP (and therefore Replicate as well) fails back to the original node when it becomes available again.

As root, allow the Oracle Grid Infrastructure software owner (e.g.oracle) to run the script to start the VIP.

# crsctl setperm resource dbvrepVIP -u user:oracle:r-x

To check the status, run:

$ crsctl status res dbvrepVIP -p

As oracle user, start the VIP:

Configure and Test

Configure the Dbvisit Replication as usual, using the previously connected TNS identifiers for source and target databases and the Application VIPs for the location where processes will run (*_LISTEN_INTERFACE settings).

Start the replication (use the nodes where the Application VIPs currently reside) and test that your configuration works.

Create action scripts

Create scripts that will manage (start/stop/check/clean/abort) the dbvrep processes. Use the script provided and:

modify the settings at the top of the script (DDC name, process name, paths)

save the script to a directory and make it executable by oracle user; do this on all cluster nodes

An example name of such script would be /home/oracle/REP/action_script_MINE.scr.

Create a resource for Replicate

As oracle, create the cluster resource for the dbvrep process:

Use any name for resource_name, e.g. dbvrepMINE_DDCname like dbvrepMINE_ERP.

For action script, use the path to the script created in previous chapter.

For start dependencies, use resource name for your database (like ora.orcl.db) and if ASM is used, use ora.asm for ASMresource.

The resource was created with “restore” (the default) AUTO_START setting and thus Clusterware will try to start it automatically if it was running last time the cluster was shut down.

The resource was created with default placement policy and thus any node will be eligible to host the resource. Use placement=favored/restricted and hosting_members/server_pools to change this.

How does this interact with the VIP?

Manage the processes

Stop the replication if was started manually.

To manage the replication processes, use:

Do not use “shutdown” from the console; this would cause Clusterware to restart the process.

Unsupported Operations

Dbvisit Replicate does not support online addition and dropping of cluster instances, or to be more precise,redo threads. For such operation, Dbvisit Replicate mine process on such database must be stopped.

Known Issues

Running appvipcfg fails with error:

Reason: Oracle bug 9964188

Workaround: edit appvipcfg.pl and remote "\" in front of USR_ORA_VIP. See also patch shown in the bug description at My Oracle Support.