DR Considerations of Bidirectional Transactional Replication

Bidirectional Transactional Replication1

Bidirectional transactional replication, also known as two-way transactional replication, permits a server to be both a publisher and a subscriber to the same data. Because the servers that participate in the replication will replicate any changes to the other servers, any changes are not propagated back to the originating server.

For example, if you have two servers (Server A and Server B), the servers are said to be in bidirectional transactional replication if both of the following conditions are true:

  • The changes that are made to Table T1 at Server A are replicated to Table T1 at Server B.

  • The changes that are made to Table T1 at Server B are replicated to Table T1 at Server A.

Therefore, if a change originates from Server A, the change is replicated to Server B, but Server B does not propagate the same change back to Server A. Replication uses a loopback detection mechanism that the distributor uses to determine whether to send the changes back to originating server

Challenges When Using Replicate for DR

For a successful implementation of Bi-Directional Replication a careful analysis is required. You must understand what the supported and unsupported objects are, business requirements and more importantly how to identify and resolve conflicts.

Dbvisit Replicate is an Asynchronous solution. It is possible that there may be LAG at times depending on the transaction volume, network issues or when processes are down. This LAG can cause data inconsistencies. To avoid this situation make sure that there is very little or no LAG or have proper SLAs.

Deploy transactional replication only in scenarios where conflicts can be avoided instead of resolved.

Why is it important to design for bidirectional replication?

If an application can be sharded or partitioned across databases involved in replication we can mostly eliminate conflicts because each site will operate on their own set of data. For example, one record should not be modified on both sites at the same time.

If this is an existing application or 3rd party software and cannot be controlled then data is going to be modified on both sides. It is best to impose some business rules for ownership of records to automate the resolution of conflicts as described below in a full section dedicated to this topic.

What About Half-Duplex Configurations: One Active Site and a Second Warm Site?

Half-duplex bidirectional configuration is where the replication processes are configured in both directions but are only active in one direction at a time. This can be useful in situations where a fallback position is required during a migration or upgrade. The reverse replication can be enabled after the migration/upgrade. This keeps the old system in sync with the new should the need arise to revert back to the old system after a period of time.

The important point here is that all traffic must be drained from the active site. The passive site should be caught up (no lag) from the former active site before enabling traffic on the second site from customers. If this is not possible, then conditions as described above can occur. In this case automated handlers/logic must be put in place to handle them.

There are special considerations with triggers as well. With an active/ passive configuration they can be set to ENABLED on source side and DISABLED on target side. If target becomes the active site then triggers need to be DISABLED on the former active site & ENABLED on the new active site. If not triggers can fire and corrupt data. Another option is described below as well in a separate section for Triggers below.

What about Sequences in Bidirectional Replication?

Sequences are not supported in Bi-Directional Replication. You must deploy one of the methods below to overcome it.

  • ODD/EVEN

  • RANGE

  • CONCATENATE

You may also choose to customize the sequence definition at each replication site such that the values retrieved are unique across all sites. It is critical that table primary keys and unique indexes are unique for each database participating in an active-active replication environment.

What about Triggers?

Dbvisit Replicate does not replicate triggers. However, triggers on target tables of any replication deserve special consideration. The reason being is that the trigger on the source that makes changes on the source will be replicated over to the target database so there is no reason for the triggers to fire again on the target. Therefore, in bi-directional replication, triggers should not be fired if the DML is being executed by the APPLY or APPLY1 process.

Configuration variable APPLY_SET_TRIGGER_FIRE_ONCE or boolean variable DBVREP.DBRSAPPLY_PKG.IS_DBREPLICATE_SESSION can be used to control trigger execution depending upon the Oracle version.

Oracle 10.2.0.5, 11.2.0.2 and 11.2.0.3: APPLY_SET_TRIGGER_FIRE_ONCEIn Oracle 10.2.0.5, 11.2.0.2 and 11.2.0.3, if APPLY_SET_TRIGGER_FIRE_ONCE variable is set to YES (Default) then the triggers do not fire for changes done by the APPLY process. Therefore, APPLY.APPLY_SET_TRIGGER_FIRE_ONCEand APPLY1.APPLY_SET_TRIGGER_FIRE_ONCE are set to YES in bi-direction replication.

NoteDBMS_DDL.SET_TRIGGER_FIRING_PROPERTY must be set to TRUE (Default) in Oracle.

Oracle versions 11.2.0.4 and above: DBRSAPPLY_PKG.IS_DBREPLICATE_SESSIONFor Oracle versions 11.2.0.4 and above the APPLY_SET_TRIGGER_FIRE_ONCE will not work, as Oracle has changed the licensing model. The workaround is to add the following condition in the trigger so the trigger will not be executed when APPLY or APPLY1 process executes DMLs. For example:

CREATE OR REPLACE TRIGGER TRG_TEMP_SCHEMA_TABLE BEFORE UPDATE ON SCHEMA.TABLE FOR EACH ROW BEGIN IF DBVREP.DBRSAPPLY_PKG.IS_DBREPLICATE_SESSION=FALSE THEN :new.last_updated:=systimestamp; END IF; END; /

What Types of Conflicts Can I Encounter With Bi-Directional & Multi-Master Replication?

The whole next section in this document is dedicated to this topic.

Conflicts in Bidirectional Transactional Replication

When you make changes on a server that is participating in replication, the changes are replicated to all other participating servers. During this replication conflicts may occur and replication may fail. The following list describes the possible conflicts and the ways that you can avoid these conflicts. These are a few basic examples:

Uniqueness ConflictThis conflict results from an attempt to insert records with the same primary key from two different sites.Suggested ActionTo avoid this problem make sure that you use different keys on each server that participates in the replication. To do so, allocate a predetermined range of keys to each server that participates in the replication. You can also use a composite key on each server.Delete ConflictThis type of conflict occurs when one transaction deletes a row that another transaction updates (before the delete is propagated). When you update a record that has been deleted on another server, the UPDATE statement affects zero rows on the server where the record has been deleted, and the replication fails with an error.Suggested ActionTo avoid this problem, perform one of the following steps:

  • Do not allow any actual deletes but instead logically delete records by setting a deactivation flag in the record. 
    or

  • Look for a record before the DELETE statement. That is, Delete_check the stored procedure to check other sites for the record. If no record exists, bypass the UPDATE statement to deleted the record on all subscribers. This may be slow. What if a DB is down? How long do we wait for a check to respond?

Update ConflictThis conflict is caused by simultaneous update operations on the same record. This can occur when you update a column in a record that is updated at the same time on another server. The data may be different on the two servers.Suggested ActionTo avoid this problem, determine if the different columns in the same record are updated at the same time, and then take any necessary action:

  • Compare the current values of all the columns against their values before the update in a stored procedure. Note: This may be slow.

  • Use a trigger to check/compare the values on your other site(s). Things to consider are:

    • What if a DB needing to be checked is down or slow to respond?

    • How long do we wait for the check to respond / time out?


    or

  • Add a column to the table to represent the row version. Compare its current value with its value before the update. 
    or

  • Add a timestamp to the record itself and define business rules for handling this type of conflict with a handler:Latest Timestamp ValueWith this simple technique, you apply updates as they are received. Based on the timestamp value, the most recent updates overlay prior updates. This approach can result in situations where one user’s update gets overlaid by a more recent update.Earliest Timestamp ValueThis mechanism is the opposite of the latest timestamp value, in that the first update overlays subsequent updates. As you would expect, not many shops use this method, but it is an option.Minimum and Maximum ValueThis mechanism may be used when the advanced replication facility detects a conflict with a column group. The advanced replication facility calls the minimum value conflict resolution method. This compares the new value from the originating site with the current value from the destination site for a designated column in the column group. You must designate that column when you select the minimum value conflict resolution method.Groups priority ValueUsing this method, some groups have priority (a higher rank) over other groups. Therefore, the update associated with the highest-ranked group gets the update.Site Priority ValueIn this method, all master sites are NOT created equal. Some remote sites will have priority over other sites.

Note: Each deployment may require a different approach to resolve these conflicts, depending on business requirements. These conflicts are easier to resolve when only two servers are involved. When more than two servers are involved, you may be able to use stored procedures to determine which server originated the changes. Then an update stored procedure can be used to update the records if Server C does not know if the change originated in Server A or in Server B. Unlike merge replication, transactional replication is not designed to resolve conflicts.

[1] http://support.microsoft.com/kb/820675

Donna Zehl January 10, 2015 04:48