How to update primary keys (PK) when using logical replication

Most people would agree with me when I say that updating primary keys (PK) is never a good idea because of foreign key constraints and integration or relationships with downstream and upstream systems. By updating the primary keys these relationships can be broken. When using replication, there are further complications that make updating primary keys a very bad idea. However there are situations that dictate that primary keys must be updated. This could happen when there is a merger between different companies or when systems are consolidated or integrated. For example companyA buys companyB. Both companies may share the same customers but they are in different systems and have different primary keys. When the merger happens you may want to keep the systems separate but at least have the same primary in both systems so that they are easily recognized as being the same customer. So one of the systems requires updating of the primary keys. Let's take a simple example to see how primary keys can be updated when using logical based replication such as Golden Gate, Streams or Dbvisit Replicate We will take the scott.dept table as we are all familiar with this table. Anyone not familiar with this table please see 'Oracle FAQ Scott table'. The data in the table looks as follows:

SQL> select * from dept; DEPTNO DNAME LOC ---------- -------------- ------------- 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON

There is a foreign key relationship on the dept table, so if we try and update the primary key we get the following error:

SQL> update dept set deptno = 12 where deptno = 10; update dept set deptno = 12 where deptno = 10 * ERROR at line 1: ORA-02292: integrity constraint (SCOTT.FK_DEPTNO) violated - child record found

To update the primary key, the foreign key constraint on table EMP has to be disabled.

SQL> alter table EMP disable constraint FK_DEPTNO; Table altered.

Now that we have disabled the constraint, the primary key can be updated:

(note that the EMP table also has to be updated to set deptno from 10 to 12) Because we have (logical based) replication turned on, the update will also be replicated to the target site. In this case we are using Dbvisit Replicate for our replication. We can see that in the Dbvisit Replicate console, we have a conflict after we execute the update on our source database. By using the "list conflict" command we can inspect the conflict and see the actual SQL and the error that is causing the conflict:

The reason for the conflict is that we did not disable the foreign key constraint on the target database. Once we do this, then the conflict will automatically resolve because of the "RETRY" option of the conflict. A conflict handled as "RETRY" will automatically rerun the SQL at certain intervals. Once the constraint has been disabled, the update statement will complete and the primary key will have been updated. The data on the source and target database now looks as follows:

So if a single primary key value is updated as single rows, there is typically not an issue with the replication as long as all the foreign key constraints have been disabled and all the child or parent tables have also been updated. After this has been done, the contraints can be enabled again. When valid SQL on the source database is not valid when using replication Typically though, primary keys are updated in batches and this causes problems with logical based replication. For example the following SQL statement is a valid statement and will update 2 of the primary keys in a single SQL statement by swapping their values:

So DEPTNO=20 will become DEPTNO=30, and DEPTNO=30 will become DEPTNO=20. This statement will run happily on the source database and the result will be:

As can be seen, the values of DEPTNO 20 and 30 are reversed. With logical based replication each set-based SQL operation is broken down into a row-by-row operation (for more information about this please see 3 Principles of logical based replication) and therefore this valid SQL statement will cause a conflict on the target database. The following conflict will be seen as:

The SQL statement in the conflict is a single row SQL statement as expected. It tries to update DEPTNO and set it to 30 where DEPTNO = 20. Because DEPTNO = 20 already exists (in the primary key constraint) it violates the unique constraint of the PK. Because we had not committed the data yet on the source database, we can issue a rollback and that will clear the conflict on the target (when using Golden Gate, the conflict will not be seen until a commit is issued, because Golden Gate only replicates committed data). How do we update multiple primary keys at once without causing a conflict on the target database? One way would be drop the primary key constraint on the target database, but this can cause other issues because we may end up with duplicate values in the primary keys. The best solution is to issue single row by row SQL statements on the source database and use temporary values:

This will replicate without conflicts and so the resulting data on both source and target will be:

To put our DEPT data back the way it was since this was only a test, the following SQL can be used:

Summary It is not a good idea to update the primary keys of a table especially if you are using logical based replication. However if it becomes necessary to do this, ensure you are aware of the consequences and have a strategy in place that will successfully replicate the updated primary keys to the target database without causing conflicts.