The 3 Fundamentals of Oracle Replication [Part 2 of 3]

This is part 2 of 3 in the series on "The 3 Fundamentals Principles of Oracle Replication". Part 1 can be found here.

The second fundamental principle of logical based Oracle replication is:
For the data in the table to be replicated, each row in the table must be uniquely identified.

To uniquely identify each row the where clause must contain one of the following:

  • Primary key

  • Unique key

  • All columns in tables (may still not uniquely identify row)

So it is not necessary for the table to have a primary key, as long as each row can be uniquely identified.

Example
To illustrate why this is necessary I will use an example. We have the following table SALES with the following columns:

CREATE TABLE SALES (PROD_ID NUMBER NO NULL, CUST_ID NUMBER NOT NULL, QUANTITY_SOLD NUMBER(10,2) NOT NULL, AMOUNT_SOLD NUMBER(10,2) NOT NULL, AMOUNT_RECEIVED NUMBER(10,2), SALES_STATUS VARCHAR2(10) NOT NULL );

We have the following update statement in our source database:

update SALES set AMOUNT_RECEIVED = 120 where SALES_STATUS = 'OVERDUE'

There is no primary key involved in our update statement, so this statement could update more than 1 record. But lets assume that it only updates 1 record on the source database with the following values:

PROD_ID = 101 CUST_ID = 201 QUANTITY_SOLD = 233 AMOUNT_SOLD = 1299 AMOUNT_RECEIVED is NULL SALES_STATUS = 'OVERDUE'

The generated SQL for the target database is translated into a single row update statement according to fundamental principle number 1.
The actual SQL that is generated depends upon if there is a primary key, a unique key, or no keys.

Primary key
If this table had a primary key which consists of column: PROD_ID, then the primary key is used to uniquely identify the row. The SQL statement that is run against the target database will have the following predicates added to the where clause:

  • Primary key

  • Old value of the updated column

This means that the SQL to run against the target database will look like this:

Adding the primary key to the where clause ensures that only 1 record is updated. Unqiue key If this table had a unique key which consists of column: PROD_ID and CUST_ID, then the unique key is used to uniquely identify the row. The SQL statement that is run against the target database will have the following predicates added to the where clause:

  • Unique key

  • Old value of the updated column

This means that the SQL to run against the target database will look like this:

Adding the unique key to the where clause ensures that only 1 record is updated.

No primary or unique key
If this table did not have a primary or unique key, then all the columns in the table are used to uniquely identify the row. The SQL statement that is run against the target database will have the following predicates added to the where clause:

  • All columns of the table (excluding LOBS and special datatypes)

  • Old value of the updated column

This means that the SQL to run against the target database will look like this:

Adding all the columns to the where clause ensures that only 1 record is updated if this second principle is true (each row in the table must be uniquely identified).

What happens if there is no primary key, no unique key, and all the columns in the table still do not uniquely identify a single row in the target table? This would mean that a conflict occurs. Conflicts can cause data divergence and lost updates.

Part 3, the final installation in this series, where I will discuss conflicts, can be found here: The 3 Fundamentals of Oracle Replication [Part 3 of 3].