03 - Data Instantiation

What is Data instantiation

Before data replication can start, the target database must have a copy of the database objects that are going to be replicated. It is also necessary that the data for the objects to be replicated are completely in sync at the point at which the replication starts. Replication can only keep data in sync from a certain point in time, all the data before that point in time must be replicated manually as a one off task.

This is called Data Instantiation and the Oracle SCN plays a key part in this. 

The Oracle SCN is captured at the point at which the replication starts to enable a snapshot of the data at a point in time. This Oracle SCN is then used in two ways:

  1. Everything before the SCN must be copied manually. This is the Data Instantiation and is done in this example through Oracle DataPump, but other methods are also available such as using a standby database.
  2. Everything after the SCN is replicated by the logical replication tool (in this case Dbvisit Replicate, but can also be Golden Gate)

1. Ensure that the REPOE schema is clean. Do this by executing the record_count.bash script in the $HOME/replicate directory of the source machine. This will show you the record counts for the source and target tables but it will also show you if the target tables have not been created. (which is what you want). If there are then the DataPump script will fail.

cd $HOME/scripts
./record_count.bash
[oracle@source scripts]$ ./record_count.bash

TABLE_NAME                     SOURCE      TARGET
------------------------------ ----------- -----------
ADDRESSES                      750003      *No Table*
CARD_DETAILS                   750003      *No Table*
CUSTOMERS                      500003      *No Table*
INVENTORIES                    900131      *No Table*
LOGON                          1191500     *No Table*
ORDERENTRY_METADATA            4           *No Table*
ORDERS                         714895      *No Table*
ORDER_ITEMS                    2143687     *No Table*
PRODUCT_DESCRIPTIONS           1000        *No Table*
PRODUCT_INFORMATION            1000        *No Table*
WAREHOUSES                     1000        *No Table*

11 rows selected.


Sum of orders         TTORCL_SRC         TTORCL_TRG
------------- ------------------ ------------------
ORDERS         $3,572,944,731.00               $.00


The output should indicate there are no tables in the REPOE schema in the target database. 

database link is a schema object in one database that enables you to access objects on another database. After you have created a database link, you can use it to refer to tables and views on the other database. Oracle Datapumps use the database link to gather the data from remote database - in our example from source database and insert them to target database.

The APPLY.sh script created by the Setup Wizard for instantiating the target schema using DataPump export/import over a database link. This database link must be created for the script to work. 

2. On the target server, setup the database link as the user SYSTEM pointing to the source database.

sqlplus system/manager@TARGET
SQL> CREATE DATABASE LINK SOURCE CONNECT TO system IDENTIFIED BY manager USING 'SOURCE';

This database link is needed for the Datapump script that is created as APPLY.sh. The content of datapump script APPLY.sh is similar to the following. DO NOT COPY THE CONTENT. This is just an example. 

impdp SYSTEM/manager@TARGET table_exists_action=TRUNCATE network_link=SOURCE directory=DATA_PUMP_DIR flashback_scn=581609 tables=REPOE.ADDRESSES,REPOE.CARD_DETAILS,REPOE.CUSTOMERS,REPOE.INVENTORIES,REPOE.LOGON,REPOE.ORDERENTRY_METADATA,REPOE.ORDERS,REPOE.ORDER_ITEMS,REPOE.PRODUCT_DESCRIPTIONS,REPOE.PRODUCT_INFORMATION,REPOE.WAREHOUSES   logfile=REPOE_WAREHOUSES.log JOB_NAME=DP_replicate_0001

 

The flashback_scn number (in this example 581609) determines the consistency point as to where the data will be loaded to. All data prior to this SCN will be loaded using the above DataPump script. All data past this SCN will be replicated using Dbvisit Replicate.

3. On the source server, as oracle, in the $HOME/replicate directory, execute the APPLY.sh script. This script will take approximately 3 - 9 minutes to complete depending on your host machine.

cd $HOME/replicate
./APPLY.sh 

The output will be similar to:

[oracle@source replicate]$ ./APPLY.sh

Import: Release 11.2.0.2.0 - Production on Thu May 25 02:58:10 2017

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit Production
Starting "SYSTEM"."DP_REPLICATE_0001":  SYSTEM/********@TARGET table_exists_action=TRUNCATE network_link=SOURCE directory=DATA_PUMP_DIR flashback_scn=581609 tables=REPOE.ADDRESSES,REPOE.CARD_DETAILS,REPOE.CUSTOMERS,REPOE.INVENTORIES,REPOE.LOGON,REPOE.ORDERENTRY_METADATA,REPOE.ORDERS,REPOE.ORDER_ITEMS,REPOE.PRODUCT_DESCRIPTIONS,REPOE.PRODUCT_INFORMATION,REPOE.WAREHOUSES logfile=REPOE_WAREHOUSES.log JOB_NAME=DP_replicate_0001
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 616.6 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . imported "REPOE"."INVENTORIES"                       900131 rows
. . imported "REPOE"."ORDER_ITEMS"                      2143687 rows
. . imported "REPOE"."ORDERS"                            714895 rows
. . imported "REPOE"."ADDRESSES"                         750003 rows
. . imported "REPOE"."CUSTOMERS"                         500003 rows
. . imported "REPOE"."CARD_DETAILS"                      750003 rows
. . imported "REPOE"."LOGON"                            1191500 rows
. . imported "REPOE"."PRODUCT_DESCRIPTIONS"                1000 rows
. . imported "REPOE"."PRODUCT_INFORMATION"                 1000 rows
. . imported "REPOE"."ORDERENTRY_METADATA"                    4 rows
. . imported "REPOE"."WAREHOUSES"                          1000 rows
Processing object type TABLE_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type TABLE_EXPORT/TABLE/INDEX/INDEX
Processing object type TABLE_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
Processing object type TABLE_EXPORT/TABLE/INDEX/FUNCTIONAL_AND_BITMAP/INDEX
Processing object type TABLE_EXPORT/TABLE/INDEX/STATISTICS/FUNCTIONAL_AND_BITMAP/INDEX_STATISTICS
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Job "SYSTEM"."DP_REPLICATE_0001" successfully completed at 03:01:34

On the source machine, as oracle, in the $HOME/scripts directory, execute the record_count.bash script to check the record counts between the two databases.

cd $HOME/scripts
./record_count.bash

TABLE_NAME                     SOURCE      TARGET
------------------------------ ----------- -----------
ADDRESSES                      750003      750003
CARD_DETAILS                   750003      750003
CUSTOMERS                      500003      500003
INVENTORIES                    900131      900131
LOGON                          1191500     1191500
ORDERENTRY_METADATA            4           4
ORDERS                         714895      714895
ORDER_ITEMS                    2143687     2143687
PRODUCT_DESCRIPTIONS           1000        1000
PRODUCT_INFORMATION            1000        1000
WAREHOUSES                     1000        1000

11 rows selected.

Sum of orders             SOURCE             TARGET
------------- ------------------ ------------------
ORDERS         $3,572,944,731.00  $3,572,944,731.00

11 rows selected.

Because there was no transaction activity in the source database, the source and target database record counts should be the same.

The source and target databases are now in sync and we can start the replication processes.