One of the most complex issues in Data Warehousing is how to detect changes on your source system. Most data warehouses do a complete dump and load of all the data during the nightly load. With Dbvisit Replicate and the new Audit functionality, only data that has been changed can be loaded real time into the Data Warehouse. This allows for reduced processing and real time updates of the Data Warehouse.
The following operations are captured and written into a special created Dbvisit Replicate stage table in the target database:
The following operations are NOT captured:
The Audit function can also be used to capture auditing information for analysis purposes.
The Audit information can be turned on during the setup wizard.
Dbvisit Replicate creates a copy of the source table as a staging table in the target database and inserts a record into this staging table every time there is a change in the source table (for Update, Delete, Insert). This means that even with an updated or delete in the source system, a new record is created in the target (or staging) table. The staging table holds a new record for every change in the source system.
The loading of the data warehouse process can use this staging table to process all new updates into the data warehouse. When the record has been processed and loaded into the data warehouse, the record can be removed from the staging table to indicate that the change has been loaded.
The staging table contains 2 columns for every 1 column on the source table. There is an OLD and NEW value. If a column is updated on the source table, then the replicated staging table will contain both the OLD and the NEW value.
Audit information is also written to the staging table.
This audit information contains:
The source database contains a table called SALES, and we want to capture all changes to this table into a staging table.
Dbvisit Replicate will create the DDL for the staging table which will contain:
For every update, delete or insert transaction on the source database, a new record will be inserted into the staging table with the old and new values as they are applicable.
The OPERATION column of the staging table will contain the operation. This can be:
Example of how the Audit function works in Dbvisit Replicate to facilitate the ETL process for real-time loading of data warehouses. Length - 6:10
The Audit function is turned on during the setup wizard.
There are 5 steps to turning on the Audit functionality:
Once Audit is turned on, the setup wizard will ask more questions to specifically configure Audit for each particular need:
The Audit function can also be turn on manually. This can be done in the dbvrep console.
To manually turn on Audit for a table called AVI.SALES, issue the following commands:
PREPARE TABLE AVI.SALES CDCAUDIT TABLE AVI.SALES INSERT YES CDCAUDIT TABLE AVI.SALES DELETE YES CDCAUDIT TABLE AVI.SALES UPDATE YES CDCAUDIT TABLE AVI.SALES OLDCOL_PREFIX OLD_ CDCAUDIT TABLE AVI.SALES NEWCOL_PREFIX NEW_ CDCAUDIT TABLE AVI.SALES OPCOL OPERATION CDCAUDIT TABLE AVI.SALES SCN SCN CDCAUDIT TABLE AVI.SALES TIMESTAMP_CHANGE DATE_CHANGE CDCAUDIT TABLE AVI.SALES XIDCOL TRANSACTION_ID CDCAUDIT TABLE AVI.SALES TIMESTAMP_COMMIT DATE_COMMIT CDCAUDIT TABLE AVI.SALES SID SID CDCAUDIT TABLE AVI.SALES SERIAL SERIAL CDCAUDIT TABLE AVI.SALES CUSER CURRENT_USER CDCAUDIT TABLE AVI.SALES LUSER LOGON_USER CDCAUDIT TABLE AVI.SALES CLIINFO CLIENT_INFO CDCAUDIT TABLE AVI.SALES OSUSER OS_USER CDCAUDIT TABLE AVI.SALES MACHINE MACHINE CDCAUDIT TABLE AVI.SALES OSTERM OS_TERM CDCAUDIT TABLE AVI.SALES OSPROC OS_PROC_ID CDCAUDIT TABLE AVI.SALES OSPROG OS_PROG |
This also turns on all auditing columns available.
The changes are inserted as they are fed from mine, i.e. they can stay inserted but uncommitted until the transaction commits on the source database. When using the Audit option, this means that the application that reads the Audit data must not rely solely on the SCN/time of the change to determine whether a record was processed (that's what the commit SCN is for). And it also means that the purging of old partitions must not be too aggressive, as they can still contain uncommitted rows. |
Options to purge data out of the Audit inserted staging tables are:
The changes are inserted as they are fed from mine, i.e. they can stay inserted but uncommitted until the transaction commits on the source database. This means that the application that reads the Audit data must not rely solely on the SCN/time of the change to determine whether a record was processed (that's what the commit SCN is for). And it also means that the purging of old partitions must not be too aggressive, as they can still contain uncommitted rows. The commit SCN is inserted as NULL and updated only when the commit actually comes in. (This means the transaction id should be indexed if commit SCN column is used and that you need row movement enabled if you use that as partition key.) |