Oracle DBMS, commonly known as Oracle Database is a database management system that has been produced and marketed by Oracle Corporation. Released as a commercial application in 1979, it has been widely used for running Online Transaction Processing (OLTP), Data Warehousing (DW), or a combination of the two.
The Technology of Oracle Change Data Capture
The concept of Oracle CDC(Change Data Capture) is rather simple. It is primarily software design patterns used to monitor and track all changes in a database so that any action like business analytics can be taken on those changes.
Oracle CDC is also useful for undertaking data identification, data integration, and data delivery for all changes made at the source database of an enterprise. This helps to speed up data warehousing and data integration across enterprises, thereby greatly improving the quality and performance of databases.
Oracle CDC is the perfect tool as well as a non-intrusive method to initiate replication processes without any drop or lag in the performances of the databases. The activities that can be carried out are migrating databases to the cloud without shutting down the systems and divesting analytics queries from databases in production to data repositories. Incremental data that is, changed data after the last run can also be extracted from various sources and moved to a data warehouse.
The most critical aspect of Oracle CDC is to capture and preserve the state of the data. Hence, the whole activity is confined to a data warehouse environment that can be carried out in any database or data storage system. For launching Oracle CDC, users can avail several options like application logic to physical storage, either as standalone entities, or a combination of many system layers.
For organizations using the Oracle database, Oracle CDC helps to increase operational efficiencies by reducing data warehousing costs. It is possible because CDC extracts and transfers incremental data in real-time as soon as a change is tracked in the source database.
The Evolution of Oracle Change Data Capture
Oracle introduced the change data capture feature in its 9i version in 2001, after almost two decades of the launch of the Oracle database management system. In the initial stages, Oracle CDC functioned through triggers located on the tables in the source database. However, Database Administrators found this process quite complex and tedious.
Oracle set things right in their 10g version with a thorough and complete transformation of the existing technology. Redo logs of the database were used by CDC. When this new feature was used in tandem with Oracle Streams, a replication tool of Oracle, it was possible to capture and transmit all changes to data without using triggers.
This log-based model of Oracle CDC became very popular and was widely used by businesses. Unfortunately, after the Oracle 12c version was released, Oracle decided to withdraw support to CDC. Users can still use Oracle CDC but through Oracle Golden Gate only, very expensive software for replication from Oracle.
The Functioning of Oracle Change Data Capture
Before starting Oracle CDC, it is necessary to set up the required infrastructure and journalizing models that will allow capturing and recording the changes made to the existing database. The Oracle Data Integrator supports two journalizing modes. The first is the Simple Journalizing mode where changes made in a specific datastore are identified. The second is the Consistent Set Journalizing model where changes that take place to the datastore in a group are identified. This process ensures the referential integrity of the data stores.
There are two types of Oracle CDC.
Synchronous Change Data Capture
Here triggers are placed in entries in the change table at all places where the existing data is changed. Therefore, when changes are identified, these triggers are activated. A selected user who acts as a change data publisher is permitted to access the tables at source and the namespace from where all the changes made are tracked and captured.
Next, a change set and tables are created that subscribe to the modified data. But before the changes can be copied to the target database, a script has to be developed that will add the data to the target database. There is a drawback to this Synchronous CDC and that is the triggers often adversely affect the performance of the source database.
Asynchronous Change Data Capture
In this process, data is transferred to the redo log files and the changes are captured after a SQL statement is passed through a DML action. Since the modified data is not captured as a part of the transactions that changed the source tables, it does not have any effect on the transaction. Asynchronous Oracle CDC has three modes, HotLog, Distributed HotLog, and AutoLog, and is based on Oracle Streams that has since been discontinued.
Both the types of Oracle CDC have their advantages and what is used depends on the particular needs of organizations.
Benefits of Database Extraction with Oracle Change Data Capture
Oracle CDC is particularly useful for database extraction.
The most critical benefit is that it facilitates immediate database extraction – Insert, Update, and Delete – as soon as they take place in the source tables in real-time. Database extraction for Insert jobs is not possible without CDC and is very difficult for Update and Delete as the data is not available in the table.
Further, Oracle CDC provides a subscribe and easy-to-publish interface through a DBMS_LOGMNR_CDC_PUBLISH
and DBMS_LOGMNR_CDC_SUBSCRIBE
bundle. Without CDC, it would take substantial manhours to execute this step and even then, it would then be prone to errors.
These are some of the cutting-edge benefits of using Oracle CDC.