Oracle® Goldengate: Tutorial For Oracle To Oracle
Oracle® Goldengate: Tutorial For Oracle To Oracle
Oracle® Goldengate: Tutorial For Oracle To Oracle
Tutorial
for Oracle to Oracle
Version 11.2
July 2013
Contents
This tutorial provides a quick overview of Oracle to Oracle database replication using
Extract and Replicat for version 11.2 and above. Both Classic and Integrated Extract are
demonstrated in this tutorial. For more detailed information, please consult the Oracle
GoldenGate Administration Guide.
This tutorial may be read to get a general overview of how Extract and Replicat operate.
Alternatively, you can follow along each step of the way.
Prerequisites
If you plan to execute the instructions in the tutorial, make sure all software is already
installed. If implementing the new 11.2 feature, Integrated Extract, you must be using an
11.2.0.3 source database with the 11.2.0.3 Database specific bundle patch for Integrated
Extract 11.2.x (Doc ID 1411356.1).
The following table describes items that are referred to throughout the tutorial. You will
need to identify your installation-specific values and substitute them as you go.
4
The source Oracle database tables used in these examples can be created and loaded with
sample data using the following commands.
% sqlplus userid/password
SQLPLUS> @/ggs/demo_ora_create
SQLPLUS> @/ggs/demo_ora_insert
SQLPLUS> exit
The target Oracle database tables used in these examples can be created using the
following commands.
% sqlplus userid/password
SQLPLUS> @/ggs/demo_ora_create
SQLPLUS> exit
This section outlines the steps required in each phase of database load and replication.
Extract and Replicat work together to keep the databases in sync near real-time via
incremental transaction replication. Perform this function by
Starting the Manager program on both the source and target systems.
Adding supplemental transaction log data for update operations.
Running the real-time Extract to retrieve and store the incremental changed data
from the Oracle tables into trail files on the target Unix system.
Once the target database is created, it can be loaded with data from the Oracle source
database. To load the target database via OGG tools by
Running the initial load Extract to retrieve, convert and output data from the Oracle
tables.
Running the initial load Replicat to insert the initial data into the target database.
Once Extract and Replicat are running, changes are replicated perpetually. At this point,
we will also demonstrate the following functions.
How to retrieve information on Extract and Replicat status.
How to gracefully stop replication.
How to restart replication with transaction integrity.
Commands throughout the tutorial make specific references to directories, file names,
checkpoint group names, begin times, etc. Unless otherwise noted, these items do not
have to correspond exactly in your environment; they are used to illustrated concrete
examples. Where the prompt is written GGSCI {unixserver1}> the command should be
executed on the source system. {unixserver2} indicates the target system.
5
Prepare the Database for Replication and Start Capturing Changes
Before the initial load is started, supplemental logging needs to be enabled and real-time
extract started. All changes occurring against source tables are automatically detected by
Extract, then formatted and transferred near real-time to temporary files on the Unix file
system. After initial load is completed, the data is read from these files and replicated to
the target database by the Replicat.
By default, Oracle only logs changed columns for update operations. Normally, this
means that primary key columns are not logged during an update operation. However,
Replicat requires the primary key columns in order to apply the update on the target
system. The ADD TRANDATA command in GGSCI is used to cause Oracle to log primary
key columns for all updates at the table. Also minimal supplemental logging needs to be
enabled at the database level.
To enable minimal supplemental logging at the database level, issue the following
command on the source Unix system.
$ sqlplus userid/password
6
To add supplemental log data at the table level, issue the following commands on the
source Unix system.
$ cd /ggs
$ ggsci
The DBLOGIN command establishes a database connection for the specified user. The
user is prompted for a password.
The ADD TRANDATA command causes Oracle to log primary key columns for all update
operations on the specified table.
Prior to executing data extraction, which moves data from Oracle to Oracle in the Unix
environment, perform the following task.
Add the MGRPORT parameter to the Manager Parameter file. This ensures that server
collector processes can be dynamically created on the remote system to receive and
log data created by Extract.
The Server Collector program receives incoming data over a TCP/IP connection from
Extract on the source Unix system, and then outputs the incremental changes to
temporary storage on the target Unix system. The Server Collector is automatically
started by the Manager process at the request of the Extract program whenever moving
data between systems.
Before starting Manager, you must edit Manager’s startup parameter file (called
/ggs/dirprm/mgr.prm) and add the PORT parameter. You can do this manually with a
Unix editor, or you can use GGSCI to start the vi program for you with these commands:
In either case, add the following text to the MGR.PRM file, then save the file and quit.
PORT 7809
If your target Oracle database is on another system, repeat the above steps on the target
system. Start the target Manager process with the START MANAGER command.
7
GGSCI (unixserver2) > EDIT PARAMS MGR
PORT 7809
Most Extract parameters are entered into a parameter file. You may create your
parameter file manually using vi or another editor on Unix. The parameter file name is
the same name as the name of the group, in this case EXTORA. Issue the following
command on the source system to launch vi from GGSCI :
You can also edit the file /ggs/dirprm/extora.prm directly from any other text editor.
Enter the following parameters into EXTORA. Note that the ordering of the parameters is
important.
--
-- Extract parameter file to capture TCUSTORD
-- and TCUSTMER changes
--
EXTRACT EXTORA
RMTTRAIL /ggs/dirdat/rt
TABLE schema.TCUSTMER;
TABLE schema.TCUSTORD;
Parameters explained:
EXTRACT EXTORA identifies the particular extract checkpoint group with which this
parameter file is associated.
USERID and PASSWORD must match an existing account in the Oracle database. The
active Oracle database is indicated by the user’s ORACLE_SID environment variable. For
Integrated Extract this user must have been granted Admin Privileges by using the
‘DBMS_GOLDENGATE_AUTH’ package executed by a SYS account.
RMTHOST identifies the system to which to output extracted database changes and must
be specified before RMTTRAIL. MGRPORT specifies the port number on which Manager
has been configured to listen for requests for Server Collectors. RMTHOST can also be
specified as a standard TCP/IP host name.
8
RMTTRAIL specifies the file set to which database changes will be output. Changes
detected on any TABLE specified below this entry will be output to the remote trail.
In order to enable Integrated Extract mode, the extract must be registered in the database.
This option can also be used to enable Extract in Classic mode to work with Oracle
Recovery Manager to retain archive logs needed for recovery with the LOGRETENTION
option.
To register the Extract to the database for Integrated Extract, issue the following
commands on the source Unix system.
$ cd /ggs
$ ggsci
The DBLOGIN command establishes a database connection for the specified user. The
user is prompted for a password.
The REGISTER command with the DATABASE option enables Integrated Capture mode for
the Extract group. In this mode, Extract integrates with the database logmining server to
receive change data in the form of logical change records (LCR). Extract does not read the
redo logs. Extract performs capture processing, filtering, transformation, and other
requirements.
Checkpoints enable both Extract and Replicat to process data continuously from one run
to another. Checkpoints enable Extract and Replicat to be restarted while ensuring that
all records are replicated once and only once.
Extract requires two checkpoints: one into the Oracle redo log, which is the source of all
database changes, and one into the remote extract trails. Remote extract trails are a series
of temporary files created on the target Unix system that contain extracted changes.
To set up these checkpoints on your source system, issue the following commands on
Unix.
$ cd /ggs
$ ggsci
Classic Extract
GGSCI (unixserver1) > ADD EXTRACT EXTORA, TRANLOG, BEGIN NOW
Integrated Extract
GGSCI (unixserver1) > ADD EXTRACT EXTORA, INTEGRATED TRANLOG, BEGIN NOW
9
The ADD EXTRACT command establishes an Extract checkpoint group name. The
TRANLONG clause specifies the transaction log as the source database. The INTEGRATED
TRANLONG clause adds this Extract in integrated capture mode. In this mode, Extract
integrates with the database logmining server, which passes logical change records (LCR)
directly to Extract. Extract does not read the redo log. The BEGIN NOW clause causes
Extract to process database operations occurring at or after the time the Extract group
was added. Alternatively, you can specify a date and time instead of the keyword NOW.
To define the extract remote trail on your source system, issue the following command
on Unix.
The ADD RMTTRAIL command establishes a checkpoint into a remote extract trail. After
each file in this trail reaches approximately 10 megabytes, Extract creates the next file in
the sequence. Files will be named RT000000, RT000001, RT000002 and so on. These files
are the source of input to the Replicat program. Note that instead of RT you could
specify any two-character prefix.
Choose group names, destination files and sizes appropriate for your environment.
The Extract program for capturing database changes is initiated from GGSCI on the
source Unix system.
$ /ggs/ggsci
Parameters that affect all GoldenGate processes are defined in the GLOBALS parameter
file. This file must be named GLOBALS (uppercase, without an extension) and located in
your installation directory. You may create your parameter file manually from the
command shell using vi or another editor on Unix.
10
Edit this file using GGSCI on the target system:
$ cd /ggs
$ ggsci
GGSCI (unixserver2) > EDIT PARAMS ./GLOBALS
And add the following parameter to establish the name of the check point table.
CHECKPOINTTABLE schema.ggchkpt
Exit the GGSCI session to activate the new GLOBALS parameter on the target system.
Now start a new GGSCI session and create the checkpoint table on the target system.
$ ggsci
The database checkpoint table will be created on the target system. In this example, the
name is ggchkpt.
As with Extract, you must create a Replicat parameter file. Here you can manually use vi
or another editor on Unix. In this example, we create the following file (called
/ggs/dirprm/repora.prm) for this purpose. Make sure the file is saved in a text format.
Note that the ordering of the parameters is important.
Alternatively, you can launch the vi program within GGSCI on the target Unix system.
$ cd /ggs
$ ggsci
--
-- REPLICAT parameter file to replicate changes
-- for TCUSTORD and TCUSTMER.
--
REPLICAT REPORA
-- Deleted PURGEOLDEXTRACTS parameter
ASSUMETARGETDEFS
Parameters explained:
REPLICAT REPORA associates this parameter file with the REPORA checkpoint established
via GGSCI. This also implicitly establishes the extract trail /ggs/dirdat/rt as the source of
11
data to replicate.
ASSUMETARGETDEFS allows Replicat to assume that the source Oracle tables are
structured like the target tables. This eliminates the need to retrieve table definitions
from the source system. Source definitions must be generated using DEFGEN if a source
table is structured differently than the target.
DISCARDFILE determines where records from operations that fail during replication are
output. The discard file is useful for debugging problems during the replication process.
This file will contain any rejected rows and the associated causes. Any existing contents
are purged at startup when PURGE is specified (APPEND could be specified instead).
USERID and PASSWORD must match an existing account in the Oracle database. The
active Oracle database is indicated by the user’s ORACLE_SID environment variable.
Each MAP entry establishes a relationship between a source Oracle table and a target
Oracle table.
Note that no END parameter is specified. The omission of END means that Replicat will
continue to run until explicitly stopped by the user (or a fatal error occurs). END is
specified when Replicat is run in batch mode versus online mode.
In addition, any replication error will cause Replicat to abort (for example, a duplicate
record condition). See the documentation on the following Replicat parameters to
customize error response: HANDLECOLLISIONS, OVERRIDEDUPS, INSERTMISSINGUPDATES
and REPERROR. Note that restart issues are discussed later in this tutorial.
The Replicat checkpoint establishes an initial position into the extract trail created by
Extract. By default, this will always be the first record extracted. The checkpoint is
updated after each transaction, ensuring that all data is processed from run to run.
To set up the Replicat checkpoint, issue the following commands on the target Unix
system.
The ADD REPLICAT command establishes the extract trail (EXTTRAIL) created by Extract as
the source of information to replicate. REPORA is the name given to this checkpoint
group.
This section demonstrates the basic features of GoldenGate’s Oracle to Oracle initial load
facilities.
12
Initial Data Extract, Conversion and Load
At this point, data can be extracted directly from the Oracle tables, converted, and then
loaded into Oracle tables. Initial load from Oracle to Oracle involves the following tasks.
Extract is configured as a batch task to retrieve data directly from the source tables
and send the data directly to a Replicat batch task.
Replicat is configured as a batch task to populate target tables.
Extract parameters are entered into an Extract parameter file you create via any text
editor on the source system. The following file (named /ggs/dirprm/initext.prm) is an
example of a file created for this purpose. Note that the ordering of the parameters is
critical.
--
-- Extract parameter file to capture TCUSTORD
-- and TCUSTMER initial data for Replicat
--
EXTRACT INITEXT
Parameters explained:
EXTRACT is the group name used for this batch task, which retrieves data directly from
the tables rather than the redo log and sends the data to the RMTTASK Replicat group.
USERID and PASSWORD must match an existing account in the Oracle database. The active
Oracle database is indicated by the user’s ORACLE_SID environment variable.
The RMTHOST parameter specifies the TCP/IP address of the target Unix system to which
the data is moved. MGRPORT specifies the well known port number on which Manager
has been configured to listen for requests for Extract Server Collectors. RMTHOST can
also be specified as a standard TCP/IP host name.
The RMTTASK entry determines where the extracted data is output on the target, in this
case, a Replicat group named INITREP.
Each TABLE entry specifies a table from which to extract data, and a TARGET structure for
the data. If the source and target column names are the same, no other parameters are
required. If your columns have different names, refer to the COLMAP statement on how to
explicitly map each column. Make sure you change the schema to match the owner of
13
the table.
Note also that a semi-colon (;) is required at the end of the TABLE entry.
As with Extract, you must create a Replicat parameter file. Launch GGSCI on the target
system and issue the following GGSCI command. (Note that the ordering of the
parameters is important) :
--
-- REPLICAT parameter file to replicate initial changes
-- for TCUSTMER.
--
REPLICAT INITREP
ASSUMETARGETDEFS
DISCARDFILE /ggs/dirrpt/tcustmer.dsc, APPEND
Parameters explained:
REPLICAT is the batch task group named and must match the name specified in the
source Extract parameter file.
ASSUMETARGETDEFS tells Replicat to assume that the source Oracle tables are structured
like the target tables. This eliminates the need to retrieve table definitions from the
source system. Source definitions must be generated using DEFGEN if a source table is
structured differently than the target.
DISCARDFILE determines where operations that fail during replication are output. The
discard file is useful for debugging problems during the replication process. It will
contain any rejected rows and the associated causes. When PURGE is specified existing
contents are purged at startup, or APPEND could be specified instead.
USERID and PASSWORD must match an existing account in the Oracle database. The
active Oracle database is indicated by the user’s ORACLE_SID environment variable.
14
Each MAP entry establishes a relationship between a source Oracle table and a target
Oracle table. In this example, we used a wildcard map.
In addition, any replication error will cause Replicat to abort (for example, a duplicate
record condition). See the documentation on the following Replicat parameters to
customize error response: HANDLECOLLISIONS, OVERRIDEDUPS, INSERTMISSINGUPDATES
and REPERROR. Note that restart issues are discussed later in this tutorial.
Extract must be defined as a batch task within GGSCI before it may be started. Use GGSCI
to configure the Extract Task with the following command on the source system.
Replicat must be defined as a batch task within GGSCI before it may be started. Use GGSCI
to configure the Replicat Task with the following command on the target system.
Running Extract
The Extract program is initiated either in online mode with GGSCI or in batch mode from
the Unix shell with the following commands on the source system. When Extract is run
for the initial load, run it in batch mode as shown below.
Note that the output will be sent to the Extract’s report file. To view the file, use the
following GGSCI command.
When Extract completes, you will see statistics indicating how many records were output
into the target tables.
Apply Change Data that was captured during Initial Loading of Data
At this point, you will need to apply the changes that were occurring on the source
system while the initial load was being executed.
This tutorial utilized the OGG initial load facilities. When the source database is active
and you are using OGG tools for doing the initial load, the parameter
HANDLECOLLISIONS must be added to the replicat parameter file. This parameter will
handle duplicate and/or missing record conditions while the replicat is becoming
15
current with the source. For the documented best practice of instantiating from an Oracle
source, please refer to Oracle GoldenGate Best Practices: Instantiation from an Oracle
Source Database (Doc ID 1276058.1)
Running Replicat
$ cd /ggs
$ ggsci
If HANDLECOLLSIONS is enabled in the parameter file, turn it off once the REPORA
repicat is current. Comment out the parameter in the replicat REPORA parameter file.
HANDLECOLLISIONS is turned off by issuing the following from the target Unix
command prompt as follows.
$ ggsci
At this point, you can add some demo transactions to the source Oracle database with the
following Unix commands on the source system.
$ sqlplus userid/password
SQLPLUS> @/ggs/demo_ora_misc
Both Extract and Replicat echo parameters, as well as diagnostic messages, to their
respective report files.
In order to view the report file while Extract is running, issue the following command
from GGSCI on Unix (EXTORA is the name of the EXTRACT group).
The Replicat report can be viewed on the target system in a similar manner.
Any errors in the parameter file are output to the report and out files. Once the message
“Run-Time Warnings” appears in the report, all parameters have been validated and
16
data processing has begun.
Discard files also serve as a source for debugging replication problems. Frequently,
looking in the discard file at specific record values and error numbers is the fastest path
to problem resolution.
To obtain the status and history of an Extract process, issue the following commands.
The output of the INFO command will look something like this:
/ggs/dirdat/rt 0 14960 10
The statistics show the history of Extract runs for the EXTORA extract checkpoint group
and the current file in the remote extract trail. Also indicated is the current status of the
process, in this case RUNNING.
Similarly, to obtain the status and history of a Replicat process, issue the following
commands.
17
/ggs/dirdat/rt000000 * Initialized * First Record
Both Extract and Replicat can be stopped via the GGSCI utility. To stop Extract, run
GGSCI from the source system on which Extract is running and issue the following
command.
Extract outputs how many inserts, updates and deletes it captured after normal stoppage
into the REPORT file. View the REPORT file with the VIEW REPORT command described
earlier.
Replicat outputs how many inserts, updates and deletes were applied to the target
database in the standard out file and the report file after the process is stopped normally.
Generating interim reports is also possible without stopping Extract and Replicat. See
the Extract and Replicat REPORT parameter for more details, and the GGSCI
SEND…REPORT commands.
Both Extract and Replicat are restarted in the same manner as they were originally
started, whether or not the previous run ended gracefully or abnormally.
The extraction and replication processes automatically start where they left off, while
ensuring no transactions are missed or duplicated. This integrity is guaranteed by the
checkpoints maintained by each process.
Hopefully, this tutorial has provided a quick overview of what you need to do in order to
18
replicate data from Oracle to Oracle. Undoubtedly, you will eventually fine-tune this
process in your own environment.
Reference the Oracle GoldenGate Reference Guide and the Oracle GoldenGate
Administration Guide for additional information on:
19