ODI Performance Tips

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 2

--ELT Architecture

--Oracle Data Integrator can be configured with Parallel Agents for load balanci
ng, but not necessarily for speeding
transformations.
--Oracle Data Integrator uses the benefits of set-based transformation processin
g in any technology to achieve unparalleled
transformation speed.
Can change the execution server and staging area according to the requirements.
Can have parallel extractions from different sources and multi threading for loa
ding into the target.
---Oracle Data Integrator consists of one or more Java agents, which can run on
Source, Target, or
dedicated hardware.
---Since each Agent uses a Repository for metadata storage, it is typically advi
sable
to locate the Repository in a common physical location.
---For best overall performance, general guidance should be to perform staging,
joins and
transformations on the server which has the largest number of CPUs available.
----Data is merged within the staging area from the loading tables into an integ
ration table.
Transformations running in the staging area occur at the time when source data i
s read
from the temporary loading tables. The integration table is similar in structure
to the
target.
----Data Integrity is checked on this integration table and specific strategies
(update, slowly
changing dimensions) involving both the target and the integration table occur a
t that
time.
----One of the most time-consuming operations in Data Integration is moving data
between
servers/machines/applications (ie: network latency). The second most time consum
ing operation is
joining data sets (ie: computational overhead).
--- When filtering source data, it is recommended to execute the filters on the sou
rce
servers to reduce the data flow from the source to the staging area
--- When joining source tables, if the expected source set resulting from the jo
in is
smaller than the sum of the two sources, the join should be performed on the sou
rce.
If the expected source set after the join is larger than the sum of the two sour
ces, the
join should be performed on the staging area, for example in the case of a cross
join.
---Using lodaers...Typically, data is extracted by the source system native load
er to a file, and then this file is copied to the
target machine and then loaded using the target native loader.
---Server to Server communication through technologies similar to Oracle s Database
links or SQL Server s Linked Servers.
-----File to Table loading specific methods such as Oracle External Tables.
---It is recommended to avoid using useless checks when working with large volum
es, and to avoid using
unnecessarily complex integration knowledge modules
----If the agent needs to have data flowing through it, it should not be install
ed
on a machine with no spare resources. In Oracle Data Integrator 11g, you can adj
ust the
ODI_INIT_HEAP and ODI_MAX_HEAP parameters in the odiparams configuration file to

define the
agents JVM initial and maximum heap size. In Oracle Data Integrator 12c the ODI_
INIT_HEAP and
ODI_MAX_HEAP parameters are located in the setODIDomainEnv script file within an
ODI
domain bin directory
----Note that the parent agent still takes charge of the creation of the session
in the repository. Load
balancing separates the creation and execution of a session. If you want the par
ent agent to take
charge of some of the sessions and not just distribute the sessions, then it mus
t be linked to itself.
----The Use
pre-emptive load balancing Agent property allows for redistribution of queued se
ssions to agents as
sessions finish. Otherwise, sessions are not redistributed.
----The Oracle Data Integrator components deployed in WebLogic Server also benef
it from the
capabilities of the WLS cluster for scalability, including JDBC Connection Pooli
ng and Load Balancing.
----The core E-LT differentiation of Oracle Data Integrator, plus advanced tunin
g
features, Knowledge Module optimizations, and Parallel Agent sessions are all wa
ys that
ODI provides superior performance compared to traditional engine-based ETL tools
.
----Starting from ODI 11g it is also possible to install the ODI agents in WebLo
gic Server to
achieve High-Availability (HA) through a clustered deployment

You might also like