Sunday, April 2, 2017

External Table

Oracle External Table

External tables are defined as tables that do not resides in the database allows you to access data that is stored outside the database in files. Oracle provides a couple of access driver “ORACLE_LOADER” and “ORACLE_DATAPUMP” to read this external data. You can select, join, or sort the external table but you can't  perform DML operation ( INSERT, UPDATE and DELETE). External tables also support view and synonyms creation but you can not create indexes on external tables.

External table feature provides a valuable mean for performing ETL task in data warehousing. External tables support 3 DDL operations.
  • Create external tables
  • Alter external tables
  • Drop externa tables


Creating External table and Load data

Sample data (dept.txt)
deptno,dname,location
10,Accounting,New York
20,Research,Dallas
30,Sales,Chicago
40,Operations,Boston

CONN / AS SYSDBA
CREATE OR REPLACE DIRECTORY ext_data_dir as “c:/users/xxx/oracle/data”;

GRANT READ, WRITE ON DIRECTORY ext_data_dir TO SCOTT;

CONN SCOTT/TIGER

CREATE TABLE DEPT_EXTERNAL (
     deptno    NUMBER(3),
     dname     VARCHAR2(40),
     location  VARCHAR2(60)
) ORGANIZATION EXTERNAL 
     TYPE   ORACLE_LOADER
     DEFAULT DIRECTORY ext_data_dir
     ACCESS PARAMETERS
     ( 
       RECORDS DELIMITED BY NEWLINE
       FIELDS TERMINATED BY ‘,’
       MISSING FIELD VALUES ARE NULL
      (
         deptno, dname, location)
       )
     )
     LOCATION(‘dept.txt’) /* can add more than one file */
)
PARALLEL 5
REJECT LIMIT UNLIMITED;

if you omit the TYPE specification, ORACLE_LOADER is used by default as access driver. you can use ORALCE_DATAPUMP  access driver

External tables can be altered using the heap table alter commands for PARALLEL, ADD COLUMN, MODIFY COLUMN, SET UNUSED, DROP COLUMN and RENAME COLUMN. Apart from this you can alter the reject limit, default directory, access parameters and location

ALTER TABLE DEPT_EXTERNAL REJECT LIMIT 50;
ALTER TABLE DEPT_EXTERNAL DEFAULT DIRECTORY <new_directory>;
ALTER TABLE DEPT_EXTERNAL ACCESS PARAMETERS( FIELDS TERMINATED BY ‘:’);
ALTER TABLE DEPT_EXTERNAL LOCATION(‘dept2.txt’,’dept3.txt’,…);



Drop External table is same as normal heap table drop statement.

Tuesday, December 31, 2013

Introduction to Golden Gate

In 2009, Oracle acquired Golden Gate to strengthen its data replication technology as part of future strategies within data warehouse and real time transaction database. Before this acquisition, Oracle  had its own replication technology which mainly constitutes basic replication, advance replication and stream replication.

Oracle's basic replication process had two different methods: log based and trigger based. In log based replication, snapshot schema and DB links had to set up between source and target database. Data was transported from online redo log to target database over network.

In advance replication Oracle included multiple master replications from multiple environments and trigger based replication method.

In later versions of Oracle 9i, stream replication was introduced with improvements in its earlier replication technologies. It uses log based replication where committed transaction’s online redo log of source system is mined to target system over network. Also a new background process to manage communication and operations of replication activities has been introduced

However these replication technologies couldn't stand long as it involved complicacy and complex software development to harness the power of Oracle to Non Oracle database environment to allow the transaction to be moved between environment.

In 1990, a small software company Golden Gate came with different approach on data replication between cross platform. Instead of using different formats, Golden Gate implemented a uniform format and used a command prompt GGSCI (Golden Gate Software Command Interface) to perform data replication operation. Oracle Goldengate 11g supports both DDL and DML operation. Legacy Oracle 8i version or prior requires an upgrade to 9i or later to implement the Goldengate.

Goldengate supports all popular databases such as Oracle, MySQL, DB2, Sybase, Teradata, SQL server etc.

Goldengate contains three installable components which can be downloaded separately from Oracle's website and can be installed on both, source and target system depending on requirement. Below is the list of installable components for Goldengate:
  •  Oracle Goldengate which includes GGSCI, DEFGEN, LOGDUMP and REVERSE.
  • Oracle Goldengate Director which includes Director Service application, Monitor Agent, Director Client and Administrator client.
  • Oracle Goldengate Veridata which includes Agent, CLI, Web interface.

 Architecture

Manager Process
  • Start and manage Goldengate processes such as collector.
  • Manage port numbers.
  • Trail Management.
  • Create  error, event and reports.
  • Process command GGSCI.

Extract Process – Extract DML and DDL data from source database. 

Data Pump Process: Option process but recommended. Read records from source trails, process them and passes over to target system.

Collector Process: This background process runs on Target system.  It writes records on remote trail.

Replicate Process: Reads Remote trail, process them and applies changes to target Database.


Trail: It is a Goldengate intermittent file written in proprietary format by either extract process (source system) or collector background process (target system).

Wednesday, December 4, 2013

Introduction to Oracle Cluster Table


Table clustering is an optional technique of storing data. In table clustering, rows from one or more tables those are associated with each other on common data entity are physically stored adjacent to each other on same data block. This common entity is called ‘Cluster key’. Identifying the cluster key is the first step to implement the oracle table clustering.

Once good candidate for cluster key is identified, two more challenges come on the way to make it success. First challenge is avoid disk block chaining and second challenge is large variance or standard deviation.

The Disk Block chaining is very mush possible in cluster table. If combined size of all rows of tables in single row cluster exceeds the size of disk block that would lead to immediate disk block chaining.  Chaining is linking together two separate disk block.  Disk block chaining reduces the advantage of Oracle Cluster table.
Second Challenge come through variance in Cluster row size which causes either significant disk block chaining or wasted disk space.

When creating oracle cluster, an optional size parameter should be used to set the max number of cluster key with their corresponding rows allowed within disk block. Oracle divides available disk block space by this optional parameter to calculate the maximum no of cluster keys. In data warehouse system where data is almost static and does not grow, this optional parameter would not necessary. Whenever update or Insert statement executes this causes Cluster row gets larger. It is quite possible that Cluster row will significantly increase in size. If accumulated increase in size of all cluster rows consumes all the available remaining space within a disk block, Disk chaining will happen. This is undesirable and require expensive disk seek operation to access it.
On other hand large variance in the size of cluster rows can cause either disk chaining or either wastage of disk space. By looking both these challenges in Oracle cluster table optional size parameter play a vital role. If average cluster row size is small, there will be more Cluster Row with in disk block. On the other hand average cluster row size is big which will reduce the MAXIMUM limit.

In select statement execution sorting and join filter are almost mandatory operation performed before the result gets displayed.  By introducing Oracle Cluster Table both these operations skip from query execution. The fields of the cluster key are more often indexed, so high maintenance indexes for these fields are reduced.
Example of crusting
EMP and DEPT table are associated to each other via DEPTNO column and it is primary key

Query execution for heap table is explained below. In later section Oracle Cluster table is explained in detail.
SELECT E.EMPNO,
         E.ENAME,
         E.SAL,
         E.COMM,
         E.HIREDATE,
         D.DEPNTO,
         D.DNAME,
         D.LOC
    FROM EMP E, DEPT D
   WHERE E.DEPTNO = D.DEPTNO
ORDER BY D.DEPTNO;

The IO operation of the above query will involve following steps.
  1. Traversing DEPTNO index to get the next dept no.
  2. Fetch rows from DEPT table that correspond to DEPTNO using ROWID identified from Index  scan. 
  3. Find each row from EMP table correspond to DEPTNO unless there is another index available on DEPTNO.
  4. Fetch Row from EMP table.
 
Once the cluster table created, the IO operation of the above query will involve below steps.
  1. Traversing through DEPTNO ( cluster key) index. 
  2. Fetch cluster row, consisting row from DEPT table and all rows from EMP table correspond to DEPTNO.

Create cluster, Table and Index

CREATE CLUSTER CLUS_EMP_DEPT (DEPTNO NUMBER (3)) -- Cluster Key
    SIZE 600                      -- Average Cluster Row size in bytes
    TABLESPACE MY_TABLESPACE      --if table space omitted then it
    STORAGE (INITIAL 500K         -- will create in default tablespace
             NEXT    500K        -- Cluster index is created by
             MINEXTENTS 2        -- default
             PCTINCREASE 20);

CREATE TABLE EMP
(
   EMPNO    NUMBER (5) PRIMARY KEY,
   ENAME    VARCHAR2 (15) NOT NULL,
   DEPTNO   NUMBER (3)    REFERENCES DEPT
)
CLUSTER CLUS_EMP_DEPT ( DEPTNO );
 
CREATE TABLE DEPT
(
   DEPTNO   NUMBER (3) PRIMARY KEY,
   DNAME    VARCHAR2 (20),
   LOC      VARCHAR2 (30)
)
CLUSTER CLUS_EMP_DEPT ( DEPTNO );
 
CREATE INDEX CLUS_EMP_DEPT_INDEX                          -- explicitly create
   ON CLUSTER CLUS_EMP_DEPT                               -- index if required
   TABLESPACE MY_INDEX_TABLESPACE
   STORAGE (INITIAL 80 K NEXT 50 K MINEXTENTS 2 MAXEXTENTS 12 PCTINCREASE 20);
 
 

Wednesday, October 23, 2013

Autonomous Transactions


When an autonomous transactions is called it leaves context of calling program, perform an independent task without affecting calling program state. It is necessary to have COMMIT or ROLLBACK statement as last statement within autonomous transaction program unit executable section to avoid the run time oracle error 'ORA-06519: active autonomous transaction detected'.
It can be used to monitor the progress of a program unit and record the event/error message generated during program execution.

Following type of PL/SQL block can use PRAGMA autonomous transactions
      ·         Stored procedure and function
      ·         Procedure or function defined in declaration section of anonymous block
      ·         Packaged procedure and Function
      ·         Type Methods

An example to understand autonomous transactions

CREATE OR REPLACE PROCEDURE RecordError (errmsg IN VARCHAR2)
AS
   PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
   INSERT INTO errorlog (id, errmsg, lstime)
        VALUES (seq_errorlog.NEXTVAL, errmsg, SYSDATE);
   COMMIT;
END RecordError;
/
BEGIN
   FOR X IN 1 .. 100
   LOOP
      RecordError  ('Autonomous transactions message - ' || X);
      INSERT INTO errorlog (id, errmsg, lstime)
           VALUES (
                     seq_errorlog.NEXTVAL,
                     'Calling Program Message - ' || X,
                     SYSDATE);
   END LOOP;
   ROLLBACK;
END;
/
 
Execution of anonymous block will create the records in ERRORLOG table generated through RECORDERROR procedure only.
 

Thursday, July 5, 2012

Print Message using user defined procedure

Print_line user defined function will help you to print the line which is more than 255 bytes.

create procedure print_line (
                 string_in IN VARCHAR2,
                 split_in  IN PLS_INTEGER DEFAULT 255
                 ) as
begin
   for i in 1 .. ceil(length(string_in)/split_in) loop
      dbms_output.put_line(substr(string_in,split_in*(i-1)+1,split_in));
   end loop;
end;
/

create or replace public synonym put_line for put_line;
grant execute on put_line to public;

Thursday, April 26, 2012

Continue Statement

Finally continue statement added to oracle 11g PL/SQL language. It signals an immediate end to a loop iteration and return to first statement of in the loop.
e.g. Addition of odd numbers.
DECLARE
    total BINARY_INTEGER := 0;
BEGIN
    FOR i in 1..10 LOOP
        IF MOD(i/2) = 0 THEN
          CONTINUE;
        END IF;
        SUM = SUM + i;
    END LOOP,
    DBMS_OUTPUT.PUT_LINE('Sum of odd numbers is '|| sum );
END;
/

Saturday, April 21, 2012

PL/ SQL Optimization

 

PL/ SQL Optimizer

Oracle 10g introduced  a new PL/SQL optimization feature to optimize the PL/SQL code by rearranging codes for better performance. PLSQL optimizer do this for you. Optimizer is enabled by default for interpreted p-code and native compilation. The default value for the optimizer is set to 2 but it can be unset or modify compiler aggressiveness.

PLSQL_OPTIMIZE_LEVEL = 0     no optimization
PLSQL_OPTIMIZE_LEVEL = 1     moderate optimize may leave unused code and exception
PLSQL_OPTIMIZE_LEVEL = 2     Aggressive optimize may rearrange source code flow

ALTER SESSION SET PLSQL_OPTIMIZER_LEVEL = 1;

ALTER PROCEDURE procedure_01 COMPILE plsql_optimizer_level =1;

ALTER PROCEDURE procedure_02 COMPILE reuse settings;

Subprogram Inlining

Subprogram inlining replaces a subprogram invocation from the copy of invoked subprogram. To allow subprogram inlining invoking set PLSQL_OPTIMIZE_LEVEL  to 2 (default value) or set it to 3. Subprogram inlining can also be done by using Pragma INLINE. PRAGMA INLINE can enable or disable subprogram inlining.

PRAGMA INLINE (subprogram, YES);              # enable subprogram inlining

PRAGMA INLINE (subprogram, NO);               # disable subprogram inlining

Candidates for Tuning

  • Older code that does not take advantage of new PL/SQL language features.
  • Older dynamic SQL statements written with the DBMS_SQL package.
  • Code that spends much time processing SQL statements.

  • Functions invoked in queries, which might run millions of times.

  • Code that spends much time looping through query results.

  • Code that does many numeric computations.

External Table

Oracle External Table External tables are defined as tables that do not resides in the database allows you to access data that is stor...