Wednesday, December 4, 2013

Introduction to Oracle Cluster Table


Table clustering is an optional technique of storing data. In table clustering, rows from one or more tables those are associated with each other on common data entity are physically stored adjacent to each other on same data block. This common entity is called ‘Cluster key’. Identifying the cluster key is the first step to implement the oracle table clustering.

Once good candidate for cluster key is identified, two more challenges come on the way to make it success. First challenge is avoid disk block chaining and second challenge is large variance or standard deviation.

The Disk Block chaining is very mush possible in cluster table. If combined size of all rows of tables in single row cluster exceeds the size of disk block that would lead to immediate disk block chaining.  Chaining is linking together two separate disk block.  Disk block chaining reduces the advantage of Oracle Cluster table.
Second Challenge come through variance in Cluster row size which causes either significant disk block chaining or wasted disk space.

When creating oracle cluster, an optional size parameter should be used to set the max number of cluster key with their corresponding rows allowed within disk block. Oracle divides available disk block space by this optional parameter to calculate the maximum no of cluster keys. In data warehouse system where data is almost static and does not grow, this optional parameter would not necessary. Whenever update or Insert statement executes this causes Cluster row gets larger. It is quite possible that Cluster row will significantly increase in size. If accumulated increase in size of all cluster rows consumes all the available remaining space within a disk block, Disk chaining will happen. This is undesirable and require expensive disk seek operation to access it.
On other hand large variance in the size of cluster rows can cause either disk chaining or either wastage of disk space. By looking both these challenges in Oracle cluster table optional size parameter play a vital role. If average cluster row size is small, there will be more Cluster Row with in disk block. On the other hand average cluster row size is big which will reduce the MAXIMUM limit.

In select statement execution sorting and join filter are almost mandatory operation performed before the result gets displayed.  By introducing Oracle Cluster Table both these operations skip from query execution. The fields of the cluster key are more often indexed, so high maintenance indexes for these fields are reduced.
Example of crusting
EMP and DEPT table are associated to each other via DEPTNO column and it is primary key

Query execution for heap table is explained below. In later section Oracle Cluster table is explained in detail.
SELECT E.EMPNO,
         E.ENAME,
         E.SAL,
         E.COMM,
         E.HIREDATE,
         D.DEPNTO,
         D.DNAME,
         D.LOC
    FROM EMP E, DEPT D
   WHERE E.DEPTNO = D.DEPTNO
ORDER BY D.DEPTNO;

The IO operation of the above query will involve following steps.
  1. Traversing DEPTNO index to get the next dept no.
  2. Fetch rows from DEPT table that correspond to DEPTNO using ROWID identified from Index  scan. 
  3. Find each row from EMP table correspond to DEPTNO unless there is another index available on DEPTNO.
  4. Fetch Row from EMP table.
 
Once the cluster table created, the IO operation of the above query will involve below steps.
  1. Traversing through DEPTNO ( cluster key) index. 
  2. Fetch cluster row, consisting row from DEPT table and all rows from EMP table correspond to DEPTNO.

Create cluster, Table and Index

CREATE CLUSTER CLUS_EMP_DEPT (DEPTNO NUMBER (3)) -- Cluster Key
    SIZE 600                      -- Average Cluster Row size in bytes
    TABLESPACE MY_TABLESPACE      --if table space omitted then it
    STORAGE (INITIAL 500K         -- will create in default tablespace
             NEXT    500K        -- Cluster index is created by
             MINEXTENTS 2        -- default
             PCTINCREASE 20);

CREATE TABLE EMP
(
   EMPNO    NUMBER (5) PRIMARY KEY,
   ENAME    VARCHAR2 (15) NOT NULL,
   DEPTNO   NUMBER (3)    REFERENCES DEPT
)
CLUSTER CLUS_EMP_DEPT ( DEPTNO );
 
CREATE TABLE DEPT
(
   DEPTNO   NUMBER (3) PRIMARY KEY,
   DNAME    VARCHAR2 (20),
   LOC      VARCHAR2 (30)
)
CLUSTER CLUS_EMP_DEPT ( DEPTNO );
 
CREATE INDEX CLUS_EMP_DEPT_INDEX                          -- explicitly create
   ON CLUSTER CLUS_EMP_DEPT                               -- index if required
   TABLESPACE MY_INDEX_TABLESPACE
   STORAGE (INITIAL 80 K NEXT 50 K MINEXTENTS 2 MAXEXTENTS 12 PCTINCREASE 20);
 
 

No comments:

Post a Comment

External Table

Oracle External Table External tables are defined as tables that do not resides in the database allows you to access data that is stor...