Table clustering is an optional
technique of storing data. In table clustering, rows from one or more tables those
are associated with each other on common data entity are physically stored
adjacent to each other on same data block. This common entity is called ‘Cluster
key’. Identifying the cluster key is the first step to implement the oracle table
clustering.
Once good candidate for cluster
key is identified, two more challenges come on the way to make it success.
First challenge is avoid disk block chaining and second challenge is large
variance or standard deviation.
The Disk Block chaining is very
mush possible in cluster table. If combined size of all rows of tables in
single row cluster exceeds the size of disk block that would lead to immediate disk
block chaining. Chaining is linking together
two separate disk block. Disk block
chaining reduces the advantage of Oracle Cluster table.
Second Challenge come through variance in Cluster row size
which causes either significant disk block chaining or wasted disk space.
When creating oracle cluster, an
optional size parameter should be used to set the max number of cluster key
with their corresponding rows allowed within disk block. Oracle divides
available disk block space by this optional parameter to calculate the maximum
no of cluster keys. In data warehouse system where data is almost static and
does not grow, this optional parameter would not necessary. Whenever update or
Insert statement executes this causes Cluster row gets larger. It is quite possible
that Cluster row will significantly increase in size. If accumulated increase in
size of all cluster rows consumes all the available remaining space within a
disk block, Disk chaining will happen. This is undesirable and require expensive
disk seek operation to access it.
On other hand large variance in
the size of cluster rows can cause either disk chaining or either wastage of
disk space. By looking both these challenges in Oracle cluster table optional
size parameter play a vital role. If average cluster row size is small, there
will be more Cluster Row with in disk block. On the other hand average cluster
row size is big which will reduce the MAXIMUM limit.
In select statement execution sorting
and join filter are almost mandatory operation performed before the result gets
displayed. By introducing Oracle Cluster
Table both these operations skip from query execution. The fields of the
cluster key are more often indexed, so high maintenance indexes for these
fields are reduced.
Example of crusting EMP and DEPT table are associated to each other via DEPTNO column and it is primary key
Query execution for heap table is explained below. In later
section Oracle Cluster table is explained in detail.
SELECT E.EMPNO,
E.ENAME,
E.SAL,
E.COMM,
E.HIREDATE,
D.DEPNTO,
D.DNAME,
D.LOC
FROM EMP E, DEPT D
WHERE E.DEPTNO = D.DEPTNO
ORDER BY D.DEPTNO;
The IO operation of the above query will involve following steps.
- Traversing DEPTNO index to get the next dept no.
- Fetch rows from DEPT table that correspond to DEPTNO using ROWID identified from Index scan.
- Find each row from EMP table correspond to DEPTNO unless there is another index available on DEPTNO.
- Fetch Row from EMP table.
Once the cluster table created, the IO operation of the above query
will involve below steps.
- Traversing through DEPTNO ( cluster key) index.
- Fetch cluster row, consisting row from DEPT table and all rows from EMP table correspond to DEPTNO.
Create cluster, Table and Index
CREATE CLUSTER CLUS_EMP_DEPT (DEPTNO NUMBER (3)) -- Cluster Key
SIZE 600 -- Average Cluster Row size in bytes
TABLESPACE MY_TABLESPACE --if table space omitted then it
STORAGE (INITIAL 500K -- will create in default tablespace
NEXT 500K -- Cluster index is created by
MINEXTENTS 2 -- default
PCTINCREASE 20);
CREATE TABLE EMP
(
EMPNO NUMBER (5) PRIMARY KEY,
ENAME VARCHAR2 (15) NOT NULL,
DEPTNO NUMBER (3) REFERENCES DEPT
)
CLUSTER CLUS_EMP_DEPT ( DEPTNO );
(
DEPTNO NUMBER (3) PRIMARY KEY,
DNAME VARCHAR2 (20),
LOC VARCHAR2 (30)
)
CLUSTER CLUS_EMP_DEPT ( DEPTNO );
CREATE INDEX CLUS_EMP_DEPT_INDEX -- explicitly create
ON CLUSTER CLUS_EMP_DEPT -- index if required
TABLESPACE MY_INDEX_TABLESPACE
STORAGE (INITIAL 80 K NEXT 50 K MINEXTENTS 2 MAXEXTENTS 12 PCTINCREASE 20);
No comments:
Post a Comment