Oracle tuning: Blocksize and index tree structures
There has been a great debate about the structures of Oracle index trees and whether they are important to Oracle tuning. Guru Don Burleson gives his opinion on the matter.
There has been a great debate about the structures of Oracle index trees and whether they are important to Oracle tuning, and many articles have attempted to describe the inner working of these important Oracle performance facilitators. Several new books have appeared on the subject, Oracle Index Management Secrets and Oracle SQL tuning and CBO Internals, by Kim Floss, President of the International Oracle Users Group (IOUG).
As we may know, Oracle offers a wealth of index structures, each with their own benefits and drawbacks:
- B-tree indexes - This is the standard tree index that Oracle has been using since the earliest releases.
- Bitmap indexes - Bitmap indexes are used where an index column has a relatively small number of distinct values (low cardinality). These are super-fast for read-only databases, but are not suitable for systems with frequent updates
- Bitmap join indexes - This is an index structure whereby data columns from other tables appear in a multi-column index of a junction table. This is the only create index syntax to employ a SQL-like from clause and where clause.
create bitmap index part_suppliers_state on inventory( parts.part_type, supplier.state ) from inventory i, parts p, supplier s where i.part_id=p.part_id and i.supplier_id=p.supplier_id;
While the debate continues to rage about index rebuilding, there are some areas of index management where everyone agrees. Internally, the structure of an Oracle B*tree index is very similar to a UNIX inode structure. Each data block within the index serves as a "node" in the index tree, with the bottom nodes (leaf blocks), containing pairs of symbolic keys and ROWID values.
Inside Oracle b-tree indexes
In order to properly manage the blocks, Oracle controls the allocation of pointers within each data block. As an Oracle tree grows (via inserting rows into the table), Oracle fills the block, and when full it splits, creating new index nodes (data blocks) to manage the symbolic keys within the index.
Hence, an Oracle index block may contain two types of pointers:
- Pointers to other index nodes (data blocks)
- ROWID pointers to specific table rows
Oracle manages the allocation of pointers within index blocks, and this is the reason why we are unable to specify a PCTUSED value (the freelist re-link threshold) for indexes. When we examine an index block structure, we see that the number of entries within each index node is a function of two values:
- The length of the symbolic key
- The blocksize for the index tablespace
Because the blocksize affects the number of keys within each index node, it follows that the blocksize will have an effect on the structure of the index tree. All else being equal, large 32k blocksizes will have more keys, resulting in a flatter index than the same index created in a 2k tablespace. A large blocksize will also reduce the number of consistent gets during index access, improving performance for scattered reads access.
Each data block within the index contains "nodes" in the index tree, with the bottom nodes (leaf blocks), containing pairs of symbolic keys and ROWID values. As an Oracle tree grows (via inserting rows into the table), Oracle fills the block, and when the block is full, it splits, creating new index nodes (data blocks) to manage the symbolic keys within the index. Hence, an Oracle index block may contain pointers to other index nodes or ROWID/Symbolic-key pairs.
Index behavior and Oracle blocksize
Because the blocksize affects the number of keys within each index block, it follows that the blocksize will have an effect on the structure of the index tree. All else being equal, large 32k blocksizes will have more keys per block, resulting in a flatter index than the same index created in a 2k tablespace.
Today, most Oracle tuning experts utilize the multiple blocksize feature of Oracle because it provides buffer segregation and the ability to place objects with the most appropriate blocksize to reduce buffer waste. Some of the world record Oracle benchmarks use very large data buffers and multiple blocksizes.
According to an article by Christopher Foot, author of the OCP Instructors Guide for Oracle DBA Certification, larger block sizes can help in certain situations:
"A bigger block size means more space for key storage in the branch nodes of B-tree indexes, which reduces index height and improves the performance of indexed queries."
In any case, there appears to be evidence that block size affects the tree structure, which supports the argument that data blocks affect the structure of the tree.
You can use the large (16-32K) blocksize data caches to contain data from indexes or tables that are the object of repeated large scans. Does such a thing really help performance? A small but revealing test can answer that question.
For the test, the following query will be used against a 9i database that has a database block size of 8K, but also has the 16K cache enabled along with a 16K tablespace:
select count(*) from eradmin.admission where patient_id between 1 and 40000;
The ERADMIN.ADMISSION table has 150,000 rows in it and has an index build on the PATIENT_ID column. An EXPLAIN of the query reveals that it uses an index range scan to produce the desired end result:
Execution Plan ---------------------------------------------------------- SELECT STATEMENT Optimizer=CHOOSE (Cost=41 Card=1 Bytes=4) 1 0 SORT (AGGREGATE) 2 1 INDEX (FAST FULL SCAN) OF 'ADMISSION_PATIENT_ID' (NON-UNIQUE) (Cost=41 Card=120002 Bytes=480008)
Executing the query (twice to eliminate parse activity and to cache any data) with the index residing in a standard 8K tablespace produces these runtime statistics:
Statistics --------------------------------------------------- 0 recursive calls 0 db block gets 421 consistent gets 0 physical reads 0 redo size 371 bytes sent via SQL*Net to client 430 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
To test the effectiveness of the new 16K cache and 16K tablespace, the index used by the query will be rebuilt into the 16K tablespace that has the exact same characteristics as the original 8K tablespace, except for the larger blocksize:
alter index eradmin.admission_patient_id rebuild nologging noreverse tablespace indx_16k;
Once the index is nestled firmly into the 16K tablespace, the query is re-executed (again twice) with the following runtime statistics being produced:
Statistics --------------------------------------------------- 0 recursive calls 0 db block gets 211 consistent gets 0 physical reads 0 redo size 371 bytes sent via SQL*Net to client 430 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
As you can see, the amount of logical reads has been reduced in half simply by using the new 16K tablespace and accompanying 16K data cache. Clearly, the benefits of properly using the new data caches and multi-block tablespace feature of Oracle9i and above are worth your investigation and trials in your own database.
Regularly scheduled index rebuilds?
Another area of the debate is about whether a set of rules can be determined to identify when performance will improve from an index rebuild. Many Oracle shops schedule periodic index rebuilding, and report measurable speed improvements after they rebuild their Oracle b-tree indexes.
In an OracleWorld 2003 presentation titled Oracle Database 10g: The Self-Managing Database by Sushil Kumar, Kumar states that the Automatic Maintenance Tasks (AMT) Oracle10g feature will automatically detect and rebuild sub-optimal indexes.
"AWR provides the Oracle Database 10g a very good 'knowledge' of how it is being used. By analyzing the information stored in AWR, the database can identify the need of performing routine maintenance tasks, such as optimizer statistics refresh, rebuilding indexes, etc. The Automated Maintenance Tasks infrastructure enables the Oracle Database to automatically perform those operations."
However there are also arguments against scheduled index rebuilding. Some Oracle in-house experts maintain that Oracle indexes are super-efficient at space re-use and access speed and that a b-tree index rarely needs rebuilding. They claim that a reduction in Logical I/O (LIO) should be measurable, and if there were any benefit to index rebuilding, someone would have come up with "provable" rules.
Conclusion
The evidence is clear that the multiple blocksize feature improves the performance of Oracle indexes and that there are cases where query speed is improved by rebuilding indexes. It is hoped that the new Oracle10g AMT will allow for the automated detection and rebuilding of sub-optimal index structures.
If you like Oracle tuning ticks, you might enjoy my latest book Creating a Self-tuning Oracle Database by Rampant TechPress. It's only $9.95 (I don't think it is right to charge a fortune for books!).
About the Author
Donald K. Burleson has been a DBA for more than 20 years and provides Oracle consulting for systems that require high performance. The author of more than 30 books, Burleson provides Oracle consulting at www.dba-oracle.com and remote Oracle support at www.remote-dba.net.