How does multilevel indexing improve the efficiency of searching an index file?

358views

written 7.6 years ago by

A multilevel index considers the index file, which we will now refer to as the first (or base) level of a multilevel index, as an ordered file with a distinct value for each K(i).
Because a single-level index is an ordered file, we can create a primary index to the index itself;  In this case, the original index file is called the first-level index and the index to the index is called the second-level index.
We can repeat the process, creating a third, fourth ... top level until all entries of the top level fit in one disk block
A multi-level index can be created for any type of firstlevel index (primary, secondary, clustering) as long as the first-level index consists of more than one disk block
However, insertion and deletion of new index entries is a severe problem because every level of the index is an ordered file

enter image description here

Multilevel indexing improve the efficiency of searching an index file in following way:

1) A multilevel index reduces the number of blocks accessed when searching for a record, given its indexing field value.

2) To retain the benefits of using multilevel indexing while reducing index insertion and deletion problems,

3) Designers adopted a multilevel index that leaves some space in each of its blocks for inserting new entries. This is called a dynamic multilevel index and is often implemented by using data structures called B-trees and B+-trees.

B+ Tree

A B+ tree is a balanced binary search tree that follows a multi-level index format. The leaf nodes of a B+ tree denote actual data pointers. B+ tree ensures that all leaf nodes remain at the same height, thus balanced. Additionally, the leaf nodes are linked using a link list; therefore, a B+ tree can support random access as well as sequential access.

Structure of B+ Tree

Every leaf node is at equal distance from the root node. A B+ tree is of the order n where nis fixed for every B+ tree.

enter image description here

Internal nodes

Internal (non-leaf) nodes contain at least ⌈n/2⌉ pointers, except the root node.
At most, an internal node can contain n pointers.

Leaf nodes

Leaf nodes contain at least ⌈n/2⌉ record pointers and ⌈n/2⌉ key values.
At most, a leaf node can contain n record pointers and n key values.
Every leaf node contains one block pointer P to point to next leaf node and forms a linked list.

B+ Tree Insertion

B+ trees are filled from bottom and each entry is done at the leaf node.
If a leaf node overflows −
- Split node into two parts.
- Partition at i = ⌊(m+1)/2⌋.
- First i entries are stored in one node.
- Rest of the entries (i+1 onwards) are moved to a new node.
- ith key is duplicated at the parent of the leaf.
If a non-leaf node overflows −
- Split node into two parts.
- Partition the node at i = ⌈(m+1)/2⌉.
- Entries up to i are kept in one node.
- Rest of the entries are moved to a new node.

B+ Tree Deletion

B+ tree entries are deleted at the leaf nodes.
The target entry is searched and deleted.

If it is an internal node, delete and replace with the entry from the left position.
After deletion, underflow is tested,
If distribution is not possible from left, then

Distribute from the nodes right to it.
If distribution is not possible from left or from right, then

Merge the node with left and right to it.

ADD COMMENT EDIT