0
9.6kviews
Explain Optimal Binary Search Tree with an Example?
4
481views

### OPTIMAL BINARY SEARCH TREES

• A Binary Search Tree is one of the most important data structures which contain a set of elements with the operations of searching, insertion, and deletion.

• An Optimal Binary Search Tree is the one for which the average number of comparisons in a search is the smallest possible, if the probability of searching each elements is given.

• Consider four keys A, B, C, and D to be searched for with probabilities $0.1,0.2,0.4$, and $0.3$, respectively.

• The below figure depicts two out of 14 possible binary search trees containing these keys.(A) (B)

• The average number of comparisons in a successful search in the first of these trees is

$0.1 \cdot 1+0.2 \cdot 2+0.4 \cdot 3+0.3 \cdot 4=2.9$,

and for the second one it is

$0.1 .2+0.2 .1+0.4 .2+0.3 .3=2.1$.

• Neither of these two trees is, in fact, optimal.

• The total number of binary search trees with $\mathrm{n}$ keys is equal to the nth Catalan number,

$$c(n)=\frac{1}{n+1}\left(\begin{array}{c} 2 n \\ n \end{array}\right) \quad \text { for } n\gt0, \quad c(0)=1,$$

• Let $a_{1}, \ldots, a_{n}$ be distinct keys ordered from the smallest to the largest and let $p_{1}, \ldots, p_{n}$ be the probabilities of searching for them.

• Let $\mathrm{C}(\mathrm{i}, \mathrm{j})$ be the smallest average number of comparisons made in a successful search in a binary search tree $T_{i}^{j}$.- Suppose the root contains key $a_{k}$, the left subtree $T_{i}^{k-1}$ contains keys $a_{i}, \ldots, a_{k-1}$ optimally arranged, and the right subtree $T_{j}^{k+1}$ contains keys $a_{k+1}, \ldots, a_{j}$ also optimally arranged.

• If we count tree levels starting with 1 to make the comparison numbers equal the keys' levels, the following recurrence relation is obtained: \begin{aligned} C(i, j)=& \min _{i \leq k \leq j}\left\{p_{k} \cdot 1+\sum_{s=i}^{k-1} p_{s} \cdot\left(\text { level of } a_{s} \text { in } T_{i}^{k-1}+1\right)\right.\\ &\left.+\sum_{s=k+1}^{j} p_{s} \cdot\left(\text { level of } a_{s} \text { in } T_{k+1}^{j}+1\right)\right\} \\ =& \min _{i \leq k \leq j}\left\{\sum_{s=i}^{k-1} p_{s} \cdot \text { level of } a_{s} \text { in } T_{i}^{k-1}+\sum_{s=k+1}^{j} p_{s} \cdot \text { level of } a_{s} \text { in } T_{k+1}^{j}+\sum_{s=i}^{j} p_{s}\right\} \\ =& \min _{i \leq k \leq j}\{C(i, k-1)+C(k+1, j)\}+\sum_{s=i}^{j} p_{s} . \end{aligned}

• The recurrence relation is $$C(i, j)=\min _{i \leq k \leq j}\{C(i, k-1)+C(k+1, j)\}+\sum_{s=i}^{j} p_{s} \quad \text { for } 1 \leq i \leq j \leq n .$$

• Here $\mathrm{C}(\mathrm{i}, \mathrm{i}-1)=0$ for $1 \leq \mathrm{i} \leq \mathrm{n}+1$ [represents empty tree] and $\mathrm{C}(\mathrm{i}, \mathrm{i})=\mathrm{p}_{\mathrm{i}}$, for $1 \leq \mathrm{i} \leq \mathrm{n}$ [represents an one node tree],

• The algorithm computes $C(1, n)$, the average number of comparisons for successful searches in the optimal binary tree.

• To get the optimal tree, another two-dimensional table to record the value of $\mathrm{k}$ for which it is minimum.

$\underline{\text { EXAMPLE }}$

Construct an optimal binary search tree for the given set of keys

Key A B C D
Probability 0.1 0.2 0.4 0.3

Initial tables will be: here $\mathrm{C}(\mathrm{i}, \mathrm{i}-1)=0$

for $1 \leq \mathrm{i} \leq \mathrm{n}+1 \& \mathrm{C}(\mathrm{i}, \mathrm{i})=\mathrm{p}_{\mathrm{i}}$

0 1 2 3 4
1 0 0.1
2 0 0.2
3 0 0.4
4 0 0.3
5 0
0 1 2 3 4
0 1 2 3 4
1 1
2 2
3 3
4 4
5

\begin{aligned} \mathrm{C}(1,2) &=\min \left\{\begin{array}{l}\text { for } \mathrm{k}=1: \mathrm{C}[1,0]+\mathrm{C}[2,2]+\sum_{s=1}^{2} p_{s} \\ \left.\text { for } \mathrm{k}=2: \mathbb{C}_{2}, 1,1\right]+\mathrm{C}[3,2]+\sum_{s=1}^{2} p_{s}\end{array}\right.\\ &=\min [0+0.2+0.3,0.1+0+0.3] \\ &=\min [0.5,0.4] \\ &=0.4 \\ \mathrm{C}(2,3) &=\min \left\{\begin{array}{l}\text { for } \mathrm{k}=2: \mathrm{C}[2,1]+\mathrm{C}[3,3]+\sum_{s=2}^{3} p_{s} \\ \text { for } \mathrm{k}=3: \mathrm{C}[2,2]+\mathrm{C}[4,3]+\sum_{s=2}^{3} p_{s}\end{array}\right.\\ &=\min [0+0.4+0.6,0.2+0+0.6] \\ &=\min [1.0,0.8] \\ &=0.8 \end{aligned}

\begin{aligned} C(1,4) &=\min \left\{\begin{array}{l} \text { for } \mathrm{k}=1: \mathrm{C}[1,0]+\mathrm{C}[2,4]+\sum_{s=1}^{4} p_{s} \\ \text { for } \mathrm{k}=2: \mathrm{C}[1,1]+\mathrm{C}[3,4]+\sum_{s=1}^{4} p_{s} \\ \text { for } \mathrm{k}=3: \mathrm{C}[1,2]+\mathrm{C}[4,4]+\sum_{s=1}^{4} p_{s} \\ \text { for } \mathrm{k}=4: \mathrm{C}[1,3]+\mathrm{C}[5,4]+\sum_{s=1}^{4} p_{s} \end{array}\right.\\ &=\min [0+1.4+1.0,0.1+1.0+1.0,0.4+0.3+1.0,1.1+0+1.0] \\ &=\min [2.4,2.1,1.3 .2 .1] \\ &=1.7 \end{aligned}

Now the tables becomes

0 1 2 3 4
1 0 0.1 0.4
2 0 0.2 0.8
3 0 0.4 1.0
4 0 0.3
5 0
0 1 2 3 4
1 1 2
2 2 3
3 3 3
4 4
5

\begin{aligned} C(1,4) &=\min \left\{\begin{array}{l}\text { for } \mathrm{k}=1: \mathrm{C}[1,0]+\mathrm{C}[2,4]+\sum_{s=1}^{4} p_{s} \\ \text { for } \mathrm{k}=2: \mathrm{C}[1,1]+\mathrm{C}[3,4]+\sum_{s=1}^{4} p_{s} \\ \text { for } \mathrm{k}=3: \mathrm{C}[1,2]+\mathrm{C}[4,4]+\sum_{s=1}^{4} p_{s} \\ \text { for } \mathrm{k}=4: \mathrm{C}[1,3]+\mathrm{C}[5,4]+\sum_{s=1}^{4} p_{s}\end{array}\right.\\ &=\min [0+1.4+1.0,0.1+1.0+1.0,0.4+0.3+1.0,1.1+0+1.0] \\ &=\min [2.4,2.1,1.3 .2 .1] \\ &=1.7 \end{aligned}

\begin{aligned} C(1,4) &=\min \left\{\begin{array}{l}\text { for } \mathrm{k}=1: \mathrm{C}[1,0]+\mathrm{C}[2,4]+\sum_{s=1}^{4} p_{s} \\ \text { for } \mathrm{k}=2: \mathrm{C}[1,1]+\mathrm{C}[3,4]+\sum_{s=1}^{4} p_{s} \\ \text { for } \mathrm{k}=3: \mathrm{C}[1,2]+\mathrm{C}[4,4]+\sum_{s=1}^{4} p_{s} \\ \text { for } \mathrm{k}=4: \mathrm{C}[1,3]+\mathrm{C}[5,4]+\sum_{s=1}^{4} p_{s}\end{array}\right.\\ &=\min [0+1.4+1.0,0.1+1.0+1.0,0.4+0.3+1.0,1.1+0+1.0] \\ &=\min [2.4,2.1,1.3 .2 .1] \\ &=1.7 \end{aligned}

Now the tables becomes

0 1 2 3 4
1 1 2
2 2 3
3 3 3
4 4
5
0 1 2 3 4
1 1 2 3 3
2 2 3 3
3 3 3
4 4
5
• Thus, the average number of key comparisons in the optimal tree is equal to $1.7$.

• Since $R(1,4)=3$, the root of the optimal tree contains the third key, i.e., $C$. - Since its a binary search tree, Its left subtree is made up of keys A and B, and its right subtree contains just the key D - To find the specific structure of these subtrees, - In the root table since $R(1,2)=2$, the root of the optimal tree containing $A$ and $B$ is $B$,- Thus, the average number of key comparisons in the optimal tree is equal to $1.7$. - Since $R(1,4)=3$, the root of the optimal tree contains the third key, i.e., $C$. - Since its a binary search tree, Its left subtree is made up of keys A and B, and its right subtree contains just the key D - To find the specific structure of these subtrees, - In the root table since $R(1,2)=2$, the root of the optimal tree containing $A$ and $B$ is $B$, with $\mathrm{A}$ being its left child. - Since R $(4,4)=4$, the root of this one-node optimal tree is its only key D.

• The below figure represents the optimal tree