Question: Bond Energy Algorithm
1

Mumbai University > Computer Engineering > Sem 6 > Distributed Database

Marks: 10M

Year: May 2016

 modified 2.1 years ago  • written 2.1 years ago by Barkha • 750
2
1. The bond energy algorithm (BEA) was developed and has been used in the database design area to determine how to group data and how to physically place data on a disk.

2. It can be used to cluster attributes based on usage and then perform logical or physical design accordingly. With BEA, the affinity (bond) between database attributes is based on common usage.

3. This bond is used by the clustering algorithm as a similarity measure. The actual measure counts the number of times the two attributes are used together in a given time. To find this, all common queries must be identified.

4. The idea is that attributes that are used together form a cluster and should be stored together. In a distributed database, each resulting cluster is called a vertical fragment and may be stored at different sites from other fragments.

5. The basic steps of this clustering algorithm are:

i. Create an attribute affinity matrix in which each entry indicates the affinity between the two associate attributes. The entries in the similarity matrix are based on the frequency of common usage of attribute pairs.

ii. The BEA then converts this similarity matrix to a BOND matrix in which the entries represent a type of nearest neighbor bonding based on probability of co-access. The BEA algorithm rearranges rows or columns so that similar attributes appear close together in the matrix.

iii. Finally, the designer draws boxes around regions in the matrix with high similarity.

The resulting matrix, modified from, is illustrated in Figure. The two shaded boxes represent the attributes that have been grouped together into two clusters.

Two attributes Ai and Aj have a high affinity if they are frequently used together in database applications. At the heart of the BEA algorithm is the global affinity measure. Suppose that a database schema consists of n attributes {A1, A2, , An}. The global affinity measure, AM, is defined as