1
32kviews
Write steps of Girvan- Newman Algorithm. Explain clustering of Social-Network Graphs using GN algorithm with example?
5
2.3kviews
• In order to find out between edges, we need to calculate shortest paths from going through each of the edges.

• Girvan - Newman Algorithm visits each node X once and computes the number of shortest paths from X to each of the other nodes that go through each of the edges.

• The algorithm begins by performing a breadth first search [BFS] of the graph, starting at the node X.

• The edges that go between node at the same level can never be a part of a shortest path from X.

• Edges DAG edge will be part of at-least one shortest path from root X.

• Consider the graph as follows:

Following figure shows the BFS representation starting at node E. solid edges are DAG edges and dashed edges connect nodes at the same level.

• The second step of GN algorithm is to label each node by the number of shortest paths that reach it from the root.

• Start by labeling the root 1, then, from the top down, label each node Y by the sum of the labels of its parents.

• The labeling of nodes is shown in above figure.

• The final step is to calculate for each edge e the sum over nodes Y of the fraction of shown paths from the root X to Y than go through e.

• Each node other than root given a credit of 1.

• This credit may be divided among nodes and edges above.

• The rules for calculation are as follows:

• Each leaf in DAG gets a credit.

• Each non leaf node gets a credit equals to 1 plus the sum of the credits of the DAG edges from that node to the level

• A DAG edge e entering node Z from the level above is given a share of the credit of Z. proportional to the fraction of shortest paths from the root to Z that go through e.

• After performing the credit calculate with each node as the root, we sum the credits for each edge.

• As each shortest path will have been discovered twice, we must divide the result/credit for each edge by 2.

• Following graph shows the calculation for DAG starting from node E.

• To complete the betweeness calculation, we have to repeat this calculation for every node as the root and sum the contributions.

• After calculations, following graph shows final betweenness values:

• We can cluster by taking the in order to increasing betweenness and add them to the graph at a time.

• We can remove edge with highest value to cluster the graph.

• In the example graph we remove edge BD to get two communities as follows: