0
23kviews
Explain the Load balancing approach in distributed system.
4
448views

## A Taxonomy of Load-Balancing Algorithms

3. Dynamic

4. Static

5. Deterministic

6. Probabilistic

7. Centralized

8. Distributed

9. Cooperative

10. Non-cooperative

Type of load-balancing algorithms

Static versus Dynamic:

Static algorithms use only information about the average behavior of the system Static algorithms ignore the current state or load of the nodes in the system Dynamic algorithms collect state information and react to system state if it changed Static algorithms are much more simpler Dynamic algorithms are able to give significantly better performance

Load-balancing approach Type of static load-balancing algorithms

Deterministic versus Probabilistic

Deterministic algorithms use the information about the properties of the nodes and the characteristic of processes to be scheduled Probabilistic algorithms use information of static attributes of the system (e.g. number of nodes, processing capability, topology) to formulate simple process placement rules Deterministic approach is difficult to optimize Probabilistic approach has poor performance

Load-balancing approach Type of dynamic load-balancing algorithms

Centralized versus Distributed

Centralized approach collects information to server node and makes assignment decision Distributed approach contains entities to make decisions on a predefined set of nodes Centralized algorithms can make efficient decisions, have lower fault-tolerance Distributed algorithms avoid the bottleneck of collecting state information and react faster

Load-balancing approach Type of distributed load-balancing algorithms

Cooperative versus Non-cooperative

In Non-cooperative algorithms entities act as autonomous ones and make scheduling decisions independently from other entities

In Cooperative algorithms distributed entities cooperate with each other Cooperative algorithms are more complex and involve larger overhead Stability of Cooperative algorithms are better

Issues in designing Load-balancing algorithms

1 .Load estimation policy:.determines how to estimate the workload of a node

2. Process transfer policy: determines whether to execute a process locally or remote

State information exchange policy: determines how to exchange load information among nodes

Location policy: determines to which node the transferable process should be sent

Priority assignment policy: determines the priority of execution of local and remote processes

Migration limiting policy: determines the total number of times a process can migrate

To balance the workload on all the nodes of the system, it is necessary to decide how to measure the workload of a particular node

Some measurable parameters (with time and node dependent factor) can be the following:

Total number of processes on the node

Resource demands of these processes

Instruction mixes of these processes

Architecture and speed of the node’s processor

Several load-balancing algorithms use the total number of processes to achieve big efficiency

In some cases the true load could vary widely depending on the remaining service time, which can be measured in several way:

Memoryless method assumes that all processes have the same expected remaining service time, independent of the time used so far Past repeats assumes that the remaining service time is equal to the time used so far Distribution method states that if the distribution service times is known, the associated process’s remaining service time is the expected remaining time conditioned by the time already used

None of the previous methods can be used in modern systems because of periodically running processes and daemons An acceptable method for use as the load estimation policy in these systems would be to measure the CPU utilization of the nodes

Central Processing Unit utilization is defined as the number of CPU cycles actually executed per unit of real time

It can be measured by setting up a timer to periodically check the CPU state (idle/busy)

Process transfer policy I.for Load-balancing algorithms

Most of the algorithms use the threshold policy to decide on whether the node is lightly-loaded or heavily-loaded

Threshold value is a limiting value of the workload of node which can be determined by

Static policy: predefined threshold value for each node depending on processing capability

Dynamic policy: threshold value is calculated from average workload and a predefined constant

Below threshold value node accepts processes to execute, above threshold value node tries to transfer processes to a lightly-loaded node Single-threshold policy may lead to unstable algorithm because under loaded node could turn to be overloaded right after a process migration To reduce instability double-threshold policy has been proposed which is also known as high-low policy

Process transfer policy III. for Load-balancing algorithms

Double threshold policy

When node is in overloaded region new local processes are sent to run remotely, requests to accept remote processes are rejected

When node is in normal region new local processes run locally, requests to accept remote processes are rejected

When node is in under loaded region new local processes run locally, requests to accept remote processes are accepted

Location policy I. for Load-balancing algorithms

Threshold method

Policy selects a random node, checks whether the node is able to receive the process, and then transfers the process. If node rejects, another node is selected randomly. This continues until probe limit is reached.

Shortest method

L distinct nodes are chosen at random, each is polled to determine its load. The process is transferred to the node having the minimum value unless its workload value prohibits to accept the process.

Simple improvement is to discontinue probing whenever a node with zero load is encountered.

Location policy II. for Load-balancing algorithms

Bidding method

Nodes contain managers (to send processes) and contractors (to receive processes)

Managers broadcast a request for bid, contractors respond with bids (prices based on capacity of the contractor node) and manager selects the best offer

Winning contractor is notified and asked whether it accepts the process for execution or not

Full autonomy for the nodes regarding scheduling

Difficult to decide a good pricing policy

Location policy III. for Load-balancing algorithms

Pairing

Contrary to the former methods the pairing policy is to reduce the variance of load only between pairs

Each node asks some randomly chosen node to form a pair with it

If it receives a rejection it randomly selects another node and tries to pair again

Two nodes that differ greatly in load are temporarily paired with each other and migration starts

State information exchange policy I. for Load-balancing algorithms

Dynamic policies require frequent exchange of state information, but these extra messages arise two opposite impacts:

Increasing the number of messages gives more accurate scheduling decision

Increasing the number of messages raises the queuing time of messages

State information policies can be the following:

Broadcast when state changes

On-demand exchange

Exchange by polling

State information exchange policy II. for Load-balancing algorithms

Each node broadcasts its state information after the elapse of every T units of time

Problem: heavy traffic, fruitless messages, poor scalability since information exchange is too large for networks having many nodes

Broadcast when state changes

Avoids fruitless messages by broadcasting the state only when a process arrives or departures

Further improvement is to broadcast only when state switches to another region (double-threshold policy)

State information exchange policy III. for Load-balancing algorithms

On-demand exchange

In this method a node broadcast a State-Information-Request message when its state switches from normal to either underloaded or overloaded region.

The pair is broken as soon as the migration is over

A node only tries to find a partner if it has at least two processes

On receiving this message other nodes reply with their own state information to the requesting node

Further improvement can be that only those nodes reply which are useful to the requesting node

Exchange by polling To avoid poor scalability (coming from broadcast messages) the partner node is searched by polling the other nodes on by one, until poll limit is reached