List down the issues in designing Distributed Load Balancing technique. Explain one technique of logical clock synchronization.

25views

written 8.4 years ago by

teamques10 ★ 66k

Designing a good load-balancing algorithm is a difficult task because of the following issues:

i. Load estimation policy for Load-balancing algorithms

The first issue in a load-balancing algorithm is to decide which method to use to estimate the workload of a particular node.
A nodes workload can be estimated based on some measurable parameters below:
Total number of processes on the node.
Resource demands of these processes.
Instruction mixes of these processes.
Architecture and speed of the node’s processor.
Several load-balancing algorithms use the total number of processes to achieve efficiency
In some cases the true load could vary widely depending on the remaining service time of processes on a node, which can be measured in several ways such as Memoryless method, Past repeats and Distribution method.
An acceptable method for use as the load estimation policy in these systems would be to measure the CPU utilization of the nodes. It can be measured by setting up a timer to periodically check the CPU state (idle/busy).

ii. Process transfer policy for Load-balancing algorithms

The idea of using this policy is to transfer processes from heavily loaded nodes to lightly loaded nodes.
Most of the algorithms use the threshold policy to decide whether the node is lightly-loaded or heavily-loaded. Threshold value is a limiting value of the workload of node which can be determined by:

Static policy: Predefined threshold value for each node depending on processing capability.

Dynamic policy: threshold value is calculated from average workload and a predefined constant.

Below threshold value node accepts processes to execute, above threshold value node tries to transfer processes to a lightly-loaded node.
Single-threshold policy may lead to unstable algorithm because underloaded node could turn to be overloaded right after a process migration.
To reduce instability double-threshold policy has been proposed which is also known as high-low policy.

iii. Location policy for Load-balancing algorithms

The selection of destinations node for process execution is made by the location policy of a scheduling algorithm.
The policies used can be:

Threshold method

This policy selects a random node and checks whether the node is able to receive the process then it transfers the process. If node rejects the transfer another node is selected randomly. This continues until probe limit is reached.

Shortest method

In this M distinct nodes are chosen at random and each is polled to determine its load. The process is transferred to the node having the minimum value unless its workload value prohibits it from accepting the process. To improve this it is necessary to discontinue probing whenever a node with zero load is encountered.

Bidding method

In this method nodes contain managers to send processes and contractors to receive processes.
Managers broadcast a request for bid and contractors respond with bids where the manager selects the best offer.
The winning contractor is notified and asked whether it accepts the process for execution or not. A contractor is never forced to accept remote process.

Pairing

The pairing policy is used to reduce the variance of load only between nodes of the system.
Each node asks some randomly chosen node to form a pair with it. If it receives a rejection it randomly selects another node and tries to pair again.
Two nodes that differ greatly in load are temporarily paired with each other and migration starts. The pair is broken as soon as the migration is over. A node only tries to find a partner if it has at least two processes.

iv. State information exchange policy for Load-balancing algorithms

Dynamic policies require frequent exchange of state information among the systems.
These extra messages arise two opposite impacts one is that it increasing the number of messages gives more accurate scheduling decision and the other is it increasing the number of messages raises the queuing time of messages. State information policies that can be used are:

Periodic broadcast

Each node broadcasts its state information after the elapse of every Tunits of time
The problems caused are heavy traffic, poor scalability. Broadcast when state changes
This method avoids fruitless message exchange by broadcasting the state only when a process arrives or departures.
Further improvement that can be done is to broadcast only when state switches to another region.

On-demand exchange

In this method a node broadcasts a State-Information-Request message when its state switches from normal to either under-loaded or overloaded region.
On receiving this message other nodes reply with their own state information to the requesting node
Further improvement can be that only those nodes reply which are useful to the requesting node and use of State Information Request. Exchange by polling
The polling method overcomes poor scalability that comes from broadcast messages. There is no need for a node to exchange its state information.
The partner node is searched by polling the other nodes on by one until poll limit is reached.

v. Priority assignment policy for Load-balancing algorithms

In a distributed operating system that support process migration, a priority assignment rule for scheduling the local and remote process of a node should be planned.

One of the following priority assignment rules may be used:

Selfish priority assignment rule
In this the local processes are given higher priority than remote processes.
This rule has the worst response time performance of the three policies due to poor performing processes.
Best response time performance is achieved for local processes.
Altruistic priority assignment rule
Remote processes are given higher priority than local processes in this rule.
It has the best response time performance of the three policies.
Remote process incur lower delays than local processes as local processes are treated as principal while remote processes are secondary workload.
Intermediate priority assignment rule
Priority in this rule is decided depending on number of local and remote processes on a node.
When the number of local processes is greater or equal to the number of remote processes than local processes are given higher priority or else remote processes are given higher priority than local processes.
Performance of this rule is between the two above policies. Overall response time performance is close to altruistic policy.
Migration limiting policy for Load-balancing algorithms:

This policy determines the total number of times a process can migrate.

One of the following two policies may be used-

a. Uncontrolled: A remote process arriving at a node is treated just as a process originating at a node due to which a process may be migrated any number of times.

b. Controlled: This avoids the instability of the uncontrolled policy and uses a migration count parameter to fix a limit on the number of time a process that can migrate.

A long process may be allowed to migrate more times compares to short process.

Explain one technique of logical clock synchronization.

Lamport timestamps does not say anything about the relationship between events by comparing their time values. It also does not capture causalities.
The receipt of an article always casually precedes the posting of a reaction. If casual relationships are to be maintained within a group of processes then the receipt of the reaction to an article should always follow the receipt of that article.
If two articles or reactions are independent then their order of delivery should not matter.
Causality can be captured using vector timestamps. A vetor timestamp VT(a) for event a will have the property such that if VT(a) < VT(b) for event b then event a is known to causally precede event b.
Vector timestamps are constructed by letting each process Pi maintain a vector Vi with the following properties: Vi [ i ] is the number of events that have occurred at Pi If Vi[ j ] = k then Pi knows k events have occurred at Pj
The first property is maintained by incrementing Vi[ i ] at the occurrence of each new event that happens at process Pi.
The second property is maintained by piggy-backing vectors along with sent messages.
A receiver is informed about the number of events that occurred at Pi and also how many events at other processes have taken place before Pi sent message m.
When Pj receives m, it adjusts its own vector by setting each entry Vj[k] to max{Vj[k],vt[k]}.
The vector now reflects the number of messages that Pj must receive.

ADD COMMENT EDIT