**1 Answer**

written 2.4 years ago by |

The factors that improve the convergence of Error Back Propotion Algorithm(EBPA) are called as learning factors they are as follows:

- Initial weights
- Steepness of activation function
- Learning constant
- Momentum
- Network architecture
- Necessary number of hidden neurons.

**1.Initial weights:**

- The weights of the network to be trained are typically initialized at small random values.
- The initialization strongly affects the ultimate solution.
- If all weights start with equal weights values,and if the solution requires that unequal weights be developed ,the network may not train them properly.
- Unless the network is distributed by random factors or the random characters of input patterns during training,the representation may continuously results in symmetric weights.

**2.Steepness of activation function :**

- The neuron's continous activation function$f(net,\lambda)$ is characterized by its steepness factor$\lambda$ .
- Also the deriative$f'(net)$ of the activation function serves as a multilpying factor in building components of the error signal vector$\delta_0and \delta_y$ .
- Thus both the choice and shape of the activation function would strongly affect the speed of network learning.
- The derivative of the activation function can be written as follows:
- $f'(net)=\frac{2\lambda exp(-\lambda net)}{(1+exp(-\lambda net))^2}$
- and it reaches a maximum value of$\frac{1}{2}$$\lambda$ at net=0. **3.Learning constant** * The effectiveness and convergence of the error back propagation learning algorithm depend significantly on the value of learning constant . * in general ,however ,the optimum value of learning constant depend on the problem being solved,and there is no single learning constant value suitable for diffrent training cases. * when broad minima yield small garadient values ,then a larger value of learning constant will result in a more rapid convergence. * However ,for problems with steep and narrow minima ,a small value of learning constant must be choosen to avoid overshooting the solution. * This leads to the conclusion that learning constant should choosen experementally for each problem. **4.Momentum Method:** * **The** purpose of momentum method is to accelerate the convergence of the error back propagation learning algorithm. * The method involves supplementing the current weight adjustment with a fraction of the most recent weight adjustment .This is done according to the formula : * $\bigtriangleup w(E)=- n\nabla E(t)+\alpha \bigtriangleup w(t-1)$ * where 't' and '(t-1)' are used to indicate the current and the most recent training step,resprctively ,an$\alpha$ is a user -selected positive momentum constant between 0.1 and 0.9.
- The second term of the equation is called the momentum term.The momentum term helps to speed up convergence and also to bring the network out the local minima.

**5.Network architecture:**

one**o**f the most important attributes of a layered neural network design is choosing the architecture (network sizes expressed through I,Jand K)

**The Number of Input Nodes** is simply determined by the dimension or size of the input vector to be classified the input vector size usually corresponds to the total number of distinct features of the input pattern.

**Number of output Neurons:**

In case of a network functioning as a classifier,K can be equal to the number of classes .

The network can be trained to indicate the class number as equal to the active output number.The number of output neurons can be reduced by using binary coded class number for example a four class classifier can be trained using only two inputs neurons having outputs of 00,01,10,11,for classes 0,1,2,3 respectively .

**6.Necessary number of hidden neurons:**

This problem of choice of size of the hidden layer is under intensive study with no conclusive answer available.

one formula can be used to find out how many hidden layer neurons J need to be used to achieve classification into M classes in n-dimensional pattern space .