Explain perceptron learning rule convergence theorem

17views

written 5.1 years ago by

teamques10 ★ 70k

Perceptron Convergence Theorem:

In the classification of linearly separable patterns belonging to two classes only, the training task for the classifier was to find the weight w such that.

(w^tx>0\hspace{0.4cm} for\hspace{0.2cm}each \hspace{0.2cm}x\in X_1\ w^tx<0\hspace{0.4cm} for\hspace{0.2cm}each \hspace{0.2cm}x\in X_2\)

Completion of training with the fixed correction training rule for any initial weight vector and any correction increment constant leads to the following weights:

$w^*=w^{k_0}=w^{k_0+1}=w^{k_0+2}.....$

with $w^*$ s the solution vector for equation.

Integer $k_0$ is the training step number starting at which no more misclassification occurs, and thus no right adjustments take place for (k_0>=0)

This theorem is called as the "Perceptron Convergence Theorem".

Perceptron Convergence theorem states that a classifier for two linearly separable classes of patterns is always trainable in a finite number of training steps.

In summary, the training of a single discrete perceptron two class classifier requires a change of weights if and only if a misclassification occurs.

In the reason for misclassification is (w^t<0\) then all weights are increased in proportion $wox_i$ .If \(w^t>0) then all weights are decreased in proportion to $x_i$

Summary of the Perceptron Convergence Algorithm:

Variables and Parameters: $x(n)=(m+1)$

$=[b(n),w_1(n),w_2(n),.....w_m(n)]^T$

$w(n)=(m+1)$ by 1 weight vector

$=[b(n),w_1(n),w_2(n),.....w_m(n)]^T$

$b(n)$= bias

$y(n)=$ actual response

$d(n)=$ desired response

$\eta=$ learning rate parameter, a +ve constant less than unity

Initialization: Set $w(0)$ , then perform the following computations for time step n=1,2
Activation: At time step n, activate the perceptron by applying input vector x(n) and desired response d(n).
Computation of actual response: Compute the actual response of the perceptron:

$y(n)=sgn[w^T(x)x(n)]$

Adaptation of weight vector: Update the weight vector of the perceptron:

$w(n+1)=w(n)+\eta [d(n)-y(n)]x(n)$

Continuation: Increment time step n by 1, go to step 1

ADD COMMENT EDIT