Explain perceptron convergence theorem

3.6kviews

written 3.2 years ago by

Perceptron Convergence Theorem:

In the classification of linearly separable patterns belonging to two classes only, the training task for the classifier was to find the weight w such that.

(w^tx>0\hspace{0.4cm} for\hspace{0.2cm}each \hspace{0.2cm}x\in X_1\ w^tx<0\hspace{0.4cm} for\hspace{0.2cm}each \hspace{0.2cm}x\in X_2\)

Completion of training with the fixed correction training rule for any initial weight vector and any correction increment constant leads to the following weights:

$w^*=w^{k_0}=w^{k_0+1}=w^{k_0+2}.....$

with $w^*$ as the solution vector for equation.

Integer $k_0$ is the training step number starting at which no more misclassification occurs, and thus no right adjustments take place for (k_0>=0)

This theorem is called as the "Perceptron Convergence Theorem".

Perceptron Convergence theorem states that a classifier for two linearly separable classes of patterns is always trainable in a finite number of training steps.

In summary, the training of a single discrete perceptron two class classifier requires a change of weights if and only if a misclassification occurs.

In the reason for misclassification is (w^tx<0\) then all weights are increased in proportion wo $x_i$ . If \(w^tx>0) then all weights are decreased in proportion to $x_i$

Summary of the Perceptron Convergence Algorithm:

Variables and Parameters: $x(n)=(m+1)$ by 1 input vector

$=[+1,x_1(n),x_2(n),.....x_m(n)]^T$

$w(n)=(m+1)$ by 1 weight vector

$=[b(n),w_1(n),w_2(n),.....w_m(n)]^T$

$b(n)=$ bias

$y(n)=$ actual response

$d(n)=$ desired response

$\eta=$ learning rate parameter, a +ve constant less than unity

1. Initialization: Set $w(0)=0$ , then perform the following computations for time step n=1,2

2. Activation: At time step n, activate the perceptron by applying input vector x(n) and desired response d(n).

3. Computation of actual response: Compute the actual response of the perceptron:

$y(n)=sgn[w^T(x)x(n)]$

4. Adaptation of weight vector: Update the weight vector of the perceptron:

$w(n+1)=w(n)+\eta [d(n)-y(n)]x(n)$

5. Continuation: Increment time step n by 1, go to step 1

ADD COMMENT EDIT