(i) Basically it is a problem of predicating a feature value of the random process from observation of past values. Focus on one-step prediction which involves prediction of the next sample from a weighted linear combination of past values x(n - 1), x(n - 2), the linearly predicated x(n) is given by:

$$ \hat{x}(n) = - \sum_{k=1}^{P} a_p(k) x(n-k) $$

where, $a_p$ is Prediction coefficient.

(ii) Consider $p^{th}$ - order prediction, the difference between x(n) and $\hat{x}(n)$ is called as Forward Prediction error, denoted as $f_p(n)$. $$ f_p(n) = x(n) - \hat{x}(n) \\ = x(n) + \sum_{k=1}^{P} a_p(k) x(n-k) \\ = \sum_{k=0}^{P} a_p(k) x(n-k) $$

(iii) We view linear prediction as linear filtering where the prediction is embedded in the feedback loop. This can be termed as prediction error filter with input sequence x(n) and output sequence as $ f_p(n) $. Putting a prediction filter in the feedback loop converts an all-poles prediction filter to a FIR prediction error.

(iv) FIR filter is a linear phase filter and is stable.

(v) The represented using equation can be seen as FIR filter with the system transfer function given by: $$ A_p(Z) = \sum_{k=0}^{P}a_p(k)Z^{-k} $$

where, by definition, $ a_p(0) = 1 $.

(vi)