0
464views
Multiple Linear Regression Problem.

Predict the value of Y for subject 6 from the given dataset that contains values for X1, X2, and Y by using a Multiple Regression Model.

Subject Y X1 X2
1 -3.7 3 8
2 3.5 4 5
3 2.5 5 7
4 11.5 6 3
5 5.7 2 1
6 ? 3 2
0
60views

The Multiple Linear Regression Model with n independent variables is written as follows:

$$Y = a + b_1X_1 + b_2X_2 + b_3X_3 + ................ + b_nX_n + u$$

Where,

Y = The variable needs to be predicted (dependent variable)

X = The variable used to predict Y (independent variable)

a = The intercept

b = The slope

u = The regression residual

Formulae -

Regression of two independent variables can be predicted by using the below formulas such as Intercepts (a), Regression Coefficients (b1, b2)

$$Intercepts\ a = \overline Y - b_1(\overline X_1) -b_2(\overline X_2)$$

Regression Coefficients (b1, b2)

$$b_1 = \frac {(\sum x_2^2)(\sum x_1y) - (\sum x_1x_2)(\sum x_2y)}{(\sum x_1^2)(\sum x_2^2) - (\sum x_1x_2)^2}$$

$$b_2 = \frac {(\sum x_1^2)(\sum x_2y) - (\sum x_1x_2)(\sum x_1y)}{(\sum x_1^2)(\sum x_2^2) - (\sum x_1x_2)^2}$$

Where,

$$\sum x_1^2 = \sum X_1X_1 - \frac {(\sum X_1)(\sum X_1)}{N}$$

$$\sum x_2^2 = \sum X_2X_2 - \frac {(\sum X_2)(\sum X_2)}{N}$$

$$\sum x_1y = \sum X_1Y - \frac {(\sum X_1)(\sum Y)}{N}$$

$$\sum x_2y = \sum X_2Y - \frac {(\sum X_2)(\sum Y)}{N}$$

$$\sum x_1x_2 = \sum X_1X_2 - \frac {(\sum X_1)(\sum X_2)}{N}$$

$$\overline Y = \frac {\sum Y}{N}$$

$$\overline X_1 = \frac {\sum X_1}{N}$$

$$\overline X_2 = \frac {\sum X_2}{N}$$

# Step 1

First, calculate all the values required in the above formulae.

Subject Y X1 X2 X1X2 X1X1 X2X2 X1Y X2Y
1 -3.7 3 8 24 9 64 -11.1 -29.6
2 3.5 4 5 20 16 25 14 17.5
3 2.5 5 7 35 25 49 12.5 17.5
4 11.5 6 3 18 36 9 69 34.5
5 5.7 2 1 2 4 1 11.4 5.7
SUM 19.5 20 24 99 90 148 95.8 45.6

## Step 2

Then put these values into the above-mentioned formulae to get the exact predictable values to calculate Regression Coefficients b1 and b2

$$\sum x_1^2 = \sum X_1X_1 - \frac {(\sum X_1)(\sum X_1)}{N} = 90 - \frac {20 \times 20}{5} = 10$$

$$\sum x_2^2 = \sum X_2X_2 - \frac {(\sum X_2)(\sum X_2)}{N} = 148 - \frac {24 \times 24}{5} = 32.8$$

$$\sum x_1y = \sum X_1Y - \frac {(\sum X_1)(\sum Y)}{N} = 95.8 - \frac {20 \times 19.5}{5} = 17.8$$

$$\sum x_2y = \sum X_2Y - \frac {(\sum X_2)(\sum Y)}{N} = 45.6 - \frac {24 \times 19.5}{5} = -\ 48$$

$$\sum x_1x_2 = \sum X_1X_2 - \frac {(\sum X_1)(\sum X_2)}{N} = 99 - \frac {20 \times 24}{5} = 3$$

$$b_1 = \frac {(\sum x_2^2)(\sum x_1y) - (\sum x_1x_2)(\sum x_2y)}{(\sum x_1^2)(\sum x_2^2) - (\sum x_1x_2)^2}$$

$$b_1 = \frac {(32.8 \times 17.8) - (3 \times (-\ 48)}{(10 \times 32.8) - (3)^2} = 2.2816$$

$$b_2 = \frac {(\sum x_1^2)(\sum x_2y) - (\sum x_1x_2)(\sum x_1y)}{(\sum x_1^2)(\sum x_2^2) - (\sum x_1x_2)^2}$$

$$b_1 = \frac {(10 \times (-\ 48)) - (3 \times 17.8)}{(10 \times 32.8) - (3)^2} = -\ 1.672$$

## Step 3

Calculate the value of Intercept a

$$a = \overline Y - b_1(\overline X_1) -b_2(\overline X_2) = \frac {19.5}{5} - \frac {2.2816 \times 20}{5} - \frac {(-\ 1.672 \times 24)}{5} = 2.796$$

## Step 4

The final Regression Equation or Model looks as follows:

$$Y = 2.796 + 2.28x_1 – 1.67x_2$$

Therefore, for given x1= 3 and x2 = 2, the value of Y = ? calculated as follows:

$$Y = 2.796 + (2.28 \times 3) - (1.67 \times 2)$$

$$Y = 6.296$$