Friday, November 18, 2011

Calculate Regression Coefficient

Linear regression trend line


One the most basic tools for engineering or scientific analysis is linear regression. This technique starts with a data set in two variables. The independent variable is usually called "x" and the dependent variable is usually called "y." The goal of the technique is to identify the line, y = mx + b, that approximates the data set. This trend line can show, graphically and numerically, relationships between the dependent and independent variables. From this regression analysis, a value for correlation can also be calculated.


Instructions


1. Identify and separate the x and y values of your data points. If you are using a spreadsheet, enter them into adjacent columns. There should be the same number of x and y values. If not, the calculation will be inaccurate, or the spreadsheet function will return an error.


x = (6, 5, 11, 7, 5, 4, 4)


y = (2, 3, 9, 1, 8, 7, 5)


2. Calculate the average value for the x values and the y values by dividing the sum of all the values by the total number of values in the set. These averages will be referred to as "x_avg" and y_avg."


x_avg = (6 + 5 + 11 + 7 + 5 + 4 + 4) / 7 = 6


y_avg = (2 + 3 + 9 + 1 + 8 + 7 + 5) / 7 = 5


3. Create two new data sets be subtracting the x_avg value from each x value and the y_avg value from each y value.


x1 = (6 - 6, 5 - 6, 11 - 6, 7 - 6 ... )


x1 = (0, -1, 5, 1, -1, -2, -2)


y1 = (2 - 5, 3 - 5, 9 - 5, 1 - 5, ... )


y1 = (-3, -2, 4, -4, 3, 2, 0)


4. Multiply each x1 value by each y1 value, in order.


x1y1 = (0 * -3, -1 * -2, 5 * 4, ... )


x1y1 = (0, 2, 20, -4, -3, -4, 0)


5. Square each x1 value.


x1^2 = (0^2, 1^2, -5^2, ... )


x1^2 = (0, 1, 25, 1, 1, 4, 4)


6. Calculate the sums of the x1y1 values and x1^2 values.


sum_x1y1 = 0 + 2 + 20 - 4 - 3 - 4 + 0 = 11


sum_x1^2 = 0 + 1+ 25 + 1 + 1 + 4 + 4 = 36


7. Divide "sum_x1y1" by "sum_x1^2" to get the regression coefficient.


sum_x1y1 / sum_x1^2 = 11 / 36 = 0.306







Tags: each value, sum_x1y1 sum_x1^2, from each, from each value, number values