In this section we discuss a special type of regression, which is called **simple linear regression**. In this special case of regression analysis the relationship between the response variable \(y\) and and the predictor variable \(x\) is given in form of a **linear** equation

\[y= a + bx\text{,}\]

where \(a\) and \(b\) are constants. The number \(a\) is called **intercept** and defines the point of intersection of the regression line and the \(y\)-axis (\(x=0\)). The number \(b\) is called **regression coefficient**. It is a measure of the slope of the **regression line**. Thus, \(b\) indicates how much the \(y\)-value changes when the \(x\)-value increases by 1 unit. The adjective **simple** refers to the fact that the outcome variable is related to a single predictor. The model is considered as a **deterministic model**, as it gives an exact relationship between \(x\) and \(y\).

Let us consider a simple example. Given a population of \(n = 3\) points with cartesian coordinates \((x_i,y_i)\) of \((1,6)\), \((2,8)\) and \((3,10)\). These points plot on a straight line and thus, can be described by a linear equation model in the form of \(y= a + bx\), where the intercept \(a=4\) and \(b=2\).

In many cases however, the relationship between two variables \(x\) and \(y\) is not exact. This is due to the fact, that the response variable \(y\) is affected by other unknown and/or random processes, that are not fully captured by the predictor variable \(x\). In such a case the data points do not line up on a straight line. However, the data still may follow an underlying linear relationship. In order to take these unknowns into consideration a **random error term**, denoted by \(\epsilon\), is added to the linear model equation, thus resulting in a **probabilistic model** in contrast to the deterministic model from above.

\[y = a + b x + \epsilon\]

where the error term \(\epsilon_i\) is assumed to consist of independent normal distributed values, \(e_i \sim N(0, \sigma^2)\).

In linear regression modelling following assumptions are made about the model (Mann 2012).

- The random error term \(\epsilon\) has a mean equal to zero for each \(x\).

- The errors associated with different observations are independent.
- For any given \(x\), the distribution of errors is normal.
- The distribution errors for each \(x\) has the same (constant) standard deviation, which is denoted by \(\sigma_{\epsilon}\).

Let us consider another example. This time we take a random sample of sample size \(n = 8\) from a population. In order to emphasis that the values of the intercept and slope are calculated from sample data, \(a\) and \(b\) are denoted by \(\beta_0\) and \(\beta_1\), respectively. In addition, the error term \(\epsilon\) is denoted as \(e\). Thus, \(\beta_0\), \(\beta_1\) and \(e\) are estimates based on sample data for the population parameters \(a\), \(b\) and \(\epsilon\).

\[\hat y = \beta_0 + \beta_1 x + e \text{,}\]

where \(\hat y\) is the the **estimated or predicted value of \(y\)** for any given value of \(x\).

The error \(e_i\) for each particular pair of values (\(x_i,y_i\)), also called **residual**, is computed by the difference of the observed value \(y_i\) and the predicted value given by \(\hat y_i\).

\[e_i = y_i - \hat y_i\]

Depending on the data \(e_i\) is a negative number if \(y_i\) plots below the regression line or it is a positive number if \(y_i\) plots above the regression line.