Polynomial regression is a special type of linear regression, in which the relationship between the predictor variable x and the response variable y is modeled by a kth-degree polynomial of x. In other words, we include second-order and higher powers of a variable in the model along with the original linear term. The incorporation of kth-degree polynomials results in a nonlinear relation between y and x. Still, the model is a linear model, since the relation between the coefficients (βi) and the expected observations is linear. The model equation can be written as
ˆy=β0+β1x+β2x2+...+βkxk+ϵ.
The values of the coefficients are determined by fitting the polynomial to the observational data (y). As in simple linear regression discussed in the previous section, this is done by minimizing the sum of squared errors (SSE), given by the equation
SSE=n∑i=1ϵ2i=n∑i=1(yi−ˆyi)2.
By fitting a polynomial to observations there arises the problem of choosing the order k of the polynomial. How to choose the right order polynomial is a matter of an important concept called model comparison or model selection. To keep it simple we use the root-mean-square error (RMSE) defined by
RMSE=√∑ni=1(yi−ˆyi)2n
to evaluate the goodness-of-fit of the model.
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.