Polynomial regression is a special type of linear regression, in which the relationship between the predictor variable \(x\) and the response variable \(y\) is modeled by a \(k^{th}\)-degree polynomial of \(x\). In other words, we include second-order and higher powers of a variable in the model along with the original linear term. The incorporation of \(k^{th}\)-degree polynomials results in a nonlinear relation between \(y\) and \(x\). Still, the model is a linear model, since the relation between the coefficients \((\beta_i)\) and the expected observations is linear. The model equation can be written as

\[\hat y = \beta_0+\beta_1x+\beta_2x^2+...+\beta_kx^k+\epsilon\text{.}\]

The values of the coefficients are determined by fitting the
polynomial to the observational data \((y)\). As in simple linear regression
discussed in the previous section, this is done by minimizing the
**sum of squared errors (SSE)**, given by the equation

\[SSE = \sum_{i=1}^n \epsilon_i^2 = \sum_{i=1}^n (y_i - \hat y_i)^2\text{.}\]

By fitting a polynomial to observations there arises the problem of
choosing the order \(k\) of the
polynomial. How to choose the right order polynomial is a matter of an
important concept called **model comparison** or **model selection**. To keep it simple
we use the **root-mean-square error (RMSE)**
defined by

\[RMSE = \sqrt{\frac{\sum_{i=1}^n (y_i - \hat y_i)^2}{n}}\]

to evaluate the goodness-of-fit of the model.