In this section we conduct a logistic regression analysis on the dwd data set. The goal of the logistic regression analysis is to predict the occurrence of hot days at any particular weather station in Germany. A hot day, referred to in German as Heißer Tag is defined as a day with temperatures $$\ge 30°C$$. The occurrence of hot days is modeled as a binary classification problem.

In the first step we build our baseline model. In a second step we build another regression model, the PCA model, for which we use explanatory variables extracted by PCA beforehand (see previous section). At the end of the study we compare the modeling results with principal components as explanatory variables to the modeling results of a unconstrained logistic regression model.