In this section we conduct a logistic regression analysis on the dwd data set. The goal of the logistic regression analysis is to predict the occurrence of hot days at any particular weather station in Germany. A hot day, referred to in German as Heißer Tag is defined as a day with temperatures $≥30°C$. The occurrence of hot days is modeled as a binary classification problem.
In the first step we build our baseline model. In a second step we build another regression model, the PCA model, for which we use explanatory variables extracted by PCA beforehand (see previous section). At the end of the study we compare the modeling results with principal components as explanatory variables to the modeling results of a unconstrained logistic regression model.
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.