In this section we conduct a logistic regression analysis on the dwd data set. The goal of the logistic regression analysis is to predict the occurrence of hot days at any particular weather station in Germany. A hot day, referred to in German as Heißer Tag is defined as a day with temperatures \(≥30°C\). The occurrence of hot days is modeled as a binary classification problem.

In the first step we build our baseline model. In a second step we build another regression model, the PCA model, for which we use explanatory variables extracted by PCA beforehand (see previous section). At the end of the study we compare the modeling results with principal components as explanatory variables to the modeling results of a unconstrained logistic regression model.


The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.