The term machine
learning is associated with the terms artificial
intelligence and deep learning,
and they are often described as subsets of each other.

\[\textbf{Deep Learning} ⊂ \textbf{Machine Learning} ⊂ \textbf{Artificial Intelligence}\]

Okay, this is nice to know, but what do these terms now really describe? According to the Oxford English Dictionary, they are:

**Artificial Intelligence**,*n. Computing*, The capacity of computers or other machines to exhibit or simulate intelligent behaviour.**Machine Learning**,*n. Computing*, The capacity of computers to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyse and infer from patterns in data.**Deep Learning**,*n. Computing*, A type of machine learning considered to be in some way more dynamic or complete than others; esp. machine learning based on artificial neural networks in which multiple layers of processing are used to extract progressively more features from data.

In this project, we will skip the term artificial intelligence and focus on machine learning. Furthermore, we will provide an example of deep learning at the end.

In recent years machine learning applications are exploding due to increased computer power and storage capabilities. Example applications cover a wide range starting from technical applications like

- fraud detection in banking,
- personalized recommendations on services like Amazon or Netflix,
- self-driving cars,

go over artistic literary applications like

- paintings in the style of old master,
- writing texts with chatGPT,

an do not end with academia applications in geosciences like

- improving weather forcasts,
- analysis of sattelite images.

To get an idea about the differences between a classical and a machine learning approach, let us consider an example. Suppose we have satellite imagery of agricultural regions and we want to classify the regions according to the crops grown.

In a classical approach, we would start with some ideas from theory, i.e. different crops have different colors, different reflectivity, different seasonal cycles, and so on. Then we combine these ideas or patterns with images of known crop types to build a model that tells us how a particular crop would look like if seen from a satellite. Finally, we can apply this model to our unknown data and classify it.

In machine learning, we reverse this approach. We start with already identified data and apply algorithms and statistical methods to it. It is the algorithm itself that then learns the patterns from the data to distinguish the crops. In most cases, these patterns are implicit and not easily comparable to the patterns from the classical approach. However, we can use the results of the algorithm as a numerical model that we can then apply to our unknown data and classify it.

There are basically three different types of machine learning:
**supervised
learning**, **unsupervised
learning** and **reinforcement
learning**.

In supervised learning, we have data consisting of various features (aka input, attribute, predictor variable) and a target (aka outcome, output, (class) label, response variable). The goal is to train a model that describes a relationship between the features and the target and allows us to predict the outcome for unlabeled input data.

In a typical workflow, we pass training data, containing features and the corresponding targets, to the algorithm. The algorithm then uses the features to learn a model by comparing the predicted output with the provided correct output. After the learning step, we can apply new data with unknown output to the then-fixed model and predict the target.

Supervised learning is mainly used for:

- Classification, i.e. the prediction of a discrete label
- Regression, i.e. the prediction of a continuous value

Classifying satellite images by crop type is an example of supervised learning used for classification. The features are the values for each pixel of the different color channels of the image. The target is the crop type.

Predicting the temperature from weather station observations would be an example of a regression task.

The following list is an incomplete overview of algorithms used for supervised learning. Some of them are covered in this project.

- Perceptron
- Linear regression
- Logistic regression
- Support vector machine
- K-nearest neighbor
- Decision tree
- Artificial neural network

In unsupervised learning, the training data has no known output value. Therefore, the goal is to find some structure in the data that allows it to be classified. The typical workflow is similar to supervised learning, but the algorithm must cluster the data without knowing about a specific outcome. After the learning step, we can then apply new data to the model and predict its label.

Unsupervised learning is mainly used for clustering, i.e. defining clusters/groups in the data from the data structure itself, and dimensionality reduction, i.e. compressing data onto a smaller dimensional subspace. If we want to group weather patterns to define certain types of weather, we could use clustering of the data.

The following list is an incomplete overview of algorithms used for unsupervised learning. Some of them are covered in this project.

Reinforcement learning is different from the other two types of learning. The goal is to train an agent to perform actions in a given environment. There is typically no correct action (as opposed to supervised learning where the correct outcome is known), but the agent tries to maximize a cumulative reward through trial and error.

A classical example of this type of learning is a game AI, such as chess, where the agent has to decide on the next move. The environment is the chessboard and the positions of the different pieces. The reward is positive if the agent wins the game and negative if it loses. There are also ideas to use reinforcement learning in the field of geoscience, such as predicting the dynamics of wildfires or the effect of stratospheric aerosol injection.

The following list is an incomplete overview of algorithms used for reinforcement learning. They are not covered in this project. However, if you are interested, you might want to start with these algorithms.

**Citation**

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: *Hartmann,
K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis
using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.*