For the purpose of illustration of the most popular types of compositional graphics we use a data set of the geochemistry of a sediment core recovered from the eastern Juyanze palaeolake in north-western China (Hartmann and Wünnemann, 2009). We load the data set and prepare if for the analysis.

library(compositions)
# load data set
g36 <- read.csv("http://userpages.fu-berlin.de/soga/data/raw-data/G36chemical.txt",
                sep='\t',
                row.names = 'Sample')
# exclude columns of no interest 
g36 <- g36[, -c(1:5, ncol(g36)-1, ncol(g36))]
# rename columns for better readability
colnames(g36) <- gsub(".mg.g", "", colnames(g36))

The data set consists of 177 samples (rows) and 11 variables (columns).

str(g36)
## 'data.frame':    177 obs. of  11 variables:
##  $ Ba : num  0.178 0.172 0.204 0.167 0.122 0.181 0.22 0.101 0.255 0.114 ...
##  $ Ca : num  61.3 55.2 96.4 53 36 ...
##  $ Fe : num  15.2 15.2 14 16.1 10.9 ...
##  $ K  : num  3.59 2.79 3.9 4.23 2.46 ...
##  $ Mg : num  77.3 31.5 59 55.9 97.5 ...
##  $ Mn : num  0.326 0.324 0.317 0.356 0.218 0.45 0.575 0.221 0.501 0.222 ...
##  $ Na : num  18.78 9.84 18.33 14.19 16.34 ...
##  $ PO4: num  0.865 0.785 0.976 1.002 0.667 ...
##  $ S  : num  24.6 15.4 18.2 14 22.7 ...
##  $ Sr : num  0.485 0.446 0.74 0.352 0.324 0.496 0.507 0.081 0.782 0.068 ...
##  $ Cl : num  27.6 14.9 27.4 24.7 26.1 ...

The elements Ca, Mg, Sr, Fe, Mn, K, Na, S, Ba, and PO4 were measured by an inductively coupled atomic emission spectrometer (ICP-OES Aqua regia-digestion), and Chloride (Cl) was measured after Mohr on extracted fluid by titration. All elements are given in mg/g.


Scatterplots of components

In many cases a scatterplot is the first choice for exploratory data analysis. With respect to compositional data analysis it is important to realize that, scatterplots are neither scaling nor perturbation invariant and not subcompositionally coherent (Aitchison and Egozcue, 2005): It is not guaranteed that the plot of a closed subcomposition exhibits similar or even compatible patterns with the plot of the original data set; thus, a regression line drawn in such a plot cannot be trusted (van den Boogaart and Tolosana-Delgado 2013).

par(mfrow = c(1,3))

# plot 1
plot(g36[, c("Ca", "Mg")], pch = 16, main = 'No transformation')
abline(lm(Mg~Ca, data = g36), col = 'red', lty = 2)
legend('topright', legend = 'OLS', lty = 2, col = 'red', cex = 0.7)

# plot 2
plot(clo(g36)[, c("Ca", "Mg")], pch = 16, main = 'Closure, full data set')
abline(lm(Mg~Ca, data = as.data.frame(clo(g36))), col = 'red', lty = 2)
legend('topright', legend = 'OLS', lty = 2, col = 'red', cex = 0.7)

# plot 3
subcomp <- c("Ba", "Ca", "Mg", "K", "Mn", "PO4", "Sr")
plot(clo(g36[, subcomp])[, c("Ca", "Mg")], pch = 16, main = 'Closure, subcomposition')
abline(lm(Mg~Ca, data = as.data.frame(clo(g36[, subcomp]))), col = 'red', lty = 2)
legend('topright', legend = 'OLS', lty = 2, col = 'red', cex = 0.7)

Note that this is another example for subcompositional incoherence!


Ternary diagrams

A compositional data set may consist of many variables, however, for graphical purposes one seldom represents the full sample. In many cases the visualization of the compositional data feature space is restricted to three-part (sub)compositions. For three parts, the simplex is an equilateral triangle, a so-called ternary diagram, with vertices at \(A = [\kappa, 0, 0]\), \(B = [0, \kappa, 0]\) and \(C = [0, 0, \kappa]\). Ternary diagrams represent the data as compositional and relative.

In order to plot a ternary diagram with R we make use of the compositions package. The package ships with the acomp() function, which applies a closure to the data vector/matrix provided to the function and further assigns the resulting object to a special object class, the acomp class. This acomp object provides the means to analyse compositions in the philosophical framework of the Aitchison simplex. If we apply the generic plot() function on a three-part compositions, of the acomp object, we get in return a ternary diagram.

xc = acomp(g36, c("Mg","Ca","Fe"))
plot(xc)

Another possibility to plot ternary diagrams is provided by the ggtern package, an extension to the functionality of the ggplot2 package. First, we install the package by typing install.packages('ggtern') into the console, and then we make the package available by calling the library() function.

For the purpose of illustration we use the same data as in the example from above. After selecting the three components of interest (Mg, Ca, Fe), we apply the closure on the selection, and then we plot a ternary diagram using the ggtern() function. The package comes with a lot of additional graphical features, which can be reviewed in the package documentation in more detail.

library(ggtern)
# data preparation
data <- as.data.frame(clo(g36[, c("Mg","Ca","Fe")]))

# plotting
ggtern(data = data, aes(Mg, Fe, Ca)) +
geom_point(alpha = 0.5, size = 2, color = "black") +
theme_rgbw() 

Note that if the three parts represented have too different magnitudes, the data tends to plot on a border or a vertex.

Important notice
> ggtern() has been deleted from the CRAN-repsoitory on May 7th, 2023 due to ignored maintainance request. You can rather use the package Ternary:


Package Ternary with functions TernaryPlot()and TernaryPoints:

#install.packages("Ternary")
library(Ternary)
par(mar = c(2,2,2,2))

TernaryPlot(
  alab = "Mg",
  blab = "Ca",
  clab = "Fe",
  lab.offset = 0.16,
  lab.col = "black",
  point = "up",
  clockwise = TRUE,
  isometric = TRUE,
  padding = 0.08,
  col = "#FFFFFF",
  grid.lines = 10,
  grid.col = "pink" ,
  grid.lty = "solid",
  grid.minor.col = "lightblue",
  grid.minor.lty = "solid",
  grid.minor.lwd = 1,
  axis.lty = "solid",
  axis.labels = TRUE,
  axis.cex = 0.8,
  axis.font = 1,
  axis.rotate = TRUE,
  axis.tick = TRUE,
  ticks.length = 0.025,
  axis.col =  "black" ,
  ticks.col = "lightgrey"
)

TernaryPoints(g36[,c("Mg","Ca","Fe")],
  type = "p",
  cex = 1.2,
  pch = 16,
  lwd = 1,
  lty = "solid",
  col = "lightgreen" 
)


Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.