In this section we walk through the EMMA algorithm and perform concurrently the calculations in R. Therefore we use the df.soil.samples data set provided here. This synthetic toy example set was created by randomly mixing four natural end-members $$(q=4)$$, corresponding to a particular transport regime (alluvial sediments, dune sediments, loess deposits and overbank sediments).

The data set consists of 45 rows, corresponding to the soil samples, and 99 columns. The columns correspond to a particular grain size class given on the Krumbein phi $$(\phi)$$ scale.

$\phi = -\log_2 D/D_0\text{,}$ where $$D$$ is the diameter of the particle or grain in millimeters, and $$D_0$$ is a reference diameter, equal to $$1$$ mm. Let us look at the structure of the data set (reduced to the first 20 samples and grain size classes):

# load the data
str(df.soil.samples[1:20, 1:20])
## 'data.frame':    20 obs. of  20 variables:
##  $0 : num 3.04e-04 2.27e-04 7.75e-05 3.80e-04 3.79e-04 ... ##$ 0.1 : num  3.36e-04 2.48e-04 9.03e-05 4.17e-04 4.16e-04 ...
##  $0.2 : num 0.000375 0.000271 0.000111 0.000462 0.000458 ... ##$ 0.31: num  0.000525 0.000297 0.000151 0.00052 0.000564 ...
##  $0.41: num 0.00071 0.000326 0.000234 0.000879 0.000687 ... ##$ 0.51: num  0.001038 0.000361 0.000406 0.001238 0.000887 ...
##  $0.61: num 0.001605 0.000407 0.000745 0.001821 0.001218 ... ##$ 0.71: num  0.002546 0.000473 0.00138 0.002727 0.001753 ...
##  $0.82: num 0.004021 0.000574 0.002499 0.004064 0.002584 ... ##$ 0.92: num  0.006204 0.000734 0.00434 0.00593 0.003813 ...
##  $1.02: num 0.009236 0.000993 0.007166 0.008382 0.005532 ... ##$ 1.12: num  0.01318 0.00141 0.0112 0.01142 0.0078 ...
##  $1.22: num 0.01797 0.00205 0.01655 0.01495 0.01063 ... ##$ 1.33: num  0.0234 0.00301 0.0231 0.01881 0.01393 ...
##  $1.43: num 0.02908 0.00439 0.03048 0.02276 0.01757 ... ##$ 1.53: num  0.03453 0.00629 0.03807 0.02652 0.02132 ...
##  $1.63: num 0.03922 0.00878 0.04511 0.02983 0.02493 ... ##$ 1.73: num  0.0427 0.0119 0.0508 0.0324 0.0282 ...
##  $1.84: num 0.0446 0.0155 0.0547 0.0342 0.0308 ... ##$ 1.94: num  0.0449 0.0195 0.0565 0.0349 0.0327 ...

Let us plot the first out of 45 samples:

phi <- colnames(df.soil.samples)
plot(phi, df.soil.samples[1, ], type = "l", ylab = "")

We see a multimodal grain size distribution ranging from $$\phi = 0$$ to $$\phi = 9.9$$, which corresponds to grain sizes ranging from $$0.977$$ (coarse sand) to $$1000 \; \mu \text{m}$$ (clay). If we plot our 45 samples on top of the 4 natural end-members, we see that each particular soil sample is a mixture of the 4 natural end-members. The goal of EMMA is to unmix the natural end-members and thus, the underlying transport regimes.