In the R software factor analysis is implemented by the
`factanal()`

function of the build-in *stats* package.
The function performs maximum-likelihood factor analysis on a covariance
matrix or data matrix. The number of factors to be fitted is specified
by the argument `factors`

. Further, the factor scores may be
calculated either by using *Thompson’s estimator* or
*Bartlett’s weighted least-squares scores method*. The particular
method is specified by an additional argument
`scores = "regression"`

or `scores = "Bartlett"`

.
Moreover, by the additional argument `rotation`

the
transformation of the factors may be specified by either
`rotation = "varimax"`

for orthogonal rotation,
`rotation = "varimax"`

for oblique rotation or
`rotation = "none"`

.

Let us get our hands dirty and apply a factor analysis on a the
*food-texture* data set. We already discussed the data set in the
section on principal component analysis, so you are probably
familiar with the data set.

This open source data set is available here and
describes *texture measurements of a pastry-type food*.

```
food <- read.csv("https://userpage.fu-berlin.de/soga/data/raw-data/food-texture.csv",
row.names = "X")
str(food)
```

```
## 'data.frame': 50 obs. of 5 variables:
## $ Oil : num 16.5 17.7 16.2 16.7 16.3 19.1 18.4 17.5 15.7 16.4 ...
## $ Density : int 2955 2660 2870 2920 2975 2790 2750 2770 2955 2945 ...
## $ Crispy : int 10 14 12 10 11 13 13 10 11 11 ...
## $ Fracture: int 23 9 17 31 26 16 17 26 23 24 ...
## $ Hardness: int 97 139 143 95 143 189 114 63 123 132 ...
```

The data set consists of 50 rows (observations) and 5 columns (features/variables). The features are:

- Oil: percentage oil in the pastry
- Density: the product’s density (the higher the number, the more dense the product)
- Crispy: a crispiness measurement, on a scale from 7 to 15, with 15 being more crispy.
- Fracture: the angle, in degrees, through which the pasty can be slowly bent before it fractures.
- Hardness: a sharp point is used to measure the amount of force required before breakage occurs.

`factanal()`

function callIn addition to the data set the `factanal()`

function
requires an estimate of the number of factors
(`factanal(data, factors = n)`

). This is a tricky aspect of
factor analysis. If we have a hypothesis about the latent variables we
may start with an informed guess. If we do not have any clue about the
number of factors and the number of variables in the data set is not too
large, one may simply try out several values for initializing the model.
Another, more sophisticated approach is to use principal component
analysis to get a good initial estimate of the number of factors.

In this simple example we just make a guess and set the number of
factor to be \(2\). Further, we keep
the defaults for the scores (`score = "none"`

) and the
rotation (`rotation = "varimax"`

).

`food_fa <- factanal(food, factors = 2)`

Before we interpret the results of the factor analysis recall the
basic idea behind it. Factor analysis creates linear combinations of
factors to abstract the variable’s underlying communality. To the extent
that the variables have an underlying communality, fewer factors capture
most of the variance in the data set. This allows us to aggregate a
large number of observable variables in a model to represent an
underlying concept, making it easier to understand the data. The
variability in our data, \(\mathbf X\),
is given by \(\mathbf\Sigma\), and its
estimate \(\hat{\mathbf \Sigma}\) is
composed of the variability explained by the factors explained be a
linear combination of the factors (**communality**) and of
the variability, which can not be explained by a linear combination of
the factors (**uniqueness**).

\[\hat{ \mathbf{\Sigma}} = \underbrace{\hat{\mathbf\Lambda}\hat{\mathbf\Lambda}^T}_{\text{communality}} + \underbrace{\hat{\mathbf\Psi}}_{\text{uniqueness}}\]

`food_fa`

```
##
## Call:
## factanal(x = food, factors = 2)
##
## Uniquenesses:
## Oil Density Crispy Fracture Hardness
## 0.334 0.156 0.042 0.256 0.407
##
## Loadings:
## Factor1 Factor2
## Oil -0.816
## Density 0.919
## Crispy -0.745 0.635
## Fracture 0.645 -0.573
## Hardness 0.764
##
## Factor1 Factor2
## SS loadings 2.490 1.316
## Proportion Var 0.498 0.263
## Cumulative Var 0.498 0.761
##
## Test of the hypothesis that 2 factors are sufficient.
## The chi square statistic is 0.27 on 1 degree of freedom.
## The p-value is 0.603
```

The model output starts with the function call to remind us on the specifications of our function call.

The first chunk provides the

*uniquenesses*, which range from \(0\) to \(1\). The uniqueness, sometimes referred to as*noise*, corresponds to the proportion of variability, which can not be explained by a linear combination of the factors. This is the \(\hat {\mathbf\Psi}\) in the equation above. A high uniqueness for a variable indicates that the factors do not account well for its variance.

`food_fa$uniquenesses`

```
## Oil Density Crispy Fracture Hardness
## 0.3338599 0.1555255 0.0422238 0.2560235 0.4069459
```

The next section is the

*loadings*, which range from \(-1\) to \(1\). This is the \(\hat{\mathbf \Lambda}\) in the equation above. The loadings are the contribution of each original variable to the factor. Variables with a high loading are well explained by the factor. Notice there is no entry for certain variables. That is because R does not print loadings less than \(0.1\). This is meant to help us spot groups of variables. Type`help(loadings)`

in your console for further details.By squaring the loading we compute the fraction of the variable’s total variance explained by the factor (Teetor 2011). This proportion of the variability is denoted as

**communality**. Another way to calculate the communality is to subtract the uniquenesses from \(1\). An appropriate factor model results in low values for uniqueness and high values for communality.

`apply(food_fa$loadings^2, 1, sum) # communality`

```
## Oil Density Crispy Fracture Hardness
## 0.6661398 0.8444745 0.9577762 0.7439766 0.5930539
```

`1 - apply(food_fa$loadings^2, 1, sum) # uniqueness`

```
## Oil Density Crispy Fracture Hardness
## 0.3338602 0.1555255 0.0422238 0.2560234 0.4069461
```

The table beneath the loadings shows the proportion of variance explained by each factor. The row

*Cumulative Var*gives the cumulative proportion of variance explained. These numbers range from \(0\) to \(1\). The row*Proportion Var*gives the proportion of variance explained by each factor, and the row*SS loadings*gives the sum of squared loadings. This is sometimes used to determine the value of a particular factor. A factor is worth keeping if the SS loading is greater than \(1\) (*Kaiser’s rule*).The last section of the function output shows the results of a hypothesis test. The null hypothesis, \(H_0\), is that the number of factors in the model, in our example 2 factors, is sufficient to capture the full dimensionality of the data set. Conventionally, we reject \(H_0\) if the

*p-value*is less than \(0.05\). Such a result indicates that the number of factors is too small. In contrast, we do not reject \(H_0\) if the*p-value*exceeds \(0.05\). Such a result indicates that there are likely enough (or more than enough) factors capture the full dimensionality of the data set (Teetor 2011). The high*p-value*in our example above leads us to not reject the \(H_0\), and indicates that we fitted an appropriate model. This hypothesis test is available thanks to our method of estimation, maximum likelihood. Note that if you provide a covariance matrix to the`factanal()`

function and not a data frame, the hypothesis test is not provided if we do not explicitly provide the number of observations (`n.obs`

) as an additional argument to the function call.

Recall the factor analysis model: \(\hat{ \mathbf{\Sigma}} = \hat{\mathbf\Lambda}\hat{\mathbf\Lambda}^T + \hat{\mathbf\Psi}\).

Using our factor model `food.fa`

we may calculate \(\hat {\mathbf\Sigma}\) and compare it to
the observed correlation matrix, \(\mathbf{S}\), by simple matrix algebra. The
`%*%`

operator performs matrix multiplication. The
`t()`

function transposes a matrix. The `diag()`

function takes a vector of \(k\)
numbers and creates a \(k \times k\)
matrix with the numbers on the diagonal and \(0\)s everywhere else.

```
Lambda <- food_fa$loadings
Psi <- diag(food_fa$uniquenesses)
S <- food_fa$correlation
Sigma <- Lambda %*% t(Lambda) + Psi
```

We now subtract the fitted correlation matrix, (`Sigma`

)
from our observed correlation matrix (`S`

). We also round the
result to 6 digits.

`round(S - Sigma, 6)`

```
## Oil Density Crispy Fracture Hardness
## Oil 0.000000 0.000001 -0.002613 -0.018220 -0.000776
## Density 0.000001 0.000000 -0.001081 -0.007539 -0.000320
## Crispy -0.002613 -0.001081 0.000000 0.000000 0.000005
## Fracture -0.018220 -0.007539 0.000000 0.000000 0.000033
## Hardness -0.000776 -0.000320 0.000005 0.000033 0.000000
```

The resulting matrix is called the **residual matrix**.
Numbers close to \(0\) indicate that
our factor model is a good representation of the underlying concept.

The purpose of a rotation is to produce factors with a mix of high and low loadings and few moderate-sized loadings. The idea is to give meaning to the factors and this helps us interpret them. From a mathematical viewpoint, there is no difference between a rotated and unrotated matrix. The fitted model is the same, the uniquenesses are the same, and the proportion of variance explained is the same.

Let us fit three factor models, one with no rotation, one with varimax rotation, and one with promax rotation, and make a scatter plot of the first and second loadings.

```
food_fa_none <- factanal(food, factors = 2, rotation = "none")
food_fa_varimax <- factanal(food, factors = 2, rotation = "varimax")
food_fa_promax <- factanal(food, factors = 2, rotation = "promax")
par(mfrow = c(1, 3))
plot(food_fa_none$loadings[, 1],
food_fa_none$loadings[, 2],
xlab = "Factor 1",
ylab = "Factor 2",
ylim = c(-1, 1),
xlim = c(-1, 1),
main = "No rotation")
abline(h = 0, v = 0)
plot(food_fa_varimax$loadings[, 1],
food_fa_varimax$loadings[, 2],
xlab = "Factor 1",
ylab = "Factor 2",
ylim = c(-1, 1),
xlim = c(-1, 1),
main = "Varimax rotation")
text(food_fa_varimax$loadings[, 1] - 0.08,
food_fa_varimax$loadings[, 2] + 0.08,
colnames(food),
col = "blue")
abline(h = 0, v = 0)
plot(food_fa_promax$loadings[, 1],
food_fa_promax$loadings[, 2],
xlab = "Factor 1",
ylab = "Factor 2",
ylim = c(-1, 1),
xlim = c(-1, 1),
main = "Promax rotation")
abline(h = 0, v = 0)
```

Now comes the tricky aspect in factor analysis: Interpreting the
factors themselves. If two variables both have large loadings for the
same factor, then we know they have something in common. As a researcher
we have to understand the data and its meaning in order to give a name
to that common ground. Taking a look on the figures above it appears
that factor 1 accounts for pastry, which is dense and can be bend a lot
before it breaks. Whereas factor 2 accounts for pastry that is crispy
and hard to break apart. So if we need to name these factors we would
probably call them *soft pastry* (factor 1) and *hard
pastry* (factor 2).

**Citation**

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: *Hartmann,
K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis
using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.*