A vector is a collection of elements. Most common
are character
, logical
, integer
,
numeric
or factor
.
There are many ways to initialize a vector object.
x <- numeric(0)
class(x)
## [1] "numeric"
y <- "cat"
class(y)
## [1] "character"
We can add elements to a vector using the c()
(combine)
function
aa <- c("John", "Sahra", "I")
aa
## [1] "John" "Sahra" "I"
Or create vectors from a sequence of numbers using either the colon
operator :
or the seq()
function.
s1 <- 1:25
s1
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
s2 <- seq(from = 1, to = 25, by = 2)
s2
## [1] 1 3 5 7 9 11 13 15 17 19 21 23 25
Vector Arithmetics
A very convenient feature in R is that arithmetic operations on vectors are performed member-by-member.
Suppose we have two vectors \(\mathbf u\) and \(\mathbf v\).
u <- c(10, 30, 50, 70, 90)
v <- c(20, 40, 60, 80, 100)
If we multiply vector \(\mathbf u\) by 100, we would get a vector with each of its members multiplied by 100.
u * 100
## [1] 1000 3000 5000 7000 9000
Similarly, if we add \(\mathbf u\) and \(\mathbf v\), the result corresponds to the sum of the corresponding members from \(\mathbf u\) and \(\mathbf v\).
u + v
## [1] 30 70 110 150 190
This concept as well holds true for subtraction, multiplication and division.
u - v
## [1] -10 -10 -10 -10 -10
u * v
## [1] 200 1200 3000 5600 9000
u / v
## [1] 0.5000000 0.7500000 0.8333333 0.8750000 0.9000000
The Recycling Rule
What if we compute two vectors of unequal length?
If you encounter such a case R applies the recycling rule. Hence, any short vector operands are extended and R is recycling its values until they match the size of any other operands.
For example, the following vectors \(\mathbf v\) and \(\mathbf w\) have different lengths and their sum is computed by recycling values of the shorter vector \(\mathbf v\).
v <- c(1, 2, 3)
w <- c(100, 200, 300, 400, 500, 600, 700, 800, 900)
v + w
## [1] 101 202 303 401 502 603 701 802 903
Consider one more example, which showcases a nice application of the
recycling rule. Let us create a vector \(\mathbf m\), a vector of fives of length
20. We may apply the handy rep()
function to construct such
a vector.
l <- 20 # length
m <- rep(5, l)
m
## [1] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
Now, if we want to turn every second item into the number \(-5\), we simply apply the recycling rule by
multiplying \(\mathbf m\) with a vector
of the form [1,-1]
.
c(1, -1) * m
## [1] 5 -5 5 -5 5 -5 5 -5 5 -5 5 -5 5 -5 5 -5 5 -5 5 -5
Vector Indexing
In order to retrieve values of a vector we use the single square
bracket []
operator.
Let us create a vector \(\mathbf s\) with five entries.
s <- c("aa", "bb", "cc", "dd", "ee")
In order to access the second vector member we use the index position 2 for retrieving the second member.
s[2]
## [1] "bb"
If we provide a negative index, the member whose position has the same absolute value as the negative index will be striped off the vector.
s[-2]
## [1] "aa" "cc" "dd" "ee"
If an index is out-of-range, a missing value will be reported via the
symbol NA
.
s[10]
## [1] NA
A vector can be sliced by a numeric index vector.
s[c(2, 5)]
## [1] "bb" "ee"
To produce a vector slice between two indexes, we apply the colon
operator :
.
s[2:5]
## [1] "bb" "cc" "dd" "ee"
A vector can be sliced from a given vector with a logical
index vector, which has the same length as the original vector. Its
members are TRUE
if the corresponding members in the
original vector are to be included in the slice, and FALSE
if otherwise.
n <- c(FALSE, TRUE, FALSE, TRUE, FALSE)
s[n]
## [1] "bb" "dd"
A matrix is a collection of data elements arranged
in a two-dimensional rectangular layout. In R we create a matrix with
the matrix()
function. Therefore we have to specify the
data
argument, the desired number of rows and number of
columns by the nrow
and ncol
arguments and the
byrow
argument, which specifies if the matrix is filled by
columns (the default) or by rows (byrow = TRUE
).
M <- matrix(
data = c(2, 4, 3, 1, 5, 7), # the data elements
nrow = 2, # number of rows
ncol = 3, # number of columns
byrow = TRUE
) # fill matrix by rows
M
## [,1] [,2] [,3]
## [1,] 2 4 3
## [2,] 1 5 7
An element at the mth row, nth column of \(\mathbf M\) can be accessed by the expression \(\mathbf M[m,n]\).
M[1, 3] # element at 1st row, 3rd column
## [1] 3
The entire mth row can be extracted as \(\mathbf M[m,]\).
M[2, ] # the 2nd row
## [1] 1 5 7
Similarly, the entire nth column can be extracted as \(\mathbf M[,n]\).
M[, 3] # the 3rd column
## [1] 3 7
Of course we can also extract more than one rows or columns at a time.
M[, c(1, 3)]
## [,1] [,2]
## [1,] 2 3
## [2,] 1 7
By applying the cbind()
and rbind()
functions we combine matrices horizontally and vertically. The function
returns a matrix.
M
## [,1] [,2] [,3]
## [1,] 2 4 3
## [2,] 1 5 7
cbind(M, M)
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 2 4 3 2 4 3
## [2,] 1 5 7 1 5 7
rbind(M, M)
## [,1] [,2] [,3]
## [1,] 2 4 3
## [2,] 1 5 7
## [3,] 2 4 3
## [4,] 1 5 7
Note that by using these commands we may easily extend any matrix
with a vector of appropriate length. We check the dimensions of a
matrix
object by applying the dim()
function,
or by using the nrow()
and ncol()
commands.
no_of_rows <- dim(M)[1]
no_of_cols <- dim(M)[2]
dim(M)
## [1] 2 3
nrow(M)
## [1] 2
ncol(M)
## [1] 3
Let us create two appropriate vectors …
v1 <- rep(1, no_of_rows)
v2 <- rep(2, no_of_cols)
… and combine them with the matrix \(\mathbf M\).
cbind(M, v1)
## v1
## [1,] 2 4 3 1
## [2,] 1 5 7 1
rbind(M, v2)
## [,1] [,2] [,3]
## 2 4 3
## 1 5 7
## v2 2 2 2
Matrix Algebra
R provides a rich environment for matrix algebra, also referred to as linear algebra.
Hence, we construct two matrices, \(\mathbf M\), a 2 by 2 matrix, and \(\mathbf N\), a 3 by 2 matrix.
M <- matrix(c(1, 2, 3, 4), nrow = 2)
M
## [,1] [,2]
## [1,] 1 3
## [2,] 2 4
N <- matrix(c(9, 8, 7, 6, 5, 4), nrow = 3)
N
## [,1] [,2]
## [1,] 9 6
## [2,] 8 5
## [3,] 7 4
Basic algebraic operations (+
and
-
)
M + M
## [,1] [,2]
## [1,] 2 6
## [2,] 4 8
N - N
## [,1] [,2]
## [1,] 0 0
## [2,] 0 0
## [3,] 0 0
Scalar multiplication and scalar division
M * 10
## [,1] [,2]
## [1,] 10 30
## [2,] 20 40
N / 10
## [,1] [,2]
## [1,] 0.9 0.6
## [2,] 0.8 0.5
## [3,] 0.7 0.4
Transpose of a matrix
We can check the dimensions of a matrix by applying the
dim()
function.
dim(N)
## [1] 3 2
t(N)
## [,1] [,2] [,3]
## [1,] 9 8 7
## [2,] 6 5 4
dim(t(N))
## [1] 2 3
Element-wise multiplication and division
M * M
## [,1] [,2]
## [1,] 1 9
## [2,] 4 16
M / M
## [,1] [,2]
## [1,] 1 1
## [2,] 1 1
Matrix multiplication (inner product)
M %*% M
## [,1] [,2]
## [1,] 7 15
## [2,] 10 22
Note that M %*% N
will cause an error,
Error in base::"%*%"(x, y) : non-conformable arguments
because the inner dimensions of the matrices are not of the same length.
dim(M)
## [1] 2 2
dim(N)
## [1] 3 2
Transposing the matrix \(\mathbf N\) solves that issue.
dim(M)
## [1] 2 2
dim(t(N))
## [1] 2 3
M %*% t(N)
## [,1] [,2] [,3]
## [1,] 27 23 19
## [2,] 42 36 30
Inverse of a square matrix
solve(M)
## [,1] [,2]
## [1,] -2 1.5
## [2,] 1 -0.5
Row means and column means
As the computation of the sum and the mean of rows and columns of a
matrix is a very common task, R provides for our convenience the
rowMeans()
, rowSums()
,
colMeans()
, and colSums()
functions
N # Matrix N
## [,1] [,2]
## [1,] 9 6
## [2,] 8 5
## [3,] 7 4
rowMeans(N) # Returns vector of row means.
## [1] 7.5 6.5 5.5
rowSums(N) # Returns vector of row sums.
## [1] 15 13 11
colMeans(N) # Returns vector of column means.
## [1] 8 5
colSums(N) # Returns vector of column sums.
## [1] 24 15
Lists are R objects which may contain elements of
different types, such as numbers, strings, vectors, another list, a
matrix or a function as its elements. Hence, this data structure if
often used as a container to store or organize R objects. A list is
created by using the list()
function.
Let us create a list L
containing strings, numbers,
vectors, logical values and a matrix.
v1 <- 1978
v2 <- c("Love", "Hate")
v3 <- seq(10, 100, 10)
v4 <- c(TRUE, TRUE, FALSE)
v5 <- matrix(v3, nrow = 2)
L <- list(v1, v2, v3, v4, v5)
class(L)
## [1] "list"
By calling the str()
command we may inspect the
structure of the list
object.
str(L)
## List of 5
## $ : num 1978
## $ : chr [1:2] "Love" "Hate"
## $ : num [1:10] 10 20 30 40 50 60 70 80 90 100
## $ : logi [1:3] TRUE TRUE FALSE
## $ : num [1:2, 1:5] 10 20 30 40 50 60 70 80 90 100
Using the single square bracket []
operator we retrieve
a slice of the list.
L[5]
returns the 5th element of the list
object L
, which in our case corresponds to the
matrix
object.
L[5]
## [[1]]
## [,1] [,2] [,3] [,4] [,5]
## [1,] 10 30 50 70 90
## [2,] 20 40 60 80 100
Note that the slice is still a list
object.
class(L[5])
## [1] "list"
In order to reference a list element directly, we have to use the
double square bracket [[]]
operator.
L[[5]]
## [,1] [,2] [,3] [,4] [,5]
## [1,] 10 30 50 70 90
## [2,] 20 40 60 80 100
class(L[[5]])
## [1] "matrix" "array"
A very convenient feature of lists is that we may assign names to
list elements, and hence, reference them by name instead of a numeric
index. We can either provide the names of the elements during the
construction of the list or use the names()
function.
list(
"scalar" = v1,
"character_vector" = v2,
"numeric_vector" = v3,
"logical_vector" = v4,
"matrix" = v5
)
## $scalar
## [1] 1978
##
## $character_vector
## [1] "Love" "Hate"
##
## $numeric_vector
## [1] 10 20 30 40 50 60 70 80 90 100
##
## $logical_vector
## [1] TRUE TRUE FALSE
##
## $matrix
## [,1] [,2] [,3] [,4] [,5]
## [1,] 10 30 50 70 90
## [2,] 20 40 60 80 100
However, for the purpose of this tutorial we apply the
names()
function.
names(L) <- c(
"scalar", "character_vector",
"numeric_vector", "logical_vector",
"matrix"
)
Therefore we can slice the list by using the element names.
L[c("scalar", "numeric_vector")]
## $scalar
## [1] 1978
##
## $numeric_vector
## [1] 10 20 30 40 50 60 70 80 90 100
Again, in order to reference a list element directly, we have to use
the double square bracket [[]]
operator.
L[["scalar"]]
## [1] 1978
Alternatively, a named list element can also be referenced directly
with the $
operator.
L$matrix
## [,1] [,2] [,3] [,4] [,5]
## [1,] 10 30 50 70 90
## [2,] 20 40 60 80 100
We can as well manipulate list
objects and add, delete
and update list elements. Note that we can add elements only at the end
of a list.
L["new_element"] <- "I am a new element at the end of the list"
L
## $scalar
## [1] 1978
##
## $character_vector
## [1] "Love" "Hate"
##
## $numeric_vector
## [1] 10 20 30 40 50 60 70 80 90 100
##
## $logical_vector
## [1] TRUE TRUE FALSE
##
## $matrix
## [,1] [,2] [,3] [,4] [,5]
## [1,] 10 30 50 70 90
## [2,] 20 40 60 80 100
##
## $new_element
## [1] "I am a new element at the end of the list"
L[1] <- NULL # Remove the first element.
L
## $character_vector
## [1] "Love" "Hate"
##
## $numeric_vector
## [1] 10 20 30 40 50 60 70 80 90 100
##
## $logical_vector
## [1] TRUE TRUE FALSE
##
## $matrix
## [,1] [,2] [,3] [,4] [,5]
## [1,] 10 30 50 70 90
## [2,] 20 40 60 80 100
##
## $new_element
## [1] "I am a new element at the end of the list"
L[3] <- "updated element"
L
## $character_vector
## [1] "Love" "Hate"
##
## $numeric_vector
## [1] 10 20 30 40 50 60 70 80 90 100
##
## $logical_vector
## [1] "updated element"
##
## $matrix
## [,1] [,2] [,3] [,4] [,5]
## [1,] 10 30 50 70 90
## [2,] 20 40 60 80 100
##
## $new_element
## [1] "I am a new element at the end of the list"
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.