In R decimal values are called numeric
. All numbers in R
are automatically stored as numeric
values. By assigning a
decimal value to a variable (e.g. num
), this variable will
be of numeric type.
num <- 33.33
num
## [1] 33.33
class(num)
## [1] "numeric"
Even if we assign an integer to a variable iteg
, it is
still being saved as a numeric value.
iteg <- 33
iteg
## [1] 33
class(iteg)
## [1] "numeric"
In R integer
values are the equivalent of numbers that
refer to the mathematical set of integer or whole numbers. In order to
create an integer variable in R, we state this explicitly by adding an
L, or by invoking the as.integer()
function.
iteg <- 50L
iteg
## [1] 50
class(iteg)
## [1] "integer"
iteg2 <- as.integer(25)
iteg2
## [1] 25
class(iteg2)
## [1] "integer"
iteg2 <- as.integer(5.55)
iteg2
## [1] 5
class(iteg2)
## [1] "integer"
A character
object is used to represent string values in
R. It is created by assigning a string, a sequence of characters, to a
variable or by converting objects into character values with the
as.character()
function:
chr <- "soga"
class(chr)
## [1] "character"
p <- as.character(3.14)
class(p)
## [1] "character"
There are many functions available to work on character
objects.
Two character values can be concatenated with the
paste()
or the paste0()
function. With
paste(..., sep = " ", ...)
we can define, how they should
be separated (e.g. the default sep = " "
sets a white
space, sep = ","
sets a comma in between and
sep = ""
avoids any space, making it the same as
paste0()
)
first <- "Francis Ford"
last <- "Coppola"
paste(first, last)
## [1] "Francis Ford Coppola"
paste0(first, last)
## [1] "Francis FordCoppola"
We may extract a substring by applying the substr()
function.
my_string <- "I prefer statistics over beer and coffee"
substr(my_string, start = 3, stop = 22)
## [1] "prefer statistics ov"
Or play around with upper and lower case transformations.
toupper(my_string)
## [1] "I PREFER STATISTICS OVER BEER AND COFFEE"
tolower(my_string)
## [1] "i prefer statistics over beer and coffee"
Many more functions for string manipulation are available (see for
example the stringr
package).
Exercise: Use the code with the defined variables below to reproduce the given output! Combine the defined variables with the
paste
function!
name1 <- "Alan"
age1 <- 20
name2 <- "Marie"
age2 <- 19
### your code here
paste(
name1, "is", age1, "and", name2, "is", age2, "years old. Together they are", age1 + age2,
"years old and on average", (age1 + age2) / 2, "years old."
)
## [1] "Alan is 20 and Marie is 19 years old. Together they are 39 years old and on average 19.5 years old."
Logical values relate to comparisons between variables, so called
logical expressions. The result of a logical expression
is either TRUE
or FALSE
.
x <- 100
y <- 200 # sample values
x > y
## [1] FALSE
x < y
## [1] TRUE
The datatype that is reserved for logical values in R is called
logical
. In other programming languages and in general we
also refer to this datatype as boolean.
z <- 100 == 100
class(z)
## [1] "logical"
Standard logical operations are >
, <
,
>=
, <=
,==
(equal),
&
(logical AND), |
(logical OR), and
!
(logical NEGATION).
a <- TRUE
b <- FALSE
a & b # are a and b both TRUE?
## [1] FALSE
a | b # is at least a OR b TRUE?
## [1] TRUE
!a # invert the logical value of a
## [1] FALSE
Exercise: Check, if the value stored in the variable
word
is a palindrom by the use of logical expressions and thesubstring
function!
word <- "kayak"
### your code here
(substr(word, 1, 1) == substr(word, 5, 5)) & (substr(word, 2, 2) == substr(word, 4, 4))
## [1] TRUE
Factors are used to represent categorical data. They are a very useful class for statistical analysis and for plotting. Note that factors can be ordered or unordered.
Factors can be created using the factor()
command.
weather <- factor(c("rainy", "sunny", "sunny", "rainy"))
Factors are stored as integers and have labels, known as levels, associated with these unique integers. By default, R always sorts levels in alphabetical order.
weather
## [1] rainy sunny sunny rainy
## Levels: rainy sunny
R will assign 1 to the level “rainy” and 2 to the level “sunny”. We
can check this by using the str()
function:
str(weather)
## Factor w/ 2 levels "rainy","sunny": 1 2 2 1
If the order matters and is meaningful (e.g. “low”, “medium”, “high”)
we may specify the order of the levels by the levels
argument and by the ordered = TRUE
argument:
flood_level <- factor(c(
"low", "high",
"medium", "high",
"low", "medium"
),
levels = c("low", "medium", "high"),
ordered = TRUE
)
min(flood_level)
## [1] low
## Levels: low < medium < high
Citation
The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.