Numeric

In R decimal values are called numeric. All numbers in R are automatically stored as numeric values. By assigning a decimal value to a variable (e.g. num), this variable will be of numeric type.

num <- 33.33
num
## [1] 33.33
class(num)
## [1] "numeric"

Even if we assign an integer to a variable iteg, it is still being saved as a numeric value.

iteg <- 33
iteg
## [1] 33
class(iteg)
## [1] "numeric"

Integer

In R integer values are the equivalent of numbers that refer to the mathematical set of integer or whole numbers. In order to create an integer variable in R, we state this explicitly by adding an L, or by invoking the as.integer() function.

iteg <- 50L
iteg
## [1] 50
class(iteg)
## [1] "integer"
iteg2 <- as.integer(25)
iteg2
## [1] 25
class(iteg2)
## [1] "integer"
iteg2 <- as.integer(5.55)
iteg2
## [1] 5
class(iteg2)
## [1] "integer"

Character

A character object is used to represent string values in R. It is created by assigning a string, a sequence of characters, to a variable or by converting objects into character values with the as.character() function:

chr <- "soga"
class(chr)
## [1] "character"
p <- as.character(3.14)
class(p)
## [1] "character"

There are many functions available to work on character objects.

Two character values can be concatenated with the paste() or the paste0() function. With paste(..., sep = " ", ...) we can define, how they should be separated (e.g. the default sep = " " sets a white space, sep = "," sets a comma in between and sep = "" avoids any space, making it the same as paste0())

first <- "Francis Ford"
last <- "Coppola"
paste(first, last)
## [1] "Francis Ford Coppola"
paste0(first, last)
## [1] "Francis FordCoppola"

We may extract a substring by applying the substr() function.

my_string <- "I prefer statistics over beer and coffee"
substr(my_string, start = 3, stop = 22)
## [1] "prefer statistics ov"

Or play around with upper and lower case transformations.

toupper(my_string)
## [1] "I PREFER STATISTICS OVER BEER AND COFFEE"
tolower(my_string)
## [1] "i prefer statistics over beer and coffee"

Many more functions for string manipulation are available (see for example the stringr package).

 

Exercise: Use the code with the defined variables below to reproduce the given output! Combine the defined variables with the paste function!

name1 <- "Alan"
age1 <- 20
name2 <- "Marie"
age2 <- 19

### your code here
Show code
paste(
  name1, "is", age1, "and", name2, "is", age2, "years old. Together they are", age1 + age2,
  "years old and on average", (age1 + age2) / 2, "years old."
)
## [1] "Alan is 20 and Marie is 19 years old. Together they are 39 years old and on average 19.5 years old."

 


Logical

Logical values relate to comparisons between variables, so called logical expressions. The result of a logical expression is either TRUE or FALSE.

x <- 100
y <- 200 # sample values

x > y
## [1] FALSE
x < y
## [1] TRUE

The datatype that is reserved for logical values in R is called logical. In other programming languages and in general we also refer to this datatype as boolean.

z <- 100 == 100
class(z)
## [1] "logical"

Standard logical operations are >, <, >=, <=,== (equal), & (logical AND), | (logical OR), and ! (logical NEGATION).

a <- TRUE
b <- FALSE
a & b # are a and b both TRUE?
## [1] FALSE
a | b # is at least a OR b TRUE?
## [1] TRUE
!a # invert the logical value of a
## [1] FALSE

 

Exercise: Check, if the value stored in the variable word is a palindrom by the use of logical expressions and the substring function!

word <- "kayak"

### your code here
Show code
(substr(word, 1, 1) == substr(word, 5, 5)) & (substr(word, 2, 2) == substr(word, 4, 4))
## [1] TRUE

 


Factor

Factors are used to represent categorical data. They are a very useful class for statistical analysis and for plotting. Note that factors can be ordered or unordered.

Factors can be created using the factor() command.

weather <- factor(c("rainy", "sunny", "sunny", "rainy"))

Factors are stored as integers and have labels, known as levels, associated with these unique integers. By default, R always sorts levels in alphabetical order.

weather
## [1] rainy sunny sunny rainy
## Levels: rainy sunny

R will assign 1 to the level “rainy” and 2 to the level “sunny”. We can check this by using the str() function:

str(weather)
##  Factor w/ 2 levels "rainy","sunny": 1 2 2 1

If the order matters and is meaningful (e.g. “low”, “medium”, “high”) we may specify the order of the levels by the levels argument and by the ordered = TRUE argument:

flood_level <- factor(c(
  "low", "high",
  "medium", "high",
  "low", "medium"
),
levels = c("low", "medium", "high"),
ordered = TRUE
)

min(flood_level)
## [1] low
## Levels: low < medium < high

Citation

The E-Learning project SOGA-R was developed at the Department of Earth Sciences by Kai Hartmann, Joachim Krois and Annette Rudolph. You can reach us via mail by soga[at]zedat.fu-berlin.de.

Creative Commons License
You may use this project freely under the Creative Commons Attribution-ShareAlike 4.0 International License.

Please cite as follow: Hartmann, K., Krois, J., Rudolph, A. (2023): Statistics and Geodata Analysis using R (SOGA-R). Department of Earth Sciences, Freie Universitaet Berlin.