After we get to know about the built-in datatypes in Python, we want to introduce the most important data structures in Python in the following chapter. Data structures are in contrast to datatypes ordered arrangement and combination based on the built-in datatypes. These structures are used to store more complex data. Therefore they are essential for data science and statistical purposes. Lists
, tuples
and dictionaries
will be discussed. The concept of indexing and slicing is essential for all of these data containers, as described in the previous chapter.
Note: Per definition, every object is also a data structure.
Note: It is helpful to imagine data structures as a collection of things.
Lists are probably the handiest and most flexible type of container. A list consists of individual elements of potentially different data types. Lists are declared with square brackets []
. Individual elements of a list can be selected using the syntax <list_name>[<index>]
.
Note: Also in
lists
the index of the first element starts with 0.
Let's have a look at an example. Therefore we want to define a list
that contains different data types:
a_list = ["blueberry", "strawberry", "pineapple", 1, True]
type(a_list)
list
To access the first element:
a_list[0]
'blueberry'
To select the last element out of the list:
a_list[-1]
True
We could compare the datatypes with each other:
type(a_list[0]) == type(a_list[-1])
False
As you can see, also if all elements are stored in the same list, the data types of the individual elements could vary:
type(a_list[0])
str
type(a_list[-1])
bool
To invert the order of a list you can easily use the ::
operator:
a_list[::-1]
[True, 1, 'pineapple', 'strawberry', 'blueberry']
To add an element to a list
that is already defined, you do not have to initialise the list
again. You add an element at the end of a list
by using the <list>.append(<element>)
method:
a_list
['blueberry', 'strawberry', 'pineapple', 1, True]
len(a_list)
5
a_list.append("a new thing")
a_list
['blueberry', 'strawberry', 'pineapple', 1, True, 'a new thing']
len(a_list)
6
Sometimes it is useful to get the last element of a list and delete it after you read it. This is done by making use of the <list>.pop()
method:
a_list.pop()
a_list.pop()
a_list.pop()
1
a_list
['blueberry', 'strawberry', 'pineapple']
len(a_list)
3
If your list
contains only elements of the same data type you can use the <list>.sort()
method:
a_list.sort()
To sort your list decreasing, the <list>.reverse()
method is used:
a_list.reverse()
a_list
['strawberry', 'pineapple', 'blueberry']
Note: There are some more methods that are associated with the
list
object. A full overview is given in the Python documentation.
We won't say a whole lot about tuples except to mention that they basically work just like lists, with two major exceptions:
You'll see tuples come up throughout the Python language, and over time you'll develop a feel for when to use them.
In general, they're often used instead of lists:
a_tuple = (1,2,3,4,5)
type(a_tuple)
tuple
If you want to change an element inside of a tuple, you will get an according error message:
a_tuple[2] = 15
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Input In [20], in <cell line: 1>() ----> 1 a_tuple[2] = 15 TypeError: 'tuple' object does not support item assignment
At last we want to introduce the data structure of dictonaries
. This data structure is widely used in Python because dictonaries
are enormously fast if you want to get elements of out a huge data. Another difference to the list
object is that dictonaries
organize their data with the help of key
value
pairs. That means the index
which is used in lists is replaced by keys
. To access an element you have to know the associated key
:
key
$\to$ value
Let's have a look at an example:
my_dict = {"Marry" : 22 , "Frank" : 33 }
Note: Dictonaries are defined with the
{}
brackets. Everykey
is intended to be astring
. Therefore you need to put thekeys
in""
. The associated value is assigned by the use of the:
operator.
To print the content of a dictonary
just type it's name:
my_dict
{'Marry': 22, 'Frank': 33}
To select a specific element / value of a dictonary
the [<key>]
brackets are used:
my_dict["Marry"]
22
my_dict["Frank"]
33
To add a new key
-value
pair to an already defined dictonary
:
my_dict["Anne"] = 13
my_dict
{'Marry': 22, 'Frank': 33, 'Anne': 13}
If you try to access a key
that not exist you will get an error message. You could implement a custom error message by:
my_dict.get("Heidi", "Danger no entry found!")
'Danger no entry found!'
To retrieve a list of all keys
of a dictonary
the <dictonary>.keys()
method is used:
my_dict.keys()
dict_keys(['Marry', 'Frank', 'Anne'])
Accordingly you could ask for a list of all elements in a dictonary
by make use of the <dictonary>.values()
method:
my_dict.values()
dict_values([22, 33, 13])
Citation
The E-Learning project SOGA-Py was developed at the Department of Earth Sciences by Annette Rudolph, Joachim Krois and Kai Hartmann. You can reach us via mail by soga[at]zedat.fu-berlin.de.
Please cite as follow: Rudolph, A., Krois, J., Hartmann, K. (2023): Statistics and Geodata Analysis using Python (SOGA-Py). Department of Earth Sciences, Freie Universitaet Berlin.