In the following sections we introduce basic operations for time series analysis. We disuses the following topics:
As mentioned in the previous section there exist few different ways to work with time series in Python. Hence, it is very important to be aware of the object class and respectively, the data representation. This representation dictates which functions will be available for loading, processing, analyzing, printing, and plotting the time series data.
For the purpose of demonstration we load the monthly (ts_FUB_monthly
), daily (ts_FUB_daily
) and hourly (ts_FUB_hourly
) time series data for the weather station Berlin-Dahlem (FU) into Python. We can do that by using the pandas.read_json()
function. Check out the previous section on data sets used to remind yourself how we processed the data.
# First, let's import the needed libraries.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from datetime import datetime
ts_FUB_monthly = pd.read_json("../data/ts_FUB_monthly.json")
ts_FUB_monthly["Date"] = pd.to_datetime(
ts_FUB_monthly["Date"], format="%Y-%m-%d", errors="coerce"
)
ts_FUB_daily = pd.read_json("../data/ts_FUB_daily.json")
ts_FUB_daily["MESS_DATUM"] = pd.to_datetime(
ts_FUB_daily["MESS_DATUM"], format="%Y-%m-%d", errors="coerce"
)
ts_FUB_hourly = pd.read_json("../data/ts_FUB_hourly.json")
ts_FUB_hourly["MESS_DATUM"] = pd.to_datetime(
ts_FUB_hourly["MESS_DATUM"], format="%Y-%m-%d", errors="coerce"
)
First, we check to object classes for the three data sets:
print(type(ts_FUB_monthly))
print(str(ts_FUB_monthly))
<class 'pandas.core.frame.DataFrame'> Date rainfall 0 1719-01-01 2.80 1 1719-02-01 1.10 2 1719-03-01 5.20 3 1719-04-01 9.00 4 1719-05-01 15.10 ... ... ... 3631 2021-08-01 17.43 3632 2021-09-01 15.55 3633 2021-10-01 10.49 3634 2021-11-01 6.28 3635 2021-12-01 2.19 [3636 rows x 2 columns]
print(type(ts_FUB_daily))
print(str(ts_FUB_daily))
<class 'pandas.core.frame.DataFrame'> MESS_DATUM Temp Rain 0 1950-01-01 -3.2 2.2 1 1950-01-02 1.0 12.6 2 1950-01-03 2.8 0.5 3 1950-01-04 -0.1 0.5 4 1950-01-05 -2.8 10.3 ... ... ... ... 26293 2021-12-27 -3.7 0.0 26294 2021-12-28 -0.5 1.5 26295 2021-12-29 4.0 0.3 26296 2021-12-30 9.0 3.2 26297 2021-12-31 12.8 5.5 [26298 rows x 3 columns]
print(type(ts_FUB_hourly))
print(str(ts_FUB_hourly))
<class 'pandas.core.frame.DataFrame'> MESS_DATUM rainfall 0 2002-01-28 11:00:00 0.0 1 2002-01-28 13:00:00 0.0 2 2002-01-28 15:00:00 1.7 3 2002-01-28 18:00:00 1.1 4 2002-01-28 21:00:00 0.0 ... ... ... 174018 2021-12-31 19:00:00 0.7 174019 2021-12-31 20:00:00 0.7 174020 2021-12-31 21:00:00 0.1 174021 2021-12-31 22:00:00 0.1 174022 2021-12-31 23:00:00 0.0 [174023 rows x 2 columns]
The data sets are of class pandas.Series
.
Now let us plot the monthly data with the plot()
function.
plt.figure(figsize=(18, 6))
plt.plot(ts_FUB_monthly.Date, ts_FUB_monthly.rainfall)
plt.show()
Exercise: Plot the daily and hourly data sets using the
plot()
function
## Your code here...
fig, ax = plt.subplots(2, 1, figsize=(18, 8))
ax[0].plot(ts_FUB_daily["Temp"])
ax[0].set_title("Temp")
ax[1].plot(ts_FUB_daily["Rain"], color="orange")
ax[1].set_title("Rain")
plt.show()
## Your code here...
plt.figure(figsize=(18, 6))
plt.plot(ts_FUB_hourly.MESS_DATUM, ts_FUB_hourly.rainfall)
plt.show()