Add Metadata for later filtering¶
# Load data set
import pandas as pd
import datetime
df = pd.read_csv("https://raw.githubusercontent.com/retomarek/edap/main/edap/sampleData/flatElectricity.csv",
sep = ";")
df
df['time'] = pd.to_datetime(df['time'], format='%Y-%m-%d %H:%M:%S')
df = df.set_index("time", drop=True)
df = df[["FlatA_Ele"]]
df.rename({'FlatA_Ele': 'value'}, axis=1, inplace=True)
df.head()
| value | |
|---|---|
| time | |
| 2018-11-30 23:45:00 | 5619.889 |
| 2018-12-01 00:00:00 | 5619.904 |
| 2018-12-01 00:15:00 | 5619.929 |
| 2018-12-01 00:30:00 | 5619.952 |
| 2018-12-01 00:45:00 | 5619.973 |
Year, Month, Day, Day of Week¶
To e.g. group, filter and aggregate data we need eventually the date splitted up in day, month and year
df['year'] = df.index.year
df['month'] = df.index.month
df['day'] = df.index.day
df['dateString'] = df.index.strftime("%Y-%m-%d")
df['hour'] = df.index.hour
df['minute'] = df.index.minute
df['second'] = df.index.second
df['dayofweek'] = df.index.dayofweek # 0=Mo, 6=Su
df['dayofweekString'] = df.index.strftime('%a')
df["timeDec"] = df["hour"] + df["minute"]/60 + df["second"]/3600
df.head()
| value | year | month | day | hour | minute | second | dayofweek | dayofweekString | timeDec | |
|---|---|---|---|---|---|---|---|---|---|---|
| time | ||||||||||
| 2018-11-30 23:45:00 | 5619.889 | 2018 | 11 | 30 | 23 | 45 | 0 | 4 | Fri | 23.75 |
| 2018-12-01 00:00:00 | 5619.904 | 2018 | 12 | 1 | 0 | 0 | 0 | 5 | Sat | 0.00 |
| 2018-12-01 00:15:00 | 5619.929 | 2018 | 12 | 1 | 0 | 15 | 0 | 5 | Sat | 0.25 |
| 2018-12-01 00:30:00 | 5619.952 | 2018 | 12 | 1 | 0 | 30 | 0 | 5 | Sat | 0.50 |
| 2018-12-01 00:45:00 | 5619.973 | 2018 | 12 | 1 | 0 | 45 | 0 | 5 | Sat | 0.75 |