Search code examples
pythonpandasdataframetime-seriesshift

Shift Dataframe and filling up NaN


I want to create a DataFrame which includes the hourly heatdemand of n consumers (here n=5) for 10 hours. --> DataFrame called "Village", n columns (each representing a consumer) and 10 rows (10 hours) All consumers follow the same demand-profile with the only difference that it is shiftet within a random amount of hours. The random number follows a normal distribution.

I managed to create a list of discrete numbers that follow a normal distribution and i managed to create a DataFrame with n rows where the same demand-profile gets shiftet by that random number.

The problem i cant solve is, that NaN appears instead of filling up the shifted hours with the values that where cut of because of the shift.

Example: if a the demand-profile gets shiftet by 1 hour (like consumer 5 for example). Now there appears "NaN" as the demand in the first hour. Instead of "NaN" i would like the value of the 10th hour of the original demand-profile to appear (4755.005240). So instead of shifting the values of the demand-profile i kind of want it more to "rotate".

   heat_demand
0  1896.107462
1  1964.878199
2  2072.946499
3  2397.151402
4  3340.292937
5  4912.195496
6  6159.893152
7  5649.024821
8  5157.805271
9  4755.005240

    Consumer 1   Consumer 2   Consumer 3   Consumer 4   Consumer 5
0  1896.107462          NaN  1964.878199          NaN          NaN
1  1964.878199          NaN  2072.946499          NaN  1896.107462
2  2072.946499          NaN  2397.151402          NaN  1964.878199
3  2397.151402  1896.107462  3340.292937  1896.107462  2072.946499
4  3340.292937  1964.878199  4912.195496  1964.878199  2397.151402
5  4912.195496  2072.946499  6159.893152  2072.946499  3340.292937
6  6159.893152  2397.151402  5649.024821  2397.151402  4912.195496
7  5649.024821  3340.292937  5157.805271  3340.292937  6159.893152
8  5157.805271  4912.195496  4755.005240  4912.195496  5649.024821
9  4755.005240  6159.893152          NaN  6159.893152  5157.805271

Could someone maybe give me a hint how to solve that problem? Thanks a lot already in advance and kind regards

Luise

import numpy as np
import pandas as pd
import os

path= os.path.dirname(os.path.abspath(os.path.join(file)))

#Create a list with discrete numbers following normal distribution
n = 5
timeshift_1h = np.random.normal(loc=0.1085, scale=1.43825, size=n)
timeshift_1h = np.round(timeshift_1h).astype(int)
print ("Time Shift in h:", timeshift_1h)

#Read the Standard Load Profile
cols = ["heat_demand"]
df_StandardLoadProfile = pd.read_excel(os.path.join(path, '10_h_example.xlsx'),usecols=cols)
print(df_StandardLoadProfile)

#Create a df for n consumers, whose demand equals a shifted StandardLoadProfile.
#It is shifted by a random amount of hours, that is taken from the list timeshift_1h
list_consumers = list(range(1,n+1))
Village=pd.DataFrame()
for i in list_consumers:
a=timeshift_1h[i-1]
name = "Consumer {}".format(i)
Village[name] = df_StandardLoadProfile.shift(a)
print(Village)

Solution

  • There's a very nice numpy function for that use-case, namely np.roll (see here for the documentation). It takes an array and shifts it by the steps specified withshift.

    For your example, this could look like the following:

    import pandas as pd
    import numpy as np
    
    df = pd.read_csv("demand.csv")
    df['Consumer 1'] = np.roll(df["heat_demand"], shift=1)