Search code examples
pythonpandasvalidationmatplotlibtimeserieschart

How to combine two line graphs for data validation during time series analysis in Python


I have separated my training data to carry out validation and was trying to follow an example on how to graph the training and validation data in one time series chart where the lines would connect and change colors at the validation data (see picture for example I was following).

I am following the example fairly closely (adjusted for my own data) and am getting two separate charts. Example followed coding and output:

Train.Count.plot(figsize=(15,8), title= 'Daily Ridership', fontsize=14, label='train') 
valid.Count.plot(figsize=(15,8), title= 'Daily Ridership', fontsize=14, label='valid') 
plt.xlabel("Datetime") plt.ylabel("Passenger count") plt.legend(loc='best') plt.show()

example followed

My code (using different data that is set up to provide the sum of values in pandas dataframes bananas_train and bananas_val) is as follows:

bananas_train.plot(figsize=(15,8), title= 'Bananas Daily', fontsize=14, label='train')
bananas_val.plot(figsize=(15,8), title= 'Bananas Daily', fontsize=14, label='valid')
plt.xlabel("Date")
plt.ylabel("Quantity Picked")
plt.legend(loc='best')
plt.show()

The bottom line has been separated compared to the original code due to syntax errors received.

I have attempted to redo with the following code and still get two separate graphs:

#attempt 2
plt.figure(figsize=(28,8))
ax1 = bananas_train.plot(color='blue', grid=True, label='train')
ax2 = bananas_val.plot(color='red', grid=True, label='valid')
plt.xlabel("Date")
plt.ylabel("Quantity Picked")
h1, l1 = ax1.get_legend_handles_labels()
h2, l2 = ax2.get_legend_handles_labels()
plt.legend(h1+h2, l1+l2, loc='best')
plt.show()

My result shows two separate time series charts rather than one. Any ideas on how I can get these two charts to merge?

Edit: If I follow the pages that were listed as duplicates of this question... Plotting multiple dataframes using pandas functionality Plot different DataFrames in the same figure I get a graph with both lines, but with the x axis matching up from the beginning rather than continuing from where the last line left off. Here is my code where I attempted to followed those directions and got this result:

#attempt 3
ax = bananas_train.plot()
bananas_val.plot(ax=ax)
plt.show()

I need my result to continue the line graphed with the newly merged data, not overlap the lines.


Solution

  • Next time it would be nice to include some sample data in your question. I am sure there are better ways of achieving what you want but I will list one here:

    # get the length of your training and checking set assuming your data are 
    # in 2 dataframes (test and check respectively) with a column named after 'count'
    t_len=len(test['count'].index)
    c_len=len(check['count'].index)
    
    # concaternate your dataframes
    test_check= pd.concat([test['count'],check['count']],ignore_index=True)
    
    #create 2 dataframes assigning 0s to training and checking set respectively
    test_1=test_check.copy()
    check_1= test_check.copy()
    
    test_1.iloc[0:t_len]=0
    check_1.iloc[c_len:]=0
    
    #make the plot
    test_1.plot(figsize=(15,8),color='blue')
    check_1.plot(figsize=(15,8),color='red')