Search code examples
pythonnumpytime-seriesdifferencelogarithm

Return to natural numbers from a logarithmic difference in python


I'm working with time series data and have transformed numbers to logarithmic differences with numpy.

df['dlog']= np.log(df['columnx']).diff()

Then I made predictions with that transformation.

How can I return to normal numbers?


Solution

    • Reversing the transformation shouldn't be necessary, because columnx still exists in df
    • .diff() calculates the difference of a Series element compared with another element in the Series.
      • The first row of dlog is NaN. Without a "base" number (e.g. np.log(764677)) there is not a way to step back that transformation
    df = pd.DataFrame({'columnx': [np.random.randint(1_000_000) for _ in range(100)]})
    df['dlog'] = np.log(df.columnx).diff()
    

    Output:

     columnx      dlog
      764677       NaN
      884574  0.145653
      621005 -0.353767
      408960 -0.417722
      248456 -0.498352
    

    Undo np.log with np.exp

    • Use np.exp to transform from a logarithmic to linear scale.
    df = pd.DataFrame({'columnx': [np.random.randint(1_000_000) for _ in range(100)]})
    df['log'] = np.log(df.columnx)
    df['linear'] = np.exp(df.log)
    

    Output:

     columnx        log    linear
      412863  12.930871  412863.0
      437565  12.988981  437565.0
      690926  13.445788  690926.0
      198166  12.196860  198166.0
      427894  12.966631  427894.0
    

    Further Notes: