Search code examples
pandasnumpyinterpolation

set new index for pandas DataFrame (interpolating?)


I have a DataFrame where the index is NOT time. I need to re-scale all of the values from an old index which is not equi-spaced, to a new index which has different limits and is equi-spaced.

The first and last values in the columns should stay as they are (although they will have the new, stretched index values assigned to them).

Example code is:

import numpy as np
import pandas as pd
%matplotlib inline

index = np.asarray((2, 2.5, 3, 6, 7, 12, 15, 18, 20, 27))
x = np.sin(index / 10)

df = pd.DataFrame(x, index=index)
df.plot();

newindex = np.linspace(0, 29, 100)

How do I create a DataFrame where the index is newindex and the new x values are interpolated from the old x values?

The first new x value should be the same as the first old x value. Ditto for the last x value. That is, there should not be NaNs at the beginning and copies of the last old x repeated at the end.

The others should be interpolated to fit the new equi-spaced index.

I tried df.interpolate() but couldn't work out how to interpolate against the newindex.

Thanks in advance for any help.


Solution

  • This is works well:

    import numpy as np
    import pandas as pd
    
    def interp(df, new_index):
        """Return a new DataFrame with all columns values interpolated
        to the new_index values."""
        df_out = pd.DataFrame(index=new_index)
        df_out.index.name = df.index.name
    
        for colname, col in df.iteritems():
            df_out[colname] = np.interp(new_index, df.index, col)
    
        return df_out