Search code examples
pythonpandasreindex

Pandas: filling the gap in time with certain value


I have the following data frame:

    timestamp            value
    2018-02-26 09:13:00  0.972198
    2018-02-26 09:14:00  1.008504
    2018-02-26 09:15:00  1.011961
    2018-02-26 09:18:00  1.018950
    2018-02-26 09:19:00  1.008538
    2018-02-26 09:21:00  0.988535
    2018-02-26 09:22:00  0.944170
    2018-02-26 09:23:00  0.940284

I want to fill all the gap in timestamp with the value = 2, so the output would be like:

    timestamp            value
    2018-02-26 09:13:00  0.972198
    2018-02-26 09:14:00  1.008504
    2018-02-26 09:15:00  1.011961
    2018-02-26 09:16:00  2.0
    2018-02-26 09:17:00  2.0
    2018-02-26 09:18:00  1.018950
    2018-02-26 09:19:00  1.008538
    2018-02-26 09:20:00  2.0
    2018-02-26 09:21:00  0.988535
    2018-02-26 09:22:00  0.944170
    2018-02-26 09:23:00  0.940284

I used the following code to fill the gap in timestamp first:

df.reindex(index = 'timestamp')

but got the following errors. I am wondering what did I miss here? Thanks!

TypeErrorTraceback (most recent call last)
<ipython-input-5-cf75ce057c42> in <module>()
----> 1 df.reindex(index = 'timestamp')

/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/frame.pyc in reindex(self, index, columns, **kwargs)
   2731     def reindex(self, index=None, columns=None, **kwargs):
   2732         return super(DataFrame, self).reindex(index=index, columns=columns,
-> 2733                                               **kwargs)
   2734 
   2735     @Appender(_shared_docs['reindex_axis'] % _shared_doc_kwargs)

/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/generic.pyc in reindex(self, *args, **kwargs)
   2513         # perform the reindex on the axes
   2514         return self._reindex_axes(axes, level, limit, tolerance, method,
-> 2515                                   fill_value, copy).__finalize__(self)
   2516 
   2517     def _reindex_axes(self, axes, level, limit, tolerance, method, fill_value,

/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/frame.pyc in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
   2677         if index is not None:
   2678             frame = frame._reindex_index(index, method, copy, level,
-> 2679                                          fill_value, limit, tolerance)
   2680 
   2681         return frame

/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/frame.pyc in _reindex_index(self, new_index, method, copy, level, fill_value, limit, tolerance)
   2685         new_index, indexer = self.index.reindex(new_index, method=method,
   2686                                                 level=level, limit=limit,
-> 2687                                                 tolerance=tolerance)
   2688         return self._reindex_with_indexers({0: [new_index, indexer]},
   2689                                            copy=copy, fill_value=fill_value,

/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/indexes/base.pyc in reindex(self, target, method, level, limit, tolerance)
   2865             target = self._simple_new(None, dtype=self.dtype, **attrs)
   2866         else:
-> 2867             target = _ensure_index(target)
   2868 
   2869         if level is not None:

/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/indexes/base.pyc in _ensure_index(index_like, copy)
   4025             index_like = copy(index_like)
   4026 
-> 4027     return Index(index_like)
   4028 
   4029 

/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/indexes/base.pyc in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    324                          **kwargs)
    325         elif data is None or is_scalar(data):
--> 326             cls._scalar_data_error(data)
    327         else:
    328             if (tupleize_cols and isinstance(data, list) and data and

/opt/conda/envs/python2/lib/python2.7/site-packages/pandas/core/indexes/base.pyc in _scalar_data_error(cls, data)
    676         raise TypeError('{0}(...) must be called with a collection of some '
    677                         'kind, {1} was passed'.format(cls.__name__,
--> 678                                                       repr(data)))
    679 
    680     @classmethod

TypeError: Index(...) must be called with a collection of some kind, 'time' was passed

Solution

  • date_range

    ts = pd.date_range(df.timestamp.min(), df.timestamp.max(), freq='1min')
    

    set_index, with reindex and fillna

    df.set_index('timestamp').reindex(ts).fillna(2.0).rename_axis('timestamp').reset_index()
    
                 timestamp     value
    0  2018-02-26 09:13:00  0.972198
    1  2018-02-26 09:14:00  1.008504
    2  2018-02-26 09:15:00  1.011961
    3  2018-02-26 09:16:00  2.000000
    4  2018-02-26 09:17:00  2.000000
    5  2018-02-26 09:18:00  1.018950
    6  2018-02-26 09:19:00  1.008538
    7  2018-02-26 09:20:00  2.000000
    8  2018-02-26 09:21:00  0.988535
    9  2018-02-26 09:22:00  0.944170
    10 2018-02-26 09:23:00  0.940284