Search code examples
pythonpandasformattingtime-seriesseparator

Pandas: Separate two columns with missing separator


i have data like:

00052600150.00942615
00052601000.01014910
00052601050.02709672
00052601100.11454732
00052601150.23151254
00052601200.36262522
00052601250.66432348
00052601301.07723763
00052601351.26019487
00052601401.20568581

First 10 digits represent the timestep YYMMDDhhmm followed by a number

Its supposed to be 0005260010,0.00799872 where the first block is a timestep and the second a measurement.

Ive tried to read the data with pandas and convert it to a str but then i loose the leading zeros? Is there a way to separate a float by digit?

greetings


Solution

  • You can read the column as str and split your value by positions

    df = pd.read_csv('yourfile.csv', header=None, dtype='str', names=['col1'])
    df['time'] = pd.to_datetime(df.col1.str[:10], unit='s')
    df['value'] = (df.col1.str[10:]).astype('float')
    df
    

    Out:

                       col1                time     value
    0  00052600150.00942615 1970-03-02 21:06:55  0.009426
    1  00052601000.01014910 1970-03-02 21:08:20  0.010149
    2  00052601050.02709672 1970-03-02 21:08:25  0.027097
    3  00052601100.11454732 1970-03-02 21:08:30  0.114547
    4  00052601150.23151254 1970-03-02 21:08:35  0.231513
    5  00052601200.36262522 1970-03-02 21:08:40  0.362625
    6  00052601250.66432348 1970-03-02 21:08:45  0.664323
    7  00052601301.07723763 1970-03-02 21:08:50  1.077238
    8  00052601351.26019487 1970-03-02 21:08:55  1.260195
    9  00052601401.20568581 1970-03-02 21:09:00  1.205686