Search code examples
pythonpandasnumpydata-acquisition

Appending Rows to a Pandas DataFrame


I'm trying to read an analog signal from a Measurement-Computing Personal Measurement Device (PMD-1208FS) and then write it to a file with a corresponding timestamp for each observation. I would like to append to this file once every second with a new observation.

The PyUniversalLibrary allows me to read from the device, but I'm stuck trying to figure out how to save the information into a dataframe. This example was helpful for reading data from the PMD, but it doesn't provide any data logging examples.

The example below gets close to solving this problem, but the df.append(pd.DataFrame() function is not providing me with the desired result. This function ends up appending the most recent dataframe to the bottom of the previously saved one, rather than just appending the new data. The result is a dataframe with many duplicate dataframes in sequence.

Here's my code:

## Source libraries:
from __future__ import print_function
import UniversalLibrary as UL
import time, os, io, csv, datetime
import pandas as pd

## Specify PMD settings:
BoardNum = 0
Gain = UL.BIP5VOLTS
Chan = 0

## Create empty lists and a dataframe to fill:
co = [] ## carbon monoxide concentration in ppm
data = [] ## raw analog output between 0-5V
times = [] ## timestamp
df = pd.DataFrame()


## Set filepath:
filename = "~/pmd_data.csv"

while True:
    ts = time.time()
    DataValue = UL.cbAIn(BoardNum, Chan, Gain)
    EngUnits = UL.cbToEngUnits(BoardNum, Gain, DataValue)
    ppm = EngUnits * 10 ## 1 Volt = 10ppm of carbon monoxide
    data.append(EngUnits)
    times.append(datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S'))
    co.append(ppm)
    ## This line of code is not providing the desired result:
    df = df.append(pd.DataFrame({'co':ppm, 'volts':data, 'datetime':times})) 
    print(df)
    df.to_csv(filename, sep = ',', index = False, encoding = 'utf-8')
    time.sleep(1)

Current Output:

    co    datetime    volts
0    13.8    2017-05-03 15:57:19   1.38
1    13.8    2017-05-03 15:57:19   1.38    
2    13.9    2017-05-03 15:57:20   1.39
3    13.8    2017-05-03 15:57:19   1.38
4    13.9    2017-05-03 15:57:20   1.39
5    14.2    2017-05-03 15:57:21   1.42

Desired Output:

    co    datetime    volts
0    13.8    2017-05-03 15:57:19   1.38
1    13.9    2017-05-03 15:57:20   1.39
2    14.2    2017-05-03 15:57:21   1.42

Solution

  • Since you don't use the index specifically, I'd keep a counter and use it to add a new row to an existing dataframe.

    I'd rewrite the while loop like this

    ## Source libraries:
    from __future__ import print_function
    import UniversalLibrary as UL
    import time, os, io, csv, datetime
    import pandas as pd
    
    ## Specify PMD settings:
    BoardNum = 0
    Gain = UL.BIP5VOLTS
    Chan = 0
    
    ## Create empty lists and a dataframe to fill:
    df = pd.DataFrame(columns=['co', 'volts', 'datetime'])
    
    ## Set filepath:
    filename = "~/pmd_data.csv"
    
    counter = 0
    while True:
        ts = time.time()
        DataValue = UL.cbAIn(BoardNum, Chan, Gain)
        EngUnits = UL.cbToEngUnits(BoardNum, Gain, DataValue)
        ppm = EngUnits * 10 ## 1 Volt = 10ppm of carbon monoxide
        df.loc[counter] = pd.Series(dict(
                co=ppm, volts=EngUnits, datetime=ts
            ))
        ## This line of code is not providing the desired result:
        counter += 1
        df.to_csv(filename, sep = ',', index = False, encoding = 'utf-8')
        time.sleep(1)