Search code examples
pythoncsvraspberry-piexport-to-csvsensors

CSV file not updating until script is terminated when continuously appending file


I recently started a project to make a data logging script for my Raspberry Pi. My goal is to make a program that can:

  1. Collect voltage data from the pH sensor that's hooked up to my Pi (that's the getData() function).
  2. Plot it on the screen in real time.
  3. Continuously append them into a CSV file every few seconds.

I've included my code below for both the function that collects sensor data and the function that plots/saves them. Everything works as expected, except for some reason I can't see the new data added to the CSV file test2.csv until I terminate the code with Ctrl+C (the IDE is Thonny).

While the code is running, the csv file is just blank, and not even the heading "Data and Time, P0, P1" is there, making me think that for some reason Python doesn't actually append new data into the file until the very end.

If I stopped the script using the red "Stop/Restart Backend" button from Thonny, no new data will be added to test2.csv. New data will only be appended to test2.csv if I run the script and stop it using Ctrl+C, which isn't helpful because I want to be able to access the existing data without stopping the program.

Any idea on how to fix this so that the CSV file will update without me terminating the script?

My function for collecting data:

def getData():
    import board
    import busio
    i2c = busio.I2C(board.SCL, board.SDA)

    #import board module (ADS1115)
    import adafruit_ads1x15.ads1115 as ADS

    #import ADS1x15 library's version of AnalogIn
    from adafruit_ads1x15.analog_in import AnalogIn

    #create ADS object
    ads = ADS.ADS1115(i2c)
    ads.gain = 2/3

    #single ended mode read for pin 0 and 1
    chan = AnalogIn(ads, ADS.P0)
    chan1 = AnalogIn(ads, ADS.P1)
    return chan.voltage, chan1.voltage

Here is the part that actually plots and saves the data (and probably where things went wrong)

importing/initializing for plotting/saving
def DataLogger(file_name):
    #importing data from other function
    from getData import getData
    #importing/initializing for plotting/saving
    import matplotlib.pyplot as plt
    from time import sleep
    import csv
    plt.ion()
    voltage_list = []
    voltage1_list = []
    t = []
    ii = 0
    print("{:>5}\t{:>5}".format('P0','P1'))

    #create subplots and set axis
    fig, (ax1, ax2) = plt.subplots(2)
    fig.suptitle('Aquaponic Sensors')
    ax1.set_ylabel('pH (V)')
    ax1.set_ylim([2,4])
    ax2.set_ylabel('Temperature Voltage (V)')
    ax2.set_ylim([0,5])
    ax2.set_xlabel('Time (s)')

    #import date and time for timestamp
    from datetime import datetime

    #clear csv file on flash drive
    loc = "/media/pi/68D2-7E93/" + file_name
    f = open(loc, "w")
    f.truncate()
    f.close()

    #save data into test.csv on flash drive by appending new row
    with open(loc,'a+',newline='') as file:
        writer = csv.writer(file)
        writer.writerow(["Date and Time","P0 (V)", "P1 (V)"])

        #define output of both channels
        while True:
            voltage = round(getData()[0], 3)
            voltage1 = round(getData()[1], 3)
            print("{:>5}\t{:>5}".format(voltage, voltage1)) #could remove
            #append new output to existing lists
            voltage_list.append(voltage)
            voltage1_list.append(voltage1)
            t.append(ii)
            ii = ii+1 #time counter
            sleep(1)
            #append data to csv file & plot
            if ii/5 == int(ii/5):
                now = datetime.now()
                dt_string = now.strftime("%m/%d/%Y %H:%M:%S")
                writer.writerow([dt_string, voltage, voltage1])

                #plot the lists
                ax1.plot(t, voltage_list, 'tab:blue')
                ax2.plot(t, voltage1_list, 'tab:red')
            #actually draws plot
                plt.draw()
                plt.pause(0.001) #some weird thing that makes the plot update, doesn't work without this pause

This is the file I used to run DataLogger():

from DataLogger import DataLogger
DataLogger("test2.csv")

To run the file without any hardware connected, you can modify getData so it just generates random numbers and feed that into DataLogger

def getData():
    from random import seed
    from random import random
    seed(1)
    chanvoltage = random()
    chan1voltage = random()
    return chanvoltage, chan1voltage

Solution

  • The answer is actually pretty simple and easily reproducible. Python does not write to files until you close the file that was opened.

    Proof:

    with open(file_name, 'w') as file:
        file.write('1')
        input('Press enter to continue')
        file.close()
    input('Press enter to continue')
    with open(file_name, 'a+') as file:
        file.write('2')
        file.close()    
        input('Press enter to continue')
    

    If you run this, and look at the file at all of the breaks you will find that the file after the first break is empty, then becomes 1, then becomes 12.

    In order to fix this, you should occasionally save the file by closing the opened file with .close() then reopening it.

    Edit:

    There was an important correction to my answer by martineau. Adding here so that other viewers don't copy my mistake

    Python does write to files before they're closed. Files generally have an associated buffer in memory where data goes initially, and when that fills up it's actually written out to the physical file. When a file is closed, this happens before the buffer is full.

    In order to clear this buffer, the .flush() function can be called, which will cause the buffer to write to the physical file.

    Finally, in my example, I did open the file in append mode on the second call, instead of write mode, which changes the behavior from overwriting a file to appending data to the end.