I have a .csv file on disk, formatted so that I can read it into a pandas DataFrame easily, to which I periodically write rows. I need this database to have a row index, so every time I write a new row to it I need to know the index of the last row written.
There are plenty of ways to do this:
I am curious if there is a way to just get that one cell directly, without having to read a whole bunch of extra information into memory. Any suggestions?
Reading the entire index column will still need to read and parse the whole file.
If no fields in the file are multiline, you could scan the file backwards to find the first newline (but with a check if there is a newline past the data). The value following that newline will be your last index.
Storing the last index in another file would also be a possibility, but you would have to make sure both files stay consistent.
Another way would be to reserve some (fixed amount of) bytes at the beginning of the file and write (in place) the last index value there as a comment. But your parser would have to support comments, or be able to skip rows.