Search code examples
pythonpandasdataframestrip

How to add a "column" back to txt file after creating the column with pandas?


So I have a .LAS file (it works practically as txt file) and I converted it to a dataframe. Later I created a new column with some important information on the dataframe. Is there a way to rewrite the LAS file similar to the original one, but now with the new column?

Here is how my LAS file was:

Text 1
Text 2
Text 3
Text 4
~A Stats1 Stats2 Stats3
     1       2     3
     6       6     7
     8       9     3

So I managed to convert the file to a DataFrame the way I wanted (without the header and '~A'):

with open(r'C:...filename.las') as f:
    for l in f:
        if l.startswith('~A'):
            stats= l.split()[1:]
            break
    data = pd.read_csv(f, names=stats, sep='~A', engine='python')

data

Stats1 Stats2 Stats3 Numbers
  1       2      3      1
  6       6      7      2
  8       9      3      3

Now, imagine that I created a new column data['Numbers'] with numbers(1,2,3) and if I managed to rewrite this back to my LAS file, it should be like:

```
Text 1
Text 2
Text 3
Text 4
~A Stats1 Stats2 Stats3 Numbers
     1       2     3       1
     6       6     7       2
     8       9     3       3
```

Anyone knows how can I do that?

if I just use: np.savetxt('filename_edited.las', data, fmt="%s") ...I manage to get the new LAS file with the data I wanted, but without the header I had on the original file.

Thanks!


Solution

  • You need to save the header when reading the file, so you can write it back. Otherwise it's lost.

    To write the dataframe, you can use pandas.DataFrame.to_csv after you have written back the header text.

    with open('data.txt') as f:
        file_header = []
        for l in f:
            if l.startswith('~A'):
                stats= l.split()[1:]
                break
            else:
                file_header.append(l)
    
        data = pd.read_csv(f, names=stats, sep='\s+', engine='python')
    
    #manipulate the dataframe to add a column or whatever
    data['numbers'] = [1, 2, 3]
    
    with open('data2.txt', 'w') as wf:
        data_str = data.to_csv(None)
        for l in file_header:
            wf.write(l)
        wf.write('~A')
        wf.write(data_str)
    

    And data2.txt will look like:

    Text 1
    Text 2
    Text 3
    Text 4
    ~A,Stats1,Stats2,Stats3,numbers
    0,1,2,3,1
    1,6,6,7,2
    2,8,9,3,3
    

    to_csv use a comma separator by default, but you can use the sep argument to specify a different one. Must be a string of length 1.