Search code examples
pythoncsvpandasnumpydata-analysis

Reading data in python pandas by defining width of each column as number of characters


I'm trying to read a file in which columns are separated by variable spaces. I was wondering if there is a way to read the file by defining the width of each column in terms of number of characters reserved for that column.

For example:

A B          C  D
- ---------- -- ---
1 foo        32 9.5
4 bar           5.4
5 foofoo_bar 44 

Let's say we have to read the above data. Notice that some entries do not exist in columns C and D. However, note that the second line in the file (the one with the dashes) indicates the maximum number of characters that particular column can have.

So, the question is given the maximum width of each column in the dataset, is there a way to read the dataset in python using pandas or any other package?


Solution

  • You should use pandas.read_fwf(). It stands for Read Fixed Width File.