pandas read textfile as dataframe

I have a textfile with hundreds of rows where each row looks like the following:

"LastName, FirstName MiddleName", 222555,X-150,2023,0.15,0.20,0.5,"1, 2, 10",--,1.5,5.10,report

The separators are normally commas except when within quotes. The above row each needs to be separated into the following columns:

LastName, FirstName MiddleName
222555
X-150
2023
0.15
0.20
0.5
1,2,10
--
1.5
5.10
report

Solution

Try something like this if you want to use Pandas. Just using read_csv should still work fine on your data, then you can split the columns that are in quotes into their own columns. Last, remove the columns with the quotes:

#### Making mock csv/text file
from io import StringIO
csv_file = StringIO('\n'.join(['"LastName, FirstName MiddleName", 222555,X-150,2023,0.15,0.20,0.5,"1, 2, 10",--,1.5,5.10,report']*3))
#########################

# Load in the file as a CSV
df = pd.read_csv(csv_file, header = None)

# Split the names into separate name columns
df[['Last', 'First', 'Middle']] = [x for x in df[0].str.replace(',', '').str.split(' ')]

# Split the numbers in quotes into 3 separate columns
df[['num1', 'num2', 'num3']] = [x for x in df[7].str.split(', ')]

# Remove the columns that you split
df = df.drop(df.columns[[0, 7]], axis = 1)

Output:

         1     2       3       4      5   6   8   9  10     11  Last            First       Middle  num1 num2 num3
0   222555  X-150   2023    0.15    0.2 0.5 --  1.5 5.1 report  LastName    FirstName   MiddleName  1   2   10
1   222555  X-150   2023    0.15    0.2 0.5 --  1.5 5.1 report  LastName    FirstName   MiddleName  1   2   10
2   222555  X-150   2023    0.15    0.2 0.5 --  1.5 5.1 report  LastName    FirstName   MiddleName  1   2   10