I am trying to read a text file to a nested list in Python. That is, I would like to have the output as:
[[$5.79, Breyers Ice Cream, Homemade Vanilla, 48 oz], [$6.39, Haagen-dazs, Vanilla Bean Ice Cream, 1 pt], etc...]]
The ultimate goal is to read the information into a pandas DataFrame for some exploratory analysis.
$5.79
Breyers Ice Cream
Homemade Vanilla
48 oz
$6.39
Haagen-dazs
Vanilla Bean Ice Cream
1 pt
$6.89
So Delicious
Dairy Free Coconutmilk No Sugar Added Dipped Vanilla Bars
4 x 2.3 oz
$5.79
Popsicle Fruit Pops Mango
12 ct
with open(sample.txt) as f:
creams = f.read()
creams = f.split("\n\n")
However, this returns:
['$5.79\nBreyers Ice Cream\nHomemade Vanilla\n48 oz', '$6.39\nHaagen-dazs\nVanilla Bean Ice Cream\n1 pt',
I have also tried utilizing list comprehension methods that look cleaner than the above code, but these attempts handle the newlines, not the paragraphs or returns. For example:
[x for x in open('<file_name>.txt').read().splitlines()]
#Gives
['$5.79', 'Breyers Ice Cream', 'Homemade Vanilla', '48 oz', '', '$6.39', 'Haagen-dazs', 'Vanilla Bean Ice Cream', '1 pt', '', '
I know I would need to nest a list within the list comprehension, but I'm unsure how to perform the split.
Note: This is my first posted question, sorry for the length or lack of brevity. Seeking help because there are similar questions but not with the outcome I desire.
You are nearly there once you have the four-line groups separated. All that's left is to split the groups again by a single newline.
with open('creams.txt','r') as f:
creams = f.read()
creams = creams.split("\n\n")
creams = [lines.split('\n') for lines in creams]
print(creams)