I have text file which contains text like this:
--------------------------------
I hate apples and love oranges.
He likes to ride bike.
--------------------------------
--------------------------------
He is a man of honour.
She loves to travel.
--------------------------------
I want to load this txt file in pandas dataframe and each row containing the content only between the separator. For e.g:
Row 1 should be like: I hate apples and love oranges. He likes to ride bike.
Row 2 should be like: He is a man of honour. She loves to travel.
Looks like you need to pre-process the text.
Try:
import pandas as pd
res = []
temp = []
with open(filename) as infile:
for line in infile:
val = line.strip()
if val:
if not val.startswith("-"):
temp.append(val)
else:
if temp:
res.append(" ".join(temp))
temp = []
df = pd.DataFrame(res, columns=["Test"])
print(df)
Output:
Test
0 I hate apples and love oranges. He likes to ri...
1 He is a man of honour. She loves to travel.