I am trying to import this data to Pandas
. It imports correctly in Libreoffice with a header and 2 lines. However, in Pandas
, it does not seem to import. There are some newline characters inside that should be ignored. But in Pandas, they are being treated as a new line instead of single field with \n
inside it. Has anyone faced such issue. I've tried setting quotechar
and sep
parameters in Pandas
but cannot import it.
Thanks in advance
This should be imported as 2 rows in Pandas. However, it is being split into multiple rows.
Your problem is that you have "
as escape char (\"
) in your text that panda should ignore.
e.g.
7/20/16: Decreased STS, active flexion to 130, abduction to 100, weak right handed grip. Lack 6-8 \" IR on right, only to waist.
The \" after Lack 6-8 shouldn't be interpreted as quote character.
You have to tell pandas that.
That should work:
import pandas as pd
df = pd.read_csv("resources/data_to_post.csv", quotechar='"', escapechar='\\')
print(df)
id ... PlanGenerated
0 1 ... A course of physical therapy was ordered. Mobi...
1 2 ... The patient is instructed to return if pain or...
[2 rows x 17 columns]
only two rows then 6 as before.