Can anyone explain how comment='#' works within a csv file in pandas
pd.read_csv(..., comment='#',...)? Sample code is below.
# Read the raw file as-is: df1
df1 = pd.read_csv(file_messy)
# Print the output of df1.head()
print(df1.head(5))
# Read in the file with the correct parameters: df2
df2 = pd.read_csv(file_messy, delimiter=' ', header=3, comment='#')
# Print the output of df2.head()
print(df2.head())
# Save the cleaned up DataFrame to a CSV file without the index
df2.to_csv(file_clean, index=False)
Here is an example of how the comment
argument works:
csv_string = """col1;col2;col3
1;4.4;99
#2;4.5;200
3;4.7;65"""
# Without comment argument
print(pd.read_csv(StringIO(csv_string), sep=";"))
# col1 col2 col3
# 0 1 4.4 99
# 1 #2 4.5 200
# 2 3 4.7 65
# With comment argument
print(pd.read_csv(StringIO(csv_string),
sep=";", comment="#"))
# col1 col2 col3
# 0 1 4.4 99
# 1 3 4.7 65