I have this txt file:
Tu 11:44:00 119 52913161 DETECTOR STATE 0001
Tu 11:44:00 119 52913161 DETECTOR STATE 1100
Tu 11:44:02 119 52913161 DETECTOR STATE 0000
Tu 11:44:02 119 52913161 DETECTOR STATE 1110
Tu 11:44:04 119 52913161 DETECTOR STATE 0000
Tu 11:44:04 119 52913161 DETECTOR STATE 0011
Which I open in Python (Jupyter Notebook) using this code:
import pandas as pd
data= pd.read_csv('EXPORT20171205114501_1.txt', sep=" ", header=None)
data.columns = ["day", "time", "street", "sensor", "type", "state", "bits"]
I get this output:
day time street sensor type state bits
0 Tu 11:44:00 119 52913161 DETECTOR STATE 1
1 Tu 11:44:00 119 52913161 DETECTOR STATE 1100
2 Tu 11:44:02 119 52913161 DETECTOR STATE 0
3 Tu 11:44:02 119 52913161 DETECTOR STATE 1110
4 Tu 11:44:04 119 52913161 DETECTOR STATE 0
5 Tu 11:44:04 119 52913161 DETECTOR STATE 11
How can keep the leading zeros?
I've tried the following options:
data = pd.read_csv('EXPORT20171205114501_1.txt', sep=" ", header=None, dtype={'column': object})
data = pd.read_csv('EXPORT20171205114501_1.txt', sep=" ", header=None).astype(str)
data = pd.read_csv('EXPORT20171205114501_1.txt', sep=" ", header=None, converters={'ColName': str})
None of the above options work. I've updated pandas to the latest version. Any idea?
Use parameter names for columns names by list and then is possible define column for parse to string
by dtype
parameter with dictionary
:
names = ["day", "time", "street", "sensor", "type", "state", "bits"]
df = pd.read_csv('EXPORT20171205114501_1.txt', sep="\s+", names=names, dtype={'bits':str})
print (df)
day time street sensor type state bits
0 Tu 11:44:00 119 52913161 DETECTOR STATE 0001
1 Tu 11:44:00 119 52913161 DETECTOR STATE 1100
2 Tu 11:44:02 119 52913161 DETECTOR STATE 0000
3 Tu 11:44:02 119 52913161 DETECTOR STATE 1110
4 Tu 11:44:04 119 52913161 DETECTOR STATE 0000
5 Tu 11:44:04 119 52913161 DETECTOR STATE 0011
And if need all columns as strings:
names = ["day", "time", "street", "sensor", "type", "state", "bits"]
df = pd.read_csv('EXPORT20171205114501_1.txt', sep="\s+", names=names, dtype=str)
print (df)
day time street sensor type state bits
0 Tu 11:44:00 119 52913161 DETECTOR STATE 0001
1 Tu 11:44:00 119 52913161 DETECTOR STATE 1100
2 Tu 11:44:02 119 52913161 DETECTOR STATE 0000
3 Tu 11:44:02 119 52913161 DETECTOR STATE 1110
4 Tu 11:44:04 119 52913161 DETECTOR STATE 0000
5 Tu 11:44:04 119 52913161 DETECTOR STATE 0011