I am trying to load a tab deliminated file into pandas with python so I can perform different queries. Unfortunately the file formatting is nto great (it loads fine to Excel but I need to work with it programatically.
When I do:
>>> print(df.columns)
Index([u'Domain Name ',
u'Current Bid ', u'Join By Date (ET)', u'Join By Date (PT)',
u'Bidders ', u'Seller ', u'TLD ', u'Length',
u'Words ',
u'Word Count',
u'Categories ',
u'Hyphens ', u'Numbers ', u'Auction Type'],
dtype='object')
How can I fix the file so simple df.query('TLD) == "value"')
would work?
I get a new file every 5 days so I need to do it programatically.
Note - please have patience, I am new to scripting and Python
This should help to clean up the columns names:
df.columns = [x.strip() for x in df.columns]