Search code examples
pandasdatetimeindex

Pandas SetIndex with DatetimeIndex


I have a csv file with the following

Symbol, Date, Unix_Tick, OpenPrice, HighPrice, LowPrice, ClosePrice, volume,
AAPL, 2021-01-04 09:00:00, 1609750800, 133.31, 133.49, 133.02, 133.49, 25000
AAPL, 2021-01-04 09:01:00, 1609750860, 133.49, 133.49, 133.49, 133.49, 700
AAPL, 2021-01-04 09:02:00, 1609750920, 133.6, 133.6, 133.5, 133.5, 500

So I attempt to create a pandas index using Date like this

import pandas as pd
import numpy as np

df = pd.read_csv(csvFile)
df = df.set_index(pd.DatetimeIndex(df["Date"]))

I get KeyError: 'Date'


Solution

  • It's because the file isn't strictly a comma-separated one, but it is comma plus space-separated.

    You can either strip the column names to remove spaces:

    df = pd.read_csv(csvFile)
    
    df.columns = df.columns.str.strip()
    
    df = df.set_index(pd.DatetimeIndex(df["Date"]))
    

    or read the CSV file with separator ", ":

    df = pd.read_csv(csvFile, sep=", ")
    
    df = df.set_index(pd.DatetimeIndex(df["Date"]))