Search code examples
pythonstringpandasheader

how to take headers automatically from a csv file


my code is going through a user input file if it's a csv or a txt to find the number of words found but I need to provide a fixed header for it such in my case : sentences, is there a way to go through this and pic up the column name automatically from the file so it wont be a fixed name maybe another file have it written data rather than sentences

wdata=pd.read_csv(fileinput)
nowt=wdata['sentences'].str.split().map(len).sum()

csv examples

enter image description here

enter image description here

enter image description here


Solution

  • If there is only one column first specify new column name by names parameter and then skip original csv header by skiprows parameter in read_csv:

    wdata=pd.read_csv(fileinput, names=['col name'], skiprows=1)
    

    If there is no csv header:

    wdata=pd.read_csv(fileinput, names=['col name'])
    

    After some discussion is possible distinguish between column name (no space there) and in data always at least one space, so general solution is:

    header = pd.read_csv(file, nrows=0).columns[0] 
    skip = int(header.count(' ') == 0) 
    df = pd.read_csv(file, names=['col'], skiprows=skip)