Search code examples
pythonpandasexport-to-csvexport-to-excel

Python pandas empty df but columns has elements


I have really irritating thing in my script and don't have idea what's wrong. When I try to filter my dataframe and then add rows to newone which I want to export to excel this happen.

File exports as empty DF, also print shows me that "report" is empty but when I try to print report.Name, report.Value etc. I got normal and proper output with elements. Also I can only export one column to excel not entire DF which looks like empty.... What can cause that strange accident?

So this is my script:

df = pd.read_excel('testfile2.xlsx')
report = pd.DataFrame(columns=['Type','Name','Value'])

for index, row in df.iterrows():
    if type(row[0]) == str:
        type_name = row[0].split(" ")
        if type_name[0] == 'const':
            selected_index = index
            report['Type'].loc[index] = type_name[1]
            report['Name'].loc[index] = type_name[2]
            report['Value'].loc[index] = row[1]

        else:
            for elements in type_name:
                report['Value'].loc[selected_index] += " " + elements

    elif type(row[0]) == float:
        df = df.drop(index=index)

print(report) #output - Empty DataFrame
print(report.Name) output - over 500 elements

Solution

  • You are trying to manipulate a series that does not exist which leads to the described behaviour.

    Doing what you did just with a way more simple example i get the same result:

    report = pd.DataFrame(columns=['Type','Name','Value'])
    report['Type'].loc[0] = "A"
    report['Name'].loc[0] = "B"
    report['Value'].loc[0] = "C"
    
    print(report) #empty df
    print(report.Name) # prints "B" in a series
    

    Easy solution: Just add the whole row instead of the three single values:

    report = pd.DataFrame(columns=['Type','Name','Value'])
    report.loc[0] = ["A", "B", "C"]
    

    or in your code:

    report.loc[index] = [type_name[1], type_name[2], row[1]]
    

    If you want to do it the same way you are doing it at the moment you first need to add an empty series with the given index to your DataFrame before you can manipulate it:

    report.loc[index] = pd.Series([])
    report['Type'].loc[index] = type_name[1]
    report['Name'].loc[index] = type_name[2]
    report['Value'].loc[index] = row[1]