Search code examples
pythondataframeattributeerrorpandas.excelwriter

Data Frame to Excel - AttributeError: 'Series' object has no attribute 'columns'


I am trying to write a data frame to an Excel file. This has worked for me in the past, but this time, it's giving me an AttributeError.

The Code

I have a data frame called data that looks like this: data frame called <code>data</code>

I put it into this code:

# To find tf-idf values
textVal = data.text.values.astype('str')
vectorizer = TfidfVectorizer()
vectorizer.fit(textVal)
X = vectorizer.transform (textVal).toarray()
names = vectorizer.get_feature_names()
tfidf_dataframe = pd.DataFrame(X, columns = names)

# To print TF-IDF
writer = pd.ExcelWriter('tfidf_test.xlsx', engine='xlsxwriter')
tfidf_dataframe.to_excel(writer)
writer.save()
print("complete")

The tfidf_dataframe looks like this:

tfidf_dataframe image

The Error Log

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-22-9ebe0a5d13a5> in <module>
     13 
     14 # To print TF-IDF
---> 15 tfidf_dataframe.to_excel(writer)
     16 
     17 # To print sentiment analysis

c:\users\matay\appdata\local\programs\python\python38\lib\site-packages\pandas\core\generic.py in to_excel(self, excel_writer, sheet_name, na_rep, float_format, columns, header, index, index_label, startrow, startcol, engine, merge_cells, encoding, inf_rep, verbose, freeze_panes)
   2162         from pandas.io.formats.excel import ExcelFormatter
   2163 
-> 2164         formatter = ExcelFormatter(
   2165             df,
   2166             na_rep=na_rep,

c:\users\matay\appdata\local\programs\python\python38\lib\site-packages\pandas\io\formats\excel.py in __init__(self, df, na_rep, float_format, cols, header, index, index_label, merge_cells, inf_rep, style_converter)
    403             self.df = df.reindex(columns=cols)
    404 
--> 405         self.columns = self.df.columns
    406         self.float_format = float_format
    407         self.index = index

c:\users\matay\appdata\local\programs\python\python38\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   5268             or name in self._accessors
   5269         ):
-> 5270             return object.__getattribute__(self, name)
   5271         else:
   5272             if self._info_axis._can_hold_identifiers_and_holds_name(name):

AttributeError: 'Series' object has no attribute 'columns'

Any ideas on why I get this error message?


Solution

  • Are you sure you're getting this from what you're running? Your traceback seems to imply it's not.

         14 # To print TF-IDF
    ---> 15 tfidf_dataframe.to_excel(writer)
         16 
         17 # To print sentiment analysis
    
    # To print TF-IDF
    writer = pd.ExcelWriter('tfidf_test.xlsx', engine='xlsxwriter')
    tfidf_dataframe.to_excel(writer)
    writer.save()
    print("complete")
    

    I believe the default engine is already xlsxwriter, so you could also just do tfidf_dataframe.to_excel('tfidf_test.xlsx')

    One of my hunches might be that if this code lives in a module, you're actively developing it and you didn't re-import it into your Jupyter environment.

    If that's the case, try:

    import importlib
    importlib.reload(module)
    

    Where module is the name of the module where your code resides.