Search code examples
pythonpandasnlptokenize

'int' object has no attribute 'lower' while doing tokenizer.fit_on_text(d['column_name'])


tokenizer=Tokenizer(num_words=1000, split=' ')
tokenizer.fit_on_texts(d['column'].values)

x=tokenizer.texts_to_sequences(d['column'].values)

In column POS_words I have all sentences having skills (C#, Office365, ...) there are some nos. +91.

I want to convert it into array but its throwing error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-a59a11ef92f5> in <module>()
      1 tokenizer=Tokenizer(num_words=1000, split=' ')
----> 2 tokenizer.fit_on_texts(d['POS_words'].values)
      3 
      4 x=tokenizer.texts_to_sequences(d['POS_words'].values)
      5 #xtest=tokenizer.texts_to_sequences(test['POS_words'].values)

1 frames
/usr/local/lib/python3.7/dist-packages/keras_preprocessing/text.py in text_to_word_sequence(text, filters, lower, split)
     41     """
     42     if lower:
---> 43         text = text.lower()
     44 
     45     if sys.version_info < (3,):

AttributeError: 'int' object has no attribute 'lower'

Please tell me how to fix this


Solution

  • The issue is solved

    d['column']=d['column'].astype(str)