My end-goal to implement ULMFit using FastAI to predict disaster tweets(as a part of this Kaggle competition). What I'm trying to do is read the tweets from a Dataframe. But for reasons unknown to me, I'm stuck at the data loading stage. I'm simply unable to do so using the below method -
from fastai.text.all import *
train= pd.read_csv('../input/nlp-getting-started/train.csv')
dls_lm = (TextList.from_df(path,train,cols='text',is_lm=True)
.split_by_rand_pct(0.1)
#.label_for_lm()
.databunch(bs=64))
This line throws - NameError: name 'TextList' is not defined.
I'm able to work around this problem with the below code -
dls_lm = DataBlock(
blocks=TextBlock.from_df('text', is_lm=True),
get_x=ColReader('text'),
splitter=RandomSplitter(0.1)
# using only 10% of entire comments data for validation inorder to learn more
)
dls_lm = dls_lm.dataloaders(train, bs=64, seq_len=72)
Why does this work and not the previous method?
Notebook Link for reference.
Which version of fastai are you running?
import fastai
print(fastai.__version__)
TextList class is from FastAI v1, but it seems to me your import path is for Fastai v2, and in v2, TextList is changed with https://docs.fast.ai/text.data.html#TextBlock (thats why it's working with the Datablock part wich is the good way to handle this)