I am trying to add two values as a list in Data Frame one is the Sentence and other once is the List of words I got, after tokenization of those sentences
for now, I have done the following code
from nltk.tokenize import word_tokenize
example = ['Mary had a little lamb' ,
'Jack went up the hill' ,
'Jill followed suit' ,
'i woke up suddenly' ,
'it was a really bad dream...']
def hi():
for i in example:
#print (word_tokenize(i),i)
a=[i,word_tokenize(i)]
print(a)
The expected output would be
Data Frame having two columns, Original Sentence and Tokens of that sentence
Example
Orignal Sentence | Tokens
My name is max | my,name,is,max
This is windows | This, is , windows
df['Original Sentence'] = a[0]
df['Tokens'] = a[1]
Or we can skip your function entirely:
df['Original Sentence'] = example
df['Tokens'] = [word_tokenize(i) for i in example]
EDIT:
Since it appears you do not have a dataframe to begin with.
import pandas as pd
df = pd.DataFrame.from_dict({'Original Sentence': example,
'Tokens': [word_tokenize(i) for i in example]})
print(df) #to see your dataframe
df.to_csv('mydata.csv') #To output your dataframe into a csv file
Other format:
df.to_sql(etc...) #Refer to comment below
To output as a sql direct to your database, setup specific to your db is required. Refer here for example: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html