Search code examples
pythonpandasjupyter-notebooknlpdata-preprocessing

Convert a single string inside list into multiple string inside that same list in pandas


I have a dataframe which consists of a column having tags as

tags
------------
['tag1' 'tag2']
['tag1']
['tag5' 'tag3' 'tag6']

How to get the dataframe in following format?

    tags
    ------------
    ['tag1','tag2']
    ['tag1']
    ['tag5','tag3','tag6']

While viewing the single row:

print(df['tags'][0][0])
>>> "'tag1' 'tag2'"
print(df['tags'][0][0][0]
>>> "'"

I tried the following method:

def Convert(string): 
    li = list(string.split(" ")) 
    return li 
str1 = Convert(df1['tags'][0][0])
print(str1)
>>> ["'tag1'", "'tag2'"]

Method2 This also didn't work to me.

import ast
ast.literal_eval(df['tags'][0][0])
>>>'tag1tag2' 

Question2 I want to count the total occurrence of tags and df['tags'].value_counts() doesn't work. It either takes the whole list as count their occurrence or if i modify the list then it will take the character count.


Solution

  • Is this what you are looking for?

    list1 = ["'tag1' 'tag2'"]
    result = [w.strip("'") for w in list1[0].split(' ')]
    print(f'converted\n{list1}\nto\n{result}')