Search code examples
pythondatabasedataframedata-analysisdata-cleaning

I want to split column categories into unique categories list using for loop and normal function


I have a column that has a lot of categories together like (Action|Adventure|Science Fiction|Thriller) (Action Adventure Science Fiction Thriller) (Action|Crime|Thriller)enter image description here as in the below picture..I want to create a function that creats a list with all unique values on column then count them later for every value. see picture please I want something like List = [Action,Thriller,Adventure.....]


Solution

  • I imagine that you are reading these columns into Python as a string. A way you could easily split this string into the list you desire is to use the string method split(). Here is some example code.

    def get_genres(row):
        """
        :param row: a string containing a set of genres seperated by '|'
        :return: a list of strings containing the genres of the row
        """
        return row.split('|')