Search code examples
pythonpandasnlpserieslowercase

Trying to convert text List to lower case but it turns everything to NaN


I am currently trying to work with text data and I am relatively new at this. The column I'm trying to work with is the cast column, as shown below:

0    [Sam Worthington, Zoe Saldana, Sigourney Weave...
1    [Johnny Depp, Orlando Bloom, Keira Knightley, ...
2    [Daniel Craig, Christoph Waltz, Léa Seydoux, R...
3    [Christian Bale, Michael Caine, Gary Oldman, A...
4    [Taylor Kitsch, Lynn Collins, Samantha Morton,...
Name: cast, dtype: object 

What I want is to lower all the upper cases. However when I try to do it, it converts everything to NaN values.

Here's the simple thing I've done:

data.cast=data.cast.str.lower()

Here's the output:

0      NaN
1      NaN
2      NaN
3      NaN
4      NaN
5      NaN
6      NaN
7      NaN
8      NaN
9      NaN
10     NaN
11     NaN
12     NaN
13     NaN
14     NaN
15     NaN
16     NaN
17     NaN
18     NaN
19     NaN
20     NaN
21     NaN
22     NaN
23     NaN
24     NaN
25     NaN
26     NaN
27     NaN
28     NaN
29     NaN
        ..

Can anyone help me understand what I'm doing wrong and how I could potentially fix it? Thank you for your time!!!


Solution

  • You try to convert a column which contains lists with a string methodology. so you need to create a simple function such as:

    def lower(l):
        return [x.lower() for x in l]
    

    And use a map to remove capitals:

    data = pd.DataFrame([{'col':['Titi','Toto','Tutu']},{'col':['Tata','Toto','Tutu']}])
    data.col = data.col.map(lower)
    data
    

    The result is:

        col
    0   [titi, toto, tutu]
    1   [tata, toto, tutu]