Search code examples
pythonlambdadefaultdict

Python defaultdict using lambda function causes error: TypeError: <lambda>() missing 1 required positional argument: 'x'


Trying to map input text to associated numeric values, allowing for missing non-represented entries using a defaultdict.

Running the below works, but when I add the default using a lambda I get an error.

z
Out[216]: 
0       $1,001
1       $1,001
2      $50,001
3      $15,001
4      $50,001
  
586     $1,001
587     $1,001
588     $1,001
589     $1,001
590     $1,001
Name: 0, Length: 591, dtype: object


amt_map = {
    "$1": 500,
    "$1,001": 2500, 
    "$15,001": 32500,
    "$50,001": 75000,
    "$100,001": 175000,
    "$250,001": 350000,
    "$500,001": 750000,
    "$1,000,001": 2000000, 
    "$5,000,001": 10000000,
    "25,000,001": 25000000  
        
    }  

z.map(amt_map) 
Out[220]: 
0       2500.0
1       2500.0
2      75000.0
3      32500.0
4      75000.0

Using the default lambda throws causes an error however:

from collections import defaultdict 

d = {
    "$1": 500,
    "$1,001": 2500, 
    "$15,001": 32500,
    "$50,001": 75000,
    "$100,001": 175000,
    "$250,001": 350000,
    "$500,001": 750000,
    "$1,000,001": 2000000, 
    "$5,000,001": 10000000,
    "25,000,001": 25000000  
        
    }      

amt_map = defaultdict(lambda x: x.replace('$',''), d)

z.map(amt_map) 
Traceback (most recent call last):

  File "/tmp/ipykernel_470946/75548175.py", line 1, in <module>
    z.map(amt_map)


  File "/home/chris/anaconda3/lib/python3.9/site-packages/pandas/core/base.py", line 825, in <lambda>
    mapper = lambda x: dict_with_default[x]

TypeError: <lambda>() missing 1 required positional argument: 'x'

Searching has suggested this is due to different number of arguments used by a function and included lambda, but I'm not seeing how/why that would cause an issue here.

TypeError: <lambda>() missing 1 required positional argument: 'item'


Solution

  • Per @wim's suggestion, created a subclass to define treatment for missing keys. The below returns what I'm looking for but would prefer not needing to subclass if there's a way to achieve it:

    d = {
        "$1": 500,
        "$1,001": 2500, 
        "$15,001": 32500,
        "$50,001": 75000,
        "$100,001": 175000,
        "$250,001": 350000,
        "$500,001": 750000,
        "$1,000,001": 2000000, 
        "$5,000,001": 10000000,
        "25,000,001": 25000000  
            
        }    
    
    class CleanAmt(dict):        
        def __missing__(self, key):
            return pd.to_numeric(key.replace('$',''))
      
      
    z.map(CleanAmt(d))