Search code examples
python-3.xpydantic

How to make custom data class subscriptable?


Consider this data class derived from the pydantic package:

from typing import List
from pydantic import BaseModel 


class Bucket(BaseModel):
    setting: List[str]
    fight_1: List[int]
    cause_1: List[str]

let my_bucket be an instance of Bucket:

my_bucket = Bucket(setting=['some_value'], fight_1=[0], cause_1=['other_value'])

basically I would like to be able to do

my_bucket['setting']                                                                                                         

and get back ['some_value'], but instead I get:

---------------------------------------------------------------------------
TypeError                                 
Traceback (most recent call last)
<ipython-input-18-cacbdc1698e9> in <module>
----> 1 my_bucket['setting']

TypeError: 'Bucket' object is not subscriptable

Solution

  • As the accepted answer mentioned, this can be done straightforwardly by implementing __getitem__ as getattr(self, instr). (Double-underscore methods like __getattribute__ in general shouldn't be used unless you can't do it some other way, which you can in this case.)

    However! It's worth noting that implementing __getitem__ does not make your object a dict, and that inexperienced programmers that come across items that are subscriptable but don't otherwise act like dicts can make mistakes.

    In particular, let's take a pydantic object that you have added subscripting to:

    from typing import List
    from pydantic import BaseModel
    
    
    In [14]: class Bucket(BaseModel):
        setting: List[str]
        fight_1: List[int]
        cause_1: List[str]
    
        def __getitem__(self, item):
            return getattr(self, item)
    
    In [15]: bucket = Bucket(setting=['a'], fight_1=[1], cause_1=['c'])
    
    

    This object can, obviously, be subscripted:

    In [17]: bucket['setting']
    Out[17]: ['a']
    

    It can also, due to built-ins from Pydantic, be iterated on:

    In [18]: for item in bucket:
                 print(item)
     
    ('setting', ['a'])
    ('fight_1', [1])
    ('cause_1', ['c'])
    In [19]:
    

    However, you can't take its length:

    In [19]: len(bucket)
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    Cell In[19], line 1
    ----> 1 len(bucket)
    
    TypeError: object of type 'Bucket' has no len()
    
    In [20]: 
    

    or set item values:

    In [20]: bucket['setting'] = ['b']
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    Cell In[20], line 1
    ----> 1 bucket['setting'] = ['b']
    
    TypeError: 'Bucket' object does not support item assignment
    
    In [21]: 
    

    You can fix those things, by implementing __len__ and __setitem__. A simple way to do that might be:

    In [23]: class Bucket(BaseModel):
        setting: List[str]
        fight_1: List[int]
        cause_1: List[str]
    
        def __getitem__(self, item):
            return getattr(self, item)
    
        def __setitem__(self, item, value):
            setattr(self, item, value)
    
        def __len__(self):
            # a cheap and simple way, since pydantic objects are dictable
            # other methods differ between pydantic 1 and 2
            return len(dict(self))
    
    In [24]: bucket = Bucket(setting=['a'], fight_1=[1], cause_1=['c'])
    
    In [25]: len(bucket)
    Out[25]: 3
    
    In [26]: bucket['setting']
    Out[26]: ['a']
    
    In [27]: bucket['setting'] = ['b']
    
    In [28]: bucket['setting']
    Out[28]: ['b']
    

    And then there are other methods it should have: keys, and values, and items, and so on. You can implement these, though it's quite a pain in the butt, and everyone who uses this base class risks shadowing them if their pydantic object has a member named 'keys', 'values', etc. That can result in some pretty weird behavior.

    Also, there are still a couple of other problems here. First, you still can't use del, and you really don't want to implement that. Deleting items from a pydantic object is just asking for trouble. You could just have it get set to None, but then it would still exist in the dict-alike after you deleted it and that's also weird behavior.

    Second, this still has unwanted behaviors (and this is present in the accepted answer even without any additions):

    In [33]: bucket['__init__']
    Out[33]: <bound method BaseModel.__init__ of Bucket(setting=['b'], fight_1=[1], cause_1=['c'])>
    

    You won't find that by iterating, and you won't see it when you're looking at your keys assuming you implemented that correctly, but it's there, and that's probably not what you want.

    You can get around all this (either by using the dict() trick above, or by looking at fields or model_fields for pydantic v1 and v2 respectively) but I'm not sure you want to. It sounds a lot like work. Maybe, instead, if what you want is a dict, you should just convert the thing into a dict.