Search code examples
pythonboto3pydantic

How to serialize a string to a dict with pydantic


I am using Boto3 to retrieve configuration information from our AWS instance. I am trying to use pydantic to keep my data clean as I parse it. I haven't used dataclasses or pydantic in the past. AWS returns JSON with the information. There is a nested dictionary that they return as a string...

{
'accountId': '12354,  
'arn': 'arn:aws:lambda:ap-southeast-2:1234-NotificationForwarder',  
'availabilityZone': 'Not Applicable',  
'awsRegion': 'ap-southeast-2',  
'configuration': '{"functionName":"aws-controltower-NotificationForwarder","functionArn":"arn:aws:lambda:ap-southeast-2:1234:function:aws-controltower-NotificationForwarder","runtime":"python3.6","role":"arn:aws:iam::1234:role/aws-controltower-ForwardSnsNotificationRole","handler":"index.lambda_handler","codeSize":473,"description":"SNS '  
'message forwarding function for aggregating account '  
'notifications.","timeout":60,"memorySize":128,"lastModified":"2020-06-23T16:11:27.781+0000","codeSha256":"blah\\u003d","version":"$LATEST","tracingConfig":{"mode":"PassThrough"},"revisionId":"blah","layers":[],"state":"Active","lastUpdateStatus":"Successful","fileSystemConfigs":[],"packageType":"Zip","architectures":["x86_64"],"ephemeralStorage":{"size":512},"snapStart":{"applyOn":"None","optimizationStatus":"Off"}}',  
'configurationItemCaptureTime': datetime.datetime(2023, 1, 18, 3, 10, 30, 444000, tzinfo=tzlocal()),  
'configurationItemMD5Hash': '',  
'configurationItemStatus': 'OK',  
'configurationStateId': '1234',  
'relatedEvents': [],  
'relationships': [{'relationshipName': 'Is associated with ',  
'resourceName': 'aws-controltower-ForwardSnsNotificationRole',  
'resourceType': 'AWS::IAM::Role'}],  
'resourceId': 'aws-controltower-NotificationForwarder',  
'resourceName': 'aws-controltower-NotificationForwarder',  
'resourceType': 'AWS::Lambda::Function',  
'version': '1.3'
}

the configuration field is a dictionary but it is enclosed in ticks which makes it a string. What is the best way to convert that to a dict as part of my datamodel?

I assume I would use a class to do that but I'm unclear as to how to do that. I tried to do a json

I have the below class..

class LambdaConfig(BaseModel):
    functionName: str
    functionArn: str
    runtime: str
    role: str
    handler: str
    codeSize: int
    description: str
    memorySize: str
class Lambda(BaseModel):
    arn: str
    availabilityZone: str
    awsRegion: str
    configuration: LambdaConfig
    configurationItemCaptureTime: datetime.datetime

    relationships: List
    resourceId: str
    resourceName: str
    resourceType: str
    supplementaryConfiguration: Dict
    version: str
    tags: dict

I can't figure out how I can convert the str into dict.


Solution

  • You can use a validator decorator to make the parsing:

    from pydantic import BaseModel, validator
    import json
    
    class LambdaConfig(BaseModel):
        functionName: str
        functionArn: str
        runtime: str
        role: str
        handler: str
        codeSize: int
        description: str
        timeout: int
        memorySize: int
        lastModified: str
        codeSha256: str
        version: str
        tracingConfig: dict
        revisionId: str
        layers: list
        state: str
        lastUpdateStatus: str
        fileSystemConfigs: list
        packageType: str
        architectures: list
        ephemeralStorage: dict
        snapStart: dict
    
    class Lambda(BaseModel):
        arn: str
        availabilityZone: str
        awsRegion: str
        configuration: LambdaConfig
        configurationItemCaptureTime: datetime.datetime
        relationships: List[dict]
        resourceId: str
        resourceName: str
        resourceType: str
        supplementaryConfiguration: Dict
        version: str
        tags: dict
        
        @validator('configuration', pre=True)
        def parse_configuration(cls, value):
            if isinstance(value, str):
                return json.loads(value)
            return value
    

    It detects if the 'configuration' field is a string and parses it to a dict.

    lambda_obj = Lambda(**data)
    print(lambda_obj.configuration)