Search code examples
amazon-web-servicesaws-cliemramazon-emr

How Can I Effortlessness Format An AWS CLI Command


I'm doing a lot of work with AWS EMR and when you build an EMR cluster through the AWS Management Console you can click a button to export the AWS CLI Command that creates the EMR cluster.

enter image description here

It then gives you a big CLI command that isn't formatted in any way i.e., if you copy and paste the command it's all on a single line.

enter image description here

I'm using these EMR CLI commands, that were created by other individuals, to create the EMR clusters in Python using the AWS SDK Boto3 library i.e., I'm looking at the CLI command to get all the configuration details. Some of the configuration details are present on the AWS Management Console UI but not all of them so it's easier for me to use the CLI command that you can export.

However, the AWS CLI command is very hard to read since it's not formatted. Is there an AWS CLI command formatter available online similar to JSON formatters?

Another solution I could use is to Clone the EMR Cluster and go through the EMR Cluster creation screen on the AWS Management Console to get all the configuration details but I'm still curious if I could format the CLI Command and do it that way. Another added benefit of being able to format the exported CLI command is that I could put it on a Confluence page for documentation.


Solution

  • Here is some quick python code to do it:

    import shlex
    import json
    import re
    
    def format_command(command):
        tokens = shlex.split(command)
        formatted = ''
        for token in tokens:
            # Flags get a new line
            if token.startswith("--"):
                formatted += '\\\n    '
            # JSON data
            if token[0] in ('[', '{'):
                json_data = json.loads(token)
                data = json.dumps(json_data, indent=4).replace('\n', '\n    ')
                formatted += "'{}' ".format(data)
            # Quote token when it contains whitespace
            elif re.match('\s', token):
                formatted += "'{}' ".format(token)
            # Simple print for remaining tokens
            else:
                formatted += token + ' '
        return formatted
    
    
    example = """aws emr create-cluster --applications Name=spark Name=ganglia Name=hadoop --tags 'Project=MyProj' --ec2-attributes '{"KeyName":"emr-key","AdditionalSlaveSecurityGroups":["sg-3822994c","sg-ccc76987"],"InstanceProfile":"EMR_EC2_DefaultRole","ServiceAccessSecurityGroup":"sg-60832c2b","SubnetId":"subnet-3c76ee33","EmrManagedSlaveSecurityGroup":"sg-dd832c96","EmrManagedMasterSecurityGroup":"sg-b4923dff","AdditionalMasterSecurityGroups":["sg-3822994c","sg-ccc76987"]}' --service-role EMR_DefaultRole --release-label emr-5.14.0 --name 'Test Cluster' --instance-groups '[{"InstanceCount":1,"EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"SizeInGB":32,"VolumeType":"gp2"},"VolumesPerInstance":1}]},"InstanceGroupType":"MASTER","InstanceType":"m4.xlarge","Name":"Master"},{"InstanceCount":1,"EbsConfiguration":{"EbsBlockDeviceConfigs":[{"VolumeSpecification":{"SizeInGB":32,"VolumeType":"gp2"},"VolumesPerInstance":1}]},"InstanceGroupType":"CORE","InstanceType":"m4.xlarge","Name":"CORE"}]' --configurations '[{"Classification":"spark-defaults","Properties":{"spark.sql.avro.compression.codec":"snappy","spark.eventLog.enabled":"true","spark.dynamicAllocation.enabled":"false"},"Configurations":[]},{"Classification":"spark-env","Properties":{},"Configurations":[{"Classification":"export","Properties":{"SPARK_DAEMON_MEMORY":"4g"},"Configurations":[]}]}]' --scale-down-behavior TERMINATE_AT_TASK_COMPLETION --region us-east-1"""
    print(format_command(example))
    

    Output looks like this:

    aws emr create-cluster \
        --applications Name=spark Name=ganglia Name=hadoop \
        --tags Project=MyProj \
        --ec2-attributes '{
            "ServiceAccessSecurityGroup": "sg-60832c2b", 
            "InstanceProfile": "EMR_EC2_DefaultRole", 
            "EmrManagedMasterSecurityGroup": "sg-b4923dff", 
            "KeyName": "emr-key", 
            "SubnetId": "subnet-3c76ee33", 
            "AdditionalMasterSecurityGroups": [
                "sg-3822994c", 
                "sg-ccc76987"
            ], 
            "AdditionalSlaveSecurityGroups": [
                "sg-3822994c", 
                "sg-ccc76987"
            ], 
            "EmrManagedSlaveSecurityGroup": "sg-dd832c96"
        }' \
        --service-role EMR_DefaultRole \
        --release-label emr-5.14.0 \
        --name Test Cluster \
        --instance-groups '[
            {
                "EbsConfiguration": {
                    "EbsBlockDeviceConfigs": [
                        {
                            "VolumeSpecification": {
                                "VolumeType": "gp2", 
                                "SizeInGB": 32
                            }, 
                            "VolumesPerInstance": 1
                        }
                    ]
                }, 
                "InstanceCount": 1, 
                "Name": "Master", 
                "InstanceType": "m4.xlarge", 
                "InstanceGroupType": "MASTER"
            }, 
            {
                "EbsConfiguration": {
                    "EbsBlockDeviceConfigs": [
                        {
                            "VolumeSpecification": {
                                "VolumeType": "gp2", 
                                "SizeInGB": 32
                            }, 
                            "VolumesPerInstance": 1
                        }
                    ]
                }, 
                "InstanceCount": 1, 
                "Name": "CORE", 
                "InstanceType": "m4.xlarge", 
                "InstanceGroupType": "CORE"
            }
        ]' \
        --configurations '[
            {
                "Properties": {
                    "spark.eventLog.enabled": "true", 
                    "spark.dynamicAllocation.enabled": "false", 
                    "spark.sql.avro.compression.codec": "snappy"
                }, 
                "Classification": "spark-defaults", 
                "Configurations": []
            }, 
            {
                "Properties": {}, 
                "Classification": "spark-env", 
                "Configurations": [
                    {
                        "Properties": {
                            "SPARK_DAEMON_MEMORY": "4g"
                        }, 
                        "Classification": "export", 
                        "Configurations": []
                    }
                ]
            }
        ]' \
        --scale-down-behavior TERMINATE_AT_TASK_COMPLETION \
        --region us-east-1