Search code examples
pythonjsonunicode

Converting elements in a single list to key/value pair using Unicode characters as key


I have a list (see below) that I want to take any element in the list containing a Unicode character (e.g.,'①','②','㉖') as the key/value pair inside a 'category' JSON element and the following elements in the list between each Unicode element as the key/value pairs inside a 'codes' JSON nested grouping.

What list I have:

['①', 'Type of Care']
['SA', 'Substance use treatment']
['DT', 'Detoxification']
['HH', 'Transitional housing, halfway house, or sober home']
['SUMH', 'Treatment for co-occurring serious mental health illness/serious emotional disturbance and substance use disorders']
['②', 'Telemedicine']
['TELE', 'Telemedicine/telehealth']
['③', 'Service Settings (e.g., Outpatient, Residential, Inpatient, etc.)']
['HI', 'Hospital inpatient']
['OP', 'Outpatient']
['RES', 'Residential']
['HID', 'Hospital inpatient detoxification']
['HIT', 'Hospital inpatient treatment']
['OD', 'Outpatient detoxification']
['ODT', 'Outpatient day treatment or partial hospitalization']
['OIT', 'Intensive outpatient treatment']
['ORT', 'Regular outpatient treatment']
['RD', 'Residential detoxification']
['RL', 'Long-term residential']
['RS', 'Short-term residential']
['⑰', 'Assessment/Pre-treatment']
['CMHA', 'Comprehensive mental health assessment']
['CSAA', 'Comprehensive substance use assessment']
['ISC', 'Interim services for clients']
['OPC', 'Outreach to persons in the community']
['㉖', 'Facility Smoking Policy']
['SMON', 'Smoking not permitted']
['SMOP', 'Smoking permitted without restriction']
['SMPD', 'Smoking permitted in designated area']

The key/value pair JSON I want to create:

{
    "codekey": [
        {
            "category": {
                "key": "①",
                "value": "Type of Care"
            },
            "codes": [
                {
                    "key": "SA",
                    "value": "Substance use treatment"
                },
                {
                    "key": "SA",
                    "value": "Substance use treatment"
                },
                {
                    "key": "DT",
                    "value": "Detoxification"
                },
                {
                    "key": "HH",
                    "value": "Transitional housing, halfway house, or sober home"
                },
                {
                    "key": "SUMH",
                    "value": "Treatment for co-occurring serious mental health illness/serious emotional disturbance and substance use disorders"
                }
            ]
        },
        {
            "category": {
                "key": "②",
                "value": "Telemedicine"
            },
            "codes": [
                {
                    "key": "TELE",
                    "value": "Telemedicine/telehealth"
                }
            ]
        },
        {
            "category": {
                "key": "③",
                "value": "Service Settings (e.g., Outpatient, Residential, Inpatient, etc.)"
            },
            "codes": [
                {
                    "key": "HI",
                    "value": "Hospital inpatient"
                },
                {
                    "key": "OP",
                    "value": "Outpatient"
                },
                {
                    "key": "RES",
                    "value": "Residential"
                },
                {
                    "key": "HID",
                    "value": "Hospital inpatient detoxification"
                },
                {
                    "key": "HIT",
                    "value": "Hospital inpatient treatment"
                },
                {
                    "key": "OD",
                    "value": "Outpatient detoxification"
                },
                {
                    "key": "ODT",
                    "value": "Outpatient day treatment or partial hospitalization"
                },
                {
                    "key": "OIT",
                    "value": "Intensive outpatient treatment"
                },
                {
                    "key": "ORT",
                    "value": "Regular outpatient treatment"
                },
                {
                    "key": "RD",
                    "value": "Residential detoxification"
                },
                {
                    "key": "RL",
                    "value": "Long-term residential"
                },
                {
                    "key": "RS",
                    "value": "Short-term residential"
                }
            ]
        },
        {
            "category": {
                "key": "⑰",
                "value": "Assessment/Pre-treatment"
            },
            "codes": [
                {
                    "key": "CMHA",
                    "value": "Comprehensive mental health assessment"
                },
                {
                    "key": "CSAA",
                    "value": "Comprehensive substance use assessment"
                },
                {
                    "key": "ISC",
                    "value": "Interim services for clients"
                },
                {
                    "key": "OPC",
                    "value": "Outreach to persons in the community"
                }
            ]
        },
        {
            "category": {
                "key": "㉖",
                "value": "Facility Smoking Policy"
            },
            "codes": [
                {
                    "key": "SMON",
                    "value": "Smoking not permitted"
                },
                {
                    "key": "SMOP",
                    "value": "Smoking permitted without restriction"
                },
                {
                    "key": "SMPD",
                    "value": "Smoking permitted in designated area"
                }
            ]
        }
    ]
}

Solution

  • Hope the following helps - this code iterates through the items in a variable named src_list, and uses a dictionary to create a JSON output like you describe.

    import json
    
    src_list = [['①', 'Type of Care'], ['SA', 'Substance use treatment'], ... ]
    output_dicts = {"codekey": [] }
    current_dict = None
    
    for pair in src_list:
        # Is unicode character outside ASCII range? If so, it's defining a category
        if all(ord(c) > 128 for c in pair[0]):
            # If the current_dict is not None, we're onto a new category, so should add the last category to the output
            if (current_dict is not None):
                output_dicts["codekey"].append(current_dict)
            
            # Define new dict for this category
            current_dict = {
                "category": {
                    "key": pair[0],
                    "value": pair[1]
                },
                "codes": []
            }
        else:
            if (current_dict is not None):
                current_dict["codes"].append({
                    "key": pair[0],
                    "value": pair[1]
                })
    
    output_dicts["codekey"].append(current_dict)
    output_json = json.dumps(output_dicts, indent = 4, ensure_ascii=False)
    
    print(output_json)