Search code examples
pythonjsonexcelmappingstring-comparison

How to map table references in a JSON file to corresponding values in an Excel file using Python?


I have an Excel file containing a list of requirements in 1 column and a JSON file that stores table values. The requirements in the Excel file include references to tables in the JSON file, formatted as "table x" where x is a number. Not every requirement has a corresponding table.

For example, a requirement might state: "The mass of the CubeSat shall not extend the values of table 3." The format of the table references in the requirements is always "table x" with x being any number.

This is the JSON-file

{
    "table 3": {
        "footprint_cs_2u": "100x100 mm",
        "height_cs_2u": "227 mm",
        "feet_cs_2u": "8.5x8.5 mm",
        "rails_edges_rounded_2u": "Rx1mm",
        "footprint_cs_3u": "100x100 mm",
        "height_cs_3u": "340.5 mm",
        "feet_cs_3u": "8.5x8.5 mm",
        "rails_edges_rounded_3u": "Rx1mm"
        }
}
        

What I managed to do was to read and save the 1 Excel column via the .iloc[:,0] -command to create a dataframe and read the JSON-file.

I saw that you might be able to use RegEx and the re.search()-command to find matches but I could not figure out an implementation and what to do afterwards. In this manner:

import re

#Check if the string starts with "The" and ends with "Spain":

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)

This is my full-code if it helps.

import pandas as pd
import numpy as np
import json
import re
import openpyxl


def translate_requirements(input_excel, tables_json, output_latex):
    # read Excel-file
    df = pd.read_excel(input_excel)

    # read JSON file
    with open(tables_json, 'r') as file:
        tables = json.load(file)


    # create dataframe for 1-column Excel content.
    requirements = df.iloc[:, 0]

    # Initialize translated requirements (empty array)
    translated_requirements = []

    # Output as LaTeX file | Here: example-doc
    with open(output_latex, 'w') as latex:
        latex.write(r"""
            \documentclass{article}
            .
            .
            .
            """
        )

translate_requirements('input.xlsx', 'tables.json', 'output_latex.tex')

print('Data read')

So far, I can read both the Excel and JSON files. However, I need help with adding the table contents from the JSON-file into the corresponding Excel column of the requirement that uses that table.

How can I achieve this mapping, and subsequently include this information in the LaTeX document?

Any help or pointers would be greatly appreciated!


Solution

  • You can search table \d+ in text

    import re
    
    text = 'The mass of the CubeSat shall not extend the values of table 3.'
    results = re.findall('table \d+', text)
    
    print(results)
    

    Result:

    ['table 3']
    

    And later you can check if list is not empty and use first results[0] (or all results) to search table in json.

    if results:
        table = tables[results[0]]
        # ... work with table ...
    

    And if you have more elements then use for-loop to find all values and put on list

    requirements  = df.iloc[:, 0].to_list()
    
    data = []
    
    for text in requirements:
        results = re.findall('table \d+', text)
        if results:
            table = tables[results[0]]
            data.append( (text, table) )
    

    and later use this list to generate latex
    but in this place you have to decide how to format it (and I skip it)

    with open(output_latex, 'w') as latex:
        latex.write(r"\documentclass{article}\n")
    
        for text, table in data:
            latex.write(f"{text} ....{table}\n")