Search code examples
pythontext-manipulation

Create multiple modified versions of a script for submission to cluster


I am trying create child scripts from one parent script where a few parameters are modified (48 child scripts, so automation would be preferred). My intention for this is to run different Modelica scripts as individual Slurm jobs. I would normally do this with the Modelica scripting language, but since I need to submit each script individually I will need to create multiple scripts. The pseudocode is as follows. Note: The search strings are unique

load('model.mo') # modelica file, parent script
# change the control strategy in the script 
for i in ['control_1', 'control_2', 'control_3', 'control 4']:
    # change the amount of electricity generation
    find and replace r'moduleName = "control"' with 'moduleName = control_' + str(i)
    for j in [3, 7]:
        find and replace '.CombiTimeTable solar_data(columns = {2}' with '.CombiTimeTable solar_data(columns = {' + str(j) + '}'
    # change the battery size
        for k in [2000000000, 4000000000, 6000000000]:            
            find and replace 'Storage.Battery BESS(EMax = 2000000000, SOC_start = 0.5, pf = 0.9)' with 'Storage.Battery BESS(EMax = ' + str(k) + ', SOC_start = 0.5, pf = 0.9)'
            for l in ['4', '8']:
               find and replace '.CombiTimeTable ev_data(columns = {2}' with '.CombiTimeTable ev_data(columns = {' + str(i) + '}'
               export('child_model_#.mo')

My goal is to change the actual text of each new script, not just the variables. I am not sure if I should use Python, Bash, or something else for this task, especially since I am modifying a non-.txt file.


Solution

  • If I am able to correctly guess what you want, your pseudocode was actually pretty close to valid Python code.

    I am assuming model.mo is actually a text file, and that "non-.txt" in your question just refers to the .mo extension. Python (or Slurm, or presumably Modelica) doesn't care what you call your files; the extension is purely a human convention on sane platforms. (I am consciously wording this so as to exclude Windows.)

    with open('model.mo') as modelica:
        script = modelica.read()
    # did you really want a space instead of an underscore in 'control 4'?
    for i in ['control_1', 'control_2', 'control_3', 'control 4']:
        ivar = script.replace(
            'moduleName = "control"',
            # guessing you wanted to keep the double quotes around the name here
            # this generates "control_control_1"; maybe you meant to omit the control_ prefix?
            f'moduleName = "control_{i}"')
        for j in [3, 7]:
            jvar = ivar.replace(
                '.CombiTimeTable solar_data(columns = {2}',
                f'.CombiTimeTable solar_data(columns = {{{j}}}')
            for k in [2000000000, 4000000000, 6000000000]:
                kvar = jvar.replace(
                    'Storage.Battery BESS(EMax = 2000000000, SOC_start = 0.5, pf = 0.9)',
                    f'Storage.Battery BESS(EMax = {k}, SOC_start = 0.5, pf = 0.9)')
                for l in ['4', '8']:
                    lvar = kvar.replace(
                        '.CombiTimeTable ev_data(columns = {2}',
                        # guessing you meant to keep the braces, and you wanted l, not i?
                        f'.CombiTimeTable ev_data(columns = {{{l}}}')
                    # Guessing as to what exactly the file name should be
                    with open(f'child_model_{i}_{j}_{k}_{l}.mo', 'w') as child:
                        child.write(lvar)
    

    In particular, perhaps notice how f'strings' allow you to interpolate variables with {variable}. The notation {{ inserts a literal opening brace, and }} a single literal closing brace. There are many options for how to format interpolated variables and expressions; for example {j:02} would produce the value of j with at least two digits and padding with leading zeros if necessary.

    As such, your + str(j) + attempts would work fine as well; it's just a lot more clunky.

    I added comments where I had to guess what you really wanted.

    Nominally, using round parentheses instead of square brackets around the enumerations would be very slightly more efficient, but that's hardly a major concern here. My main concern would be whether the names of the generated files are what you wanted. If you want to abbreviate, perhaps use a dictionary, like

    abbrev_k = {
        "2B": 2_000_000_000,
        "4B": 4_000_000_000,
        "6B": 6_000_000_000
    }
    ...
            for k in abbrev_k.keys():
                kvar = jvar.replace(
                    'Storage.Battery BESS(EMax = 2000000000, SOC_start = 0.5, pf = 0.9)',
                    f'Storage.Battery BESS(EMax = {abbrev_k[k]}, SOC_start = 0.5, pf = 0.9)'
    

    to use the abbreviation 2B in the file name, but have it translated to the numeric value 2,000,000,000 in the generated file.

    Demo: https://ideone.com/cP7KRx

    You'll notice that we use script to hold the contents of the original script, and then in the first substitution, we create a copy of it in ivar with the control string modified, then continue to create new variables jvar, kvar, lvar with a modified copy throughout the rest of the script.