Search code examples
pythonalgorithmdynamic-programming

Optimization of matrix search


I have a matrix of 6 columns and 375 rows, the user sets any number. The task is as follows, find all combinations in which a number is taken from each column (a total of 6 numbers) and their sum should be equal to the number of the user. I write in python, and have tried many ways, but they are all very slow, I need an hour or less to calculate UPD. the numbers are all positive UPD in each row there is an intermediate diesel power and fuel consumption corresponding to this power, so 6 numbers from each column should add up to the number specified by the user and ensure minimum consumption. Fuel consumption is a non-linear characteristic, usually it is like a car, if you are familiar with engines. n - kilowatts, i - diesel index, Av - diesel consumption grams per kilowatt hour

n;i;Av
80;5;330.467
80;6;330.467
85;5;324.551
85;6;324.551
90;5;319.292
90;6;319.292
95;5;314.587
95;6;314.587
100;1;301.271
100;2;301.271
100;5;310.352
100;6;310.352
105;1;296.857
105;2;296.857
105;5;306.52
105;6;306.52
110;1;292.844
110;2;292.844
110;5;303.037
110;6;303.037
115;1;289.18
115;2;289.18
115;5;299.857
115;6;299.857
120;1;285.821
120;2;285.821
120;5;296.942
120;6;296.942
125;1;282.731
125;2;282.731
125;5;294.26
125;6;294.26
130;1;279.879
130;2;279.879
130;3;326.651
130;4;326.651
130;5;291.784
130;6;291.784
135;1;277.238
135;2;277.238
135;3;322.929
135;4;322.929
135;5;289.491
135;6;289.491
140;1;274.785
140;2;274.785
140;3;319.472
140;4;319.472
140;5;287.363
140;6;287.363
145;1;272.502
145;2;272.502
145;3;316.254
145;4;316.254
145;5;285.381
145;6;285.381
150;1;270.371
150;2;270.371
150;3;313.25
150;4;313.25
150;5;283.531
150;6;283.531
155;1;268.378
155;2;268.378
155;3;310.44
155;4;310.44
155;5;281.801
155;6;281.801
160;1;266.509
160;2;266.509
160;3;307.806
160;4;307.806
160;5;280.179
160;6;280.179
165;1;264.753
165;2;264.753
165;3;305.331
165;4;305.331
165;5;278.655
165;6;278.655
170;1;263.101
170;2;263.101
170;3;303.002
170;4;303.002
170;5;277.221
170;6;277.221
175;1;261.543
175;2;261.543
175;3;300.806
175;4;300.806
175;5;275.868
175;6;275.868
180;1;260.071
180;2;260.071
180;3;298.732
180;4;298.732
180;5;274.591
180;6;274.591
185;1;258.68
185;2;258.68
185;3;296.77
185;4;296.77
185;5;273.383
185;6;273.383
190;1;257.361
190;2;257.361
190;3;294.912
190;4;294.912
190;5;272.238
190;6;272.238
195;1;256.11
195;2;256.11
195;3;293.148
195;4;293.148
195;5;271.153
195;6;271.153
200;1;254.921
200;2;254.921
200;3;291.473
200;4;291.473
200;5;270.121
200;6;270.121
205;1;253.791
205;2;253.791
205;3;289.88
205;4;289.88
205;5;269.14
205;6;269.14
210;1;252.714
210;2;252.714
210;3;288.362
210;4;288.362
210;5;268.205
210;6;268.205
215;1;251.688
215;2;251.688
215;3;286.915
215;4;286.915
215;5;267.314
215;6;267.314
220;1;250.708
220;2;250.708
220;3;285.534
220;4;285.534
220;5;266.464
220;6;266.464
225;1;249.772
225;2;249.772
225;3;284.214
225;4;284.214
225;5;265.651
225;6;265.651
230;1;248.876
230;2;248.876
230;3;282.952
230;4;282.952
230;5;264.874
230;6;264.874
235;1;248.018
235;2;248.018
235;3;281.743
235;4;281.743
235;5;264.129
235;6;264.129
240;1;247.197
240;2;247.197
240;3;280.585
240;4;280.585
240;5;263.416
240;6;263.416
245;1;246.408
245;2;246.408
245;3;279.474
245;4;279.474
245;5;262.732
245;6;262.732
250;1;245.652
250;2;245.652
250;3;278.407
250;4;278.407
250;5;262.075
250;6;262.075
255;1;244.925
255;2;244.925
255;3;277.383
255;4;277.383
255;5;261.444
255;6;261.444
260;1;244.225
260;2;244.225
260;3;276.397
260;4;276.397
260;5;260.837
260;6;260.837
265;1;243.553
265;2;243.553
265;3;275.449
265;4;275.449
265;5;260.253
265;6;260.253
270;1;242.905
270;2;242.905
270;3;274.536
270;4;274.536
270;5;259.691
270;6;259.691
275;1;242.281
275;2;242.281
275;3;273.656
275;4;273.656
275;5;259.149
275;6;259.149
280;1;241.679
280;2;241.679
280;3;272.808
280;4;272.808
280;5;258.627
280;6;258.627
285;1;241.098
285;2;241.098
285;3;271.989
285;4;271.989
285;5;258.122
285;6;258.122
290;1;240.537
290;2;240.537
290;3;271.198
290;4;271.198
290;5;257.636
290;6;257.636
295;1;239.995
295;2;239.995
295;3;270.435
295;4;270.435
295;5;257.165
295;6;257.165
300;1;239.472
300;2;239.472
300;3;269.697
300;4;269.697
300;5;256.711
300;6;256.711
305;1;238.965
305;2;238.965
305;3;268.983
305;4;268.983
305;5;256.271
305;6;256.271
310;1;238.475
310;2;238.475
310;3;268.292
310;4;268.292
310;5;255.846
310;6;255.846
315;1;238
315;2;238
315;3;267.623
315;4;267.623
315;5;255.43
315;6;255.43
320;1;237.54
320;2;237.54
320;3;266.975
320;4;266.975
325;1;237.095
325;2;237.095
325;3;266.346
325;4;266.346
330;1;236.663
330;2;236.663
330;3;265.737
330;4;265.737
335;1;236.243
335;2;236.243
335;3;265.146
335;4;265.146
340;1;235.836
340;2;235.836
340;3;264.573
340;4;264.573
345;1;235.441
345;2;235.441
345;3;264.016
345;4;264.016
350;1;235.057
350;2;235.057
350;3;263.475
350;4;263.475
355;1;234.684
355;2;234.684
355;3;262.949
355;4;262.949
360;1;234.322
360;2;234.322
360;3;262.438
360;4;262.438
365;1;233.969
365;2;233.969
365;3;261.941
365;4;261.941
370;1;233.626
370;2;233.626
370;3;261.457
370;4;261.457
375;1;233.292
375;2;233.292
375;3;260.986
375;4;260.986
380;1;232.966
380;2;232.966
380;3;260.527
380;4;260.527
385;1;232.65
385;2;232.65
385;3;260.081
385;4;260.081
390;1;232.341
390;2;232.341
390;3;259.646
390;4;259.646
395;1;232.04
395;2;232.04
395;3;259.222
395;4;259.222
400;1;231.747
400;2;231.747
400;3;258.808
400;4;258.808
405;3;258.405
405;4;258.405
410;3;258.012
410;4;258.012
415;3;257.628
415;4;257.628
420;3;257.253
420;4;257.253
425;3;256.887
425;4;256.887
430;3;256.529
430;4;256.529
435;3;256.18
435;4;256.18
440;3;255.839
440;4;255.839
445;3;255.505
445;4;255.505
450;3;255.179
450;4;255.179
455;3;254.86
455;4;254.86
460;3;254.548
460;4;254.548
465;3;254.242
465;4;254.242
470;3;253.943
470;4;253.943
475;3;253.651
475;4;253.651
480;3;253.364
480;4;253.364
485;3;253.083
485;4;253.083
490;3;252.809
490;4;252.809
495;3;252.539
495;4;252.539
500;3;252.275
500;4;252.275
505;3;252.016
505;4;252.016
510;3;251.763
510;4;251.763
515;3;251.514
515;4;251.514
520;3;251.27
520;4;251.27

UPD. They helped me in the comments, I tried this code, it works, but it needs improvement for me, because I can't find the functions for my expenses

import pulp
from numpy.polynomial import Polynomial
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit


list_df = []
df = pd.read_csv('ruchki.csv', sep=';')

def quadratic_func(x, a, b, c):
    return a * x**2 + b * x + c

for i in range(1, 7):
    condition = df['i'] == i
    df1 = df[condition]
    x = df1['n'].values
    y = df1['Av'].values
    params_quadratic, covariance_quadratic = curve_fit(quadratic_func, x, y)
    a, b, c = params_quadratic
    list_df.append([a, b, c])

engine_W = pd.DataFrame(
    data={
        'power_min': [100, 100, 130, 130, 80, 80],
        'power_max': [400, 400, 520, 520, 315, 315],
    },
    index=pd.RangeIndex(name='engine', start=0, stop=6),
)

output_W = pd.DataFrame(
    data=np.linspace(
        start=engine_W['power_min'],
        stop=engine_W['power_max'],
        num=50,
    ),
    columns=engine_W.index,
)

# quadratic diesel fuel model, L/J
fuel_models = [
    Polynomial((list_df[0][2], list_df[0][1], list_df[0][0])),
    Polynomial((list_df[1][2], list_df[1][1], list_df[1][0])),
    Polynomial((list_df[2][2], list_df[2][1], list_df[2][0])),
    Polynomial((list_df[3][2], list_df[3][1], list_df[3][0])),
    Polynomial((list_df[4][2], list_df[4][1], list_df[4][0])),
    Polynomial((list_df[5][2], list_df[5][1], list_df[5][0])),

]
num_points = 50
num_engines = len(engine_W)

input_L_J = np.zeros((num_points, num_engines))

for engine, model in enumerate(fuel_models):
    input_L_J[:, engine] = model(output_W.loc[:, engine])

print(input_L_J)

target_W = 1000
assignments = pulp.LpVariable.matrix(
    name='asn', cat=pulp.LpBinary,
    indices=(range(num_engines), range(num_points)),

)
prob = pulp.LpProblem(name='diesel_generation', sense=pulp.LpMinimize)
prob.objective = pulp.LpAffineExpression()
total_output = pulp.LpAffineExpression()

for engine, engine_group in enumerate(assignments):
    prob.addConstraint(name=f'engine_excl_{engine}', constraint=pulp.lpSum(engine_group) <= 1)
    prob.objective += pulp.lpDot(engine_group, input_L_J[:, engine])

    total_output += pulp.lpDot(engine_group, output_W.loc[:, engine])

prob.addConstraint(name='total_power', constraint=total_output == target_W)

print(prob)
prob.solve()
assert prob.status == pulp.LpStatusOptimal

cons_idx = [
    next((i for i, var in enumerate(engine_group) if var.value() > 0.5), None)
    for engine_group in assignments
]
for engine, idx in enumerate(cons_idx):
    if idx is not None:
        print(
            'Engine', engine, 'active at', output_W.loc[idx, engine], 'kW, '
            f'consuming {input_L_J[idx, engine]:.1f}'

        )

Solution

  • First cut: assignment

    If we assume that the fuel consumption is tabular and can't be represented by a continuous function, then this is a simple assignment problem:

    import numpy as np
    import pandas as pd
    import pulp
    from numpy.polynomial import Polynomial
    
    engine_W = pd.DataFrame(
        data={
            'power_min': [100, 125, 150, 200, 250, 300],
            'power_max': [300, 325, 350, 400, 450, 500],
        },
        index=pd.RangeIndex(name='engine', start=0, stop=6),
    )*1e3
    
    output_W = pd.DataFrame(
        data=np.linspace(
            start=engine_W['power_min'],
            stop=engine_W['power_max'],
            num=41,
        ),
        columns=engine_W.index,
    )
    
    # quadratic diesel fuel model, L/J
    quadfuel = Polynomial((6.722e-8, 3.380e-8, 1.166e-8))
    output_frac = output_W / engine_W['power_max']
    input_L_J = quadfuel(output_frac)
    input_L_s = input_L_J * engine_W['power_max']
    
    target_W = 1000e3
    assignments = pulp.LpVariable.matrix(
        name='asn', cat=pulp.LpBinary,
        indices=(engine_W.index, input_L_s.index),
    )
    prob = pulp.LpProblem(name='diesel_generation', sense=pulp.LpMinimize)
    prob.objective = pulp.LpAffineExpression()
    total_output = pulp.LpAffineExpression()
    
    for engine, engine_group in enumerate(assignments):
        prob.addConstraint(name=f'engine_excl_{engine}', constraint=pulp.lpSum(engine_group) <= 1)
        prob.objective += pulp.lpDot(engine_group, input_L_s.loc[:, engine])
        total_output += pulp.lpDot(engine_group, output_W.loc[:, engine])
    
    prob.addConstraint(name='total_power', constraint=total_output == target_W)
    
    print(prob)
    prob.solve()
    assert prob.status == pulp.LpStatusOptimal
    
    cons_idx = [
        next((i for i, var in enumerate(engine_group) if var.value() > 0.5), None)
        for engine_group in assignments
    ]
    for engine, idx in enumerate(cons_idx):
        if idx is not None:
            print(
                'Engine', engine, 'active at', output_W.loc[idx, engine]*1e-3, 'kW, '
                f'consuming {input_L_s.loc[idx, engine]*60*60:.1f} l/h'
            )
    
    Engine 0 active at 295.0 kW, consuming 120.7 l/h
    Engine 1 active at 315.0 kW, consuming 129.8 l/h
    Engine 3 active at 390.0 kW, consuming 160.2 l/h
    

    Notice that I used a continuous 2-degree model anyway, to produce synthetic data.

    Second cut: assignment and continuous

    The much more advisable approach is to produce a continuous model for fuel consumption. This is quite easy because your fuel consumption (when in units of mass per time) is perfectly linear.

    import numpy as np
    import pandas as pd
    import pulp
    from matplotlib import pyplot as plt
    
    
    def load(filename: str = '77000778.csv') -> pd.DataFrame:
        df = pd.read_csv(
            filename, sep=';',
        ).rename(columns={
            'n': 'power_kW',
            'i': 'engine',
            'Av': 'consume_g_kWh',
        })
        df['consume_g_h'] = df['consume_g_kWh'] * df['power_kW']
        return df
    
    
    def linregress(df: pd.DataFrame) -> pd.DataFrame:
        params = pd.DataFrame(
            index=pd.RangeIndex(name='engine', start=df['engine'].min(), stop=df['engine'].max()+1),
        )
        grouped = df.groupby('engine')
        params['pmin'] = grouped['power_kW'].min()
        params['pmax'] = grouped['power_kW'].max()
    
        for engine, group in grouped:
            fit, *_ = np.linalg.lstsq(
                a=np.stack((group['power_kW'], np.ones(len(group))), axis=1),
                b=group['consume_g_h'],
                rcond=None,
            )
            params.loc[engine, ['m', 'b']] = fit
    
        return params
    
    
    def make_vars(consume_params: pd.DataFrame) -> tuple[
        dict[int, pulp.LpVariable],  # engine assignment
        dict[int, pulp.LpVariable],  # engine usage
    ]:
        assign = {
            engine:
            pulp.LpVariable(
                name=f'assign_{engine}', cat=pulp.LpBinary,
            )
            for engine in consume_params.index
        }
        power = {
            engine:
            pulp.LpVariable(
                name=f'power_{engine}', cat=pulp.LpContinuous,
            )
            for engine in consume_params.index
        }
        return assign, power
    
    
    def make_objective(
        assign: dict[int, pulp.LpVariable],
        power: dict[int, pulp.LpVariable],
        consume_params: pd.DataFrame,
    ) -> pulp.LpAffineExpression:
        return pulp.lpDot(
            power.values(),
            consume_params['m'],
        ) + pulp.lpDot(
            assign.values(),
            consume_params['b'],
        )
    
    
    def add_constraints(
        prob: pulp.LpProblem,
        assign: dict[int, pulp.LpVariable],
        power: dict[int, pulp.LpVariable],
        consume_params: pd.DataFrame,
        target_power: float,
    ) -> None:
        prob.addConstraint(
            name='total_power',
            constraint=target_power == pulp.lpSum(power.values()),
        )
    
        for engine, row in consume_params.iterrows():
            prob.addConstraint(
                name=f'pmin_{engine}',
                constraint=power[engine] >= row['pmin']*assign[engine],
            )
            prob.addConstraint(
                name=f'pmax_{engine}',
                constraint=power[engine] <= row['pmax']*assign[engine],
            )
    
    
    def dump(
        assign: dict[int, pulp.LpVariable],
        power: dict[int, pulp.LpVariable],
    ) -> None:
        for engine, asn in assign.items():
            if asn.value() > 0.5:
                p = power[engine].value()
                print(f'Engine {engine} runs at {p:.1f} kW')
    
    
    def plot(df: pd.DataFrame, consume_params: pd.DataFrame) -> plt.Figure:
        fig, (ax_left, ax_right) = plt.subplots(ncols=2)
    
        ax_left.set_xlabel('Power (kW)')
        ax_left.set_ylabel('Consumption (g/kWh)')
        ax_right.set_xlabel('Power (kW)')
        ax_right.set_ylabel('Consumption (g/h)')
    
        for engine, group in df.groupby('engine'):
            ax_left.scatter(group['power_kW'], group['consume_g_kWh'], label=f'table ({engine})', s=8)
            ax_right.scatter(group['power_kW'], group['consume_g_h'], label=f'table ({engine})', s=8)
    
            params = consume_params.loc[engine]
            power = np.linspace(start=params['pmin'], stop=params['pmax'], num=201)
            g_h = params['m']*power + params['b']
            g_kwh = g_h / power
    
            ax_left.plot(power, g_kwh, label=f'fit ({engine})')
            ax_right.plot(power, g_h, label=f'fit ({engine})')
        ax_left.legend(title='Engine')
    
        return fig
    
    
    def main() -> None:
        df = load()
        consume_params = linregress(df)
    
        assign, power = make_vars(consume_params)
        prob = pulp.LpProblem(name='diesel_generation', sense=pulp.LpMinimize)
        prob.objective = make_objective(assign, power, consume_params)
        add_constraints(prob, assign, power, consume_params, target_power=970.3)
    
        print(prob)
        prob.solve()
        assert prob.status == pulp.LpStatusOptimal
        dump(assign, power)
    
        plot(df, consume_params)
        plt.show()
    
    
    if __name__ == '__main__':
        main()
    
    diesel_generation:
    MINIMIZE
    9269.900724484416*assign_1 + 9269.900724484416*assign_2 + 13066.01881329114*assign_3 + 13066.01881329114*assign_4 + 8046.24272579243*assign_5 + 8046.24272579243*assign_6 + 208.57198202009533*power_1 + 208.57198202009533*power_2 + 226.14324824732228*power_3 + 226.14324824732228*power_4 + 229.88975993269645*power_5 + 229.88975993269645*power_6 + 0.0
    SUBJECT TO
    total_power: power_1 + power_2 + power_3 + power_4 + power_5 + power_6 = 970.3
    
    pmin_1: - 100 assign_1 + power_1 >= 0
    
    pmax_1: - 400 assign_1 + power_1 <= 0
    
    pmin_2: - 100 assign_2 + power_2 >= 0
    
    pmax_2: - 400 assign_2 + power_2 <= 0
    
    pmin_3: - 130 assign_3 + power_3 >= 0
    
    pmax_3: - 520 assign_3 + power_3 <= 0
    
    pmin_4: - 130 assign_4 + power_4 >= 0
    
    pmax_4: - 520 assign_4 + power_4 <= 0
    
    pmin_5: - 80 assign_5 + power_5 >= 0
    
    pmax_5: - 315 assign_5 + power_5 <= 0
    
    pmin_6: - 80 assign_6 + power_6 >= 0
    
    pmax_6: - 315 assign_6 + power_6 <= 0
    
    VARIABLES
    0 <= assign_1 <= 1 Integer
    0 <= assign_2 <= 1 Integer
    0 <= assign_3 <= 1 Integer
    0 <= assign_4 <= 1 Integer
    0 <= assign_5 <= 1 Integer
    0 <= assign_6 <= 1 Integer
    power_1 free Continuous
    power_2 free Continuous
    power_3 free Continuous
    power_4 free Continuous
    power_5 free Continuous
    power_6 free Continuous
    ...
    
    Result - Optimal solution found
    
    Objective value:                232593.85590738
    Enumerated nodes:               0
    Total iterations:               1
    Time (CPU seconds):             0.01
    Time (Wallclock seconds):       0.01
    
    Option for printingOptions changed from normal to all
    Total time (CPU seconds):       0.01   (Wallclock seconds):       0.01
    
    Engine 1 runs at 400.0 kW
    Engine 2 runs at 400.0 kW
    Engine 5 runs at 170.3 kW
    

    fits