Search code examples
pythonpandascplex

Parsing CPLEX logfile to Pandas Dataframe


I'm trying to parse the Cplex Logfile for a MILP Problem to a Pandas Dataframe. The Logfile Part for the Branch and Bound looks like this:

['  Node  Left     Objective  IInf  Best Integer    Best Bound    ItCnt     Ga',
 '',
 '     0+    0                            0.0000 -6819503.0000              ---',
 '     0     0  -149597.6789  3202        0.0000  -149597.6789       37     ---',
 '     0+    0                       -83655.0000  -149597.6789            78.83',
 '     0+    0                      -125854.0000  -149597.6789            18.87',
 '     0+    0                      -126474.0000  -149597.6789            18.28',
 '     0+    0                      -130810.0000  -138528.6611             5.90',
 '     0+    0                      -133740.0000  -135305.2732             1.17',
 '     0     2  -135305.2732  3622  -133740.0000  -135305.2732     6307    1.17',
 '    79    49  -135300.3578  3627  -133740.0000  -135304.7364     6743    1.17',
 '   207   172  -135294.5414  3606  -133740.0000  -135304.7364     8033    1.17',
 '   337   280  -135292.2563  3569  -133740.0000  -135304.7364     9140    1.17',
 '   457   433  -135294.0213  3591  -133740.0000  -135304.7364    10532    1.17',
 '   569   500  -135285.6607  3573  -133740.0000  -135304.7364    11117    1.17',
 '   627   559  -135282.5594  3559  -133740.0000  -135304.7364    11710    1.17',
 '   735   667  -135283.6653  3582  -133740.0000  -135304.7364    12619    1.17',
 '   815   688  -135281.1259  3574  -133740.0000  -135304.7364    12786    1.17',
 '   912   844  -135282.2331  3629  -133740.0000  -135304.7364    14043    1.17',
 '  1244  1085  -135267.1649  3544  -133740.0000  -135304.7364    16446    1.17',
 '  1458  1352  -135241.3260  3490  -133740.0000  -135304.7364    18874    1.17',
 '  1569  1487  -135230.0718  3454  -133740.0000  -135304.7364    20190    1.17',
 '  1607  1525  -135214.0255  3400  -133740.0000  -135304.7364    20599    1.17',
 '  1707  1590  -135196.7871  3372  -133740.0000  -135304.7364    21195    1.17',
 '  1818+ 1327                      -133837.0000  -135304.7364             1.10',
...
 ' 11004 10458  -134792.2319  2747  -134481.0000  -135304.7364   106870    0.61',
 ' 11005+ 9743                      -134482.0000  -135304.7364             0.61',
 ' 11005 10377  -135051.5669  3342  -134482.0000  -135304.7364   106128    0.61',
 ' 11006+ 9743                      -134485.0000  -135304.7364             0.61',
 ' 11007  9697  -135085.5502  3356  -134485.0000  -135304.7364   100103    0.61']

My Problem is if I try to read it with Pandas the missing values are shifting the values in the wrong cells.

    Node    Left    Objective   IInf    BestInteger BestBound   ItCnt   Gap
0   0+  0   0.0000          -6.819503e+06   --- NaN         NaN NaN
1   0   0   -149597.6789    3.202000e+03    0.0000  -149597.6789    37.0    ---
2   0+  0   -83655.0000 -1.495977e+05   78.83   NaN         NaN NaN
3   0+  0   -125854.0000    -1.495977e+05   18.87   NaN         NaN NaN
4   0+  0   -126474.0000    -1.495977e+05   18.28   NaN         NaN NaN
5   0+  0   -130810.0000    -1.385287e+05   5.90    NaN         NaN NaN
6   0+  0   -133740.0000    -1.353053e+05   1.17    NaN         NaN NaN
7   0   2   -135305.2732    3.622000e+03    -133740.0000    -135305.2732    6307.0  1.17
8   79  49  -135300.3578    3.627000e+03    -133740.0000    -135304.7364    6743.0  1.17
9   207 172 -135294.5414    3.606000e+03    -133740.0000    -135304.7364    8033.0  1.17

I used the following line to parse my list to pandas:

test = pd.read_csv(io.StringIO("\n".join(milp_log)), delim_whitespace=True)

Solution

  • You should read your strings as a fixed-width file using pd.read_fwf:

    import io
    import pandas as pd
    
    df = pd.read_fwf(io.StringIO('\n'.join(milp_log)))
    

    Output:

         Node  Left    Objective    IInf  Best Integer    Best Bound     ItCnt     Ga
    0       0+    0          NaN     NaN           0.0 -6.819503e+06       NaN    ---
    1       0     0 -149597.6789  3202.0           0.0 -1.495977e+05      37.0    ---
    2       0+    0          NaN     NaN      -83655.0 -1.495977e+05       NaN  78.83
    3       0+    0          NaN     NaN     -125854.0 -1.495977e+05       NaN  18.87
    4       0+    0          NaN     NaN     -126474.0 -1.495977e+05       NaN  18.28
    5       0+    0          NaN     NaN     -130810.0 -1.385287e+05       NaN   5.90
    6       0+    0          NaN     NaN     -133740.0 -1.353053e+05       NaN   1.17
    7       0     2 -135305.2732  3622.0     -133740.0 -1.353053e+05    6307.0   1.17
    8      79    49 -135300.3578  3627.0     -133740.0 -1.353047e+05    6743.0   1.17
    9     207   172 -135294.5414  3606.0     -133740.0 -1.353047e+05    8033.0   1.17
    10    337   280 -135292.2563  3569.0     -133740.0 -1.353047e+05    9140.0   1.17
    11    457   433 -135294.0213  3591.0     -133740.0 -1.353047e+05   10532.0   1.17
    12    569   500 -135285.6607  3573.0     -133740.0 -1.353047e+05   11117.0   1.17
    13    627   559 -135282.5594  3559.0     -133740.0 -1.353047e+05   11710.0   1.17
    14    735   667 -135283.6653  3582.0     -133740.0 -1.353047e+05   12619.0   1.17
    15    815   688 -135281.1259  3574.0     -133740.0 -1.353047e+05   12786.0   1.17
    16    912   844 -135282.2331  3629.0     -133740.0 -1.353047e+05   14043.0   1.17
    17   1244  1085 -135267.1649  3544.0     -133740.0 -1.353047e+05   16446.0   1.17
    18   1458  1352 -135241.3260  3490.0     -133740.0 -1.353047e+05   18874.0   1.17
    19   1569  1487 -135230.0718  3454.0     -133740.0 -1.353047e+05   20190.0   1.17
    20   1607  1525 -135214.0255  3400.0     -133740.0 -1.353047e+05   20599.0   1.17
    21   1707  1590 -135196.7871  3372.0     -133740.0 -1.353047e+05   21195.0   1.17
    22   1818+ 1327          NaN     NaN     -133837.0 -1.353047e+05       NaN   1.10
    23  11004 10458 -134792.2319  2747.0     -134481.0 -1.353047e+05  106870.0   0.61
    24  11005+ 9743          NaN     NaN     -134482.0 -1.353047e+05       NaN   0.61
    25  11005 10377 -135051.5669  3342.0     -134482.0 -1.353047e+05  106128.0   0.61
    26  11006+ 9743          NaN     NaN     -134485.0 -1.353047e+05       NaN   0.61
    27  11007  9697 -135085.5502  3356.0     -134485.0 -1.353047e+05  100103.0   0.61