I am attempting to couple a pymoo optimization algorithm with a regression model. I found this fantastic example on github, and am attempting to follow the steps starting from In[16].
The example problem is probably using an outdated method of loading the pickle file (it wasn't working for me), so I'm using Model= load_model('RegModel')
for my first model output and similarlyModel2=load_model('Reg2Model')
for my second output. I was able to follow the rest of the example up to In[69]. I'm also using two different pkl files to predict two different outcomes of my data, but when I use
result=pd.DataFrame(list(res.X))
result['Output1']=res.F
result['Output2']=Model2.predict(result)
I get the error in the question title, namely:ValueError: Length of values (1) does not match length of index (11)
(note that my length is 11 instead of the example's 5 because I'm using a different dataset. The methodology however is the same).
I tried not using the pd.DataFrame object and pass it as a numpy list but that also didn't work. At this point I'm stumped. If I don't define Output 2 and simply define Output 1 alone Output=res.F my code compiles, but it's obviously nonsensical.
Any help is appreciated, including the intuition behind In[69]. Appended is the full error message:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[14], line 4
1 result= pd.DataFrame(list(res.X))
2 result
----> 4 result['Output1']= res.F
5 result['Output2']=Reg2Model.predict(result)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\frame.py:3980, in DataFrame.__setitem__(self, key, value)
3977 self._setitem_array([key], value)
3978 else:
3979 # set column
-> 3980 self._set_item(key, value)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\frame.py:4174, in DataFrame._set_item(self, key, value)
4164 def _set_item(self, key, value) -> None:
4165 """
4166 Add series to DataFrame in specified column.
4167
(...)
4172 ensure homogeneity.
4173 """
-> 4174 value = self._sanitize_column(value)
4176 if (
4177 key in self.columns
4178 and value.ndim == 1
4179 and not is_extension_array_dtype(value)
4180 ):
4181 # broadcast across multiple columns if necessary
4182 if not self.columns.is_unique or isinstance(self.columns, MultiIndex):
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\frame.py:4915, in DataFrame._sanitize_column(self, value)
4912 return _reindex_for_setitem(Series(value), self.index)
4914 if is_list_like(value):
-> 4915 com.require_length_match(value, self.index)
4916 return sanitize_array(value, self.index, copy=True, allow_2d=True)
File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\common.py:571, in require_length_match(data, index)
567 """
568 Check the length of data matches the length of the index.
569 """
570 if len(data) != len(index):
--> 571 raise ValueError(
572 "Length of values "
573 f"({len(data)}) "
574 "does not match length of index "
575 f"({len(index)})"
576 )
ValueError: Length of values (1) does not match length of index (11)
You're error is due to trying to add an array of length 1 to a dataframe which has a size of 11 (length of index is 11).
You should check how res.X
and res.F
look like. Are they consistent? And how does the dataframe look like? Is there something else in the dataframe?
Usually the amount of solutions in the solution space res.F
should equal the amount of sets in the design space res.X
.