Using oct2py to call corrcoef.m on several (10MM+) size dataframes to return [R,P] matrices to generate training sets for a ML algorithm. Yesterday, I had this working no problem. Ran the script from the top this morning, returning an identical test set to be passed to Octave through oct2py.
I am being returned:
Oct2PyError: Octave evaluation error: error: isnan: not defined for cell error: called from: corrcoef at line 152, column 5 CorrCoefScript at line 1, column 7
First, there are no null/nan values in the set. In fact, there aren't even any zeros. There is no uniformity in any column such that there is no standard deviation being returned in the corrcoef calculation. It is mathmatically sound.
Second, when I load the test set into Octave through the GUI and execute the same .m on the same data no errors are returned, and the [R,P] matrices are identical to the saved outputs from last night. I tested to see if the matrix var is being passed to Octave through oct2py correctly, and Octave is receiving an identical matrix. However, oct2py can no longer execute ANY .m with a nan check in the source code. The error above is returned for any Octave packaged .m script that contains .isnan at any point.
For s&g, I modified my .m to receive the matrix var and write it to a flat file like so:
csvwrite ('filename', data);
This also fails with an fprintf error; if I run the same code on the same dataset inside of the Octave GUI, works fine.
I'm at a loss here. I updated conda, oct2py, and Octave with the same results. Again, the exact code with the exact data ran behaved as expected less than 24 hours prior.
I'm using the code below in Jupyter Notebook to test:
%env OCTAVE_EXECUTABLE = F:\Octave\Octave-5.1.0.0\mingw32\bin\octave-cli-5.1.0.exe
import oct2py
from oct2py import octave
octave.addpath('F:\\FinanceServer\\Python\\Secondary Docs\\autotesting\\atOctave_Scripts');
data = x
octave.push('data',data)
octave.eval('CorrCoefScript')
cmat = octave.pull('R')
enter code here
Side note - I am only having this issue inside of a specific .ipynb script. Through some stroke of luck, the no other scripts using oct2py seem to be affected.
Got it fixed, but it generates more questions than answers. I was using a list of dataframes to loop by type, such that for each iteration i
, x
was generated through x = dflst[i]
. For reasons beyond my understanding, that failed with the passage of time. However, by writing my loop into a custom function and explicitly calling each data frame within that function as so: oct_func(type1df)
I am seeing the expected behavior and desired outcome. However, I still cannot use a loop to pass the dataframes to oct_func()
. So, it's a band-aid solution that will fit my purposes, but is frustratingly unable to scale .
Edit: The loop works fine if iterating through a dict of dataframes instead of a list.