I am using dask.distrinuted, and I have two dask DataFrames A & B. Both have the same number of partitions, and each partition is a 2D DataFrame containing the same columns and rows that have float64 values. When I multiply the dask dataframes A*B
and compute the results. I get a dask dataframe of the same size full of NaN values.
I tried computing a single partition of each dataframe individually as in:
A.partitions[1].compute()
B.partitions[1].compute()
And none of the two contain NaN values. I multiplied the two :
A.partitions[1].compute()*B.partitions[1].compute()
and I still get a dataframe of the same size that is full of NaN values. What could the problem be, why aren't I getting the actual results in float64? Note that other multiplicaion operations seem to work fine. Could it be related to the difference graph layers?
The issue was solved by simply equating both columns of the dask data frames:
A.columns == B.columns
Even though upon inspection it seemed that the columns and rows had the same names and numbers and type, it seems that there has been an unnoticed discrepancy.