I'm using Hypothesis to test dataframes, and when they're "empty-ish" I'm getting some unexpected behavior.
In the example below, I have a dataframe of all nans, and it's getting viewed as a NoneType
object rather than a dataframe (and thus it has no attribute notnull()
):
Falsifying example: test_merge_csvs_properties(input_df_dict= {'googletrend.csv': file week trend
0 NaN NaN NaN
1 NaN NaN NaN
2 NaN NaN NaN
3 NaN NaN NaN
4 NaN NaN NaN 5 NaN NaN NaN}
<snip>
Traceback (most recent call last):
File "/home/chachi/Capstone-SalesForecasting/tests/test_make_dataset_with_composite.py", line 285, in test_merge_csvs_properties
input_dataframe, df_dict = make_dataset.merge_csvs(input_df_dict)
File "/home/chachi/Capstone-SalesForecasting/tests/../src/data/make_dataset.py", line 238, in merge_csvs
if dfs_dict['googletrend.csv'].notnull().any().any():
AttributeError: 'NoneType' object has no attribute 'notnull'
Compare to ipython session, where a dataframe of all nans is still a dataframe:
>>> import pandas as pd
>>> import numpy as np
>>> tester = pd.DataFrame({'test': [np.NaN]})
>>> tester
test
0 NaN
>>> tester.notnull().any().any()
False
I'm explicitly testing for notnull()
to allow for all sorts of pathological examples. Any suggestions?
It looks like you've somehow ended up with None instead of a dataframe as that value in the input_dfs_dict
. Can you post the full test you're using, or at least the function definition and strategy? The traceback alone doesn't really have enough information to tell what's happening. Quick things to check: