Search code examples
pythonpandasdataframeaws-lambda

Python 'NoneType' object has no attribute '_value'


I am appending a list of data frames into one data frame using pd.concat. However, I am getting 'NoneType' object has no attribute '\_value' error on AWS Lambda. This works on Python 3.8 but doesn't work on Python 3.9+

My code is like this below. I am trying to call API to get a list of task data frames based on ids and concate the task IDs into one dataframe.

def get_tasks(#api_client, projects):
    tasks_api_response_df_lst=[]
    for i in project_ids:
         task_df # call API to get the data frame
         tasks_api_response_df_lst.append(task_df)

    tasks_api_response_df =pd.concat(tasks_api_response_df_lst)
    return tasks_api_response_df
{
"errorMessage": "'NoneType' object has no attribute '_value'",
"errorType": "AttributeError",
"stackTrace": [
"File \"/var/task/lambda_function.py\", line 137, in lambda_handler\n    tasks_df=get_tasks(api_client, projects_df)\n", 
"File \"/var/task/lambda_function.py\", line 116, in get_tasks\n    tasks_api_response_df = pd.concat(tasks_api_response_df_lst)\n",  
"File \"/opt/python/pandas/core/reshape/concat.py\", line 393, in concat\n    return op.get_result()\n",
"File \"/opt/python/pandas/core/reshape/concat.py\", line 680, in get_result\n    new_data = concatenate_managers(\n",
"File \"/opt/python/pandas/core/internals/concat.py\", line 189, in concatenate_managers\n    values = _concatenate_join_units(join_units, copy=copy)\n",
"File \"/opt/python/pandas/core/internals/concat.py\", line 466, in _concatenate_join_units\n    to_concat = [\n",
"File \"/opt/python/pandas/core/internals/concat.py\", line 467, in <listcomp>\n    ju.get_reindexed_values(empty_dtype=empty_dtype, upcasted_na=upcasted_na)\n",
"File \"/opt/python/pandas/core/internals/concat.py\", line 452, in get_reindexed_values\n    return make_na_array(empty_dtype, self.block.shape, fill_value)\n",
"File \"/opt/python/pandas/core/internals/managers.py\", line 2294, in make_na_array\n    i8values = np.full(shape, fill_value._value)\n"
  ]
}

I have tried to append the data frame (instead of the list) but it seems like appending data frames into a list and then changing those into a data frame is the best approach.

Any help is appreciated! Thanks!


Solution

  • This was originally a comment, but it seems to have answered the question, so I'm expanding it to an answer.

    This question is an excellent exercise in debugging and reasoning skills. All junior programmers should eagerly seek to practice debugging. It's an essential skill that will enable you to solve problems quickly and confidently. The current generation of AI assistants are not likely to be able to help in a nontrivial real-world debugging scenario.

    (I did actually check, and ChatGPT 3.5 did a decent job with this particular question. But that's partly because you already did a good job of narrowing down the problem up front).

    So, how do we debug this? The answer is almost always to start with the error message.

    The error message here says:

    'NoneType' object has no attribute '_value'
    

    We can only conclude that something, somewhere, tried to access ._value on a None object. Clearly, it expected something other than None. Therefore our instinct should be to find where a None might appear unexpectedly.

    Next, we should look at the traceback, to figure out where in the code the problem arose. You don't have to know what every item in this output means (in this case it's a lot of irrelevant internal Pandas stuff), but it should hopefully be obvious that this error originates from our pd.concat() call:

    File \"/var/task/lambda_function.py\", line 116, in get_tasks\n    tasks_api_response_df = pd.concat(tasks_api_response_df_lst)
    

    In a normal Python program, that output would be formatted like this:

    File "/var/task/lambda_function.py", line 116, in get_tasks
        tasks_api_response_df = pd.concat(tasks_api_response_df_lst)
    

    This narrows things down a lot. Assuming that this is not a bug in pd.concat, there's only one other possible cause: tasks_api_response_df_lst, the single input to pd.concat. We constructed this tasks_api_response_df_lst, so it must be the case that somehow we did something wrong to cause the error.

    Now we can be confident in the location and cause of the error: somewhere inside pd.concat, something was None when it wasn't supposed to be. Now that we have diagnosed the problem, it should hopefully be easy to fix: check for items that are None, and at that point either discard the bad item (ideally with a logged warning) or raise an exception.

    The one wrinkle in reasoning here is that your code appears to have worked under Python 3.8 and not under Python 3.9. We know that something changed, and as a result your code started failing. But we can't know for sure that the Python version itself was the cause.

    But are you sure that the only change in environment was switching from Python 3.8 to 3.9? Did the Pandas version in the environment change? Is this a new gap in Pandas error handling (arguably a bug) that was not present in an older version? Or did something else change? Did the upstream API change its format? Did you adjust your implementation of get_tasks? Are the files actually still present in the upstream data source? Etc.