Search code examples
pythonpandasvisual-studio-codeintellisensepylance

Why won't IntelliSense work with pandas pipe()?


In VScode, it seems that Intellisense is not able to infer the return type of calls to pandas.DataFrame.pipe. It is a source of some inconvenience as I cannot rely on autocompletion after using pipe. But I haven't seen this issue mentioned anywhere, so it makes me wonder if it's just me or if I am missing something.

This is what I do:

import pandas as pd
df = pd.DataFrame({'A': [1,2,3]})
df2 = df.pipe(lambda x: x + 1)

VSCode recognizes df as a DataFrame: enter image description here, but has no clue what df2 might be: enter image description here

A first thought would be that this is due to the lack of type hinting in the lambda function. But if I try this instead:

def add_one(df: pd.DataFrame) -> pd.DataFrame:
  return df + 1
df3 = df.pipe(add_one)

Still IntelliSense can't guess the type of df3: enter image description here

Of course as a last recourse I can add a hint to df3 itself:

df3: pd.DataFrame = df.pipe(add_one)

But it seems like it shouldn't be necessary. IntelliSense seems very capable of inferring return types in other complex scenarios, such as involving map: enter image description here


UPDATE:

I experimented a bit more and found some interesting patterns which narrow down the range of possible causes.

I am not sufficiently familiar with Pylance to really understand why this is happening, but here is what I find:

Finding 1

It is happening to pandas.core.common.pipe if import it. (I know pd.DataFrame.pipe calls pandas.core.generic.pipe, but that internally calls pandas.core.common.pipe, and I can reproduce the issue in pandas.core.common.pipe.) enter image description here

Finding 2

If I copy the definition of that same function from pandas.core.common, together with the relevant imports of Callable and TypeVar, and declare T as TypeVar('T'), IntelliSense actually does its magic. enter image description here

(Actually in pandas.core.common, T is not defined as TypeVar('T') but imported from pandas._typing, where it is defined as TypeVar('T'). If I import it instead of defining it myself, it still works fine.)

From this I am tempted to conclude that pandas does everything right, but that Pylance is failing to keep track of type information for some unknown reason...

Finding 3

If I just copy pandas.core.common into a local file pandascommon.py and import pipe from that, it works fine too! enter image description here


Solution

  • I got it!

    It was due to the stubs shipped with Pylance. Specifically in ~/.vscode/extensions/ms-python.vscode-pylance-2022.3.2/dist/bundled/stubs/pandas/.

    For example in core/common.pyi I found this stub: def pipe(obj, func, *args, **kwargs): ...

    Pylance uses this instead of the annotations in pandas.core.common.pipe, causing the issue.

    One heavy-handed solution is to just erase (or rename) the pandas stubs in that folder. Then pipe works again. On the other hand, it breaks some other things, for example read_csv is no longer correctly inferred to return a DataFrame. I think the better long run solution would be for the Pylance maintainers to improve those stubs...

    A minimally invasive solution to the original pipe issue is to edit ~/.vscode/extensions/ms-python.vscode-pylance-2022.3.2/dist/bundled/stubs/pandas/core/frame.pyi in the following manner:

    • add from pandas._typing import T

    • replace the line starting with def pipe by:

      def pipe(self, func: Callable[..., T], *args, **kwargs) -> T: ...