Have a dataframe called topmentions
. Here some data relative to it:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 30 entries, 22 to 29
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 reference 30 non-null object
1 freq 30 non-null int64
dtypes: int64(1), object(1)
memory usage: 720.0+ bytes
None
reference freq
22 Giorgia Meloni|PER 4
16 Matteo Piantedosi|PER 3
10 Donald Trump|PER 3
28 Gianfranco Baruchello|PER 3
3 Tomaso Montanari|PER 3
Despite being a valid dataframe and despite the code below works as expected, the variable topmentions
gets flagged by Pylint as Value 'topmentions' is unsubscriptable
.
Here the code that gets flagged:
json_string = topmentions[
topmentions["freq"].cumsum() < topmentions["freq"].sum() / 2
].to_json(orient="records")
All three topmentions
variable names in the snippet are flagged as errors. What's wrong?
PS: I know I can suppress those errors adding # pylint: disable=unsubscriptable-object
, but I'd like not to resort to such a trick
Nothing is wrong with your code, this seems to be an ongoing issue with Pylint, which wrongly thinks that your dataframe "does not support item assignment (i.e. doesn’t define setitem method)".
Rather than disabling the warning, you can use Pandas loc property, which is probably preferable anyway (see Note here and this post).
So, in the following (hopefully) reproducible example (as of the date of this answer, using Python 3.10.9, Pandas 1.5.2, Pylint 2.15.9):
import pandas as pd
df = pd.DataFrame({"col0": [1, 2], "col1": ["a", "b"]})
df = df.set_index("col0")
print(df[df["col1"] == "a"])
# Output
col1
col0
1 a
Running Pylint on the script prints out:
script.py:4:6: E1136: Value 'df' is unsubscriptable (unsubscriptable-object)
script.py:4:9: E1136: Value 'df' is unsubscriptable (unsubscriptable-object)
------------------------------------------------------------------
Your code has been rated at 0.00/10 (previous run: 0.00/10, +0.00)
Now, if you replace df[df["col1"] == "a"]
with df.loc[df.loc[:, "col1"] == "a", :]
and run Pylint again, everything is fine:
--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
Similarly:
df["col2"] = ["c", "d"]
Raises:
script.py:4:9: E1137: 'df' does not support item assignment (unsupported-assignment-operation)
But df.loc[:, "col2"] = ["c", "d"]
does not.