I have a data Series called Snow (the amount of snow in different months of the year).
These two lines of code produce the same results (at least seems so!)
So I just wanted to know the difference.
import pandas as pd
snow.loc[(snow.index.month==1) & (snow>0)]
snow.loc[lambda s: (s.index.month==1) & (s>0)]
There is no difference between the two lines provided you're not running chained commands. Using a function/lambda in loc
is a way to ensure that you're referencing the current Series/DataFrame.
It would be different with chained commands.
Example:
snow = pd.Series([0, 1, 0, 1], index=pd.to_datetime(['2023-01-01', '2023-01-15', '2023-02-01', '2023-02-15']))
(snow
.add(2)
# here we reference the series independently
# of the previous chained commands
.loc[(snow.index.month==1) & (snow>0)]
)
# 2023-01-15 3
# dtype: int64
(snow
.add(2)
# here we reference the current state of the Series
.loc[lambda s: (s.index.month==1) & (s>0)]
)
# 2023-01-01 2
# 2023-01-15 3
# dtype: int64