Search code examples
pythonpandasdataframemaskdata-analysis

ValueError: Can only compare identically-labeled Series objects


here is my code, no matters what I do I keep on getting the error and followed all the index related solutions, can anyone help me?

site = pd.read_csv('../data/survey_site.csv')
sampled = site.sample(n=1)

site = site.reset_index(drop=True)
sampled = sampled.reset_index(drop=True)

mask = site.mask(site['name'] == sampled['name'])

Solution

  • The issue is the comparison between site['name'] and sample['name'] is between two pd.Series. You can bypass that by making one of them a scalar. However, I noticed that you took a sample of length 1. I suspect you thought that when you took sample['name'] that it would be a scalar value. But instead it is a length one series. So you just need to make is a scalar.

    Option 1

    mask = site.mask(site['name'] == sampled['name'].squeeze())
    

    Option 2

    mask = site.mask(site['name'] == sampled.loc[0, 'name'])