Search code examples
pythonpandasdataframepandas-loc

Adjust numeric scale for a set of rows in a more effective/Pythonic way?


So my dataset is survey data, where each row shows a question and a unique respondent's numeric response to that question. Unfortunately, the scale was backwards for some question (aka 1s should be 4s and 4s should be 1s).

I came up with the most absurd way of resolving this via loc. The code searches first for the question and then the instance of the number and then converts it. First numbers are converted to a placeholder value (e.g. instead of immediately converting a 1 to a 4, I convert it to '4a') to ensure that the 4s don't get wrongfully converted (and so on). After all of that, it goes through and converts those placeholders appropriately.

df.loc[((df['question'].str.contains('Why did the chicken cross the road?')) & (df['numericValue'] == 1)),'numericValue'] = '4a'
df.loc[((df['question'].str.contains('Why did the chicken cross the road?')) & (df['numericValue'] == 2)),'numericValue'] = '3a'
df.loc[((df['question'].str.contains('Why did the chicken cross the road?')) & (df['numericValue'] == 3)),'numericValue'] = '2a'
df.loc[((df['question'].str.contains('Why did the chicken cross the road?')) & (df['numericValue'] == 4)),'numericValue'] = 1
df.loc[((df['question'].str.contains('Why did the chicken cross the road?')) & (df['numericValue'] == '2a')),'numericValue'] = 2
df.loc[((df['question'].str.contains('Why did the chicken cross the road?')) & (df['numericValue'] == '3a')),'numericValue'] = 3
df.loc[((df['question'].str.contains('Why did the chicken cross the road?')) & (df['numericValue'] == '4a')),'numericValue'] = 4

Ultimately, the 1s become 4s, the 2s become 3s, the 3s become 2s, and the 4s become 1s. But since it's not really an effective method, wondering if you had a better idea? Thanks so much!


Solution

  • If your replacement cannot be expressed arithmetically, you can also map the correct values:

    replacement = {1: 4, 2: 3, 3: 2, 4: 1}
    df.loc[selector, 'numericValue'] = df.loc[selector, 'numericValue'].map(replacement)