Search code examples
pythonplotlyanalysissurveylikert

How to transform Pandas dataframe for likert scale visualization?


I am in the process of analyzing a survey. This contains likert scales. To display them with Plotly (see link), I need an appropriate format. Therefore my question, how do I get from the following format :

enter image description here

to this target format:

enter image description here

Source Dataframe:

    {
        "ID": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
        "Q1": [
            "no answer",
            "Very satisfied",
            "no answer",
            "no answer",
            "Very satisfied",
            "no answer",
            "no answer",
            "Very satisfied",
            "Very satisfied",
            "no answer",
            "no answer",
            "Neither satisfied nor dissatisfied",
            "no answer",
            "Very satisfied",
            "Neither satisfied nor dissatisfied",
            "Somewhat satisfied",
        ],
        "Q2": [
            "no answer",
            "Very satisfied",
            "no answer",
            "no answer",
            "Very satisfied",
            "no answer",
            "no answer",
            "Very satisfied",
            "Very satisfied",
            "no answer",
            "no answer",
            "Somewhat satisfied",
            "no answer",
            "Very satisfied",
            "Somewhat dissatisfied",
            "Neither satisfied nor dissatisfied",
        ],
        "Q3": [
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "no answer",
            "Neither satisfied nor dissatisfied",
            "no answer",
            "Somewhat satisfied",
        ],
    }
) 

Solution

  • Assuming ID is index of df, you can do this:

    pd.concat([df[c].value_counts() for c in df.columns],axis=1).fillna(0)
    

    Another option is to melt the df long, and use pd.pivot_table()

    pd.pivot_table(pd.melt(df),index="value",columns="variable", aggfunc='size', fill_value=0)