How to (correctly) merge 2 Pandas DataFrames and scatter-plot

Thank you for your answers, in advance.

My end goal is to produce a scatter-plot - corruption as an explanatory variable (x axis, from a DataFrame 'corr') and inequality as a dependent variable (y axis, from a DataFrame 'inq'). A hint to produce an informative table (DataFrame) by joining these two Dataframes would be much appreciated I have a dataframe 'inq' for a country inequality (GINI index) and another one 'corr' for country corruption index.

pd.DataFrame(
    {
        "country": {0: "Angola", 1: "Albania", 2: "United Arab Emirates"},
        "1975": {0: nan, 1: nan, 2: nan},
        "1976": {0: nan, 1: nan, 2: nan},
        "2017": {0: nan, 1: 33.2, 2: nan},
        "2018": {0: 51.3, 1: nan, 2: nan},
    }
)

pd.DataFrame(
    {
        "country": {0: "Afghanistan", 1: "Angola", 2: "Albania"},
        "1975": {0: 44.8, 1: 48.1, 2: 75.1},
        "1976": {0: 44.8, 1: 48.1, 2: 75.1},
        "2018": {0: 24.2, 1: 40.4, 2: 28.4},
        "2019": {0: 40.5, 1: 37.6, 2: 35.9},
    }
)

I concatenate and manipulate and get

cm = pd.concat([inq, corr], axis=0, keys=["Inequality", "Corruption"]).reset_index(
    level=1, drop=True
)

a new Dataframe

pd.DataFrame(
    {
        "indicator": {0: "Inequality", 1: "Inequality", 2: "Inequality"},
        "country": {0: "Angola", 1: "Albania", 2: "United Arab Emirates"},
        "1967": {0: nan, 1: nan, 2: nan},
        "1969": {0: nan, 1: nan, 2: nan},
        "2018": {0: 51.3, 1: nan, 2: nan},
        "2019": {0: nan, 1: nan, 2: nan},
    }
)

Solution

You should concatenate your dataframe in a different way:

df = (pd.concat([inq.set_index('country'),
                 corr.set_index('country')],
                 axis=1,
                 keys=["Inequality", "Corruption"]
                )
        .stack(level=1)
     )

                  Inequality  Corruption
country                                 
Angola      1975         NaN        48.1
            1976         NaN        48.1
            2018        51.3        40.4
            2019         NaN        37.6
Albania     1975         NaN        75.1
            1976         NaN        75.1
            2017        33.2         NaN
            2018         NaN        28.4
            2019         NaN        35.9
Afghanistan 1975         NaN        44.8
            1976         NaN        44.8
            2018         NaN        24.2
            2019         NaN        40.5

Then to plot:

df.plot.scatter(x='Corruption', y='Inequality')

NB. there is only one point as most of your data is NaN