I have dataframe with such structure:
country strength index_name index
africa 0.75 5A 5
boston 0.65 5A 5
tga 0.89 5A 5
ollaw 0.45 5A 5
africa 0.69 80A 80
boston 0.35 80A 80
tga 0.81 80A 80
ollaw 0.33 80A 80
pica 0.29 80A 80
africa 0.70 150A 150
boston 0.47 150A 150
tga 0.40 150A 150
FSD 0.90 150A 150
We can see, the strength for africa decays from 0.75 (in 5A) to 0.69 (in 80A) to 0.70 (in 150A). Same increase or decrease for other cities/countries across differetn index_name. Some countries might be in one index_name and not present in other.
I am trying to plot a scatter plot, having names of countries on each point, but lines connecting across all index_names.
Could this be done with sns?
With the dataframe you provided:
import pandas as pd
df = pd.DataFrame(
{
"country": [
"africa",
"boston",
"tga",
"ollaw",
"africa",
"boston",
"tga",
"ollaw",
"pica",
"africa",
"boston",
"tga",
"FSD",
],
"strength": [
0.75,
0.65,
0.89,
0.45,
0.69,
0.35,
0.81,
0.33,
0.29,
0.7,
0.47,
0.4,
0.9,
],
"index_name": [
"5A",
"5A",
"5A",
"5A",
"80A",
"80A",
"80A",
"80A",
"80A",
"150A",
"150A",
"150A",
"150A",
],
"index": [5, 5, 5, 5, 80, 80, 80, 80, 80, 150, 150, 150, 150],
}
)
Here is one way to do it (using e Jupyter notebook):
from matplotlib import pyplot as plt
df = df.sort_values(by=["country", "index", "strength"]).reset_index(drop=True)
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(7, 4))
for country in df["country"].unique():
ax.plot(
df.loc[df["country"] == country, "index"],
df.loc[df["country"] == country, "strength"],
".",
linestyle="-",
)
for x, y in zip(
df.loc[df["country"] == country, "index"],
df.loc[df["country"] == country, "strength"],
):
ax.text(x, y, country)
fig
Output: