I have a dataset for expected and current income:
id currentsalary expectedsalary
1 1 NA 1500
2 2 NA 3000
3 3 NA NA
4 4 NA NA
5 5 NA 1500
6 6 1500 3000
7 7 NA 1500
8 8 NA 5000
9 9 1000 1500
10 10 3000 5000
I would like to show the distribution of the expected net income in relation to the current net income (charts + conclusions). I draw histograms:
hist(df$expectedsalary, col="pink", xlab="salary")
hist(df$currentsalary, col="blue", add=T)
But it doesn't show the relation correctly. I would like to put id's to x coordinate and current and expected salary on y-axis (one maybe a line over histogram) to emphasize the differences between expected and current salaries person-based. How should I do that?
I'd use a dotchart to plot the differences:
ILLUSTRATIVE DATA:
set.seed(122)
df <- data.frame(
id = 1:10,
exp = sample(1000:5000, 10),
curr = sample(800: 4500, 10)
)
SOLUTION:
Calculate the difference:
df$diff <- df$curr - df$exp
Draw dotchart:
dotchart(df$diff, labels = df$id, main = "Difference in current v expected income",
col = ifelse(df$diff < 0, "red", "blue"), density = 50, angle = 90)
abline(v = 0)
RESULT:
(obviously, this can be greatly embellished)
EDIT:
How about using a barplot?
barplot(df$diff, names = df$id, xlab = "ID", ylab = "Difference",
main = "Difference in current v expected income",
col = ifelse(df$diff < 0, "red", "blue"), density = 50, angle = 90)
legend("topright", c("Current > Expected income", "Current < Expected income"),
fill = c("blue", "red"),
cex = 0.8)
Result: