I have two dataframes:
farm_production <- data.frame (
year = c(seq(1980,2000)),
"n11" = c(seq(80,200,length.out=21)),
"n26" = c(seq(110,180,length.out=21)),
"n31" = c(seq(150,56,length.out=21)),
"n48" = c(seq(200,160,length.out=21)),
"n59" = c(seq(198,170,length.out=21)))
farm_info <- data.frame (
ID = c("n11", "n26", "n31", "n48", "n59"),
type = c("wheat", "wheat", "cereal", "hay", "hay"),
country = c("Spain", "Greece", "Italy", "Spain", "Portugal"))
These two dataframes have in common cells with the same value (n11, n26, n31, n48, n59)
I plotted the production of these 5 farms over the years:
plot(farm_production$year, farm_production$n11, xlab = "Year", ylab = "Forage production (tons)", ylim = c(0, 200))
points(farm_production$year, farm_production$n26)
points(farm_production$year, farm_production$n31)
points(farm_production$year, farm_production$n48)
points(farm_production$year, farm_production$n59)
However, I want to color these points by "type" (3 levels: wheat, grain, hay), but this info is in the "farm_info" dataframe, how can I relate the info of one dataframe to another?
I am aware that I can probably do this manually, but keep in mind that this is just a small sample of a much larger dataframe with more than 100 rows and columns, so I am interested in finding a way to "automate" this process by relating the info in dataframe 1 (farm_production) to dataframe 2 (farm_info) to color these points by "type".
Any suggestions on how I can do this? Any help is greatly appreciated.
Having the data in this "wide" format will make plotting difficult.
I would start by transforming your farm_production
dataframe to a tidy format and then join your farm_info
data to create a single dataframe from which to plot.
During the data preparation, I would convert your type
variable to a factor so that R might automatically assign colors.
Optionally, you might consider adding a legend.
farm_production <- data.frame (
year = c(seq(1980,2000)),
"n11" = c(seq(80,200,length.out=21)),
"n26" = c(seq(110,180,length.out=21)),
"n31" = c(seq(150,56,length.out=21)),
"n48" = c(seq(200,160,length.out=21)),
"n59" = c(seq(198,170,length.out=21)))
farm_info <- data.frame (
ID = c("n11", "n26", "n31", "n48", "n59"),
type = c("wheat", "wheat", "cereal", "hay", "hay"),
country = c("Spain", "Greece", "Italy", "Spain", "Portugal"))
data <- merge(
reshape(
farm_production,
varying = names(farm_production)[-1],
v.names = "production",
timevar = "farm",
times = names(farm_production)[-1],
direction = "long",
sep = ""
),
farm_info,
by.x = "farm",
by.y = "ID"
)
data$id <- NULL
data$type <- factor(data$type)
plot(
data$year,
data$production,
xlab = "Year",
ylab = "Forage production (tons)",
ylim = c(0, 200),
col = data$type # R will automatically choose colors for factors
)
legend(
x ="topleft",
legend = levels(data$type), # labels for factor levels
col = 1:3, # numeric representation of factor levels
pch = 19, # optionally change size of points
cex = .7 # optionally change overall size of legend
)
Created on 2024-02-27 with reprex v2.1.0