I have the following sample dataframe with prices of toys in different shops:
dfData <- data.frame(article = c("Fix", "Foxi", "Stan", "Olli", "Barbie", "Ken", "Hulk"),
priceToys1 = c(10, NA, 10.5, NA, 10.7, 11.2, 12.0),
priceAllToys = c(NA, 11.4, NA, 11.9, 11.7, 11.1, NA),
price123Toys = c(12, 12.4, 12.7, NA, NA, 11.0, 12.1))
Additionally I generate a min price column by adding:
dfData$MinPrice <- apply(dfData[, grep("price", colnames(dfData))], 1, FUN=min, na.rm = TRUE)
So I have this dataframe now:
# article priceToys1 priceAllToys price123Toys MinPrice
#1 Fix 10.0 NA 12.0 10.0
#2 Foxi NA 11.4 12.4 11.4
#3 Stan 10.5 NA 12.7 10.5
#4 Olli NA 11.9 NA 11.9
#5 Barbie 10.7 11.7 NA 10.7
#6 Ken 11.2 11.1 11.0 11.0
#7 Hulk 12.0 NA 12.1 12.0
How do I get additional columns into the dataframe that tell me the factor of all prices relatively to the minimum price in percentage? The new column names should also include the shop name.
The result should look like this:
# article priceToys1 PercToys1 priceAllToys PercAllToys price123Toys Perc123Toys MinPrice
#1 Fix 10.0 100.0 NA NA 12.0 120.0 10.0
#2 Foxi NA NA 11.4 100.0 12.4 108.8 11.4
#3 Stan 10.5 100.0 NA NA 12.7 121.0 10.5
#4 Olli NA NA 11.9 100.0 NA NA 11.9
#5 Barbie 10.7 100.0 11.7 109.4 NA NA 10.7
#6 Ken 11.2 101.8 11.1 100.9 11.0 100.0 11.0
#7 Hulk 12.0 100.0 NA NA 12.1 100.8 12.0
Two possible solutions:
1) With the data.table
-package:
# load the 'data.table'-package
library(data.table)
# get the columnnames on which to operate
cols <- names(dfData)[2:4] # or: grep("price", names(dfData), value = TRUE)
# convert dfData to a 'data.table'
setDT(dfData)
# compute the 'fraction'-columns
dfData[, paste0('Perc', gsub('price','',cols)) := lapply(.SD, function(x) round(100 * x / MinPrice, 1))
, .SDcols = cols][]
which gives:
article priceToys1 priceAllToys price123Toys MinPrice PercToys1 PercAllToys Perc123Toys 1: Fix 10.0 NA 12.0 10.0 100.0 NA 120.0 2: Foxi NA 11.4 12.4 11.4 NA 100.0 108.8 3: Stan 10.5 NA 12.7 10.5 100.0 NA 121.0 4: Olli NA 11.9 NA 11.9 NA 100.0 NA 5: Barbie 10.7 11.7 NA 10.7 100.0 109.3 NA 6: Ken 11.2 11.1 11.0 11.0 101.8 100.9 100.0 7: Hulk 12.0 NA 12.1 12.0 100.0 NA 100.8
2) With base R:
cols <- names(dfData)[2:4] # or: grep("price", names(dfData), value = TRUE)
dfData[, paste0('Perc', gsub('price','',cols))] <- round(100 * dfData[, cols] / dfData$MinPrice, 1)
which will get you the same result.