I have a data frame that lists down some names of individuals and their monetary transactions carried out in USD. The table lists down data according to several districts and the valid transactions made by either cash or credit cards, like so:
X Dist transact.cash transact.card
a 1 USD USD
b 1 USD USD
Where X is an individual and his/her transactions for a period of time keeping that period fixed and Dist
is the district where he/she resides. There are over 4000 observations in total for an approx. 80-100 rows per Dist
. So far, the sorting, slicing and everything else have been simple operations with dat.cash
and dat.card
being subsetted tables according to mode of transaction; but I'm having problems when extracting information in reference to ranking the dataset. For this, I have written a function where I specify a rank and the function should show those rows starting from that rank:
rankdat <- function(transact, numb) {
# Truncated
valid.nums = c('highest', 'lowest', 1:nrow(dat.cash)) # for cash subset
if (transact == 'cash' && numb == 'highest') { # This is easy
sort <- dat.cash[order(dat.cash[, 3], decreasing = T), ]# For sorting only cash data set
} else if (transact == 'cash' and numb == 1:nrow(dat.cash)) {
sort <- dat.cash[order(dat.cash[, 3], decreasing = T) == numb, ] } # Not getting results here
}
The last line is returning NULL
instead of a ranked transaction and all its rows. Replacing ==
with %in%
still gives NULL
and using rank()
doesn't change anything. For highest
and lowest
numbers, its not a great deal since it only involves simple sorting. If I specify rankdat('cash', 10)
, the function should return values starting from the 10th highest transaction and decreasing irrespective of Dist
, similar to:
X Dist transact.cash
b 1 10th highest
h 2 11th highest
p 1 12th highest
and so on
This function is able to do that:
rankdat <- function(df,rank.by,num=10,method="top",decreasing=T){
# ------------------------------------------------------
# RANKDAT
# ------------------------------------------------------
# ARGUMENT
# ========
# df Input dataFrame [d.f]
# num Selected row [num]
# rank.by Name of column(s) used to rank dataFrame
# method Method used to extract rows
# top - to select top rank (e.g. 10 first rows)
# specific - to select specific row
# ------------------------------------------------------
eval(parse(text=paste("sort=df[with(df,order(",rank.by,"), decreasing=",decreasing,"),]",sep=""))) # order dataFrame by
if(method %in% "top"){
return(sort[1:num,])
}else if(method %in% "specific"){
return(sort[num,])
}else{
stop("Please select method used to extract data !!!")
}
}