I have monthly returns in long format. For each month I want to assign the companies into 10 groups (deciles) of equal size according to their performance.
My data sample looks the following (here in ascending order of value, which is the monthly return):
print(R4, n=10)
# A tibble: 1,125 x 5
# Groups: year [3]
company year month value mktvalue
<chr> <dbl> <dbl> <dbl> <dbl>
1 STIFEL FINANCIAL 1997 7 0.821 50.3
2 RAYMOND JAMES FINL. 1997 10 0.833 1070.
3 CHARLES SCHWAB 1997 3 0.853 6250.
4 STATE STREET 1997 3 0.863 6178.
5 STIFEL FINANCIAL 1996 7 0.871 31.6
6 FRANKLIN RESOURCES 1997 3 0.872 7459.
7 BERKSHIRE HATHAWAY 'A' 1997 8 0.879 53857.
8 ALLIANCEBERNSTEIN HLDG. UNT. 1997 3 0.879 2257.
9 STATE STREET 1997 8 0.890 8504.
10 MORGAN STANLEY 1996 7 0.891 8764.
# ... with 1,115 more rows
Because I want to assign ranks to the companies for each month, I first filtered for one month:
R5 <- R4 %>%
filter(year == 1997, month == 12, !is.na(value)) %>%
arrange(value) %>%
mutate(rank = rank(value))
print(R5)
# A tibble: 13 x 6
# Groups: year [1]
company year month value mktvalue rank
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 FRANKLIN RESOURCES 1997 12 0.967 11505. 1
2 STATE STREET 1997 12 0.978 9250. 2
3 JEFFERIES FINANCIAL GROUP 1997 12 0.998 2175. 3
4 BERKSHIRE HATHAWAY 'A' 1997 12 1.02 55184. 4
5 BANK OF NEW YORK MELLON 1997 12 1.08 21152. 5
6 CHARLES SCHWAB 1997 12 1.09 10854. 6
7 MORGAN STANLEY 1997 12 1.09 32712. 7
8 RAYMOND JAMES FINL. 1997 12 1.11 1207. 8
9 MGIC INVESTMENT 1997 12 1.14 7030. 9
10 ALLIANCEBERNSTEIN HLDG. UNT. 1997 12 1.15 3107. 10
11 AFFILIATED MANAGERS 1997 12 1.16 373. 11
12 RADIAN GP. 1997 12 1.16 1284. 12
13 STIFEL FINANCIAL 1997 12 1.17 81.4 13
I tried the answer from this question to assign companies into groups: Grouped ranking in R
Please tell me if there is a smarter way to create the decile groups.
percent.rank <- function(x) trunc(rank(x)/length(x)*100)
R6 <- within(R5, pr <- percent.rank(rank))
R6$decile <- cut(R6$pr, breaks = c(0,10,20,30,40,50,60,70,80,90,100), labels = c(1:10))
print(R6)
# A tibble: 13 x 8
# Groups: year [1]
company year month value mktvalue rank pr decile
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
1 FRANKLIN RESOURCES 1997 12 0.967 11505. 1 7 1
2 STATE STREET 1997 12 0.978 9250. 2 15 2
3 JEFFERIES FINANCIAL GROUP 1997 12 0.998 2175. 3 23 3
4 BERKSHIRE HATHAWAY 'A' 1997 12 1.02 55184. 4 30 3
5 BANK OF NEW YORK MELLON 1997 12 1.08 21152. 5 38 4
6 CHARLES SCHWAB 1997 12 1.09 10854. 6 46 5
7 MORGAN STANLEY 1997 12 1.09 32712. 7 53 6
8 RAYMOND JAMES FINL. 1997 12 1.11 1207. 8 61 7
9 MGIC INVESTMENT 1997 12 1.14 7030. 9 69 7
10 ALLIANCEBERNSTEIN HLDG. UNT. 1997 12 1.15 3107. 10 76 8
11 AFFILIATED MANAGERS 1997 12 1.16 373. 11 84 9
12 RADIAN GP. 1997 12 1.16 1284. 12 92 10
13 STIFEL FINANCIAL 1997 12 1.17 81.4 13 100 10
So far I did this manually for one month. How can I apply this for every month and year of my data frame without manually filtering? My whole data would go from Jan 1996 to Dec 2020. The groups/deciles must be of equal size and ranked according to the monthly performance (value). Is there any way of looping or another smart way?
To reproduce my data:
R4 <- data.frame(company = c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L",
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L"),
year = c(1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996, 1996,1996, 1996, 1996,
1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997, 1997),
month = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3),
value = c(1.15999568000864, 0.983783783783784, 1.07692307692308, 1, 0.989795918367347,
0.989690721649484, 1, 1.04166666666667, 0.99, 1.05050505050505, 1.07211538461538,
1.01345291479821, 0.942477876106195, 1.06572769953052, 0.986784140969163,
0.879464285714286, 1.11167512690355, 0.922374429223744, 1.15841584158416,
1.18803418803419, 0.973018705035971, 1.10906058132519, 0.92, 1.00724637681159))
I'm not sure I follow exactly what you're trying to do, but given your data I was able to create a ranked grouping by value
for each month
/year
combination with this:
library(dplyr)
R4 %>%
group_by(year, month) %>%
mutate(rank = rank(value)) %>%
arrange(month, year, rank)
This yields:
# A tibble: 60 × 6
# Groups: year, month [24]
company year month value mktvalue rank
<fct> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ALLIANCEBERNSTEIN HLDG. UNT. 1996 1 0.984 1802. 1
2 BERKSHIRE HATHAWAY 'A' 1996 1 0.994 37088. 2
3 BANK OF NEW YORK MELLON 1996 1 1.05 9517. 3
4 ALLIANCEBERNSTEIN HLDG. UNT. 1997 1 1.07 2307. 1
5 BANK OF NEW YORK MELLON 1997 1 1.08 13652. 2
6 BANK OF NEW YORK MELLON 1996 2 1.01 10488. 1
7 ALLIANCEBERNSTEIN HLDG. UNT. 1996 2 1.08 1931. 2
8 BERKSHIRE HATHAWAY 'A' 1996 2 1.11 39584. 3
9 ALLIANCEBERNSTEIN HLDG. UNT. 1997 2 0.987 2441. 1
10 BANK OF NEW YORK MELLON 1997 2 1.06 15457. 2