I have a DT a follows (can someone please explain to me why the fread
here does not work properly?):
library(data.table)
DT <- fread("
pty_n pty pev1 vot1 vv1 ivv1 to1 pv1 ctr_n yr
A 236 6567389 5759215 5554662 204553 876941383 2743064 Argentina 1983
B 41 6567389 5759215 5554662 204553 876941383 56987 Argentina 1983
C 237 6567389 5759215 5554662 204553 876941383 62659 Argentina 1983
D 207 6567389 5759215 5554662 204553 876941383 587 Argentina 1983
E 125 6567389 5759215 5554662 204553 876941383 3728 Argentina 1983",
header=TRUE)
I would like to change the format of the DT
as follows:
1) For the two pty_n
with the highest pv1
I want the party names.
ctr_n yr First_Party Second_Party
Argentina 1983 A C
2) I would like to add their share of the votes as a proportion.
ctr_n yr First_Party Second_Party First_Party_Votes First_Party_Votes
Argentina 1983 A C pv1/vot1 = 2743064/5759215 = 0.476 0.011 (rounded from 0.0108)
I was thinking of doing something like the following:
# Selecting the row with the maximum amount of votes (But how do I do the second max?)
DT[, .SD[max(pv1)==prop_votes ], on=.(ctr_n, yr)]
# Calculating the proportion for the remaining rows
DT[, prop_votes := pv1/vot1]
# Recasting, here I get stuck as well
DT <- reshape2::dcast(DT, ctr_n + yr ~ pty_n + pty + prop_votes + ?
Could someone please help me continue?
My try based on @sindri_baldur's answer
library(data.table)
DT <- fread("
pty_n pty pev1 vot1 vv1 ivv1 to1 pv1 ctr_n yr
A 236 6567389 5759215 5554662 204553 876941383 2743064 Argentina 1983
B 41 6567389 5759215 5554662 204553 876941383 56987 Argentina 1983
C 237 6567389 5759215 5554662 204553 876941383 62659 Argentina 1983
D 207 6567389 5759215 5554662 204553 876941383 587 Argentina 1983
E 125 6567389 5759215 5554662 204553 876941383 3728 Argentina 1983
A 236 6567389 5759215 5554662 204553 876941383 56987 Argentina 1984
B 41 6567389 5759215 5554662 204553 876941383 56987 Argentina 1984
C 237 6567389 5759215 5554662 204553 876941383 62659 Argentina 1984
D 207 6567389 5759215 5554662 204553 876941383 2743064 Argentina 1984
E 125 6567389 5759215 5554662 204553 876941383 3728 Argentina 1984",
header=TRUE)
DT[order(ctr_n, yr, -pv1)
][, .(First_Party= pty_n[1],
Second_Party= pty_n[2],
First_Party_Votes = pv1[1] / vot1[1],
Second_Party_Votes = pv1[2] / vot1[1]
), .(ctr_n, yr)]
ctr_n yr First_Party Second_Party First_Party_Votes Second_Party_Votes
1: Argentina 1983 A C 0.4762913 0.01087978
2: Argentina 1984 D C 0.4762913 0.01087978