Search code examples
rdata.tablefreaddcast

Casting the row maxima from long to wide


I have a DT a follows (can someone please explain to me why the fread here does not work properly?):

library(data.table)
DT <- fread("
pty_n  pty  pev1    vot1    vv1     ivv1    to1         pv1       ctr_n     yr
A      236  6567389 5759215 5554662 204553  876941383   2743064   Argentina 1983
B      41   6567389 5759215 5554662 204553  876941383   56987     Argentina 1983
C      237  6567389 5759215 5554662 204553  876941383   62659     Argentina 1983
D      207  6567389 5759215 5554662 204553  876941383   587       Argentina 1983
E      125  6567389 5759215 5554662 204553  876941383   3728      Argentina 1983",
header=TRUE)

I would like to change the format of the DT as follows:

1) For the two pty_n with the highest pv1 I want the party names.

ctr_n     yr   First_Party  Second_Party
Argentina 1983 A            C

2) I would like to add their share of the votes as a proportion.

ctr_n     yr   First_Party  Second_Party  First_Party_Votes                   First_Party_Votes
Argentina 1983 A            C             pv1/vot1 = 2743064/5759215 = 0.476  0.011 (rounded from 0.0108)

I was thinking of doing something like the following:

# Selecting the row with the maximum amount of votes (But how do I do the second max?)
DT[, .SD[max(pv1)==prop_votes ], on=.(ctr_n, yr)]
# Calculating the proportion for the remaining rows
DT[, prop_votes := pv1/vot1]
# Recasting, here I get stuck as well
DT <- reshape2::dcast(DT, ctr_n + yr ~ pty_n + pty + prop_votes + ?

Could someone please help me continue?


Solution

  • My try based on @sindri_baldur's answer

    library(data.table)
    DT <- fread("
    pty_n  pty  pev1    vot1    vv1     ivv1    to1         pv1       ctr_n     yr
    A      236  6567389 5759215 5554662 204553  876941383   2743064   Argentina 1983
    B      41   6567389 5759215 5554662 204553  876941383   56987     Argentina 1983
    C      237  6567389 5759215 5554662 204553  876941383   62659     Argentina 1983
    D      207  6567389 5759215 5554662 204553  876941383   587       Argentina 1983
    E      125  6567389 5759215 5554662 204553  876941383   3728      Argentina 1983
    A      236  6567389 5759215 5554662 204553  876941383   56987  Argentina 1984
    B      41   6567389 5759215 5554662 204553  876941383   56987     Argentina 1984
    C      237  6567389 5759215 5554662 204553  876941383   62659     Argentina 1984
    D      207  6567389 5759215 5554662 204553  876941383   2743064       Argentina 1984
    E      125  6567389 5759215 5554662 204553  876941383   3728      Argentina 1984",
                header=TRUE)
    
    DT[order(ctr_n, yr, -pv1)
       ][, .(First_Party= pty_n[1], 
             Second_Party= pty_n[2], 
             First_Party_Votes = pv1[1] / vot1[1],
             Second_Party_Votes = pv1[2] / vot1[1]
       ), .(ctr_n, yr)]
    
           ctr_n   yr First_Party Second_Party First_Party_Votes Second_Party_Votes
    1: Argentina 1983           A            C         0.4762913         0.01087978
    2: Argentina 1984           D            C         0.4762913         0.01087978