I have a big table in which I have calculated the number of counts by subcategory countsperc
(subcategory names not shown)
for every category (id
), then the total
of observations per category (id
) in column sumofcounts
, and the proportion of subcategory to
the total (counsperc/sumofcounts
) in apppropor
(approx. proportions), that needs to be approximate (3 decimal).
The problem is, the sum of approximate proportions (old_sum
) for categories (id
) has to be 1.000 instead of 0.999, etc.
So, I would like to ask for a method to add or subtract 0.001, on any sub-item of column apppropor
in order to get 1.000 always as the sum.
For example, in row1 the number could be 0.334 instead of 0.333
EDIT: The goal of the task is not to produce solely a exact sum of 1, which has no utility, but to produce an input to other program, which will consider the column apppropor
as is (requiring it will sum 1.000 per id
, see error message below).
text1<-"
id countsperc sumofcounts apppropor
item1 1 3 0.333
item1 1 3 0.333
item1 1 3 0.333
item2 1 121 0.008
item2 119 121 0.983
item2 1 121 0.008
item3 1 44 0.023
item3 1 44 0.023
item3 41 44 0.932
item3 1 44 0.023
item4 1 29 0.034
item4 3 29 0.103
item4 1 29 0.034
item4 24 29 0.828"
table1<-read.table(text=text1,header=T)
library(data.table)
sums<-as.data.frame(setDT(table1)[, sum(`apppropor`), by = .(id)][,.(id, old_sum = V1)])
table1<-merge(table1,sums)
table1
chromEvol Version: 2.0. Last updated December 2013
The count probabilities for taxa Ad_mic not sum to 1.0 chromEvol: errorMsg.cpp:41: static void errorMsg::reportError(const string&, int): Assertion `0' failed. Aborted (core dumped)
I found a way.
table1$dif<-1-table1$old_sum
table1<-table1[order(table1$id),]
len<-rle(as.vector(table1$id))[[1]]
table1$apppropor[cumsum(len)]<-table1$apppropor[cumsum(len)]+table1$dif[cumsum(len)]
#verify
library(data.table)
sums<-as.data.frame(setDT(table1)[, sum(`apppropor`), by = .(id)][,.(id, new_sum = V1)])
table1<-merge(table1,sums)
table1