Search code examples
runiquedivide

Dividing a column value by unique number of multiple columns in R


I need to obtain the value in the Area and Prm columns by dividing the values by the unique number of columns (YEAR DIV POL ST CTY CR PL YID LID DATE).

Is there a function in R to achieve this?

Thanks in advance.

Table:

Tag  YEAR  DIV  POL  ST  CTY   CR  PL  YID  LID DATE    Area PRm  SEP1  SEP2
S25  2005  7    3068 15  205   11  44  4    2   9042004 799  4504 326.9 296.6 
S1   2005  7    4077 15  205   11  90  4    2   9202004 300  3000 316.1 309.2
S16  2005  7    4077 15  205   11  90  4    2   9202004 300  3000 391.2 201.5
S2   2005  7    4077 15  205   11  90  4    2   9202004 300  3000 271.2 311.5
S28  2005  7    3180 15  205   11  44  5    6   9202004 651  1747 251.2 382.5


Output:

Tag  YEAR  DIV  POL  ST  CTY   CR  PL  YID  LID DATE    Area PRm  SEP1  SEP2
S25  2005  7    3068 15  205   11  44  4    2   9042004 799  4504 326.9 296.6 
S1   2005  7    4077 15  205   11  90  4    2   9202004 100  1000 316.1 309.2
S16  2005  7    4077 15  205   11  90  4    2   9202004 100  1000 391.2 201.5
S2   2005  7    4077 15  205   11  90  4    2   9202004 100  1000 271.2 311.5
S28  2005  7    3180 15  205   11  44  5    6   9202004 651  1747 251.2 382.5

The difference in Table and Output is in the 2,3 and 4 rows in Area and PRm columns.

The values in the Area and Prm column (300 and 3000) was divided by 3 (because YEAR DIV POL ST CTY CR PL YID LID DATE columns are identical in rows 2,3 and 4). SO the values in the Area and PRm columns are divided by 3 (300 / 3 = 100 in Area column and 3000 / 3 = 1000 in PRm column in 2,3 and 4th rows in the output table.

     S1   2005  7    4077 15  205   11  90  4    2   9202004 100  1000 316.1 309.2
    S16  2005  7    4077 15  205   11  90  4    2   9202004 100  1000 391.2 201.5
    S2   2005  7    4077 15  205   11  90  4    2   9202004 100  1000 271.2 311.5

Solution

  • with dplyr :

    df1 %>% add_count(YEAR,DIV,POL,ST,CTY,CR,PL,YID,LID,DATE) %>%
      mutate(Area = Area /n) %>%
      select(-n)
    

    output

    # # A tibble: 5 x 15
    #     Tag  YEAR   DIV   POL    ST   CTY    CR    PL   YID   LID    DATE  Area   PRm  SEP1  SEP2
    #   <chr> <int> <int> <int> <int> <int> <int> <int> <int> <int>   <int> <dbl> <int> <dbl> <dbl>
    # 1   S25  2005     7  3068    15   205    11    44     4     2 9042004   799  4504 326.9 296.6
    # 2    S1  2005     7  4077    15   205    11    90     4     2 9202004   100  3000 316.1 309.2
    # 3   S16  2005     7  4077    15   205    11    90     4     2 9202004   100  3000 391.2 201.5
    # 4    S2  2005     7  4077    15   205    11    90     4     2 9202004   100  3000 271.2 311.5
    # 5   S28  2005     7  3180    15   205    11    44     5     6 9202004   651  1747 251.2 382.5
    

    data

    df1 <- read.table(text="Tag  YEAR  DIV  POL  ST  CTY   CR  PL  YID  LID DATE    Area PRm  SEP1  SEP2
    S25  2005  7    3068 15  205   11  44  4    2   9042004 799  4504 326.9 296.6 
                      S1   2005  7    4077 15  205   11  90  4    2   9202004 300  3000 316.1 309.2
                      S16  2005  7    4077 15  205   11  90  4    2   9202004 300  3000 391.2 201.5
                      S2   2005  7    4077 15  205   11  90  4    2   9202004 300  3000 271.2 311.5
                      S28  2005  7    3180 15  205   11  44  5    6   9202004 651  1747 251.2 382.5",header=T,stringsAsFactors=F)