Search code examples
rdataframeauc

Implementation of Cobb-Douglas Utility Function to calculate Receiver Operator Curve & AUC


this is not a hard problem from my understanding, however, I have minimal experience with different R functions and problem solving within it. The text is long to provide a thorough understanding rather than complexity. Any help would be greatly appreciated.

I am trying to utilize a Cobb-Douglas Utility Function [x^a * y^(1 - a)] to calculate a Receiver Operator Curve for every point of alpha for a final result of an AUC vs. Alpha graph.

If you are unfamiliar with ROC - I'd recommend GoogleDev & Wikipedia.

Here are my two initial tables: df, alpha

view(df)
ID X Y
A 1 4
B 2 5
C 3 6
Alpha <- seq(0.1,0.3,by = 0.1))
view(alpha)
row # Alpha value
1 0.1
2 0.2
3 0.3

Wanted function #1 here to combine Alpha to XY pairs w/

view(combined_wanted_result)
ID X Y Alpha
A 1 4 0.1
A 1 4 0.2
A 1 4 0.3
B 2 5 0.1
B 2 5 0.2
B 2 5 0.3
C 3 6 0.1
C 3 6 0.2
C 3 6 0.3

The combined result is wanted due to the fact that every XY pair (1/4, 2/5, 3/6) is unique&connected and I do not want to mix those vales (bad_ex: 1/5, 3/5, 2/6).

I'm trying to use the Cobb-Douglas Utility Function so that I can make a table for every iteration of alpha. The documentation in R does not seem to be appropriate for my use.

Z = (x^a * y^(1 - a)
Wanted cobb_douglas function #2 here

View(cobb_douglas)
ID X Y Alpha Z
A 1 4 0.1 3.482
A 1 4 0.2 3.031
A 1 4 0.3 2.649
B 2 5 0.1 4.562
B 2 5 0.2 4.163
B 2 5 0.3 3.798
C 3 6 0.1 5.598
C 3 6 0.2 5.223
C 3 6 0.3 4.876

This View(cobb_douglas) will then be grouped by alpha and an ROC curve created from those Z value in comparison to a given an absolute identification.

view(Abs)
ID Abs
A 1
B 0
C 1
#Grab all Z values of certain alpha, matched by common ID to get absolute
view(0.1_alpha)
view(0.2_alpha)
ID Alpha Z Abs
A 0.1 3.482 1
B 0.1 4.562 0
C 0.1 5.598 1
ID Alpha Z Abs
A 0.2 3.031 1
B 0.2 4.163 0
C 0.2 5.223 1

Wanted function #3 - creating an ROC table (w/ Z&Abs) of each alpha table to calculate AUC
Note, the AUC values are not given here - This table will be used to plot Alpha vs. AUC.

Alpha AUC
0.1 ?
0.2 ?
0.3 ?

What I've tried prompt:

My initial stick up which I got while trying to solve this problem with 'expand.grid' to plot every alpha to an XY pair. But the X&Y values kept getting mixed with each other leading to unwanted value pairs such as 1/5, 3/5, 2/6.

I have reviewed similar articles in Stackoverflow and some of them solve the further problem of calculating the AUC. However the initial data matching I am still lost on.

Similar issue links:
MLE for Cobb Douglas function
How to calculate a partial area under the curve (AUC)?
How to calculate area under the curve (AUC) in several data series?
How to calculate the AUC from a ROC plot without the underlying data?
How to calculate the AUC of a graph in R?
Applying a function for calculating AUC for each subject


Solution

  • While it is not entirely clear how all your desired outputs should look, here is a partial solution that addresses some of the things you are trying to achieve. If there are other elements in need of consideration, please comment below and I will update my answer.

    At its heart, your question relates to data manipulation and the dplyr package is probably the easiest way to achieve your goals. If you haven't already, install dplyr and MESS (for the second example) prior to running this code:

    library(dplyr)
    library(MESS)
    
    # Your data as described
    df <- data.frame(ID = LETTERS[1:3],
                     X = 1:3,
                     Y = 4:6)
    
    Alpha <- data.frame(row = 1:3,
                        Alpha = seq(0.1,0.3,by = 0.1))
    
    Abs <- data.frame(ID = LETTERS[1:3],
                      Abs = c(1,0,1))
    
    # Create full dataset
    result <- df %>% 
      slice(rep(1:n(), each = nrow(Alpha))) %>%
      group_by(ID) %>%
      mutate(row = 1:n()) %>%
      left_join(., Alpha, by = "row") %>%
      mutate(Z = (X^Alpha * Y^(1 - Alpha))) %>%
      left_join(., Abs, by = "ID") %>%
      select(-row) %>%
      ungroup()
    
    data.frame(result)
    
      ID X Y Alpha        Z Abs
    1  A 1 4   0.1 3.482202   1
    2  A 1 4   0.2 3.031433   1
    3  A 1 4   0.3 2.639016   1
    4  B 2 5   0.1 4.562218   0
    5  B 2 5   0.2 4.162766   0
    6  B 2 5   0.3 3.798289   0
    7  C 3 6   0.1 5.598198   1
    8  C 3 6   0.2 5.223303   1
    9  C 3 6   0.3 4.873514   1
    
    # Create table with AUC for each Alpha (NOTE: edit columns in auc() if incorrect)
    result1 <- df %>% 
      slice(rep(1:n(), each = nrow(Alpha))) %>%
      group_by(ID) %>%
      mutate(row = 1:n()) %>%
      left_join(., Alpha, by = "row") %>%
      mutate(Z = (X^Alpha * Y^(1 - Alpha))) %>%
      left_join(., Abs, by = "ID") %>%
      select(-row) %>%
      group_by(Alpha) %>%
      summarize(AUC = auc(Z, Abs, type = "spline"), .groups="keep")
    
    data.frame(result1)
    
      Alpha       AUC
    1   0.1 0.7930400
    2   0.2 0.8208068
    3   0.3 0.8363540
    

    Update based on OP's comments

    To use multiple sequences of alpha values dynamically in a loop, the following method works. NOTE: I have kept your column names consistent with your original question (rather than the ones stated in your comments) to avoid confusing others who may happen across this answer:

    # Create list of alpha sequences to iterate over
    alphaseq <- list(seq(0.1, 0.3, by = 0.1),
                     seq(0.1, 1, by = 0.1),
                     seq(0.01, 0.1, by = 0.01))
    
    # Create empty df to hold the results
    auc_results <- data.frame(matrix(NA, nrow = 0, ncol = 3))
    # iteration column added to identify which seq() was used
    colnames(auc_results) <- c("iteration",
                               "Alpha",
                               "AUC")
    
    # Iterate over alphaseq and save results to auc_results
    for(i in seq(alphaseq)) {
      
      # Create Alpha dynamically
      Alpha <- data.frame(Alpha = alphaseq[[i]],
                          row = 1:length(alphaseq[[i]]))
      
      tmp <- df %>% 
        slice(rep(1:n(), each = nrow(Alpha))) %>%
        group_by(ID) %>%
        mutate(row = 1:n()) %>%
        left_join(., Alpha, by = "row") %>%
        mutate(Z = (X^Alpha * Y^(1 - Alpha))) %>%
        left_join(., Abs, by = "ID") %>%
        select(-row) %>%
        group_by(Alpha) %>%
        summarize(AUC = auc(Z, Abs, type = "spline"), .groups="keep") %>%
        mutate(iteration = i) %>%
        relocate(iteration)
      
      auc_results <- rbind(auc_results, tmp)
      rm(tmp)
      
    }
    
    # Result
    data.frame(auc_results)
    
       iteration Alpha       AUC
    1          1  0.10 0.7930400
    2          1  0.20 0.8208068
    3          1  0.30 0.8363540
    4          2  0.10 0.7930400
    5          2  0.20 0.8208068
    6          2  0.30 0.8363540
    7          2  0.40 0.8419765
    8          2  0.50 0.8394922
    9          2  0.60 0.8303900
    10         2  0.70 0.8159135
    11         2  0.80 0.7971144
    12         2  0.90 0.7748883
    13         2  1.00 0.7500000
    14         3  0.01 0.7551021
    15         3  0.02 0.7600145
    16         3  0.03 0.7647423
    17         3  0.04 0.7692901
    18         3  0.05 0.7736625
    19         3  0.06 0.7778638
    20         3  0.07 0.7818983
    21         3  0.08 0.7857700
    22         3  0.09 0.7894826
    23         3  0.10 0.7930400