Search code examples
rdictionarykeyuniqueinstance

How do I assign unique instances as multiple keys to a dictionary in R?


I have an R df where one column, assignment, looks like this:

course instance assignment
1 1 A
1 1 B
1 2 B
1 2 C
2 1 A
2 1 C
2 2 B
2 2 A

I need to create a superset (for lack of a better term) of all of the assignments in a course across instances.

For example: Course 1 was offered 2x, and in instance 1 it included assignments A and B, and in instance 2 it included assignments B and C. The superset of assignments in this class should include assignments A, B, and C each one time. In other words, every assignment that appears at least once across instances of a course should appear exactly one time in the superset.

UPDATE: I've tried the suggestion below.

library(tidyverse); df %>% group_by(course) %>% 
summarise(all_assignments = toString(sort(unique(assignment))), 
.groups = "drop")

This returns the following:

all_assignments .groups
A drop

I've now tested this on the following sample data set:

df <- read.table(text = "course instance    assignment
1   1   A
1   1   B
1   2   B
1   2   C
2   1   A
2   1   C
2   2   B
2   2   A", header = T)

Which returns a similar structure:

all_assignments .groups
A, B, C drop

Apparently this exact code has worked for others, so I'm wondering what I'm doing incorrectly?


Solution

  • I'm not entirely clear on your expected output (see my comment above); please have a look at the following

    library(dplyr)
    df %>% 
        group_by(course) %>% 
        summarise(
            all_assignments = toString(sort(unique(assignment))), 
            .groups = "drop")
    ## A tibble: 2 × 2
    #  course all_assignments
    #   <int> <chr>          
    #1      1 A, B, C        
    #2      2 A, B, C       
    

    This is tested & verified on R_4.2.0 with dplyr_1.0.9.


    Sample data

    df <- read.table(text = "course instance    assignment
    1   1   A
    1   1   B
    1   2   B
    1   2   C
    2   1   A
    2   1   C
    2   2   B
    2   2   A", header = T)