I need to prepare queries that are made of characters strings (DOI, Digital Object Identifier) stored in a data frame. All strings associated with the same case have to be joined to produce one query.
The df looks like this:
Case | DOI |
---|---|
1 | 1212313/dfsjk23 |
1 | 322332/jdkdsa12 |
2 | 21323/xsw.w3 |
2 | 311331313/q1231 |
2 | 1212121/1231312 |
The output should be a data frame looking like this:
Case | Query |
---|---|
1 | DO=(1212313/dfsjk23 OR 322332/jdkdsa12) |
2 | DO=(21323/xsw.w3 OR 311331313/q1231 OR 1212121/1231312) |
The prefix ("DO="), suffix (")") and "OR" are not critical, I can add them later, but how to aggregate character strings based on a case number?
In base R you could do:
aggregate(DOI~Case, df1, function(x) sprintf('DO=(%s)', paste0(x, collapse = ' OR ')))
Case DOI
1 1 DO=(1212313/dfsjk23 OR 322332/jdkdsa12)
2 2 DO=(21323/xsw.w3 OR 311331313/q1231 OR 1212121/1231312)
if Using R 4.1.0
aggregate(DOI~Case, df1, \(x)sprintf('DO=(%s)', paste0(x, collapse = ' OR ')))