I have a vector list.exp2 where each entry is one or more strings separated by commas. I would like to split each entry and take the first n number of strings based on the number of delimiters present in that entry.
I've tried the below code but am not successful yet
refined.final.list <- as.vector(sapply(list.exp2, function(n)
ifelse(count.fields(textConnection(list.exp2[n]), sep = ",") < 3,
unlist(strsplit(list.exp2[n], ","))[1],
count.fields(textConnection(list.exp2[n]), sep = ",") < 5,
unlist(strsplit(list.exp2[n], ","))[1:2],
unlist(strsplit(list.exp2[n], ","))[1:4])))
Basically, I used the ifelse along with the count function that gives me a count of the number of delimiters and the unlist function is suppose to give me corresponding split elements.
The list.exp2 vector looks like this
lis.exp2 <- c("ISTITUTO PER LA SINTESI ORGANICA E LA FOTOREATTIVITÀ (ISOF-CNR),
SEZIONE DI FERRARA, VIA L. BORSARI 46, 44100 FERRARA, ITALY",
"FLUXOME SCIENCES A/S, SØLTOFTS PLADS, BUILDING 223, DK-2800 KGS. LYNGBY, DENMARK",
"FERDINAND-BRAUN-INSTITUT FÜR HÖCHSTFREQUENZTECHNIK, GUSTAV-KIRCHHOFF-STR. 4, 12489 BERLIN, GERMANY")
Any insights into how to correct this code will be greatly appreciated.
One option could be to use strsplit
directly on your vector lis.exp2
. It will result into a list with one item for each item from vector. Then use lapply
to return desired number of element.
Example to return 3 items as:
n <- 3
lapply(strsplit(lis.exp2, split=","), function(x)x[1:n])
#OR Based on @thelatemail suggestion
lapply(strsplit(lis.exp2, split=","), head, n)
#Result
# [[1]]
# [1] "ISTITUTO PER LA SINTESI ORGANICA E LA FOTOREATTIVITÀ (ISOF-CNR)"
# [2] " SEZIONE DI FERRARA"
# [3] " VIA L. BORSARI 46"
#
# [[2]]
# [1] "FLUXOME SCIENCES A/S" " SØLTOFTS PLADS" " BUILDING 223"
#
# [[3]]
# [1] "FERDINAND-BRAUN-INSTITUT FÜR HÖCHSTFREQUENZTECHNIK"
# [2] " GUSTAV-KIRCHHOFF-STR. 4"
# [3] " 12489 BERLIN"
**UPDATED:**Based on feedback from OP
A function can be written which check if number of items less than (say 4
) then return only 1st items else return top 3 items.
#Function to return top 1/3 items based on condition
getNItems <- function(x){
if(length(x) <= 4){
#only 1st
x[1]
}else{
#first 3
x[1:3]
}
}
lapply(strsplit(lis.exp2, split=","), getNItems)