I have a character vector where single elements contain multiple strings separated by commas. I have obtained this list by extracting it from a data frame, and it looks like this:
[1] "Acworth, Crescent Lake, East Acworth, Lynn, South Acworth"
[2] "Ferncroft, Passaconaway, Paugus Mill"
[3] "Alexandria, South Alexandria"
[4] "Allenstown, Blodgett, Kenison Corner, Suncook (part)"
[5] "Alstead, Alstead Center, East Alstead, Forristalls Corner, Mill Hollow"
[6] "Alton, Alton Bay, Brookhurst, East Alton, Loon Cove, Mount Major, South Alton, Spring Haven, Stockbridge Corners, West Alton, Woodlands"
[7] "Amherst, Baboosic Lake, Cricket Corner, Ponemah"
[8] "Andover, Cilleyville, East Andover, Halcyon Station, Potter Place, West Andover"
[9] "Antrim, Antrim Center, Clinton Village, Loverens Mill, North Branch"
[10] "Ashland"
I would like to obtain a new character vector whereby every single string is an element within this character vector, i.e.:
[1] "Acworth", "Crescent Lake", "East Acworth", "Lynn", "South Acworth"
[6] "Ferncroft", "Passaconaway", "Paugus Mill", "Alexandria", "South Alexandria"
I used the strsplit()
function, however this returns a list. When I try to turn it into a character vector, it reverts to the old state.
I'm sure this is a really simple problem - any help would be greatly appreciated! thanks!
Your post title suggests you want unique strings, so
unique(unlist(strsplit(myvec, split=",")))
or
unique(unlist(strsplit(myvec, split=", ")))
if you always have a space following the comma.