Search code examples
rsplitcamelcasing

Splitting CamelCase in R


Is there a way to split camel case strings in R?

I have attempted:

string.to.split = "thisIsSomeCamelCase"
unlist(strsplit(string.to.split, split="[A-Z]") )
# [1] "this" "s"    "ome"  "amel" "ase" 

Solution

  • string.to.split = "thisIsSomeCamelCase"
    gsub("([A-Z]){1}", " \\1", string.to.split)
    # [1] "this Is Some Camel Case"
    # added a counter to prevent situation mentioned in comment
    strsplit(gsub("([A-Z]{1})", " \\1", string.to.split), " ")
    # [[1]]
    # [1] "this"  "Is"    "Some"  "Camel" "Case" 
    
    # another attempt to meet the commenter's concern
    # inserts space between lower-single upper sequence
    gsub("([[:lower:]])([[:upper:]]){1}", "\\1 \\2", string.to.split)
    

    Looking at Ramnath's and mine I can say that my initial impression that this was an underspecified question has been supported.

    And give Tommy and Ramanth upvotes for pointing out [:upper:]

    strsplit(gsub("([[:upper:]])", " \\1", string.to.split), " ")
    # [[1]]
    # [1] "this"  "Is"    "Some"  "Camel" "Case"