Search code examples
arraysawksplitcamelcasing

How to split a camelCase string into an array in awk?


How can I split a camelCase string into an array in awk using the split function?

Input:

STRING="camelCasedExample"

Desired Result:

WORDS[1]="camel"
WORDS[2]="Cased"
WORDS[3]="Example"

Bad Attempt:

split(STRING, WORDS, /([a-z])([A-Z])/);

Bad Result:

WORDS[1]="came"
WORDS[2]="ase"
WORDS[3]="xample"

Solution

  • You can't do it with split() alone which is why GNU awk has patsplit():

    $ awk 'BEGIN {
        patsplit("camelCasedExample",words,/(^|[[:upper:]])[[:lower:]]+/)
        for ( i in words ) print words[i]
    }'
    camel
    Cased
    Example