Search code examples
phpregexpreg-replacecamelcasingkebab-case

Using preg_replace() to convert alphanumeric strings from camelCase to kebab-case


I have a method now that will convert my camelCase strings to kebab-case, but it's broken into three calls of preg_replace():

public function camelToKebab($string, $us = "-")
{
    // insert hyphen between any letter and the beginning of a numeric chain
    $string = preg_replace('/([a-z]+)([0-9]+)/i', '$1'.$us.'$2', $string);
    // insert hyphen between any lower-to-upper-case letter chain
    $string = preg_replace('/([a-z]+)([A-Z]+)/', '$1'.$us.'$2', $string);
    // insert hyphen between the end of a numeric chain and the beginning of an alpha chain
    $string = preg_replace('/([0-9]+)([a-z]+)/i', '$1' . $us . '$2', $string);

    // Lowercase
    $string = strtolower($string);

    return $string;
}

I wrote tests to verify its accuracy, and it works properly with the following array of inputs (array('input' => 'output')):

$test_values = [
    'foo'       => 'foo',
    'fooBar'    => 'foo-bar',
    'foo123'    => 'foo-123',
    '123Foo'    => '123-foo',
    'fooBar123' => 'foo-bar-123',
    'foo123Bar' => 'foo-123-bar',
    '123FooBar' => '123-foo-bar',
];

I'm wondering if there's a way to reduce my preg_replace() calls to a single line which will give me the same result. Any ideas?

NOTE: Referring to this post, my research has shown me a preg_replace() regex that gets me almost the result I want, except it doesn't work on the example of foo123 to convert it to foo-123.


Solution

  • You can use lookarounds to do all this in a single regex:

    function camelToUnderscore($string, $us = "-") {
        return strtolower(preg_replace(
            '/(?<=\d)(?=[A-Za-z])|(?<=[A-Za-z])(?=\d)|(?<=[a-z])(?=[A-Z])/', $us, $string));
    }
    

    RegEx Demo

    Code Demo

    RegEx Description:

    (?<=\d)(?=[A-Za-z])  # if previous position has a digit and next has a letter
    |                    # OR
    (?<=[A-Za-z])(?=\d)  # if previous position has a letter and next has a digit
    |                    # OR
    (?<=[a-z])(?=[A-Z])  # if previous position has a lowercase and next has a uppercase letter