Search code examples
regexregex-lookaroundslookbehind

Regex remove quotes around integers?


Let's say I have a string "\"Bob\",\"1\",\"Mary\",\"2\"". Is it possible to remove only the quotes around the numbers and not the letters? I've tried look-ahead/behind but the non variable length for look-behind screwed me and I have no idea how to solve the problem. Thanks.


Solution

  • in php:

    <?php
    $in = "\"Bob\",\"1\",\"Mary\",\"2\"";
    $out = preg_replace('/"(\d)"/',"$1",$in);
    echo $out;
    ?>
    

    in javascript:

    var $in = "\"Bob\",\"1\",\"Mary\",\"2\"";
    var $out = $in.replace(/"(\d)"/g,"$1");
    alert($out);
    

    my best guess in R: (I am not an R programmer)

    in <- "\"Bob\",\"1\",\"Mary\",\"2\""
    out <- sub("\"([:digit:])\"","\\1",in)
    print(out)
    

    ... here \\1 is equivalent to $1 and [:digit:] is equivalent to \d

    to explain the regex

    php preg_replace() function takes a regular expression as the first parameter in the form of a string, a replacement value as the second parameter in the form of a string, and the source as the third parameter in the form of a string, and returns the modified string after regular expression replacements have taken place.

    javascript .replace() method operates on a source string, takes a regular expression as the first parameter, and a replacement string as the second parameter. It returns the modified string after regular expression replacements.

    In this example, the regular expression is delimited by (starts and ends with) slashes (/.../), and matches any digit (\d) captured by brackets ("(\d)") and enclosed by quotes ("(\d)"). In javascript the g flag is used to make replacements global (repeat for all occurances). The captured digit (captured because it is enclosed in brackets) is then referenced in the replacement with $1 meaning the first captured group. $0 matches the entire matched string. $2 would match the second captured group - but there is none in this regex. Anything contained within brackets in regex is a captured group and can be referenced in the replacement by it's index via $n where n is it's index. So to put it simply, the regex replaces all occurances of a digit enclosed in quotes with just the digit.