Search code examples
regexperlsyntaxspaceremoving-whitespace

Insert missing commas in C source


I've got a perl script (using -p flag) that performs some corrections on a corrupted C source file. Here's part of the script:

sub remove_sp {
    $_ = shift; 
    s/ /, /g; 
    return $_;
}

s/(\([^}]*\))/remove_sp($1)/eg;

This replaces spaces inside parenthesis with , e.g. foo(bar baz) becomes foo(bar, baz). However, it's not very smart. It also changes foo("bar baz") to foo("bar, baz") which obviously isn't something I want.

I can't think of a way to rewrite the script so that it replaces a space with a comma-space only when the space is not between quotes. How can I do this?


Here's a simple table of what I need and what isn't working.

Search                       | Replace                        | Currently handled correctly?
--------------------------------------------------------------------------------------------
foo(bar baz)                 | foo(bar, baz)                  | Yes
foo("bar baz")               | foo("bar baz")                 | No
foo("bar baz" bak)           | foo("bar baz", bak)            | No
foo("bar baz" bak "123 abc") | foo("bar baz", bak, "123 abc") | No

Solution

  • You could use Text::ParseWords to get the data between the parens and do the substitution on the results of the parse.

    #!/usr/bin/perl
    use strict;
    use warnings;
    use Text::ParseWords;
    
    for ('foo("bar baz")', 'print("foo bar" baz)', 'foo(bar baz)') {
        my $s = $_;
        $s =~ s/(\([^)]*\))/remove_sp($1)/eg;
        print $s, $/;
    }
    
    sub remove_sp {
        join ", ", quotewords('\s+', 1, shift);
    }
    

    Output:

    foo("bar baz")
    print("foo bar", baz)
    foo(bar, baz)