Search code examples
regexperlrtf

Odd substitution behaviour in perl substitution of rtf file


I am trying to use the perl module "RTF::Writer" for strings of text that must be a mix of formats. This is proving more complicated than I anticipated. I am just trying a test at the moment with:

$rtf->paragraph( \'\b', "Name: $name, le\cf1 ng\cf0 th $len" );

but this writes:

{\pard
\b
Name: my_name, le\'061 ng\'060 th 7
\par}

where \'061 should be \cf1 and \'060 should be \cf0.

I then tried to remedy this with a perl 1-liner:

perl -pi -e "s/\'06/\cf/g"

but this made things worse, I do not know what "\^F" represents in vi, but that is what it shows.

It did not matter if I escaped the backslashes or not.

Can anyone explain this behavior, and what to do about it?

Can anyone suggest how to get the RTF::Writer to create the file as desired from the start?

Thanks


Solution

  • \ is a special character in double-quoted string literals. If you want a string that contains \, you need to use \\ in the literal. To create the string \cf1, you need to use "\\cf1". ("\cf" means Ctrl-F, which is to say the byte 06.)

    Alternatively, \ is only special if followed by \ or a delimiter in single-quoted string literals. So the string \cf1 could also be created from '\cf1'.

    Both produce the string you want, but they don't produce the document you want. That's because there's a second problem.

    When you pass a string to RTF::Writer, it's expected to be text to render. But you are passing a string you wanted included as is in the final document. You need to pass a reference to a string if you want to provide raw RTF. \'...', \"..." and \$str all produce a reference to a string.

    Fixed:

    use RTF::Writer qw( );
    
    my $name = "my_name";
    
    my $rtf = RTF::Writer->new_to_file("greetings.rtf");
    $rtf->prolog( 'title' => "Greetings, hyoomon" );
    $rtf->paragraph( \'\b', "Name: $name, le", \'\cf1', "ng", \'\cf0', "th".length($name));
    $rtf->close;
    

    Output from the call to paragraph:

    {\pard
    \b
    Name: my_name, le\cf1
    ng\cf0
    th7
    \par}
    

    Note that I didn't use the following because it would be code injection bug:

    $rtf->paragraph(\("\\b Name: $name, le\\cf1 ng\\cf0 th".length($name)));
    

    Don't pass text such as the contents of $name using \...; use that for raw RTF only.