Search code examples
bashquotestex

Replacing quotation marks with "``" and "''"


I have a document containing many " marks, but I want to convert it for use in TeX.

TeX uses 2 ` marks for the beginning quote mark, and 2 ' mark for the closing quote mark.

I only want to make changes to these when " appears on a single line in an even number (e.g. there are 2, 4, or 6 "'s on the line). For e.g.

"This line has 2 quotation marks."
--> ``This line has 2 quotation marks.''

"This line," said the spider, "Has 4 quotation marks."
--> ``This line,'' said the spider, ``Has 4 quotation marks.''

"This line," said the spider, must have a problem, because there are 3 quotation marks."
--> (unchanged)

My sentences never break across lines, so there is no need to check on multiple lines.

There are few quotes with single quotes, so I can manually change those.

How can I convert these?


Solution

  • Here's my one-liner using repeated sed's:

    cat file.txt | sed -e 's/"\([^"]*\)"/`\1`/g' | sed '/"/s/`/\"/g' | sed -e 's/`\([^`]*\)`/``\1'\'''\''/g'
    

    (note: it won't work correctly if there are already back-ticks (`) in the file but otherwise should do the trick)

    EDIT:

    Removed back-tick bug by simplifying, now works for all cases:

    cat file.txt | sed -e 's/"\([^"]*\)"/``\1'\'\''/g' | sed '/"/s/``/"/g' | sed '/"/s/'\'\''/"/g'
    

    With comments:

    cat file.txt                           # read file
    | sed -e 's/"\([^"]*\)"/``\1'\'\''/g'  # initial replace
    | sed '/"/s/``/"/g'                    # revert `` to " on lines with extra "
    | sed '/"/s/'\'\''/"/g'                # revert '' to " on lines with extra "