Search code examples
regexuppercaselookbehindbibtexbibliography

Regex to capitalize words


I'm using a text editor to capitalize the titles of papers in a references file.

I have a similar structure

...
Title = {Direct synthesis of antimicrobial coatings based on tailored bi-elemental nanoparticles},
journal={APL Mat.},
...

and I want to capitalize only the words in the title section so that it becomes

...
Title = {Direct Synthesis of Antimicrobial Coatings Based on Tailored Bi-Elemental Nanoparticles},
journal={APL Mat.},
...

I tried using the lookbehing regex to search for each word followed by the word "Title" in the following way

(?<=Title)(\b.+?\b)

and I want to substitute it with

\u\1

for each occurrence in the text. However my code only selects the character after the "e" of Title and the "D" of Direct, and cannot find the other occurrences after that.

Can you help me? Thank you.


Solution

  • You may use

    (\G(?!^)(?:[^}\n\w]+(?:o[fn]|in|the|by|for|to|and*|a))*[^}\n\w]+|Title\s*=\s*\{)(\w+)
    

    and replace with $1\u$2. See the regex demo (it is slightly modified since regex101 does not seem to support \u operator).

    Details

    • (\G(?!^)(?:[^}\n\w]+(?:o[fn]|in))*[^}\n\w]+|Title\s*=\s*\{) - either of the two alternatives:
      • \G(?!^)(?:[^}\n\w]+(?:o[fn]|in|the|by|for|to|and*|a))*[^}\n\w]+:
      • \G(?!^) - the end of the previous match
      • (?:[^}\n\w]+(?:o[fn]|in))* - 0 or more repetitions of
        • [^}\n\w]+ - 1 or more chars other than }, LF and a word char
        • (?:o[fn]|in|the|by|for|to|and*|a) - of, on or in, etc. (add more words that should be excluded from capitalization here)
      • [^}\n\w]+ - 1 or more chars other than }, LF and a word char
      • | - or
      • Title\s*=\s*\{ - Title, = enclosed with 0+ whitespaces and a {
    • (\w+) - Group 2: one or more word chars.