Search code examples
regexsedbibtex

Using sed to erase field in bibtex entry


I'm faced with a text file containing multiple bibtex instances like this one

@article{Lindgren1989Resonant,
    abstract = {Using a simple model potential, a truncated image barrier, for the
Al(111) surface, one obtains a resonant bound surface state at an energy
that agrees surprisingly well with recent observations by inverse
photoemission.},
    author = {Lindgren and Walld\'{e}n, L.},
    citeulike-article-id = {9286612},
    citeulike-linkout-0 = {http://dx.doi.org/10.1103/PhysRevB.40.11546},
    citeulike-linkout-1 = {http://adsabs.harvard.edu/cgi-bin/nph-bib\_query?bibcode=1989PhRvB..4011546L},
    doi = {10.1103/PhysRevB.40.11546},
    journal = {Phys. Rev. B},
    keywords = {image-potential, surface-states},
    month = dec,
    pages = {11546--11548},
    posted-at = {2011-05-12 11:42:49},
    priority = {0},
    title = {Resonant bound states for simple metal surfaces},
    url = {http://dx.doi.org/10.1103/PhysRevB.40.11546},
    volume = {40},
    year = {1989}
}

I want to erase the abstract field, which can span over one or multiple (like in the above case) lines. I tried using sed in the follwing manner

sed "/^\s*${field}.*=/,/},?$/{
    d
}" file

where file is a text file containing the above bibtex code. However, the output of this command is just

@article{Lindgren1989Resonant,

Obviously sed is matching for the final }, but how do I get it to match the closing bracket of the abstract value?


Solution

  • This might work for you:

    sed '1{h;d};H;${x;s/\s*abstract\s*=\s*{[^}]*}\+,//g;p};d' file
    

    This slurps the whole file into the hold space then deletes the abstract fields

    Explanation:

    On the first line replace the hold space (HS) with the current line, append all subsequent lines to the HS. Upon encountering the last line, swap to the HS and substitute all occurrences of the abstract field then print the file out. N.B. all lines that would normally be printed out are deleted.