Search code examples
regexpreg-replacedoi

Using preg_replace to linkify DOI


I'm looping through some text with embedded literature references. Some of these are DOI numbers, and I need to linkify them.

Example text:

<div>Interesting article here:  doi:10.1203/00006450-199305000-00005</div>

What I've tried so far:

$html = preg_replace("\b(10[.][0-9]{4,}(?:[.][0-9]+)*/(?:(?![\"&\'<>])[[:graph:]])+)\b", "<a href='https://doi.org/\\0' target='_new'>doi:\\0</a>",$html);

This returns an empty string.

I'm expecting:

<div>Interesting article here:  <a href='https://doi.org/10.1203/00006450-199305000-00005' target='_new'>doi:10.1203/00006450-199305000-00005</a></div>

Where am I going wrong?

edit 2018-01-30: updated DOI resolver per Katrin's answer below.


Solution

  • Using Regular Expression Test Tool I found an expression that works for my example text:

    $pattern        = '(10[.][0-9]{4,}[^\s"/<>]*/[^\s"<>]+)';
    $replacement    = "<a href='http://dx.doi.org/$0' target='1'>doi:$0</a>";
    $html = preg_replace($pattern, $replacement, $html);
    

    hth