Search code examples
regexnuxt.jsmarkdownmarkdown-it

Excluding URLs/Images in markdown from string Regular Expression operation


I'm building an application where users highlight and scroll to words in an article they write in search bar. articles come in a Markdown format and I'm using a Markdown-it to render article body.

It works well except for if the word they search for is part of an image URL. it applies regular expression to it and image breaks.

    applyHighlights() {
      let str = this.article.body
      let searchText = this.articleSearchAndLocate
      const regex = new RegExp(searchText, 'gi')
      let text = str
      if (this.articleSearchAndLocate == '') {
        return text
      } else {
        const newText = text.replace(
          regex,
          `<span id="searchResult" class="rounded-sm shadow-xl py-0.25 px-1 bg-accent font-semibold text-tint">$&</span>`
        )
        return newText
      }
    }

Is there a way to exclude applying regex if it's an Image URL ?


Solution

  • You can use

    applyHighlights() {
      let str = this.article.body
      let searchText = this.articleSearchAndLocate
      const regex = new RegExp('(!\\[[^\\][]*]\\([^()]*\\))|' + searchText.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), 'gi')
      let text = str
      if (this.articleSearchAndLocate == '') {
        return text
      } else {
        const newText = text.replace(
         regex, function(x, y) { return y ? y :
          '<span id="searchResult" class="rounded-sm shadow-xl py-0.25 px-1 bg-accent font-semibold text-tint">' + x + '</span>'; })
        return newText
      }
    }
    

    Here,

    • new RegExp('(!\\[[^\\][]*]\\([^()]*\\))|' + searchText.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), 'gi') - creates a regex like (!\[[^\][]*]\([^()]*\))|hello that matches and captures into Group 1 a string like ![desc](value), or it matches hellow (if the searchText is hello).
    • .replace(regex, function(x, y) { return y ? y : '<span id="searchResult" class="rounded-sm shadow-xl py-0.25 px-1 bg-accent font-semibold text-tint">' + x + '</span>'; }) means that if Group 1 (y) was matched, the return value is y itself as is (no replacement is peformed), else, the x (the whole match, searchText) is wrapped with a span tag
    • .replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&') is necessary to support searchText that can contain special regex metacharacters, see Is there a RegExp.escape function in JavaScript?