Search code examples
javascriptreactjshtml-sanitizing

how can I leave single <br> tag for the output string from the input string with multiple <br> tags in a row, sanitize-html?


I have an issue, related to the sanitize-html library, here's my code:

import sanitizeHtml from 'sanitize-html';


const allowedTags = [
  'h1',
  'h2',
  'h3',
  'h4',
  'h5',
  'p',
  'a',
  'strong',
  'i',
  'ul',
  'ol',
  'li',
  'em',
  'br',
];
export const sanitizeEventDescription = (
  description: string,
  keepNewline: boolean
) =>
  sanitizeHtml(description, {
    allowedTags,
    transformTags: {
      a: function transform(tagName, attribs) {
        const href = href

        return {
          tagName: 'a',
          attribs: {
            ...attribs,
            href,
            target: '_blank',
            rel: 'noopener noreferrer',
          },
        };
      },
    },
    exclusiveFilter(frame) {
      return !frame.text.trim() && frame.tag !== 'br';
    },
    textFilter(text) {
      return text.replace(/(\n|<br>)+/g, '<br>').replace(/<br><br>/g, '<br>');
    },
  });

When I provide an input string with consecutive \n characters, the textFilter function replaces them with a single <br> tag as expected. However, when I provide an input string with consecutive <br> tags, the textFilter function doesn't seem to collapse them as intended. Instead, I still get multiple <br> tags in a row in the output.

For example, if I pass in the input string "testing carriage <br><br>returns.", I still get multiple <br> tags in a row in the output. However, if I pass in the input string "testing carriage \n\nreturns.", I get a single <br> tag in the output.

I've tried adding <br> to the allowedTags constant and modifying the exclusiveFilter function to include frame.tag !== 'br', but this only allows <br> tags to be outputted and doesn't seem to help with collapsing consecutive <br> tags.

Can anyone explain why the textFilter function isn't collapsing multiple <br> tags in a row and how I can modify my code to achieve the desired behavior?


Solution

  • sanitize-html doesn't provide functionality to remove consecutive
    tags or \n, so the good option here is to prevent getting the string with consecutive
    tags first like this:

    import sanitizeHtml from 'sanitize-html';
    
    
    const allowedTags = [
      'h1',
      'h2',
      'h3',
      'h4',
      'h5',
      'p',
      'a',
      'strong',
      'i',
      'ul',
      'ol',
      'li',
      'em',
      'br',
    ];
    export const sanitizeEventDescription = (
      description: string,
      keepNewline: boolean
    ) =>
      sanitizeHtml(description.replace(/(<br\s?\/?>|\n)\s*(<br\s?\/?>|\n)+/gim, '<br>'), {
        allowedTags,
        transformTags: {
          a: function transform(tagName, attribs) {
            const href = href
    
            return {
              tagName: 'a',
              attribs: {
                ...attribs,
                href,
                target: '_blank',
                rel: 'noopener noreferrer',
              },
            };
          },
        },
        exclusiveFilter(frame) {
          return !frame.text.trim() && frame.tag !== 'br';
        },
        textFilter(text) {
          return text.replace(/(\n|<br>)+/g, '<br>').replace(/<br><br>/g, '<br>');
        },
      });