reactjs security dangerouslysetinnerhtml

Is it wise to render a server response using dangerouslySetInnerHTML to the page without sanitization?

It is very common I think to use React's dangerouslySetInnerHTML property to place markup acquired from a server on a page, i.e.

const SomeComponent = () => {
  const [markup, setMarkup] = useState(null)

  useEffect(() => { 
    const resp = await fetch(...)
    setMarkup(resp.content)
  })

  return <div dangerouslySetInnerHTML={{ __hmtl: markup }} />
}

If this were a different scenario and the markup were coming from a form on the page, this would clearly pose a risk because you can't trust data entered on the form and we are not doing any sanitization here.

However, we are putting data returned from a server on the page, and so presumably there is some degree of trust. The call to the server occurs in the code and presumably we know the API we are calling.

But is it actually unwise to consider data coming from the sever trusted even when we trust the server? Can a bad actor intervene on the wire before the data comes back?

Solution

Problems with Completely Trusting dangerouslySetInnerHTML

There are a number of reasons to take a minimum of precautions with dangerouslySetInnerHTML. Since the logic for the browser is defined elsewhere, that elsewhere then becomes a point of failure.

Did an internal process for reviewing and revising how you build your HTML code logic fail? Did this allow for XSS attacks?
Did someone forget to renew the SSL cert? The domain registration? And someone already cyber-squatted it, and now your app uses an API from a hijacked domain?
Was a DNS nameserver hacked to point your API domain to a different server? What about a router, or any intermediate piece of networking equipment?
Were your own servers hacked? Least likely (wink), but also possible.

Safely Using dangerouslySetInnerHTML

But, sometimes you need to dangerouslySetInnerHTML, because that's the easiest solution there is. For instance, it is extremely easy to store, preserve, and retrieve markup like bold, italic, etc., by saving and storing it as raw HTML.

At the very least, please cleanse the data of any <script> tags before sending it to the user, to absolutely remove the possibility of anything harmful. You can do this by casting your HTML with document.createElement(), and then removing any <script> tag nodes.

Fun fact: SO's demo does not like it when you create an element with a <script> tag! The snippet below will not run, but it does work at: Full Working Demo at JSBin.com.

var el = document.createElement( 'html' );
el.innerHTML = "<p>Valid paragraph.</p><p>Another valid paragraph.</p><script>Dangerous scripting!!!</script><p>Last final paragraph.</p>";

var scripts = el.getElementsByTagName( 'script' );

for(var i = 0; i < scripts.length; i++) {
  var script = scripts[i];
  script.remove();
}

console.log(el.innerHTML);

document.getElementById('main').append(el);