Search code examples
pdfcsrfcontent-security-policywebsecurity

JavaScript execution in PDFs inside browsers: What is the best practice to handle this securely?


We are currently working on a file-server like implementation which serves user-uploaded content. To circumvent CSRF attacks, we serve all content with a CSP header, which disallows any execution of scripts on the content.

The used CSP header looks something like this:

Content-Security-Policy: default-src 'none'; media-src 'self'; style-src 'unsafe-inline'; font-src data: ; connect-src 'self'; sandbox

This works fine for all sorts of media like HTML or SVG. But we have noticed, that browsers do not apply the CSP rules for PDF files which embed JavaScript opened in a browser. When you open a PDF containing JavaScript in a browser, the JavaScript is executed regardless of the value of the CSP header.

It seems like that the context of the JavaScript execution in the PDF is quite limited and it seems like that there is no way to access site site context like window or document and also, it seems like there is no way to perform HTTP requests inside the PDFs JavaScript context.

In my research, I found it quite difficult to find reliable sources on how JavaScript in PDFs inside browsers is handled or if it is a security risk. Most sources hint to sanitation of the PDF files by stripping the JavaScript content from the files, but that would be unhandy if the user actually want's to serve interactive PDF files, if it is seemingly possible by the spec.

Do you have any experience in this topic on how to securely serve PDFs via a web server? Are there any best practices?


Solution

  • JavaScript inside of PDFs that use the browser's native PDF viewer will indeed execute with a limited scope and in a sandboxed way.

    However, a future theoretical vulnerability could circumvent this. Depending on your security posture, you may want to use the CSP to mitigate this risk. You may also just not want "irritating" PDFs to be displayed with those dynamic features (they can trigger the browser alert() function, for example).

    This is a very narrow case we are discussing, so I stress again whether you go out of your way to be defensive about this depends on your security posture.

    As you have discovered, scripts executed as part of browser "plugins" aren't subject to the CSP. It is outside of its control. To the CSP they are just opaque "objects" via <embed>, <object>, <iframe> or otherwise. Therefore, object-src and also frame-ancestors can be used to control if they are allowed to appear at all inside of your content.

    However, they are "all or nothing" switches and can't control script execution inside the PDF. You can only control whether they can be embedded or not. And I would guess that you attach a certain amount of product value to being able to embed the PDFs in your use case.

    If you want to press ahead with some solution, there are some potential options with different tradeoffs:

    • Use a JS based PDF viewer like pdf.js with enableScripting set to false.
      • This example is AngularJS specific, but I couldn't find anything else with a live demo. It shows that the JS in the PDF can be disabled.
      • You would then use object-src and/or frame-ancestors to block the loading of PDFs via their native viewer, forcing them to be viewed via the library in embedded scenarios.
      • This has some downsides in that you are putting a certain amount of trust in a third party library. Ensure enableScripting works as you expect.
      • It does allow the determination of wether scripts in pdf are enabled or not to be easily turned on/off and in control of the application.
      • You could encourage users who actually do need scripting to download the PDF and open it locally. This would mean you are not exposed. It should be possible to detect if there are scripts inside the pdf.js that were not executed.
    • Theoretically, you could strip the JS from the PDF server side.
      • Research would be needed into what libs could help you do this in your server technology.
      • You would need to use object-src and/or frame-ancestors to make it so that only PDFs from your own server (the "cached"/"stripped" location) are allowed.
      • Feels like it has a fair amount of overhead and a bit of a burden.
      • As you mentioned, it is bulky if you need some switch to enable scripting. You inevitably would need some UI still to embed the pdf in a frame, and notify the user to optionally reach out to some "raw" link if it does indeed contain JS.

    Now, if you really want to avoid even the "switch" you might be tempted to reach for something like pdfjs with scripting enabled as it would be one consistent sandbox behavior between all browsers.

    However, I would caution against this, because that sandbox is enforced in userland and is likely, therefore, more vulnerable than the default PDF viewer.

    There is no absolute answer. What you choose depends on the tradeoffs that make sense for you.