Search code examples
pythonconfigurationsandbox

Possible to sandbox Python configuration file?


I'm thinking of implementing a configuration file written in Python syntax, not unlike what Django does.

While I've seen one or two SO questions about the merits of using executable code in configuration files, I'm curious whether there is a way to execute the config file code in a "sandbox" to prevent mistakes in the code from locking up the host application.

Because the host application is a programmer's tool, I'm not concerned about teaching Python syntax or introducing security holes as mentioned in at least one other SO question. But I am worried about the configuration code branching to Fishkill and wedging the host app. I'd much rather that the host app trap those problems and display diagnostic error information.

Has anyone tried this sort of sandboxing for a Python configuration file? And, if so, what techniques proved useful, and what pitfalls cropped up that I should be aware of?


Solution

  • We do this for some of our internal tools

    What we do protects us from exception issues and discourages any attempts by the users to get overly creative in the config scripts. However it doesn't protect us from infinite loops or actively malicious third parties.

    The core of the approach here is to run the script in a locked down exec.

    1. First we go through the __ builtin __ module and del everything we don't want them to be able to touch, especially __ import __. We actually do this in a context manager which backs the original values up and dels them on the way in and then restores the original values on the way back out.

    2. Next we create an empty dictionary to be the config scripts namespace.

    3. Then we exec the config with the namespace.

    4. The exec is of course wrapped in a try except that will catch anything.

    5. And finally we inspect the namespace to extract the variables we are interested in.

    Points to note here:

    1. It might be tempting to prepopulate the namespace with stuff that might be useful to the config script, but you want to be very careful doing that you quickly open up hooks back into the host program.

    2. The config scripts can still create functions and classes so you might get back something that looks like a string for example, but is actually an arbitrary blob of executable code.

    Because of these we impose the restriction that our config scripts are expected to produce pure primitive data structures (generally just ints, strings, lists, tuples and None) that we then separately verify.