I am working on project where users must be able to submit templates containing placeholders to be later rendered to generate dynamic content.
For example, a user might submit a template like:
"${item.price} - {item.description} / {item.release_date}"
that would be after formatted with the real values.
Using a template engine (as Django or Jinja) for this purpose would require a lot of validation and sanitizing to prevent SSTI and XSS, so I was wondering if I could use the python str.format method to create a more limited and safer alternative, since this method cannot execute python code directly (I believe).
My question is:
Is using string.format safe enough when dealing with user-submitted templates, or it would still be vulnerable to injection attacks?
If it is not, is there any alternative to implement a "safe template rendering" in python?
No, it is not safe in general to use str.format
with user-provided format strings.
Format strings are capable to executing a limited form of Python code. This limited code is just powerful enough to pose significant denial of service (DOS) and data breach risks.
The main factors that make using untrusted format strings risky are:
str.format
, but also about every object that is indirectly accessible from those objects via indexing and attribute lookups. It turns out that the space of indirectly accessible objects is very large.Off the top of my head, I came up with this toy example of how str.format
with a carefully constructed format string can easily leak sensitive information from your server. This attack only relies on the attacker knowing (or guessing) the of types of the arguments being supplied to str.format
.
Running this code will print all environment variables defined on the server:
import os
class Foo:
def foo(self):
pass
user_format = "{f.foo.__globals__[os].environ}"
print(user_format.format(f=Foo()))
Specific to Django, this format string will print your entire settings module, complete with your DB passwords, signing keys, and all sorts of other bits of exploitable information:
user_format = "{f.foo.__globals__[sys].modules[myapp.settings].__dict__}"
Simple format strings like "{:10000000000}"
can lead to massive memory consumption. This can grind things to a halt due to constant page swapping, or even crash the server process with a MemoryError
exception or OOM kill.
Additionally, due to the sheer number of different objects that a format string can interact with, and the variety of special methods that can be called on those objects, there's no telling what sort of other weird and unexpected exceptions could be raised or what resource intensive operations that may be performed. You can imagine a scenario where an ORM manager class has a property that triggers a database transaction when accessed. Such a property could be repeatedly accessed in the format string to stall the server with database interactions.