Search code examples
phphtmlsecurityuser-input

How should I clean user input if I am not writing to a databse?


I have a simple web page with a textarea that when multiple lines of tab delineated text is input, a php script processes the text and outputs the resulting string as a text file for the user to download.

The page itself is nothing more than a textarea and a submit button that posts to itself. If post data is received, it basically creates an array of the lines of text in the textbox (using explode("\n",$text), then loops through each line and creates an array of each section of text in the line (using explode("\t",$text). Finally, it takes each element in the array and constructs a string with it. The only other php function used in the script is the date function and the strtotime function along with a couple of foreach statements.

When the string is complete, the string is sent to the browser as a text file using a few header() functions (Content-Disposition: attachment; , Content-Type: , etc). So, assuming $str is the string, this is how it is sent to the browser:

header('Content-Disposition: attachment; filename="file.txt"')
header('Content-Type: text/plain'); 
header('Content-Length: ' . strlen($str));
echo $str;
exit();

For example, suppose the text supplied is this (pretend the spaces between text are tabs)

John    Smith    Freshman
Jane    Doe      Sophomore

Then the script creates a string as follows:

Student:\nfirstname:John\nlastname:Smith\nYear:Freshman\nEndStudent\nStudent:\nfirstname:Jane\nlastname:Doe\nYear:Sophomore\nEndStudent\n

And that script is sent to the browser as a text/plain file.

The script works fine, the output file is correct. My question is what kind of security holes am I leaving myself open and is it possible to close them ( and how to close them ). I would like the page to be public.


Solution

  • Since you're never interpreting the text and execute any code in response to instructions found within the text, there's absolutely no security concern here.

    The only thing you'd need to be careful about is the use of special characters: your output uses specific characters with a certain meaning, namely \n. If your input contained a \n, and you're passing that trough to the output as is, would your output's meaning change? Perhaps you need to escape or remove any \n found in the input.