Search code examples
phpsecuritystrip-tagsstripslashes

PHP Security (strip_tags, htmlentities)


I'm working in a personal project and besides using prepared statements, I would like to use every input as threat. For that I made a simple function.

function clean($input){
if (is_array($input)){
    foreach ($input as $key => $val){
        $output[$key] = clean($val);
    }
}else{
    $output = (string) $input;
    if (get_magic_quotes_gpc()){
        $output = stripslashes($output);
    }
    $output = htmlentities($output, ENT_QUOTES, 'UTF-8');

}
return $output;
}

Is this enought or should I use to the following code?

        $output = mysqli_real_escape_string($base, $input);
        $output = strip_tags($output);

Sorry this could be a silly question but I would like to avoid any problem with me code :) Thanks for your help


Solution

  • I applaud your efforts. You must, friendly community member, consider decoupling your operations.

    1) Have one function/routine/class/method for filtering input (filter_input_array(), strip_tags(), str_ireplace(), trim(), etc ...). You may want to create functions that use loops to do filtering. Tricks such as double encoding, one-time-strip-spoofing, and more can defeat single usage of things like strip_tags().

    Here is a strip_tags() wrapper method from my Sanitizer class. Notice how it compares the old value to the new value to see if they are equal. If they are not equal, it keeps on using strip_tags(). Although, there is quite of bit of preliminary INPUT_POST / $_POST checking done before this method is executed. Another version of this using trim() is actually executed before this one.

    private function removeHtml(&$value)
    {    
        if (is_scalar($value)) {
            do {
                $old = $value;
                $value = strip_tags($value);
    
                if ($value === $old) {
                    break;
                }
            } while(1);
        } else if (is_array($value) && !empty($value)) {
            foreach ($value as $field => &$string) {
                do {
                    $old = $string;
                    $string = strip_tags($string);
    
                    if ($string === $old) {
                        break;
                    }
                } while (1);
            }
        } else {
           throw new Exception('The data being HTML sanitized is neither scalar nor in an array.');
        }
    
        return;
    }
    

    2) Have another one for validating input (filter_var_array(), preg_match(), mb_strlen, etc...)

    Then, when your data needs to switch contexts ...

    A) For databases, use prepared statements (PDO, preferably).

    B) For returning / transmitting user input to the browser, escape the output with htmlentities() or htmlspecialchars accordingly.

    In terms of magic quotes, the best thing to do is just disable that in the php.ini.

    Now, with those various constructs having their own areas of responsibility, all you have to do is manage the flow of logic and data inside of your handler file. This includes providing error messages to the user (when necessary) and handling errors/exceptions.

    There is no need to use htmlentities() or htmlspecialchars immediately if the data is going from the HTML form directly into the database. The point of escaping data is to prevent it from being interpreted as executable instructions inside a new context. There is no danger htmlentities() or htmlspecialchars can resolve when passing data to a SQL query engine (that is why you filter and validate the input, and use (PDO) prepared statements).

    However, after the data is retrieved from database tables and is directly destined for the browser, ok, now use htmlentities() or htmlspecialchars. Create a function that uses a for or foreach loop to handle that scenario.

    Here is a snippet from my Escaper class

    public function superHtmlSpecialChars($html)
    {
         return htmlspecialchars($html, ENT_QUOTES | ENT_HTML5, 'UTF-8', false);
    }
    
    public function superHtmlEntities(&$html)
    {
        $html = htmlentities($html, ENT_QUOTES | ENT_HTML5, 'UTF-8', false);
    }
    
    public function htmlSpecialCharsArray(array &$html)
    {       
        foreach ($html as &$value) {
            $value = $this->superHtmlSpecialChars($value);
        }
    
        unset($value);
    }
    
    public function htmlEntitiesArray(array &$html)
    {       
        foreach ($html as &$value) {
            $this->superHtmlEntities($value);
        }
    
        unset($value);
    }
    

    You'll have to tailor your code to your own personal tastes and situation.

    Note, if you plan on processing the data before sending it to the browser, do the processing first, then escape with your handy-dandy htmlentities() or htmlspecialchars looping function.

    You can do it!