Search code examples
phpcommentsstrip

Best way to automatically remove comments from PHP code


What’s the best way to remove comments from a PHP file?

I want to do something similar to strip-whitespace() - but it shouldn't remove the line breaks as well.

For example,

I want this:

<?PHP
// something
if ($whatsit) {
    do_something(); # we do something here
    echo '<html>Some embedded HTML</html>';
}
/* another long
comment
*/
some_more_code();
?>

to become:

<?PHP
if ($whatsit) {
    do_something();
    echo '<html>Some embedded HTML</html>';
}
some_more_code();
?>

(Although if the empty lines remain where comments are removed, that wouldn't be OK.)

It may not be possible, because of the requirement to preserve embedded HTML - that’s what’s tripped up the things that have come up on Google.


Solution

  • I'd use tokenizer. Here's my solution. It should work on both PHP 4 and 5:

    $fileStr = file_get_contents('path/to/file');
    $newStr  = '';
    
    $commentTokens = array(T_COMMENT);
        
    if (defined('T_DOC_COMMENT')) {
        $commentTokens[] = T_DOC_COMMENT; // PHP 5
    }
    
    if (defined('T_ML_COMMENT')) {
        $commentTokens[] = T_ML_COMMENT;  // PHP 4
    }
    
    $tokens = token_get_all($fileStr);
    
    foreach ($tokens as $token) {    
        if (is_array($token)) {
            if (in_array($token[0], $commentTokens)) {
                continue;
            }
            
            $token = $token[1];
        }
    
        $newStr .= $token;
    }
    
    echo $newStr;