Search code examples
phpincludeamalgamation

Unwrap / amalgamate PHP code from several .php files


For debugging purposes, when working on PHP projects with many file / many include (example: Wordpress code), I would sometimes be interested in seeing the "unwrapped" code, and to amalgamate / flatten ("flatten" is the terminology used in Photoshop-like tools when you merge many layers into one layer) all files into one big PHP file.

How to do an amalgamation of multiple PHP files?

Example:

$ php index.php --amalgamation

would take these files as input:

  • vars.php

    <?php
    $color = 'green';
    $fruit = 'apple';
    ?>
    

    index.php

    <?php
    include 'vars.php';
    echo "A $color $fruit";
    ?>
    

and produce this amalgamated output:

<?php
$color = 'green';
$fruit = 'apple';
echo "A $color $fruit";
?>

(it should work also with many files, e.g. if index.php includes vars.php which itself includes abc.php).


Solution

  • We can write an amalgamation/bundling script that fetches a given file's contents and matches any instances of include|require, and then fetches any referred files' contents, and substitutes the include/require calls with the actual code.

    The following is a rudimentary implementation that will work (based on a very limited test on files with nested references) with any number of files that include/require other files.

    <?php
    
    // Main file that references further files:
    $start = 'test/test.php';
    
    function bundle_files(string $filepath)
    {
        // Fetch current code
        $code = file_get_contents($filepath);
        
        // Set directory for referred files
        $dirname = pathinfo($filepath, PATHINFO_DIRNAME);
        
        // Match and substitute include/require(_once) with code:
        $rx = '~((include|require)(_once)?)\s+[\'"](?<path>[^\'"]+)[\'"];~';
    
        $code = preg_replace_callback($rx, function($m) use ($dirname) {
            // Ensure a valid filepath or abort:
            if($path = realpath($dirname . '/' . $m['path'])) {
                return bundle_files($path);         
            } else {
                die("Filepath Read Fail: {$dirname}/{$m['path']}");
            }
        }, $code);
        
        // Remove opening PHP tags, note source filepath
        $code = preg_replace('~^\s*<\?php\s*~i', "\n// ==== Source: {$filepath} ====\n\n", $code);
        
        // Remove closing PHP tags, if any
        $code = preg_replace('~\?>\s*$~', '', $code);   
        
        return $code;
    }
    
    $bundle = '<?php ' . "\n" . bundle_files($start);
    
    file_put_contents('bundle.php', $bundle);
    echo $bundle;
    

    Here we use preg_replace_callback() to match and substitute in order of appearance, with the callback calling the bundling function on each matched filepath and substituting include/require references with the actual code. The function also includes a comment line indicating the source of the included file, which may come in handy if/when you're debugging the compiled bundle file.

    Notes/Homework:

    • You may need to refine the base directory reference routine. (Expect trouble with "incomplete" filepaths that rely on PHP include_path.)
    • There is no control of _once, code will be re-included. (Easy to remedy by recording included filepaths and skipping recurrences.)
    • Matching is only made on "path/file.php", ie. unbroken strings inside single/double quotes. Concatenated strings are not matched.
    • Paths including variables or constants are not understood. Files would have to be evaluated, without side-effects!, for that to be possible.
    • If you use declare(strict_types=1);, place it atop and eliminate following instances.
    • There may be other side-effects from the bundling of files that are not addressed here.
    • The regex does no lookbehind/around to see if your include/require is commented out!
    • If your code jumps in and out of PHP mode and blurts out HTML, all bets are off
    • Managing the inclusion of autoloaded classes is beyond this snippet.

    Please report any glitches and edge cases. Feel free to develop and (freely) share.