Search code examples
phpphp-internals

Where's the php-src/PHP-Internals Main Entry Point


What function or bit of code serves as the main entry point for executing/interpreting a PHP program in the source of PHP itself? Based on things I've googled or read in books, I know that PHP is designed to work with a server of some kind (even the CLI command works by starting up "the command line SAPI", which acts as a mini-server designed to process a single request), and that the server will ask PHP to execute a program.

I know about the minit and rinit lifecycle functions, which serve as entry points for a PHP extension.

What I don't know is where does the PHP source code have this conversation with itself

Hey look, there's a PHP program in this file/string. I ought to run it

I'm not trying to accomplish any specific task here. I'm trying to understand how the internals of PHP does what it does, and find a main entry point where I can start following its execution.


Solution

  • Where is the entry point of the code of some SAPI?

    The CLI is a standalone application. As any other application written in C, its entry point is the function main() (file sapi/cli/php_cli.c, line 1200):

    int main(int argc, char *argv[])
    

    There are two versions of the CLI for Windows, one of them is a console application and starts with the main() function described above, the other is a Windows GUI application (it doesn't create a console when it starts and uses message boxes for output) that starts with the WinMain() function (file sapi/cli/php_cli.c, line 1198).
    main() and WinMain() use the same code here. They have different name and different code fragments here and there by checking if the symbol PHP_CLI_WIN32_NO_CONSOLE is defined. It is defined in file sapi/cli/cli_win32.c that is used to generate the Windows GUI application.
    </Windows>

    The CGI version is also a standalone console application. Its entry point is also the main() function in file sapi/cgi/cgi_main.c, line 1792.

    Similar, the FPM version starts with main() in file sapi/fpm/fpm/fpm_main.c, line 1570.

    Apache2 handler is a dynamically loadable module (.dll on Windows, .so on Unix-like systems). It registers some functions as event handlers for the events published by the web server (server start, pre/post configuration loaded, process request etc). These handlers are registered by the php_ap2_register_hook() function in file sapi/apache2handler/sapi_apache2.c, line 738.
    (You can find details about how a loadable module integrates with Apache in the Apache documentation.)

    The handler that is interesting to us is the function php_handler() that is invoked to handle a HTTP request.

    In a similar manner, every SAPI has an entry point (either main() or a function that is invoked by the web server).

    All these entry points do similar processing:

    • initialize themselves;
    • parse the command line arguments (only if it's CLI, CGI or other kind of standalone application);
    • read php.ini and/or other configuration they have (the Apache module configuration can be overridden in .htaccess);
    • create a stream using the input file and pass it to the function php_execute_script() defined in file main/main.c, line 2496;
    • cleanup and return an exit code to the calling process (the shell or the web server).

    Where is the code that actually executes a PHP script?

    The function php_execute_script() is a wrapper; it interprets the php.ini configuration entries auto_prepend_file and auto_append_file, prepares the list of files (auto-prepend file, main script, auto-append file) and passes the list to zend_execute_scripts() that processes them.

    php_execute_script() is not always invoked, some SAPIs and command line arguments of the CLI produce the direct invocation of zend_execute_scripts().

    zend_execute_scripts() is where the interesting things happen.

    It compiles the PHP file (and returns a list of OP codes in op_array then, if the compilation succeeds (the returned op_array is not NULL) it executes the OP-codes. There is also exception handling and cleanup; boring work but as important as the parsing and executions nevertheless.

    The compilation is a tedious process. It is done by the function zendparse() defined in the file Zend/zend_language_parser.c. The definition of the zendparse() function and the file Zend/zend_language_parser.c are nowhere to be seen in the Git repo; the parser is generated using bison and re2c that read the language syntax rules and the definition of lexical tokens from Zend/zend_language_parser.y and Zend/zend_language_scanner.l and generate the actual compiler in file Zend/zend_language_parser.c.

    However, even if the hard work is not visible in the repo, the interesting parts of the compilation process are visible in the files mentioned above.

    The execution of the compiled script (the list of OP codes) is done by function zend_execute() that is defined in the file Zend/zend_vm_execute.h. This is also a generated file and the interesting part is that it is generated by a PHP script.

    The generator script (Zend/zend_vm_gen.php) uses zend_vm_def.h and zend_vm_execute.skl to generate zend_vm_execute.h and zend_vm_opcodes.h.

    zend_vm_def.h contains the actual interpreter code that is executes to handle each OP code.

    Where is the code of some function provided by the PHP core or one of its bundled extensions?

    The code of the PHP functions and functions provided by extensions is somehow easier to follow. The functions included in the PHP core are located in files in the ext/standard directory, the functions provided by other extensions are located in files in the corresponding ext subdirectories.

    In these files, the C functions that implement PHP functions are declared using the PHP_FUNCTION() macro. For example, the implementation of the PHP function strpos() starts in file ext/standard/string.c, line 1948. The function strchr() being an alias of strstr() is declared using the PHP_FALIAS() macro in file ext/standard/basic_functions.c on line 2833.

    And so on, and so forth.