Search code examples
phpregexxampp

preg_match_all to get list of virtual hosts from httpd-vhosts.conf file xampp using php preg_match_all


This is what i tried, also i dont want the hosts and explanations which are commented with #/ ##

$str = '# Virtual Hosts
#
# Required modules: mod_log_config

# If you want to maintain multiple domains/hostnames on your
# machine you can setup VirtualHost containers for them. Most configurations
# use only name-based virtual hosts so the server doesn\'t need to worry about
# IP addresses. This is indicated by the asterisks in the directives below.
#
# Please see the documentation at 
# <URL:http://httpd.apache.org/docs/2.4/vhosts/>
# for further details before you try to setup virtual hosts.
#
# You may use the command line option \'-S\' to verify your virtual host
# configuration.

#
# Use name-based virtual hosting.
#
##NameVirtualHost *:80
#
# VirtualHost example:
# Almost any Apache directive may go into a VirtualHost container.
# The first VirtualHost section is used for all requests that do not
# match a ##ServerName or ##ServerAlias in any <VirtualHost> block.
#
##<VirtualHost *:80>
    ##ServerAdmin webmaster@dummy-host.example.com
    ##DocumentRoot "D:/xampp/htdocs/dummy-host.example.com"
    ##ServerName dummy-host.example.com
    ##ServerAlias www.dummy-host.example.com
    ##ErrorLog "logs/dummy-host.example.com-error.log"
    ##CustomLog "logs/dummy-host.example.com-access.log" common
##</VirtualHost>

##<VirtualHost *:80>
    ##ServerAdmin webmaster@dummy-host2.example.com
    ##DocumentRoot "D:/xampp/htdocs/dummy-host2.example.com"
    ##ServerName dummy-host2.example.com
    ##ErrorLog "logs/dummy-host2.example.com-error.log"
    ##CustomLog "logs/dummy-host2.example.com-access.log" common
##</VirtualHost>



##<VirtualHost *:8080>
    ##ServerName CI1
    ##DocumentRoot D:\xampp\htdocs\codeigniter_1\public
##</VirtualHost>

<VirtualHost *:8080>
    ServerName CI2
    DocumentRoot D:\xampp\htdocs\codeigniter_2\public
</VirtualHost>

<VirtualHost *:8080>
    ServerName CI3
    DocumentRoot D:\xampp\htdocs\codeigniter_3\public
</VirtualHost>';

Pattern 1

$pattern1 = "#<\s*?$tagname\b[^>]*>(.*?)</$tagname\b[^>]*>#s";
preg_match_all($pattern1, $str, $match);

Pattern 2

$pattern2 = "/^(?<!#).*<$tagname.*>(.+?)<\/tagname>/mis";
preg_match_all($pattern2, $str, $matches);

Function

function everything_in_tags($str, $tagname)
{
    $pattern1 = "#<\s*?$tagname\b[^>]*>(.*?)</$tagname\b[^>]*>#s";
    preg_match_all($pattern1, $str, $match);
    
    $pattern2 = "/^(?<!#).*<$tagname.*>(.+?)<\/$tagname>/s";
    preg_match_all($pattern2, $str, $matches);
    
    echo '<pre>',print_r($match[1]),'</pre>';
    
    echo '<pre>',print_r($matches[1]),'</pre>';
}

everything_in_tags($str, $tagname);

Output for Pattern 1

Array
(
    [0] =>  block.
#
##
    ##ServerAdmin webmaster@dummy-host.example.com
    ##DocumentRoot "D:/xampp/htdocs/dummy-host.example.com"
    ##ServerName dummy-host.example.com
    ##ServerAlias www.dummy-host.example.com
    ##ErrorLog "logs/dummy-host.example.com-error.log"
    ##CustomLog "logs/dummy-host.example.com-access.log" common
##
    [1] => 
    ##ServerAdmin webmaster@dummy-host2.example.com
    ##DocumentRoot "D:/xampp/htdocs/dummy-host2.example.com"
    ##ServerName dummy-host2.example.com
    ##ErrorLog "logs/dummy-host2.example.com-error.log"
    ##CustomLog "logs/dummy-host2.example.com-access.log" common
##
    [2] => 
    ##ServerName CI1
    ##DocumentRoot D:\xampp\htdocs\codeigniter_1\public
##
    [3] => 
    ServerName CI2
    DocumentRoot D:\xampp\htdocs\codeigniter_2\public

    [4] => 
    ServerName CI3
    DocumentRoot D:\xampp\htdocs\codeigniter_3\public

)

Output for Pattern 2

Array
(
    [0] => 
    ServerName CI3
    DocumentRoot D:\xampp\htdocs\codeigniter_3\public

)

Desired Output

Array
(
    [0] => array(
        [ServerName] : CI2
        [DocumentRoot] : D:\xampp\htdocs\codeigniter_2\public
        ),
    [1] => array(
        [ServerName] : CI3
        [DocumentRoot] : D:\xampp\htdocs\codeigniter_3\public
        );
)

Any help would be appreciated, coz i'm new to regex..... Also i don't needany of the strings which are commented with #/##, Thanks in advance..


Solution

  • Use as a regex:

    '~^\s*<VirtualHost\s+[^>]*>\s*ServerName\s+(?P<ServerName>\S+)\s*DocumentRoot\s+(?P<DocumentRoot>\S+)\s*</VirtualHost>~m'
    

    Then what you need will be in capture groups 1/'ServerName' and 2/'DocumentRoot'.

    Explanation:

    1. ^ - Match start of line with m flag set.
    2. \s* - Match 0 or more whitespace characters.
    3. <VirtualHost - Match '<VirtualHost'.
    4. \s+ - Match 1 or more whitespace characters.
    5. [^>]* Match 0 or more characters that are not '>'.
    6. > - Match '>'.
    7. \s* - Match 0 or more whitespace characters.
    8. ServerName - Match 'ServerName'.
    9. \s+ - Match 1 or more whitespace characters,
    10. (?P<ServerName>\S+) - Match 1 or more non-whitespace characters as capture group 'ServerName'.
    11. \s* - Match 0 or more whitespace characters.
    12. DocumentRoot - Match 'DocumentRoot'.
    13. \s+ - Match 1 or more whitespace characters,
    14. (?P<DocumentRoot>\S+) - Match 1 or more non-whitespace characters as capture group 'DocumentRoot'.
    15. \s* - Match 0 or more whitespace characters.
    16. </VirtualHost> - Match '</VirtualHost>'.

    See RegEx Demo

    The code:

    $regex = '^\s*<VirtualHost\s+[^>]*>\s*ServerName\s+(?P<ServerName>\S+)\s*DocumentRoot\s+(?P<DocumentRoot>\S+)\s*</VirtualHost>~m';
    preg_match_all($regex, $str, $matches, PREG_SET_ORDER);
    print_r($matches);
    

    Prints:

    Array
    (
        [0] => Array
            (
                [0] =>
    <VirtualHost *:8080>
        ServerName CI2
        DocumentRoot D:\xampp\htdocs\codeigniter_2\public
    </VirtualHost>
                [ServerName] => CI2
                [1] => CI2
                [DocumentRoot] => D:\xampp\htdocs\codeigniter_2\public
                [2] => D:\xampp\htdocs\codeigniter_2\public
            )
    
        [1] => Array
            (
                [0] =>
    <VirtualHost *:8080>
        ServerName CI3
        DocumentRoot D:\xampp\htdocs\codeigniter_3\public
    </VirtualHost>
                [ServerName] => CI3
                [1] => CI3
                [DocumentRoot] => D:\xampp\htdocs\codeigniter_3\public
                [2] => D:\xampp\htdocs\codeigniter_3\public
            )
    
    )
    

    Update

    The above regex assumes that the DocumentRoot specification contains no embedded spaces, which may not always be the case. It also had an extraneous, unnecessary s*, which I have removed. The following regex, I believe, is an improvent. It replaces \S+, which matches 1 or more non-whitespace characters with .+?, which will non-greedily match 1 or more of any character (except for the newline character). Since it is followed by \s*</VirtualHost>, being non-greedy it will stop matching as soon as it finds '</VirtualHost> optionally preceded by any whitespace.

    So the new regex would be (used with the m flag):

    ^\s*<VirtualHost\s+[^>]*>\s*ServerName\s+(?P<ServerName>\S+)\s*DocumentRoot\s+(?P<DocumentRoot>.+?)\s*</VirtualHost>
    

    See new RegEx Demo

    $regex = '^\s*<VirtualHost\s+[^>]*>\s*ServerName\s+(?P<ServerName>\S+)\s*DocumentRoot\s+(?P<DocumentRoot>.+?)\s*</VirtualHost>~m';
    preg_match_all($regex, $str, $matches, PREG_SET_ORDER);
    print_r($matches);