Search code examples
phpregex

regex for checking URL's wildcard for both subdomain and top level domain?


Is it possible to modify this current php regex to allow '*example.*'?

$wildcardUrlPattern = '/^(https?:\/\/)?(\*[a-z0-9-]*\.)?[a-z0-9-]+(\.[a-z0-9-]*)*(\.\*[a-z0-9-]*)?$/i';

// Test URLs
$urls = [
    'example.*',      // Domain with wildcard TLD
    '*example.com',   // Wildcard domain with fixed TLD
    '*.example.*',    // Wildcard subdomain with wildcard TLD
    '*.example.com',  // Wildcard subdomain with fixed TLD
    '*example.*',     // Wildcard domain with wildcard TLD
    'http://example.*', // Protocol with domain and wildcard TLD
    'http://*example.com' // Protocol with wildcard domain and fixed TLD
];

foreach ($urls as $url) {
    if (preg_match($wildcardUrlPattern, $url)) {
        echo "$url: Valid URL with wildcard\n";
    } else {
        echo "$url: Invalid URL\n";
    }
}

Result:

example.*: Valid URL with wildcard
*example.com: Valid URL with wildcard
*.example.*: Valid URL with wildcard
*.example.com: Valid URL with wildcard
*example.*: Invalid URL (should also be valid)
http://example.*: Valid URL with wildcard
http://*example.com: Valid URL with wildcard

Solution

  • How about this?

    /^(?=.+\..+)(?:https?:\/\/)?[a-z0-9-\.*]+$/i
    

    Here’s an explanation:

    • ^ — start of string
    • (?=.+\..+) — lookahead (?= ) to ensure the string contains least one dot \. with a character .+ either side of it
    • (?:https?:\/\/)? — optional scheme (not captured)
    • [a-z0-9-\.*]+ — a sequence of uninterrupted alphanumeric characters, hyphens, dots or asterisks
    • $ — end of string

    See it working on regex101, although note that for testing purposes I have included the /m flag so that ^ and $ match the start and end of each line (rather than the start and end of the string), and the /g flag so that it won’t stop after finding the first match. I included a second group of test cases which I imagine should not match.