regex iis regex-lookarounds regex-group regex-greedy

RegEx for failing subdomains

Basically, I would like to check a valid URL that does not have subdomain on it. I can't seem to figure out the correct regex for it.

Example of URLs that SHOULD match:

example.com
www.example.com
example.co.uk
example.com/page
example.com?key=value

Example of URLs that SHOULD NOT match:

test.example.com
sub.test.example.com

Solution

Here, we would start with an expression which is bounded on the right with .com or .co.uk and others, if desired, then we would swipe to left to collect all non-dot chars, add an optional www and https, then we would add a start char ^ which would fail all subdomains:

^(https?:\/\/)?(www\.)?([^.]+)(\.com|\.co\.uk)(.+|)$

Other TLDs can be added to this capturing group:

(\.com|\.co\.uk|\.net|\.org|\.business|\.edu|\.careers|\.coffee|\.college)

And the expression can be modified to:

^(https?:\/\/)?(www\.)?([^.]+)(\.com|\.co\.uk|\.net|\.org|\.business|\.edu|\.careers|\.coffee|\.college)(.+|)$

Flexibility

I can't think of something to make the TLDs too flexible, since this is a validation expression. For instance, if we would simplify it to:

^(https?:\/\/)?(www\.)?([^.]+)(\.[a-z]+)(\.uk?)?[a-z?=\/]+$

it might work for the URLs listed in the question, but it would also pass:

example.example

which is invalid. We can only use this expression:

^(https?:\/\/)?(www\.)?([^.]+)(\.[a-z]+)(\.uk?)?[a-z?=\/]+$

if we would know that what we pass, it is already a URL.

NOT FUNCTIONAL DEMO

Demo

This snippet just shows that how the capturing groups work:

const regex = /^(https?:\/\/)?(www\.)?([^.]+)(\.com|\.co\.uk)(.+|)$/gm;
const str = `example.com
www.example.com
example.co.uk
example.com/page
example.com?key=value

test.example.com
sub.test.example.com`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}

RegEx for failing subdomains

Flexibility

NOT FUNCTIONAL DEMO

Demo

RegEx Circuit

RegEx

DEMO