I'm working with url data and i have problem to categorize the url into domain and sub domain using python
I'm trying regex to extract domain but i don't know how to return it into True or False subdomain
for example
a = ['facebook.com', 'profile.facebook.com']
I expect the result is
[False, True]
You need to decide how loose restrictions you want to put on domain name, rest can look like:
>>> import re
>>> a = re.compile('[0-9a-z\.]*\.[0-9a-z]*\.com')
>>> bool(a.match('facebook.com'))
False
>>> bool(a.match('sub.facebook.com'))
True
Here I assumed domain will and with .com
but you can change that too easily.