Search code examples
pythonregexregex-lookaroundsregex-group

I am finding an email from from a given string and returning the email that exists in the string


My requirements are-

name: the name is an alphanumeric string that is less than or equal to 12 characters. Additional characters allowed are dash (-), period (.) and underscore (_). But the email cannot start or end with these additional characters. The name must also be at least 1 character long. Example name values: a, ab, a_b, A__B..C--D, 1nt3r3st.1ng

domain: the domain is strictly numerical, and the number must be divisible by 5. the length of the domain is unrestricted. Example domain values: 984125, 0

ending: the email must end with a (.com) or (.ca) (case sensitive)

Exmaple:

find_special_email('[email protected]!')

'[email protected]'

What I have tried:

import re

def find_special_email(str):
   match = re.search(r'[a-zA-Z0-9_\.-]{1,12}@[0-9]+\.(com|ca)(\.[a-z]{2,3})?', str)
   return match.group(0)


print(find_special_email('[email protected]!'))
print(find_special_email('[email protected]!'))

My Issues:

  1. the email cannot start or end with these additional characters e.g dash (-), period (.) and underscore (_)
  2. I don't know how to match the "domain" that is divisible by 5

Solution

  • This regex - https://regex101.com/r/wSS0ES/4 can help.

    Regex: [a-zA-Z0-9](?:[a-zA-Z0-9_.-]{0,10}[a-zA-Z0-9])?@[0-9]*[05]+\.(?:com|ca)(?:\.[a-z]{2,3})?

    Changes Made:

    1. Prefixed [a-zA-Z0-9] so that email starts with a valid character.
    2. (?:[a-zA-Z0-9_.-]{0,10}[a-zA-Z0-9])? - after the first char, the address can optionally contain 0 to 10 times of all valid chars in the middle, but should end with an alphanumeric char. The entire expression is made optional so that it can match a single valid char in the address part.
    3. [0-9]*[05]+ - it ensures domain can contain multiple numbers but it should end with 0 or 5