Search code examples
python-3.xemailimapimaplib

How to set multiple conditions (AND and OR) on imaplib.IMAP4_SSL.search()


I need to filter emails and label them based on some conditions.

This is my code:

def get_inbox():

    os.chdir("C:/Users/simeone/Desktop/FilterEmails")
    df = {}
    df = pd.read_excel("Filtri.xlsx", encoding='utf-8', sheet_name = ['FROM', 'TEXT', 'SUBJECT'])
    filters = []
    for key in df.keys():
        fil = [ '(OR ' + key + ' ' + '"' + name + '"'+ ' UNSEEN)' for name in list(df[key][df[key].columns[0]])]
        str1 = ' '.join(fil)
        filters.append(str1)
    filtro = ' '.join(filters)
    

    mail = imaplib.IMAP4_SSL(host)
    mail.login(username, password)
    mail.select("inbox")


    _, search_data = mail.search(None, filtro)  

the code is not complete but that's not the point as the error is the condition. The problem is the condition.

I import the conditions from an Excel where they are divided in from, text, subject, and then I impose conditions on them.

The problem is that the code select every unseen email, whatever the from test and subject.

I have clear in mind the logic but cannot translate to code correctly. What the mail.search must do is: AND UNSEEN AND (OR FROM "####" OR SUBJECT "####") which means take all the unseen and put the label on those with have OR "that subject" OR they are from "that person".

In another way, label all those from xxx OR with subject xxx but that are also (AND) UNSEEN.


Solution

  • In the IMAP search language, AND is the default operation, and OR is two-operand prefix operation.

    For AND you just stick them together: "a and b" is A B.

    For OR that means if you want "a or b", you need to write "OR (A) (B)". Technically the parentheses aren't really needed, but may help if your conditions get complex.

    If you want more than two things, you need to chain the ORs together. Each one can only take two parameters. You could write "x or y or z" as either OR (OR X Y) Z or OR X (OR Y Z). Again the parenthesis are optional, but may help some servers parse it better.

    Sticking all that together "a and (x or y or z)" is A OR OR X Y Z.

    There is a lot of server software that doesn't handle complex queries very well. If your query gets too complex or the servers implementation is marginal, you may want to consider caching the metadata yourself (using UID FETCH BODY[HEADER]) and doing your searches locally. This data is theoretically immutable so you should only have to fetch it once.