Search code examples
pythonms-accesssqlalchemysqlalchemy-access

Query filter using CDate() produces inconsistent results


i try to get this Query running but it does not work like i expected.

In the table is a Date Field Formatted as Text (can't change it) and i need to filter this Column.

Dates looks like

11/03/2022 (d/m/Y)

My Query looks like

session.query(DBGEUK)\
        .filter(DBGEUK.VALIDATOR == '58')\
        .filter(func.CDate(DBGEUK.DATE) <= datetime.now())\
        .all()

There is a total of 24 entry's. When i run the Query above and print out the DBGEUK.DATE + datetime.now, this 9 entry's are my result.

27/03/2022 2022-03-28 19:06:49.465406
27/03/2022 2022-03-28 19:06:49.480988
27/03/2022 2022-03-28 19:06:49.480988
27/03/2022 2022-03-28 19:06:49.480988
28/03/2022 2022-03-28 19:06:49.480988
28/03/2022 2022-03-28 19:06:49.480988
28/03/2022 2022-03-28 19:06:49.480988
28/03/2022 2022-03-28 19:06:49.481612
28/03/2022 2022-03-28 19:06:49.481727

If i change the Query to greater then >= i got the other 15 entry's

04/03/2022 2022-03-28 19:09:09.030659
04/03/2022 2022-03-28 19:09:09.031659
04/03/2022 2022-03-28 19:09:09.031659
04/03/2022 2022-03-28 19:09:09.031659
04/03/2022 2022-03-28 19:09:09.031659
05/03/2022 2022-03-28 19:09:09.031659
05/03/2022 2022-03-28 19:09:09.031659
05/03/2022 2022-03-28 19:09:09.031659
05/03/2022 2022-03-28 19:09:09.032657
11/03/2022 2022-03-28 19:09:09.032657
12/03/2022 2022-03-28 19:09:09.032657
11/03/2022 2022-03-28 19:09:09.032657
11/03/2022 2022-03-28 19:09:09.032657
09/03/2022 2022-03-28 19:09:09.033654
09/03/2022 2022-03-28 19:09:09.033654

Thanks for your help in advanced.


Solution

  • The CDate() function tries to interpret date string literals according to the date formatting settings in the Windows control panel. With a "short date" setting of MM/dd/yyyy, CDate("03/07/2022") evaluates to March 7, 2022. With a "short date" setting of dd/MM/yyyy, CDate("03/07/2022") evaluates to July 3, 2022.

    However, if the date string represents an invalid date then CDate() will be "helpful" and return a valid date using the other format. In both cases above, CDate("14/03/2022") will evaluate to March 14, 2022.

    Unfortunately, this means that

    1. How the dates are interpreted depends on the Windows date format, which can vary from machine to machine and even from user to user on the same machine.
    2. The interpretation of dates can be inconsistent between ambiguous and unambiguous date strings.

    Therefore in this case we need to avoid using CDate() and parse the date string ourselves:

    from datetime import datetime
    
    from sqlalchemy import create_engine, Column, String, select, func
    from sqlalchemy.engine import URL
    from sqlalchemy.orm import declarative_base, Session
    
    accdb_path = r"C:\Users\Public\test\sqlalchemy-access\gord_test.accdb"
    connection_string = (
        "DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};"
        f"DBQ={accdb_path};"
        "ExtendedAnsiSQL=1;"
    )
    connection_url = URL.create(
        "access+pyodbc", query={"odbc_connect": connection_string}
    )
    engine = create_engine(connection_url)
    
    Base = declarative_base()
    
    
    class DBGEUK(Base):
        __tablename__ = "so71651145"
        DATE = Column(String(10), primary_key=True)
    
        def __repr__(self):
            return f"<DBGEUK(DATE='{self.DATE}')>"
    
    
    # create test environment
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)
    
    with Session(engine) as session:
        # example data
        session.add_all(
            [
                #                           How CDate() interprets the string
                #                           based on Windows' short date format:
                #
                #                             dd/MM/yyyy    MM/dd/yyyy
                #                             ------------  ------------
                DBGEUK(DATE="07/03/2022"),  # Mar 7, 2022   Jul 3, 2022
                DBGEUK(DATE="14/03/2022"),  # Mar 14, 2022  Mar 14, 2022
                DBGEUK(DATE="03/07/2022"),  # Jul 3, 2022   Mar 7, 2022
                DBGEUK(DATE="31/12/2022"),  # Dec 31, 2022  Dec 31, 2022
            ]
        )
        session.commit()
    
        # (for future readers of this answer)
        print(datetime.now())  # 2022-03-29 08:44:00.512366
    
        # original query
        qry = select(DBGEUK).filter(func.CDate(DBGEUK.DATE) <= datetime.now())
        results = session.scalars(qry).all()
        print(results)
        # [<DBGEUK(DATE='14/03/2022')>, <DBGEUK(DATE='03/07/2022')>]
        #
        # with Windows' short date format set to MM/dd/yyyy, CDate() interprets
        # the second value as March 7, not July 3
    
        # corrected query
        qry = select(DBGEUK).filter(
            func.DateSerial(
                func.CInt(func.Mid(DBGEUK.DATE, 7, 4)),  # year
                func.CInt(func.Mid(DBGEUK.DATE, 4, 2)),  # month
                func.CInt(func.Mid(DBGEUK.DATE, 1, 2)),  # day
            )
            <= datetime.now()
        )
        results = session.scalars(qry).all()
        print(results)
        # [<DBGEUK(DATE='07/03/2022')>, <DBGEUK(DATE='14/03/2022')>]
        #
        # all good