I want to do this but in SQLAlchemy. The only difference is that rather than only being able to get the most recent record, I want to be able to get the most recent record before a given timestamp. As long as I ensure rows are never deleted, this allows me to view the database as it was on a particular timestamp.
Let's say my model looks like this:
from datetime import datetime
from sqlalchemy import Column, Integer, DateTime
from sqlalchemy.ext.declarative include declarative_base
Base = declarative_base()
class User(Base):
__tablename__ = "users"
id_ = Column("id", Integer, primary_key=True, index=True, nullable=False)
timestamp = Column(DateTime, primary_key=True, index=True, nullable=False, default=datetime.utcnow())
# other non-primary attributes would go here
And I have this users
table (timestamps simplified):
| id_ | timestamp |
-------------------
0 1
0 4
0 6
1 3
2 7
2 3
For example, if I request a snapshot at timestamp = 4
, I want to get:
| id_ | timestamp |
-------------------
0 4
1 3
2 3
The best I can come up with is doing it procedurally:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
db_engine = create_engine(...)
SessionLocal = sessionmaker(bind=db_engine, ...)
db_session = SessionLocal()
def get_snapshot(timestamp: datetime):
all_versions = db_session.query(User).filter(User.timestamp <= timestamp).order_by(desc(User.timestamp))
snapshot = []
for v in all_versions:
if v.id_ not in (i.id_ for i in snapshots):
snapshot.append(v)
return snapshot
However, this gives me a list of model objects rather than a sqlalchemy.orm.query.Query
, so I have to treat the result differently to standard queries in
other parts of my code. Can this be done all in the ORM?
Thanks in advance
Have you tried:
all_versions = db_session.query(User, func.max(User.timestamp)).\
filter(User.timestamp <= timestamp).\
group_by(User.id_)
You can read more about generic functions in SQLAlchemy here