I'm currently running different queries where I want to exclude those users that have not been active for the last 36 hours.
My users table has a last_connection_time
, which is a timestamp with the last time that the user connected.
I was thinking about adding a users.last_connection_time < $timestamp_36hours_ago
to the relevant queries.
I'm a bit concerned though about how much may this influence the performance.
Would it be better to run a script every 2 - 6 hours that updates a boolean column in the users table called inactive
and then simply ignore those inactive users in the query via a whereNull('users.inactive')
?
Indexing last_connection_time
and then performing greater / lower than comparisons will be fast enough and an index would be used, that will help MySQL immensely when looking for records.
However, if you create a boolean column and then perform queries based on whereNull(users.inactive)
-> index cannot help you there, it's a low-cardinality column and MySQL might ignore the index here because it won't help to reduce I/O, thus inspecting the whole dataset - and that's what you want to avoid.
Your data-set probably doesn't exceed terabyte limit (or even a gigabyte), so you probably won't see too many differences between both approaches currently, but for ever-growing dataset I'd simply keep the timestamp column indexed and do >
or <
types of queries like you wanted to do in the first scenario.