I have a web MVC application with domain objects and data mappers. The class methods of the data mappers contain all database querying logic. I'm trying to avoid mirroring any database structure and, therefore, to achieve the maximum flexibility in constructing the sql statements. So, in principle, I'm trying to not make use of any ORM or ActiveRecord structure/pattern AT ALL.
Let me give you an example:
Normally, I could have an abstract class AbstractDataMapper
inherited by all specific data mapper classes - like the UserDataMapper
class. And then I could define a findById()
method in AbstractDataMapper
, to fetch a record of a specific table - like users
- by a given id
value, e.g. user id. But this would imply that I'd always fetch a record from a single table, without the possibility to use any left joins to also fetch some other details from some other tables corresponding to the given id
- user id.
So, my question is: Under these conditions - to which I myself obliged to, should I implement an abstract data mapper class, or each data mapper class should contain its own completely "proprietary" implementation of the data-access layer?
I hope I could express my idea clear. Please tell me, if I was somehow unclear or you have any questions.
Thank you very much for your time and patience.
If I understood your point ...
Having all your concrete mappers inheriting SQL from a common class has several issues that you have missed:
id
as name for all of your PRIMARY KEY
columnsNow, I'm gonna try to unpack each of those.
To create a shared findById()
method, the only pragmatic approach is to build it around something like this:
"SELECT * FROM {$this->tableName} WHERE id = :id"
The main issue actually is the wildcard *
symbol.
There are two major approaches for populating an entity using a data mapper: use setters or use reflection. In both cases the "names" of a parameters/setters is implied by columns, that you have selected.
In a normal query you can do something like SELECT name AS fullName FROM ...
, which lets you to use the query for re-naming the fields. But with a "unified approach", there are no good options.
id
?So, the thing is, unless you have a mapper-per-table structure (in which case an active record starts look like pragmatic option), you will end up with few (really common) "edge case" scenarios for your mappers:
Your original idea would work just fine in a small scale project (with one or two mappers being an "edge case"). But with a large project, the usage of findById()
will be the exception not the norm.
To actually get this findById()
method in the superclass, you will need a way to communicate the table name to it. Which would mean, that you have something like protected $tableName
in you class definition.
You can mitigate it by having abstract function getTableName()
in your abstract mapper class, which, when implemented, returns a value of global constant.
But what happens, when your mapper need to work with multiple tables.
It seems like a code smell to me, because information actually crosses two boundaries (for lack of better word). When this code breaks, the error will be shown for SQL in the superclass, which isn't where the error originated from (especially, if you go with constants).
This is a bit more controversial opinion :)
As far as I can tell, the practice of calling all primary columns id
comes from various ORMs. The penalty, that this incurs, applies only to readability (and code maintenance). Consider these two queries:
SELECT ar.id, ac.id
FROM Articles AS ar LEFT JOIN
Accounts AS ac ON ac.id = ar.account_id
WHERE ar.status = 'published'
SELECT ar.article_id, ac.account_id
FROM Articles AS ar LEFT JOIN
Accounts AS ac USING(account_id)
WHERE ar.status = 'published'
As the DB schema grows and the queries become more complex, it gets harder and harder to actually keep track of, what the "id" stands for in what case.
My recommendation would be to try same name for column, when it is a primary as when it is a foreign key (when possible, because in some cases, like for "closure tables, it's not viable). Basically, all columns that store IDs of same type, should have the same name.
As a minor bonus, you get the USING()
syntax sugar.
Bad idea. You are basically breaking LSP.