Search code examples
haskellhaskell-lenslenses

Reconciling lens usage with database access


I've been playing around with lenses recently, and finding them very pleasant for their intended usage - digging into complex data structures. But one of the areas that I'd most appreciate them is in database access (specifically sqlite, but I think my question generalizes to most DBs), and yet I can't see any way to write lenses that don't heavily sacrifice either performance or granularity.

If I write a lens (or I think probably a Prism, in light of NULLable fields?) from a DB to a table, from a table to a row, and from a row to a column, each step of that incurs a DB access, meaning that what ought to be one access is at minimum 4.

On the other hand, if I aim to map DB access 1:1 with uses of the lens/prism, I get big do-everything lenses that can't be broken up into smaller pieces when I do want to just see what columns are in a table, and so on.

Does it make sense at all to use lenses with a DB, and if so am I missing an obvious way to avoid having to duplicate work to avoid unnecessary DB access?


Solution

  • It sound to me like you want to use lens in a way similar to linq IQueryable in c#.

    Eg if you have the types:

    data Project = Project {
      _projectId :: Int
      , _projectPriority :: Int
      , _projectName :: String
      , _projectTasks :: [Task]
       } deriving (Show)
    
    data Task = Task {
      _taskId :: Int
      , _taskName :: String
      , _taskEstimate :: Int
      } deriving (Show)
    
    
    makeLenses ''Project
    makeLenses ''Task
    

    And a database:

    create table projects ( id, name, priority);
    create table tasks (id, name, estimate, projectId);
    
    insert into projects values (1, 'proj', 1), (2, 'another proj', 2);
    
    insert into tasks values (1, 'task1', 30, 1), (2, 'another', 40, 1),
                            (3, 'task3', 20, 2), (4, 'more', 80, 2);
    

    If you wanted to get a list of task names from projects with priority greater than 1, it would be nice if you could use:

    highPriorityTasks :: IO [String]
    highPriorityTasks = db ^.. projects . filtered (\p -> p ^. projectPriority > 1 )
                        . projectTasks . traverse . taskName
    

    And have that query the database using the query:

    select t.name from projects as p 
    inner join tasks as t on t.projectId = p.id 
    where p.priority > 1;
    

    Unfortunately, that isn't possible with the library. Basically, to be efficient database wise, you (normally) have to do everthing in one query. It wouldn't be acceptable to do this:

    select * from projects where priority > 1;
    for each project:
       select name from tasks where projectId = <project>.id    
    

    Unfortunately, it isn't possible to decompose functions to know what built them up. Apart from the type, you can't find anything out about a function without running it. So there would be no way to extract the data out of the filtered function to help build the query. Nor would it be possible to extract the sub lens out of the full expression. So this isn't possible using the lens library.

    The best you can get at the moment is to query the database using one set of functions, and query the resulting data using lens. See this blog post about yesod for an example of this.


    A related question is if this is possible at all. To do so, we would need to create a sublanguage for numeric and string operators, and composition that tracks what is done. This could be possible. For example, you can build up a Num type that records everything done to it:

    data TrackedNum = TrackedNum :-: TrackedNum
                    | TrackedNum :+: TrackedNum
                    | TrackedNum :*: TrackedNum
                    | Abs TrackedNum
                    | Signum TrackedNum
                    | Value Integer
      deriving (Show)
    
    instance Num TrackedNum where
      a + b = a :+: b
      a * b = a :*: b
      a - b = a :-: b
      abs a = Abs a
      signum a = Signum a
      fromInteger = Value
    
    t :: TrackedNum
    t = 3 + 4 * 2 - abs (-34)
    
    > t 
    (Value 3 :+: (Value 4 :*: Value 2)) :-: Abs (Value 0 :-: Value 34)
    

    Repeat the process for boolean operators (you will need a new type class for this), list operators, and function composition (ie the Category class), and you should be able to make a "white-box" function, which then could be used to create an efficient sql query. This isn't a trivial undertaking though!