Search code examples
haskellstate-monadio-monad

The "Haskell way" to extract/cumulate results inside an predefined vistor pattern iterator


I'm getting started with Haskell (from many years of C and c++) and have decided to attempt a small database project. I'm using a predefined binder library to a C database library (Database.kyotocabint). I'm struggling to get my head round how to do anything with the iterator interfaces due to the separation of effects when using a pre-defined method.

The toy demo to iterate over the data base and print it out (which works fine) is

test7 = do
  db <- openTree "testdatabase/mydb.kct" defaultLoggingOptions (Writer [] [])
  
  let visitor = \k v -> putStr (show k) >> putStr ":" >> putStrLn (show v) >> 
                        return (Left NoOperation)
  iterate db visitor False
  
  close db

Where iterate and visitor are provided by the library bindings and the relevant types are

iterate :: forall db. WithDB db => db -> VisitorFull -> Writable -> IO ()
visitor :: ByteString -> ByteString -> IO (Either VisitorAction b)

But I can't see to how extract information out from inside the iterator rather than process each one individually - for example collect all the keys beginning with 'a' in a list or even just count the number of entries.

Am I limited because iterate just has the type IO () and so I can't build in side effects and would have to rebuild this replacing the library versions? The state monad on paper seems to adress this but the visitor type doesn't seem to allow me to maintain the state over subsequent visitor calls.

What would be the Haskell way to solve this ?

Matthew

Edit - many thanks for the clear answer below which siad both 0 its not the Haskell way but also provided a solution - this answer led me to Mutable objects which I found a clear explanation of the options.


Solution

  • The kyotocabinet library unfortunately does not seem to support your operation. Beyond iterate, it should expose some similar operation which returns something more complex than IO (), say IO a or IO [a] while requiring a more complex visitor function.

    Still, since we work inside IO, there is a workaround: we can exploit IORefs and collect results. I want to stress, though, that this is not idiomatic code one would write in Haskell, but something one is forced to use because if the limitation of this library.

    Anyway, the code would look something like this (untested):

    test7 = do
      db <- openTree "testdatabase/mydb.kct" defaultLoggingOptions (Writer [] [])
      w <- newIORef []   -- create mutable var, initialize to []
      let visitor = \k v -> do
             putStrLn (show k ++ ":" ++ show v)
             modifyIORef w ((k,v):)  -- prepend (k,v) to the list w
             return (Left NoOperation)
      iterate db visitor False
      result <- readIORef w   -- get the whole list
      print result
      
      close db
    

    Since you come from C++, you might want to compare the code above to the following pseudo-C++:

    std::vector<std::pair<int,int>> w;
    db.iterate([&](int k, int v) {
       std::cout << k << ", " << v << "\n";
       w.push_back({k,v});
    });
    // here we can read w, even if db.iterate returns void
    

    Again, this is not something I would consider idiomatic Haskell.