Search code examples
macrosclojurecassandradestructuring

clojure-variable-names for database_column_names


This is a "what's most idiomatic in Clojure" question.

I'm using Cassandra for my DB, with Alia as my Clojure driver (both Cassandra and Alia work phenomenally well -- couldn't be happier).

The problem is this: Cassandra uses underscores (not dashes) in column names, and Clojure prefers dashes to underscores. So "user-key" in Clojure is "user_key" in Cassandra. How best to handle the mapping of Cassandra column names to Clojure variables?

Because I'm using prepared statements for my CQL queries, I think the fact that column names contain underscores and not dashes is more than an implementation detail to be abstracted away -- I'm frequently putting CQL queries as strings into my Clojure code, and I think it's important to represent the CQL as it actually is. I have considered approaches that auto-magically translate dashes to underscores in query strings, so that there's a Clojure version of the CQL that gets mapped to the Cassandra version of the CQL, but this seems like an inappropriate level of abstraction. Besides you'd still need to use underscores when you ran CQL queries directly in Cassandra for troubleshooting, so you'd need to keep two different representations of column names in your head. Seems like the wrong approach.

The approach I've ended up taking is to perform the mapping in a Clojure destructuring map, like this:

(let [{user-key :user_key, user-name :user_name} 
    (conn/exec-1-row-ps "select user_key,user_name from users limit 1")] )

("conn/exec-1-row-ps" is my convenience function that just looks the CQL string up in a map, and uses the previously-prepared statement if present, or else prepares the statement and stashes it in the map, and then executes the prepared statement and returns the first row of the result set, or throws an exception if more than one row is returned).

if I use the more concise {:keys []} destructuring method, then I'm stuck with underscores in my Clojure variable names:

(let [{:keys [user_key user_name]} ...

That was the first approach that I tried, but it gets ugly very fast, as variable names with underscores seep through the code, and come head-to-head with ones with dashes. Confusing.

Having been confronted with this issue for a long time, doing the conversion in the destructuring map, where Clojure "variable-name" and Cassandra "column_name" are side-by-side feels like the best solution. It also lets me expand out short_col_nms to more-descriptive-variable-names when I want to.

This bears some resemblance to the mapping that Clojure does of underscores in filenames to dashes in namespaces, so it feels like there is some precedent for doing the mapping like this. In the filename/namespace case, Clojure does the mapping automagically, and so perhaps the direct analog would be a version of {:keys []} destructuring that mapped dashes to underscores.

I'm a relative newbie to Clojure, so I realize there may be better ways to do this. Hence my question.

One improvement that I've considered is writing a macro that builds the destructuring map dynamically at compile-time. But I don't know how to write a macro that operates that early in the compilation process.


Solution

  • After an upgrade to my Clojure macro-fu, the answer I've found is to use a macro that does the destructuring, including the conversion from snake_case to kebab-case, for me.

    An ancillary advantage of using the macro is that I can also do some basic compile-time validation of my CQL column names and parameters. The validation is very basic, but it will catch 90% of the head-slapper errors that I typically make.

    Here's the macro. This macro only handles the single-row result cases (which for me in Cassandra is more than 50% of cases). I'm going to work on a separate set of macros to handle multiple-row results.

    (defmacro with-single-row-cql-selects 
    
    "given a vector of one or more maps of the form:
    
      {:bindings [title doc-key version]
          :cql \"SELECT * from dtl_blog_entries where blog_key=? and n=?\"
          :params [ blog-key (int n) ]}
    
    evaluates body with the symbols in :bindings bound to the results of the CQL in :cql executed with the params in :params
    
    the CQL should be 'single-row' CQL that returns only one row.  in any case, the macro will take only the first row of the results1
    
    notes:
    1) the macro handles the conversion from kebab-case (Clojure) to snake_case (Cassandra) automagically.  specify your bindings using camel-case
    2) to bind to a different symbol than the variable name, use the form symbol-name:column-name in the bindings vector, e.g.:
    
      {:bindings [blog-name:title]
          :cql \"select title from dtl_blogs where blog_key=? and comm_key=? and user_key=?\"
          :params [ blog-key comm-key user-key]}
    
    3) the macro will do very basic compile-time checking of your cql, including
    
    a) validating that you have the same number of '?'s in your cql as params
    b) validating that the column names corresponding to the bindings are present in the CQL (or that this is a 'select *' query)
    
    "
      [select-bindings & body]
      (let [let-bindings# 
            (into []
                  (letfn ((make-vec#
                            ;; puts a single element into a vector, passes a vector straight through, and complains if v is some other kind of collection
                            [v#]
                            (cond
                             ;; missing, just use an empty vector
                             (not v#) []
                             (vector? v#) v#
                             (coll? v#) (throw (IllegalArgumentException. (str v# " should be a vector")))
                             :else [v#])))
                    (apply concat
                           (for [{:keys [cql params bindings]} select-bindings]
                             (let [vec-bindings# (make-vec# bindings)
                                   vec-params# (make-vec# params)
                                   binding-names# (map #(-> % name (clojure.string/split #":" ) first symbol) vec-bindings#)
                                   col-names# (map #(-> (or (-> % name (clojure.string/split #":" ) second ) %)
                                                       (clojure.string/replace \- \_) ) vec-bindings#)
    
                                   destructuring-map# (zipmap binding-names# (map keyword col-names#))
                                   fn-call# `(first (prep-and-exec ~cql ~vec-params#))]
                               ;; do some *very basic* validation to catch the some common typos / head slappers
                               (when (empty? vec-bindings#)
                                 (throw (IllegalArgumentException. "you must provide at least one binding")))
                               ;; check that there are as many ?s as there are params
                               (let [cql-param-count (count (re-seq #"\?" cql))]
                                 (when (not= cql-param-count (count vec-params#))
                                   (throw (IllegalArgumentException. (str "you have " cql-param-count
                                                                          " param placeholders '?' in your cql, but " 
                                                                          (count vec-params#) " params defined; cql: " cql ", params:" vec-params#)))))
                               ;; validate that the col-names are present  
                               (when (empty? (re-seq #"(?i)\s*select\s+\*\s+from" cql)) ;; if a 'select *' query, no validation possible
                                 (doseq [c col-names#]
                                   (when  (empty? (re-seq (re-pattern (str "[\\s,]" c "[\\s,]")) cql))
                                     (throw (IllegalArgumentException. ( str "column " c " is not present in the CQL"))))))
                               [destructuring-map# fn-call#])))))]
    
        `(let ~let-bindings#
           ~@body)))
    

    and here's an example use of the macro:

    (conn/with-single-row-cql-selects
    [{:bindings [blog-title]
      :cql "select blog_title from dtl_blogs where blog_key=? and comm_key=? and user_key=?"
      :params [ blog-key comm-key user-key]}]
      (println "blog title is " blog-title))
    

    and the macroexpand-1 (minus the println):

    (clojure.core/let [{blog-title :blog_title} (clojure.core/first
                                                  (dreamtolearn.db.conn/prep-and-exec
                                                    "select blog_title from dtl_blogs where blog_key=? and comm_key=? and user_key=?"
                                                    [blog-key
                                                     comm-key
                                                     user-key]))])
    

    here's another example with the output from the REPL:

    dreamtolearn.db.conn> (with-conn
      (with-single-row-cql-selects 
        [{:cql "select * from dtl_users limit 1"
          :bindings [user-key name date-created]}
    
         {:cql "select badges,founder_user_key,has_p_img from dtl_communities where comm_key=?"
          :bindings [badges founder-user-key has-profile-image:has-p-img]
          :params "5LMO8372ZDKHF798RKGNA57O3"}]
    
        (println "user-key: " user-key "  name: " name "  date-created: " date-created "  badges: " badges
                 "  founder-user-key: " founder-user-key " has-profile-image: " has-profile-image)))
    
    user-key:  9MIGXXW2QJWPGL0WJL4X0NGWX   name:  Fred Frennant   date-created:  1385131440791   badges:  comm-0   founder-user-key:  F2V3YJKBEDGOLLG11KTMPJ02QD  has-profile-image:  true
    nil
    dreamtolearn.db.conn> 
    

    and the macroexpand-1:

    (clojure.core/let [{date-created :date_created,
                        name :name,
                        user-key :user_key} (clojure.core/first
                                              (dreamtolearn.db.conn/prep-and-exec
                                                "select * from dtl_users limit 1"
                                                []))
                       {has-profile-image :has_p_img,
                        founder-user-key :founder_user_key,
                        badges :badges} (clojure.core/first
                                          (dreamtolearn.db.conn/prep-and-exec
                                            "select badges,founder_user_key,has_p_img from dtl_communities where comm_key=?"
                                            ["5LMO8372ZDKHF798RKGNA57O3"]))])