Search code examples
csvprologswi-prolog

Reversible CSV parsing


Prolog newbie here. In SWI Prolog, I'm trying to figure out how to parse a simple line of CSV reversibly, but I'm stuck. Here's what I've got:

csvstring1(S, L) :-
  split_string(S, ',', ',', T),
  maplist(atom_number, T, L).
   
csvstring2(S, L) :-
  atomic_list_concat(T, ',', S),
  maplist(atom_number, T, L).

% This one is the same except that maplist comes first. 
csvstring3(S, L) :-
  maplist(atom_number, T, L),
  atomic_list_concat(T, ',', S).

Now csvstring1 and csvstring2 work in a "forward" manner:

?- csvstring1('1,2,3,4', L).
L = [1, 2, 3, 4].

?- csvstring2('1,2,3,4', L).
L = [1, 2, 3, 4].

But not csvstring3:

?- csvstring3('1,2,3,4', L).
ERROR: Arguments are not sufficiently instantiated

Moreover csvstring3 works in reverse, but not the other two predicates:

?- csvstring3(L, [1,2,3,4]).
L = '1,2,3,4'.

?- csvstring1(L, [1,2,3,4]).
ERROR: Arguments are not sufficiently instantiated

?- csvstring2(L, [1,2,3,4]).
ERROR: Arguments are not sufficiently instantiated

How can I combine these into a single predicate?


Solution

  • I don't know of a particularly newbie friendly way to do it which doesn't compromise somewhere. This is the easiest:

    csvString_list(String, List) :-
        ground(String),
        atomic_list_concat(Temp, ',', String),
        maplist(atom_number, Temp, List).
    
    csvString_list(String, List) :-
        ground(List),
        maplist(atom_number, Temp, List),
        atomic_list_concat(Temp, ',', String).
    

    but it makes and leaves spurious choicepoints, which is mildly annoying.

    This cuts the choicepoints which is nice when using it, but poor practise to get into without being aware of what that means:

    csvString_list(String, List) :-
        ground(String),
        atomic_list_concat(Temp, ',', String),
        maplist(atom_number, Temp, List),
        !.
    
    csvString_list(String, List) :-
        ground(List),
        maplist(atom_number, Temp, List),
        atomic_list_concat(Temp, ',', String).
    

    This uses if/else which is less code:

    csvString_list(String, List) :-
      ground(String) ->
          (atomic_list_concat(Temp, ',', String), maplist(atom_number, Temp, List))
        ; (maplist(atom_number, Temp, List),      atomic_list_concat(Temp, ',', String)).
    

    but is logically bad and you should reify the branching with if_ which isn't builtin to SWI Prolog and is less simple to use.

    Or you could write a grammar with a DCG, which is not newbie territory:

    
    :- set_prolog_flag(double_quotes, chars).
    :- use_module(library(dcg/basics)).
    
    csvTail([N|Ns]) --> [','], number(N), csvTail(Ns). 
    csvTail([])     --> [].
    
    csv([N|Ns]) --> number(N), csvTail(Ns).
    

    e.g.

    ?- phrase(csv(Ns), "11,22,33,44,55").
    Ns = [11, 22, 33, 44, 55]
    
    
    ?- phrase(csv([11, 22, 33, 44, 55]), String)
    String = [49, 49, ',', 50, 50, ',', 51, 51, ',', 52, 52, ',', 53, 53]
    

    but now you're back to it leaving spurious choicepoints while parsing and you have to deal with the historic split of strings/atoms/character codes in SWI Prolog; that list will unify with "11,22,33,44,55" because of the double_quotes flag but it doesn't look like it will.