Search code examples
listparsingocamlrecord

How do you parse values from a txt file into a list of records in OCaml?


I'm trying to learn OCaml and am having difficulty with parsing a file into a list of records. Let's say I have a text file with the following format:

Jim Bob, red
Steve Black, blue

etc..

I would like to be able to parse the csv's into a list of records which I would then later use to do basic list operations such as sorting, with the record being:

type person_info =
{
  name : string;
  favorite_color  : string;
}

I have the parse function:

let parse_csv =
  let regexp = Str.regexp (String.concat "\\|" [
                             "\"\\([^\"\\\\]*\\(\\\\.[^\"\\\\]*\\)*\\)\",?";
                             "\\([^,]+\\),?";
                             ",";
                           ]) in
  fun text ->
    let rec loop start result =
      if Str.string_match regexp text start then
        let result =
          (try Str.matched_group 1 text with Not_found ->
             try Str.matched_group 3 text with Not_found ->
               "") :: result in
        loop (Str.match_end ()) result
      else
        result in
    List.rev ((if
                 try String.rindex text ',' = String.length text - 1
                 with Not_found -> false
               then [""] else [])
              @ loop 0 [])

That will split everything up for me. However I have no idea how to read things into a list of records and I can't even get it to parse properly into an array:

let () =
  let ic = open_in Sys.argv.(1) in
  let lines = ref [] in
  try
    while true do

    lines := Array.of_list (parse_csv (input_line ic))

    done
  with End_of_file ->
    close_in ic

This will work fine without calling parse_csv, but fails when I try to parse.


Solution

  • Note that there exists a CSV module that you can install with opam install csv. You can then easily read the file (in the interactive toploop):

    # #require "csv";;
    /home/chris/.opam/system/lib/csv: added to search path
    /home/chris/.opam/system/lib/csv/csv.cma: loaded
    # let c = Csv.load "data.csv";;
    val c : Csv.t = [["Jim Bob"; "red"]; ["Steve Black"; "blue"]]
    

    You can then easily convert it to your favorite format:

    # let read_people fname =
      Csv.load fname
      |> List.map (function [name; favorite_color] -> {name; favorite_color }
                          | _ -> failwith "read_people: incorrect file");;
    val read_people : string -> person_info list = <fun>
    # read_people "data.csv";;
    - : person_info list =
    [{name = "Jim Bob"; favorite_color = "red"};
     {name = "Steve Black"; favorite_color = "blue"}]