Search code examples
smlsmlnj

SML splitting string on first space


As of right now I am reading in an entire input file using inputAll and then using String.tokens to split each word at every occurrence of space.

val file = TextIO.openIn input
val _input = TextIO.inputAll file
val _ = TextIO.closeIn file
String.tokens Char.isSpace _input

Ex) "red blue green" would look like this

["red", "blue", "green"]

However, now I would like to change it to only split the string at the first occurrence of a space char on each line.

Ex) "red blue green" should look like

["red", "blue green"]

I have a feeling I will need to utilize something other than inputAll to accomplish this, and my main question is how do you make it so it only splits at the first space of each line.


Solution

  • TextIO.inputAll is fine. In this case it seems that String.tokens is not the right tool for the job. Personally I would just write my own function, using String.explode and String.implode to convert a string to/from char list.

    fun splitCharsFirstSpace cs =
      case cs of
        [] => ([], [])
      | c :: cs' =>
          if Char.isSpace c then ([], cs')
          else let val (l, r) = splitCharsFirstSpace cs'
               in (c :: l, r)
               end
    
    fun splitFirstSpace s =
      let
        val (l, r) = splitCharsFirstSpace (String.explode s)
      in
        (String.implode l, String.implode r)
      end
    

    In context, you could use this as follows.

    val file = TextIO.openIn input
    val contents = TextIO.inputAll file
    val _ = TextIO.closeIn file
    val lines = String.tokens (fn c => c = #"\n") contents
    val lines' = List.map splitFirstSpace lines
    

    For example, if your input file was this:

    red blue green
    yellow orange purple pink
    

    then lines' would look like this:

    [("red", "blue green"), ("yellow", "orange purple pink")]