I am writing a function that takes in a filename and a list of pairs of characters of which to substitute when reading the file. I am currently getting an error on one of my helping functions.
prac.sml:177.5-182.12 Error: right-hand-side of clause doesn't agree with function result type [tycon mismatch]
expression: string option -> string * string -> unit
result type: TextIO.elem option -> string * string -> unit
Here is the function that gives the error. I don't understand exactly what could be causing this to happen, can anybody help me see what's going wrong?
fun echoFile (infile) (c) (x,y) =
if isSome c
then (
printChar (valOf c) (x,y);
echoFile infile (TextIO.input1 infile) (x,y)
) else ()
Here is the printChar function:
fun printChar (c) (x,y) =
if x = c
then print y
else print c
And here is the function that calls it.
fun fileSubst _ [] = ()
| fileSubst inputFile ((x,y)::xs) =
let
val infile = TextIO.openIn inputFile
in
echoFile infile TextIO.input1(infile) (x,y);
TextIO.closeIn(infile);
fileSubst inputFile xs
end
Here is some feedback to the code you've written:
The function TextIO.input1
has the type TextIO.instream → TextIO.elem option. When you inspect the TextIO structure (e.g. by writing open TextIO;
in an sml prompt), you will find the definition type elem = char
. So treat the output like a char and not a string. You could use the function str
of type char → string. But consider using line buffering since reading files one character at a time is expensive in terms of system calls and allocation.
I've removed unnecessary semicolons: The ones after fun
, val
and other declarations are only needed in the REPL to get immediate results. The ;
between expressions is an operator.
I've removed unnecessary parentheses. You do need parentheses when constructing tuples ((x,y)
) and when declaring precedence. For example, echoFile infile (TextIO.input1 infile) (x,y)
says that echoFile
is a function with three arguments, and the second argument is TextIO.input1 infile
, which is itself a function applied to an argument. But you don't need a second pair of parentheses to signify function application. That is, TextIO.input1 infile
is just as good as TextIO.input1(infile)
, just like you don't bother to write (42)
every time you have the number 42
.
This means you still have a bug in fileSubst
on this line:
echoFile infile TextIO.input1(infile) (x,y)
since this is interpreted as echoFile
having four arguments: infile
, TextIO.input1
, (infile)
and (x,y)
. It may seem that TextIO.input1
and (infile)
stick together because there's no space gap, but function application is recognized as the positioning of a function in front of its argument, not the presence of parentheses. Also, function application associates to the left, so if we're adding explicit parentheses to the line above, it becomes:
(((echoFile infile) TextIO.input1) (infile)) (x,y)
To overcome the left-associativity, we write:
echoFile infile (TextIO.input1 infile) (x,y)
which gets interpreted as (the bold parentheses are the explicit ones):
((echoFile infile)
(
TextIO.input1 infile
)
) (x,y)
It seems that your function fileSubst
is supposed to replace every occurrence of the character x
with the character y
. I'd probably call this a "file map", since it resembles quite closely the library function String.map
of type (char → char) → string → string. Whether you keep a list of (x,y) mappings or a char → char function is quite similar.
I'd probably write a function fileMap
with the type (char → char) → instream → outstream to resemble String.map
:
fun fileMap f inFile outFile =
let fun go () =
case TextIO.inputLine inFile of
NONE => ()
| SOME s => ( TextIO.output (outFile, String.map f s) ; go () )
in go () end
And then use it e.g. like:
fun cat () = fileMap (fn c => c) TextIO.stdIn TextIO.stdOut
fun fileSubst pairs =
fileMap (fn c => case List.find (fn (x,y) => x = c) pairs of
NONE => c
| SOME (x,y) => y)
Some thoughts on these:
When arguments for similar functions can be either files or filenames, I'd like the distinction to be more clear in the variable name. E.g. inputFile
vs. infile
isn't doing it for me. I'd rather have e.g. inFile
and filePath
.
Whether a function should take a file path or an instream, I guess, depends on how you expect you want to compose it. So a very generic function like fileMap
might take instream / outstream, but it might just as well take file paths. If you're making both types of functions, it's probably nice to either distinguish them by name or separate them into different modules.
You probably want to deal with arbitrary outstreams, not just TextIO.stdOut
, since you're dealing with arbitrary instreams, too. You can always special-case standard input/output like in cat
.
I made an auxiliary function, go
, inside fileMap
to deal with the recursion. In this case, I could just as well have done without and let fileMap
call itself directly:
fun fileMap f inFile outFile =
case TextIO.inputLine inFile of
NONE => ()
| SOME s => ( TextIO.output (outFile, String.map f s)
; fileMap f inFile outFile )
since fileMap
doesn't accumulate any state in additional arguments. But it is often the case that recursive functions need extra arguments to keep their state, while at the same time, I don't want to pollute the function's type signature (like with your echoFile
's c
). This is a major use-case for monads.
And instead of case-of on List.find
, I could've used various library functions for dealing with NONE
/SOME
found in Option
:
local
val getOpt = Option.getOpt
val mapOpt = Option.map
val find = List.find
in
fun fileSubst pairs =
fileMap (fn c => getOpt (mapOpt #2 (find (fn (x,y) => x = c) pairs), c))
end