I'm not sure how to word this question. Say I'm trying to pass the paths of tmpfiles around, and I want to capture the idea that there are different formats of tmpfile, and each function only works on one of them. This works:
data FileFormat
= Spreadsheet
| Picture
| Video
deriving Show
data TmpFile = TmpFile FileFormat FilePath
deriving Show
videoPath :: TmpFile -> FilePath
videoPath (TmpFile Video p) = p
videoPath _ = error "only works on videos!"
But there must be a better way to write it without runtime errors right? I thought of two alternatives, this:
type TmpSpreadsheet = TmpFile Spreadsheet
type TmpPicture = TmpFile Picture
type TmpVideo = TmpFile Video
videoPath :: TmpVideo -> FilePath
Or this:
data TmpFile a = TmpFile a FilePath
deriving Show
videoPath :: TmpFile Video -> FilePath
But obviously they don't compile. What's the proper way to do it? Some other ideas, none particularly appealing:
TmpFile
in the format instead of the other way around, so the values are Video (TmpFile "test.avi")
etc.VideoTmpFile
, PictureTmpFile
etc.TmpFile
typeclassI also considered learning the -XDataKinds
extension, but suspect I'm missing something much simpler that can be done without it.
EDIT: I'm learning a lot today! I tried both the approaches outlined below (DataKinds
and phantom types, which have dummy value constructors that can be removed with another extension), and they both work! Then I tried to go a little further. They both let you make a nested type TmpFile (ListOf a)
in addition to regular TmpFile a
, which is cool. But I've tentatively decided to go with plain phantom types (intact value constructors), because you can pattern match on them. For example, I was surprised that this actually works:
data Spreadsheet = Spreadsheet deriving Show
data Picture = Picture deriving Show
data Video = Video deriving Show
data ListOf a = ListOf a deriving Show
data TmpFile a = TmpFile a FilePath
deriving Show
videoPath :: TmpFile Video -> FilePath
videoPath (TmpFile Video p) = p
-- read a file that contains a list of filenames of type a,
-- and return them as individual typed tmpfiles
listFiles :: TmpFile (ListOf a) -> IO [TmpFile a]
listFiles (TmpFile (ListOf fmt) path) = do
txt <- readFile path
let paths = map (TmpFile fmt) (lines txt)
return paths
vidPath :: TmpFile Video
vidPath = TmpFile Video "video1.txt"
-- $ cat videos.txt
-- video1.avi
-- video2.avi
vidsList :: TmpFile (ListOf Video)
vidsList = TmpFile (ListOf Video) "videos.txt"
main :: IO [FilePath]
main = do
paths <- listFiles vidsList -- [TmpFile Video "video1.avi",TmpFile Video "video2.avi"]
return $ map videoPath paths -- ["video1.avi","video2.avi"]
As far as I can tell, the equivalent with DataKinds
is very similar, but can't access fmt
as a value:
{-# LANGUAGE DataKinds, KindSignatures #-}
data FileFormat
= Spreadsheet
| Picture
| Video
| ListOf FileFormat
deriving Show
data TmpFile (a :: FileFormat) = TmpFile FilePath
deriving Show
vidPath :: TmpFile Video
vidPath = TmpFile "video.avi"
vidsList :: TmpFile (ListOf Video)
vidsList = TmpFile "videos.txt"
videoPath :: TmpFile Video -> FilePath
videoPath (TmpFile p) = p
listFiles :: TmpFile (ListOf a) -> IO [TmpFile a]
listFiles (TmpFile path) = do
txt <- readFile path
let paths = map TmpFile (lines txt)
return paths
main :: IO [FilePath]
main = do
paths <- listFiles vidsList
return $ map videoPath paths
(It may seem like a weird thing to want, but my actual program is going to be an interpreter for a small language that compiles to Shake rules with a tmpfile corresponding to each variable, so typed lists of tmpfiles will be useful)
Does that seem right? I like the idea of DataKinds
better, so I would go with it instead if I could inspect them as values, or if it turns out that's never needed.
You're right: with -XDataKinds
, the TmpFile Video -> FilePath
approach would work. And indeed I think this may be a good application for that extension.
{-# LANGUAGE DataKinds #-}
data TmpFile (a :: FileFormat) = TmpFile FilePath
deriving Show
videoPath :: TmpFile Video -> FilePath
The reason you need this extension to write TmpFile Video
is that the constructors of FileFormat
are ab initio value-level (thus only exist at runtime), while TmpFile
is type-level / compile-time.
Of course there's another way to generate type-level entities: define types!
data Spreadsheet = Spreadsheet
data Picture = Picture
data Video = Video
data TmpFile a = TmpFile a FilePath
deriving Show
videoPath :: TmpFile Video -> FilePath
Such types are called phantom types. But really, they're a bit of a hack to work around the former lack of proper type-level values, which DataKinds has now given us. So, unless you need compatibility with old compilers, do use DataKinds!
An alternative would be to not enforce the file type at compile time, but simply make it explicit that the functions are partial.
data TmpFile = TmpFile FileFormat FilePath
deriving Show
videoPath :: TmpFile -> Maybe FilePath
videoPath (TmpFile Video p) = p
videoPath _ = Nothing
In fact, that approach might well be the more rational one, depending on what you're planning to do.