Search code examples
haskelltypesfunctional-programmingdata-kindskotlin-sealed

Constraint on function type using data kind


I'm confused about data kinds. Suppose we have

{-# LANGUAGE DataKinds #-}
...
data Format
  = Photo
      { bytes :: Int
      }
  | Video
      { bytes       :: Int
      , durationSec :: Int
      }

I want to make then function with promouted type:

createVideo :: Int -> Int -> 'Video
createVideo _ _ = error "Not implemented"

Compiller this will ask us for parameters, and with them it give msg 'Video Int Int has kind `Format'. i would like this compile-time behavior similar to kotlin:

sealed class Format {

  abstract val bytes: Int
  data class Photo(override val bytes: Int) : Format()
  data class Video(override val bytes: Int, val durationSec: Int) : Format()
}

private fun createVideo(bytes: Int, durationSec: Int) : Format.Video {
  return Format.Video(bytes, durationSec)
}

fun main() {
  val video: Format = createVideo(bytes = 0, durationSec = 0) // correct
  val video2: Format.Video = createVideo(bytes = 0, durationSec = 0) // correct
  val video3: Format.Photo = createVideo(bytes = 0, durationSec = 0) // compiler error 
}

In repl https://pl.kotl.in/2G9E1Cbgs


Solution

  • Data kinds do not provide a direct mechanism to check the value of a data structure at compile time. In other words, if you have a data type:

    data Format
      = Photo
          { bytes :: Int
          }
      | Video
          { bytes       :: Int
          , durationSec :: Int
          }
    

    that is suitable for representing photos and videos, using the DataKinds extension to promote it to the type level does not allow you to write functions that accept only 10-second videos:

    processTenSecondVideo :: Video bytes 10 -> IO ()    -- does not work
    

    or even generate a Format value that is guaranteed to be a video:

    createVideo1 :: Int -> Int -> Format                 -- works
    createVideo2 :: Int -> Int -> Video bytes duration   -- does not work
    

    From a technical standpoint, the new types Photo and Video that are promoted by the DataKinds extension are types of a kind (specifically, the new kind Format) that does not support values. So, while values of type Int exist (because Int is of kind *, the kind of types that have values), no values of type Photo 0 or Video 128000 10 exist. So, you can't use these types as return types (or argument types) of functions.

    So, what use are data kinds? Well, if you have a data structure that you want to somehow constrain at compile time, you don't promote that data structure. Instead you promote other data structures to use as tools to write a type-level program to constrain the target data structure.

    In your example, the constraints you want to put on your data structure are really quite modest: you want to check if the data structure is or isn't a Video. You don't need data kinds to do this. Regular Haskell types are adequate if you just rearrange your data structure a bit. Separating the formats into two types:

    data Photo = Photo {bytes :: Int}
    data Video = Video {bytes :: Int, durationSec :: Int}
    

    is enough to distinguish videos from non-videos at compile time. If you have parts of your program where you want to work with a value that can be either a video or a photo, you can introduce a sum type:

    data Format = PhotoF Photo | VideoF Video
    

    Data kinds become useful when the constraints become more complicated. For example, suppose you wanted to censor photos videos to make them safe for the whole family:

    censorPhoto :: Photo -> Photo
    censorVideo :: Video -> Video
    

    and allow users to generate screenshots:

    screenShot :: Video -> Photo
    

    You might like, at compile time, to ensure you don't accidentally censor a video twice, or show an uncensored video to a younger audience, or let someone bypass your censoring by taking screenshots of uncensored videos and passing them off as censored photos.

    You could accomplish this by introducing more types:

    data UncensoredPhoto = UncensoredPhoto {bytes :: Int}
    data UncensoredVideo = UncensoredVideo {bytes :: Int, durationSec :: Int}
    data UncensoredFormat = UncensoredPhotoF UncensoredPhoto | UncensoredVideoF UncensoredVideo
    data CensoredPhoto = CensoredPhoto {bytes :: Int}
    data CensoredVideo = CensoredVideo {bytes :: Int, durationSec :: Int}
    data CensoredFormat = CensoredPhotoF CensoredPhoto | CensoredVideoF CensoredVideo
    data AnyPhoto = UncensoredPhotoA UncensoredPhoto | CensoredPhotoA CensoredPhoto
    data AnyVideo = UncensoredVideoA UncensoredVideo | CensoredVideoA CensoredVideo
    data AnyFormat = AnyPhotoF AnyPhoto | AnyVideoF AnyVideo
    

    so you can write things like:

    censorFormat :: UncensoredFormat -> CensoredFormat
    censoredScreenshot :: CensoredVideo -> CensoredPhoto
    uncensoredScreenshot :: UncensoredVideo -> UncensoredPhoto
    showAdult :: AnyFormat -> IO ()
    showChild :: CensoredFormat -> IO ()
    

    That's pretty messy, though. And suppose you want to add some constraints on video length to prevent spammers from submitting lots of short videos, or to avoid tying up your servers censoring really long videos. How many types along the line of ShortUncensoredVideo do you want to define?

    In a situation like this, you can use data kinds to develop a type-level "language" for describing attributes of your data structures:

     {-# LANGUAGE DataKinds #-}
     {-# LANGUAGE DuplicateRecordFields #-}
     {-# LANGUAGE GADTs #-}
     {-# LANGUAGE KindSignatures #-}
     {-# LANGUAGE TypeFamilies #-}
    
    -- these types/constructors will be promoted...
    data Censoring = Censored | Uncensored
    data Length = Short | Medium | Long
    
    -- ...and used to tag our actual data structures
    data Photo (c :: Censoring) = Photo { bytes :: Int }
    data Video (c :: Censoring) (l :: Length) = Video {bytes :: Int, durationSec :: Int}
    data Format (c :: Censoring) where
      PhotoF :: Photo c -> Format c
      VideoF :: Video c l -> Format c
    

    Now we can write things like:

    -- preserve censoring at compile time
    screenShot :: Video c l -> Photo c
    screenShot (Video b _) = Photo (b `div` 10)
    
    -- only censor uncensored videos that aren't long
    type family NotLong l where
        NotLong Long = False
        NotLong l = True
    censorVideo :: (NotLong l ~ True) => Video Uncensored l -> Video Censored l
    censorVideo (Video b l) = Video (b `div` 2) (l `div` 2)
    
    -- show any format to adults
    showAdult :: Format c -> IO ()
    showAdult fmt = print fmt
    
    -- only censored content for kids
    showChild :: Format Censored -> IO ()
    showChild fmt = print fmt
    

    and catch problems at compile time:

    main = do
      -- we can show a screenshot from a censored version of an uncensored short video to a child
      showChild $ PhotoF . screenShot . censorVideo $ (Video 128000 1 :: Video 'Uncensored 'Short)
    
      -- but the following are compilation errors
      --   can't censor an already censored video
      showAdult $ VideoF . censorVideo $ (Video 128000 1 :: Video 'Censored 'Short)
      --   can't censor a long video
      showAdult $ VideoF . censorVideo $ (Video 12800000 100 :: Video 'Uncensored 'Long)
      --   can't show a child an uncensored screenshot
      showChild $ PhotoF . screenShot $ (Video 128000 1 :: Video 'Uncensored 'Short)
    

    Note that the promoted types (Censored, Uncensored, Short, Medium, Long) and kinds (Censoring and Length) are not directly related to the unpromoted types Photo, Video, and Format that they describe. As I say, this is how data kinds are typically used.

    Full code example:

    {-# LANGUAGE DataKinds #-}
    {-# LANGUAGE DuplicateRecordFields #-}
    {-# LANGUAGE GADTs #-}
    {-# LANGUAGE KindSignatures #-}
    {-# LANGUAGE TypeFamilies #-}
    
    -- these types/constructors will be promoted...
    data Censoring = Censored | Uncensored
    data Length = Short | Medium | Long
    
    -- ...and used to tag our actual data structures
    data Photo (c :: Censoring) = Photo { bytes :: Int }
    data Video (c :: Censoring) (l :: Length) = Video {bytes :: Int, durationSec :: Int}
    data Format (c :: Censoring) where
      PhotoF :: Photo c -> Format c
      VideoF :: Video c l -> Format c
    instance Show (Format c) where show _ = "<Format>"
    
    -- preserve censoring at compile time
    screenShot :: Video c l -> Photo c
    screenShot (Video b _) = Photo (b `div` 10)
    
    -- only censor uncensored videos that aren't long
    type family NotLong l where
        NotLong Long = False
        NotLong l = True
    censorVideo :: (NotLong l ~ True) => Video Uncensored l -> Video Censored l
    censorVideo (Video b l) = Video (b `div` 2) (l `div` 2)
    
    -- show any format to adults
    showAdult :: Format c -> IO ()
    showAdult fmt = print fmt
    
    -- only censored content for kids
    showChild :: Format Censored -> IO ()
    showChild fmt = print fmt
    
    main = do
      -- we can show a screenshot from a censored version of an uncensored short video to a child
      showChild $ PhotoF . screenShot . censorVideo $ (Video 128000 1 :: Video 'Uncensored 'Short)
    
      -- but the following are compilation errors
      --   can't censor an already censored video
      showAdult $ VideoF . censorVideo $ (Video 128000 1 :: Video 'Censored 'Short)
      --   can't censor a long video
      showAdult $ VideoF . censorVideo $ (Video 12800000 100 :: Video 'Uncensored 'Long)
      --   can't show a child an uncensored screenshot
      showChild $ PhotoF . screenShot $ (Video 128000 1 :: Video 'Uncensored 'Short)