Search code examples
haskellconduit

Resolving "No instance for ..." with conduit's mono-traversable


Given the following code:

import           Data.Attoparsec.Text
import qualified Conduit as C
import qualified Data.Conduit.Combinators as CC

f :: FilePath -> FilePath -> IO ()
f infile outfile =
  runResourceT $
    CC.sourceFile infile $$ C.encodeUtf8C =$= x

where x's type is ConduitM Text Void (ResourceT IO) ()

The following compile-time error occurs in my private github repo:

• No instance for (mono-traversable-1.0.2:Data.Sequences.Utf8
                     ByteString Text)
    arising from a use of ‘C.encodeUtf8C’
• In the first argument of ‘(=$=)’, namely ‘C.encodeUtf8C’
  In the second argument of ‘($$)’, namely ‘C.encodeUtf8C =$= x’
  In the second argument of ‘($)’, namely
    ‘CC.sourceFile infile $$ C.encodeUtf8C =$= x’

How can I resolve this compile-time error?

EDIT

My understanding of the types:

> :t sourceFile
sourceFile
  :: MonadResource m =>
     FilePath
     -> ConduitM
          i bytestring-0.10.8.1:Data.ByteString.Internal.ByteString m ()

> :t ($$)
($$) :: Monad m => Source m a -> Sink a m b -> m b

> :t Conduit
type Conduit i (m :: * -> *) o = ConduitM i o m ()

> :i Source
type Source (m :: * -> *) o = ConduitM () o m ()

> :i Sink
type Sink i = ConduitM i Data.Void.Void :: (* -> *) -> * -> *

> :t (=$=)
(=$=)
  :: Monad m => Conduit a m b -> ConduitM b c m r -> ConduitM a c m r

C.encodeUtf8C =$= x boils down to, I think:

(mono-traversable-1.0.2:Data.Sequences.Utf8 text binary,
      Monad m) =>
     Conduit text m binary () 

=$= 

ConduitM Text Void binary () 

yielding a return type of

ConduitM text Void (ResourceT IO) ()

And I suppose that this type, i.e. C.encodeUtf8C =$= x, does not unify to the expected second argument of CC.sourceFile?


Solution

  • The sourceFile conduit produces a ByteString, that you need to decode into a Text for x to consume. Encoding refers to the opposite direction where you serialize Text to ByteString to be written to file.

    Use decodeUtf8.


    Why the types don't match

    -- Ignoring the `Monad` constraint.
    
    (=$=)      :: Conduit a m b -> ConduitM b c m r -> ConduitM a c m r
    
    encodeUtf8 :: Utf8 text binary
               => Conduit text m binary
    
    x          :: ConduitM Text Void m ()
    

    To apply (=$=) to encodeUtf8, you must unify Conduit a m b and Conduit text m binary, so we get the following type equalities:

    a ~ text
    b ~ binary
    

    Then we apply the result to x, unifying ConduitM b c m r and ConduitM Text Void m ():

    b ~ Text
    c ~ Void
    

    Up to here the compiler doesn't complain, but we can already see a mismatch because of the two equalities involving b:

    b ~ binary
    b ~ Text
    

    In the conduit-combinators library, the type variable binary is used to refer to types that represent raw binary data, typically ByteString, as opposed to more structured data like Text.

    If we continue, the result has type ConduitM a c m r, and that is being passed as the second argument of ($$).

    -- Expanding Source and Sink definitions, renaming type variables.
    
    ($$) :: Monad m => ConduitM () d m () -> ConduitM d Void m e -> m e
    
    sourceFile infile
         :: _ => ConduitM i ByteString m ()
    

    Using source infile as the first argument, we unify ConduitM () d m () with ConduitM i ByteString m ().

    i ~ ()
    d ~ ByteString
    

    And with our previous encodeUtf8C =$= x as the second argument of ($$), we unify ConduitM d Void m e with ConduitM a c m r.

    a ~ d
    c ~ Void
    r ~ e
    

    Focus on a and d, we have the following:

    a ~ text
    a ~ d
    d ~ ByteString
    

    Therefore text ~ ByteString, binary ~ Text. Now remember that to use encodeUtf8, we require a Utf8 text binary constraint, i.e., Utf8 ByteString Text, which is the wrong way around.