Given the following code:
import Data.Attoparsec.Text
import qualified Conduit as C
import qualified Data.Conduit.Combinators as CC
f :: FilePath -> FilePath -> IO ()
f infile outfile =
runResourceT $
CC.sourceFile infile $$ C.encodeUtf8C =$= x
where x
's type is ConduitM Text Void (ResourceT IO) ()
The following compile-time error occurs in my private github repo:
• No instance for (mono-traversable-1.0.2:Data.Sequences.Utf8
ByteString Text)
arising from a use of ‘C.encodeUtf8C’
• In the first argument of ‘(=$=)’, namely ‘C.encodeUtf8C’
In the second argument of ‘($$)’, namely ‘C.encodeUtf8C =$= x’
In the second argument of ‘($)’, namely
‘CC.sourceFile infile $$ C.encodeUtf8C =$= x’
How can I resolve this compile-time error?
EDIT
My understanding of the types:
> :t sourceFile
sourceFile
:: MonadResource m =>
FilePath
-> ConduitM
i bytestring-0.10.8.1:Data.ByteString.Internal.ByteString m ()
> :t ($$)
($$) :: Monad m => Source m a -> Sink a m b -> m b
> :t Conduit
type Conduit i (m :: * -> *) o = ConduitM i o m ()
> :i Source
type Source (m :: * -> *) o = ConduitM () o m ()
> :i Sink
type Sink i = ConduitM i Data.Void.Void :: (* -> *) -> * -> *
> :t (=$=)
(=$=)
:: Monad m => Conduit a m b -> ConduitM b c m r -> ConduitM a c m r
C.encodeUtf8C =$= x
boils down to, I think:
(mono-traversable-1.0.2:Data.Sequences.Utf8 text binary,
Monad m) =>
Conduit text m binary ()
=$=
ConduitM Text Void binary ()
yielding a return type of
ConduitM text Void (ResourceT IO) ()
And I suppose that this type, i.e. C.encodeUtf8C =$= x
, does not unify to the expected second argument of CC.sourceFile
?
The sourceFile
conduit produces a ByteString
, that you need to decode into a Text
for x
to consume. Encoding refers to the opposite direction where you serialize Text
to ByteString
to be written to file.
Use decodeUtf8
.
-- Ignoring the `Monad` constraint.
(=$=) :: Conduit a m b -> ConduitM b c m r -> ConduitM a c m r
encodeUtf8 :: Utf8 text binary
=> Conduit text m binary
x :: ConduitM Text Void m ()
To apply (=$=)
to encodeUtf8
, you must unify Conduit a m b
and Conduit text m binary
, so we get the following type equalities:
a ~ text
b ~ binary
Then we apply the result to x
, unifying ConduitM b c m r
and ConduitM Text Void m ()
:
b ~ Text
c ~ Void
Up to here the compiler doesn't complain, but we can already see a mismatch because of the two equalities involving b
:
b ~ binary
b ~ Text
In the conduit-combinators
library, the type variable binary
is used to refer to types that represent raw binary data, typically ByteString
, as opposed to more structured data like Text
.
If we continue, the result has type ConduitM a c m r
, and that is being passed as the second argument of ($$)
.
-- Expanding Source and Sink definitions, renaming type variables.
($$) :: Monad m => ConduitM () d m () -> ConduitM d Void m e -> m e
sourceFile infile
:: _ => ConduitM i ByteString m ()
Using source infile
as the first argument, we unify ConduitM () d m ()
with ConduitM i ByteString m ()
.
i ~ ()
d ~ ByteString
And with our previous encodeUtf8C =$= x
as the second argument of ($$)
, we unify ConduitM d Void m e
with ConduitM a c m r
.
a ~ d
c ~ Void
r ~ e
Focus on a
and d
, we have the following:
a ~ text
a ~ d
d ~ ByteString
Therefore text ~ ByteString
, binary ~ Text
. Now remember that to use encodeUtf8
, we require a Utf8 text binary
constraint, i.e., Utf8 ByteString Text
, which is the wrong way around.