Search code examples
haskellattoparsec

how to parse yahoo historical csv with Attoparsec


i am a beginner of haskell, how to parse with attoparsec into open array, high array etc

module CsvParser (
      Quote (..)
    , csvFile
    , quote
    ) where
import System.IO
import Data.Attoparsec.Text
import Data.Attoparsec.Combinator
import Data.Text (Text, unpack)
import Data.Time
import System.Locale
import Data.Maybe

data Quote = Quote {
        qTime       :: LocalTime,
        qAsk        :: Double,
        qBid        :: Double,
        qAskVolume  :: Double,
        qBidVolume  :: Double
    } deriving (Show, Eq)

csvFile :: Parser [Quote]
csvFile = do
    q <- many1 quote
    endOfInput
    return q

quote   :: Parser Quote
quote   = do
    time        <- qtime
    qcomma
    ask         <- double
    qcomma
    bid         <- double
    qcomma
    askVolume   <- double
    qcomma
    bidVolume   <- double
    endOfLine
    return $ Quote time ask bid askVolume bidVolume 

qcomma  :: Parser ()
qcomma  = do 
    char ','
    return ()

qtime   :: Parser LocalTime
qtime   = do
    tstring     <- takeTill (\x -> x == ',')
    let time    = parseTime defaultTimeLocale "%d.%m.%Y %H:%M:%S%Q" (unpack tstring)
    return $ fromMaybe (LocalTime (fromGregorian 0001 01 01) (TimeOfDay 00 00 00 )) time

--testString :: Text
--testString = "01.10.2012 00:00:00.741,1.28082,1.28077,1500000.00,1500000.00\n" 

quoteParser = parseOnly quote

main = do  
    handle <- openFile "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode  
    contents <- hGetContents handle  
    let allLines = lines contents
    map (\line -> quoteParser line) allLines
    --putStr contents  
    hClose handle

Error message:

testhaskell.hs:89:5:
    Couldn't match type `[]' with `IO'
    Expected type: IO (Either String Quote)
      Actual type: [Either String Quote]
    In the return type of a call of `map'
    In a stmt of a 'do' block:
      map (\ line -> quoteParser line) allLines
    In the expression:
      do { handle <- openFile
                       "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode;

           contents <- hGetContents handle;
           let allLines = lines contents;
           map (\ line -> quoteParser line) allLines;
           .... }

testhaskell.hs:89:37:
    Couldn't match type `[Char]' with `Text'
    Expected type: [Text]
      Actual type: [String]
    In the second argument of `map', namely `allLines'
    In a stmt of a 'do' block:
      map (\ line -> quoteParser line) allLines
    In the expression:
      do { handle <- openFile
                       "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode;

           contents <- hGetContents handle;
           let allLines = lines contents;
           map (\ line -> quoteParser line) allLines;
           .... }

Solution

  • You can use attoparsec-csv package or you can take a look at its source code to have some idea on how to write it by yourself.

    The code will be like

    import qualified Data.Text.IO as T
    import Text.ParseCSV
    
    main = do
      txt <- T.readFile "file.csv"
      case parseCSV txt of
        Left  err -> error err
        Right csv -> mapM_ (print . mkQuote) csv
    
    mkQuote :: [T.Text] -> Quote
    mkQuote = error "Not implemented yet"