Search code examples
parsinghaskellagdadependent-type

Agda: parsing nested lists


I am trying to parse nested lists in Agda. I searched on google and the closest I have found is parsing addressed in Haskell, but usually libraries like "parsec" are used that are not available in Agda.

So I would like to parse "((1,2,3),(4,5,6))" with a result type of (List (List Nat)).

And further nested lists should be supported (up to depth 5), e.g., depth 3 would be (List (List (List Nat))).

My code is very long and cumbersome, and it only works for (List (List Nat)) but not for further nested lists. I didn't make any progress on my own.

If helpful, I would like to reuse splitBy from the first answer of one of my older posts.

NesList : ℕ → Set
NesList 0 = ℕ -- this case is easy
NesList 1 = List ℕ -- this case is easy
NesList 2 = List (List ℕ) 
NesList 3 = List (List (List ℕ))
NesList 4 = List (List (List (List ℕ)))
NesList 5 = List (List (List (List (List ℕ)))) -- I am only interested to list depth 5
NesList _ = ℕ -- this is a hack, but I think okay for now


-- My implementation is *not* shown here
--
--
-- (it's about 80 lines long and uses 3 different functions
parseList2 : List Char → Maybe (List (List ℕ))
parseList2 _ = nothing -- dummy result


parseList : (dept : ℕ) → String → Maybe (NesList dept)
parseList 2 s = parseList2 (toList s)
parseList _ _ = nothing



-- Test Cases that are working (in my version)

p1 : parseList 2 "((1,2,3),(4,5,6))" ≡ just ((1 ∷ 2 ∷ 3 ∷ []) ∷ (4 ∷ 5 ∷ 6 ∷ []) ∷ [])
p1 = refl


p2 : parseList 2 "((1,2,3),(4,5,6),(7,8,9,10))" ≡ just ((1 ∷ 2 ∷ 3 ∷ []) ∷ (4 ∷ 5 ∷ 6 ∷ []) ∷ (7 ∷ 8 ∷ 9 ∷ 10 ∷ []) ∷ [])
p2 = refl

p3 : parseList 2 "((1),(2))" ≡ just ((1 ∷ []) ∷ (2 ∷ []) ∷ [])
p3 = refl

p4 : parseList 2 "((1,2))" ≡ just ((1 ∷ 2 ∷ []) ∷ [])
p4 = refl

-- Test Cases that are not working 
-- i.e., List (List (List Nat))

lp5 : parseList 3 "(((1,2),(3,4)),((5,6),(7,8)))" ≡ just (  ((1 ∷ 2 ∷ []) ∷ (3 ∷ 4 ∷ []) ∷ []) ∷ ((5 ∷ 6 ∷ []) ∷ (7 ∷ 8 ∷ []) ∷ []) ∷ [])
lp5 = refl

EDIT1 **

Connor's talk at ICFP is online -- the title is "Agda-curious?".
It is from two days ago. Check it out!!
.
See the video:
http://www.youtube.com/watch?v=XGyJ519RY6Y

--
EDIT2:
I found a link that seems to be almost the code I need for my parsing.
There is a tokenize function provided:
https://github.com/fkettelhoit/agda-prelude/blob/master/Examples/PrefixCalculator.agda

--
EDIT3:
I finally found a simple combinator library that should be fast enough. There are no examples included in the library so I still have to look how to solve the problem.
Here is the link:

https://github.com/crypto-agda/agda-nplib/blob/master/lib/Text/Parser.agda

There is more agda-code from Nicolas Pouillard online:
https://github.com/crypto-agda


Solution

  • I post here my solution using parser combinators. It uses the agda-nplib library is on github. The code is far from optimal but it works.

    module NewParser where
    
    -- dummy
    open import Data.Maybe
    open import Data.Bool
    
    -- includes
    open import Data.List hiding (map)
    
    -- ***
    -- WAS PRELUDE IMPORTS
    open import StringHelpers using (charToℕ; stringToℕ)
    open import Data.String hiding (_==_; _++_)
    open import Data.Char
    open import Function
    open import Data.Nat
    open import Data.Unit
    open import Data.Maybe
    
    
    -- https://github.com/crypto-agda/agda-nplib/tree/master/lib/Text
    open import Text.Parser
    open import ParserHelpers
    
    
    
    --- ****
    --- Lessons Learned, this is the key:
    --- (was a basic error that tyeps where too specific, generalisation not possible)
    
    -- parseList : {A : Set} → Parser (Maybe A) → Parser (Maybe A) → ℕ → Parser (List (Maybe A))
    -- converted to
    -- parseList : {A : Set} → Parser A → Parser A → ℕ → Parser (List A)
    
    
    
    -- *****
    -- General ... Normal List (depth 1)
    
    parseList : {A : Set} → Parser A → Parser A → ℕ → Parser (List A)
    parseList oneMatcher manyMatcher n = ⟪ _++_ · (map toL oneMatcher) · (many n manyMatcher) ⟫
    
    parseBracketList : {A : Set} → Parser A → Parser A → ℕ → Parser (List A)
    parseBracketList oneMatcher manyMatcher n = bracket '(' (parseList oneMatcher manyMatcher n) ')'
    
    parseCommaListConvert : {A : Set} → (List Char → A) → (Parser (List Char)) → ℕ → Parser (List A)
    parseCommaListConvert convert parser = parseBracketList (⟪ convert · parser ⟫) (⟪ convert · parseOne "," *> parser ⟫) 
    
    
    -- For Numbers
    
    number : Parser (List Char)
    number = manyOneOf (toList "1234567890")
    
    parseNumList : ℕ → Parser (List (Maybe ℕ))
    parseNumList = parseCommaListConvert charsToℕ number
    
    
    -- Nested List (depth 2)
    --
    
    parseListListNum : ℕ → Parser (List (List (Maybe ℕ)))
    parseListListNum n = parseList (parseNumList n) ((parseOne ",") *> (parseNumList n)) n
    
    
    parseManyLists : ℕ → Parser (List (List (Maybe ℕ)))
    parseManyLists n = bracket '(' (parseListListNum n) ')'
    
    
    
    -- Run the Parsers
    --
    
    open import MaybeEliminatorHelper
    
    -- max number of terms is the number of characters in the string
    -- this is for the termination checker
    runParseList' : String → Maybe (List (Maybe ℕ))
    runParseList' s = runParser (parseNumList (strLength s)) (toList s)
    
    runParseList : String → Maybe (List ℕ)
    runParseList = maybe-list-maybe-eliminate ∘ runParseList'
    
    -- nested list
    runParseNesList' : String → Maybe (List (List( Maybe ℕ)))
    runParseNesList' s = runParser (parseManyLists (length (toList s))) (toList s)
    
    runParseNesList : String → Maybe (List (List ℕ))
    runParseNesList = maybe-list-list-maybe-eliminate ∘ runParseNesList' 
    

    Here is are my helper functions:

    module MaybeEliminatorHelper where
    
    open import Data.Maybe
    open import Category.Monad
    open import Function
    
    open import Data.List
    
    open import Category.Functor
    
    
    sequence-maybe : ∀ {a} {A : Set a} → List (Maybe A) → Maybe (List A)
    sequence-maybe = sequence Data.Maybe.monad
    
    
    join : {A : Set} → Maybe (Maybe A) → Maybe A
    join m = m >>= id
      where 
        open RawMonad Data.Maybe.monad
    
    
    maybe-list-elem : {A : Set} → Maybe (List (Maybe A)) → Maybe (List A)
    maybe-list-elem mlm = join (sequence-maybe <$> mlm)
      where open RawFunctor functor
    
    {-
    sequence-maybe : [Maybe a] -> Maybe [a]
    
    join :: Maybe (Maybe a) -> Maybe a
    
    
    Maybe (List (List (Maybe A))
      Maybe.fmap (List.fmap sequenc-maybe)
    Maybe (List (Maybe (List A))
      Maybe.fmap sequence-maybe
    Maybe (Maybe (List (List A)))
      join
    Maybe (List (List A))
    
    
    
    join . Maybe.fmap sequence-maybe . Maybe.fmap (List.fmap sequenc-maybe)
    
    join . Maybe.fmap (sequence-maybe . List.fmap sequenc-maybe)
    (short form)
    
    -}
    
    maybe-list-elem2 : {A : Set} → Maybe (List (List (Maybe A))) → Maybe (List (List A)) 
    maybe-list-elem2 = join ∘ Mfmap (sequence-maybe ∘ Lfmap sequence-maybe)
      where
        open RawMonad Data.Maybe.monad hiding (join) renaming (_<$>_ to Mfmap)
        open RawMonad Data.List.monad hiding (join) renaming (_<$>_ to Lfmap)
    
    
    maybe-list-maybe-eliminate = maybe-list-elem
    
    maybe-list-list-maybe-eliminate = maybe-list-elem2
    

    Further helper functions:

    -- ***
    -- WAS PRELUDE IMPORTS
    open import StringHelpers using (charToℕ; stringToℕ)
    
    open import Data.String hiding (_==_)
    open import Data.Char
    open import Function
    open import Data.Nat
    open import Data.Unit
    open import Data.Maybe
    
    
    
    open import Text.Parser
    
    open import Data.List
    
    
    -- mini helpers
    --
    
    parseOne : String → Parser Char
    parseOne = oneOf ∘ toList 
    
    strLength : String → ℕ
    strLength = length ∘ toList 
    
    
    -- misc helpers
    --
    
    charsToℕ : List Char → Maybe ℕ
    charsToℕ [] = nothing
    charsToℕ xs = stringToℕ (fromList xs)
    
    toL : ∀ {a} {A : Set a} → A → List A
    toL x = x ∷ []
    
    
    -- test
    l : List (Maybe ℕ)
    l = (just 3) ∷ (just 3) ∷ []
    
    
    -- Parser Helpers Nicolas
    --
    isSpace : Char → Bool
    isSpace = (_==_ ' ')
    
    spaces : Parser ⊤
    spaces = manySat isSpace *> pure _
    
    -- decide if seperator before after brackets is spaces
    someSpaces : Parser ⊤
    someSpaces = someSat isSpace *> pure _
    
    tok : Char → Parser ⊤
    tok c = spaces *> char c *> pure _
    
    bracket : ∀ {A} → Char → Parser A → Char → Parser A
    bracket start p stop = tok start *> p <* tok stop
    

    And some test cases:

    tn09 : pList "12,13,,14" ≡ nothing
    tn09 = refl
    
    tn08 : pList "" ≡ nothing
    tn08 = refl
    
    tn07 : pList "12,13,14" ≡ nothing
    tn07 = refl
    
    -- not working tn06 : pList "(12,13,14,17)," ≡ nothing
    -- not working tn06 = refl
    
    tn05 : pList "aa,bb,cc" ≡ nothing
    tn05 = refl
    
    tn04 : pList "11" ≡ nothing 
    tn04 = refl
    
    tn03 : pList "(11,12,13)" ≡ just (11 ∷ 12 ∷ 13 ∷ [])
    tn03 = refl
    
    
    -- new testcases
    tn11 : pList2 "((1,2,3),(4,5,6),(7,8,9))" ≡ just ((1 ∷ 2 ∷ 3 ∷ []) ∷ (4 ∷ 5 ∷ 6 ∷ []) ∷ (7 ∷ 8 ∷ 9 ∷ []) ∷ [])
    tn11 = refl
    
    -- old testcases
    p1 : pList2 "((1,2,3),(4,5,6))" ≡ just ((1 ∷ 2 ∷ 3 ∷ []) ∷ (4 ∷ 5 ∷ 6 ∷ []) ∷ [])
    p1 = refl
    
    
    p2 : pList2 "((1,2,3),(4,5,6),(7,8,9,10))" ≡ just ((1 ∷ 2 ∷ 3 ∷ []) ∷ (4 ∷ 5 ∷ 6 ∷ []) ∷ (7 ∷ 8 ∷ 9 ∷ 10 ∷ []) ∷ [])
    p2 = refl
    
    p3 : pList2 "((1),(2))" ≡ just ((1 ∷ []) ∷ (2 ∷ []) ∷ [])
    p3 = refl
    
    p4 : pList2 "((1,2))" ≡ just ((1 ∷ 2 ∷ []) ∷ [])
    p4 = refl
    

    I am open for suggestions to improve the code.