Search code examples
parsingvariableshaskellevalparsec

String to variable name Haskell


I find it hard to learn Parsec in Haskell so I'm trying to make my college project( a parser that parses files with the form

x=3
y=4
z=x+y
badluck=(x+sqrt(z)*7)

I managed to write a function that gets everything from the file and validates the file and I am stuck at making x be the variable name. I know in javascript it's eval but I can't find anything similar in Haskell. Please help!

This is what I have done until now:

ischarorscore :: Char -> Bool
ischarorscore a = if ((a>='A' && a<='Z') || (a>='a' && a<='z') || a=='_')
        then True 
        else False

ischarscoredigit :: Char -> Bool
ischarscoredigit a = if ((a>='A' && a<='Z') || 
                      (a>='a' && a<='z') || 
                       a=='_' ||
                       a>='0' && a<='9') then True else False

isvar :: String -> Bool
isvar [] = False
isvar (h:t) = if (ischarorscore h) then (isvarbody t) else False

isvarbody :: String -> Bool
isvarbody [] = True
isvarbody (h:t) = if (ischarscoredigit h) then (isvarbody t) else False

isoperator :: Char -> Bool
isoperator a = if (a=='+' || a=='-' || a=='*' || a=='/' || a=='^') then True else False

issqrt :: String -> Bool
issqrt [] = True
issqrt x = if (x=="sqrt(") then True else False

isnumber :: String -> Bool
isnumber (h:t) = if (h>='0' && h<='9') then isnumber t else False

charsetall :: String -> Bool
charsetall [] = True
charsetall (h:t) = if (h>='0' && h<='9' ||
                h>='a' && h<='z' ||
                h>='A' && h<='Z' ||
                h>='0' && h<='9' ||
                h=='+' || h=='-' || h=='*' || h=='/' || h=='^' ||           h=='(' || h==')' || h=='=')
                then charsetall t else False

charsetexpr :: String -> Bool
charsetexpr [] = True
charsetexpr (h:t) = if (h>='0' && h<='9' ||
                h>='a' && h<='z' ||
                h>='A' && h<='Z' ||
                h>='0' && h<='9' ||
                h=='+' || h=='-' || h=='*' || h=='/' || h=='^' || h=='(' || h==')')
                then charsetexpr t else False

paranthesis :: String -> Int -> Bool
paranthesis [] a = if (a==0) then True else False
paranthesis (h:t) a = if (h=='(') then (paranthesis t (a+1)) 
                    else (if (h==')') then
                      paranthesis t (a-1) else paranthesis t a)

paranthesis' :: String -> Bool
paranthesis' (h:t) = paranthesis (h:t) 0

obeyrules :: String -> Bool
obeyrules [] = True
obeyrules (h:t) = if (ischarorscore h) then (contvar t) 
              else if (h>='0' && h<='9') then (contnumber t)
              else if (h=='(' || h=='-') then obeyrules t
              else False

contnumber :: String -> Bool
contnumber [] = True
contnumber (h:t) = if (h>='0' && h<='9') then contnumber t
            else if (isoperator h) then contoperator t
            else if (h==')') then contoperator h
            else False

contvar :: String -> Bool
contvar [] = True
contvar (h:t) = if (ischarorscore h || h>='0' && h<='9') then contvar t
            else if (isoperator h) then contoperator t
            else if (h==')') then contoperator
            else False

contoperator :: String -> Bool
contoperator [] = True
contoperator (h:t) = if (ischarorscore h) then contvar t
                else if (h>='0' && h<='9') then contnumber t
                else if (h=='(') then obeyrules t
                else False


isexpression :: String -> Bool
isexpression [] = True
isexpression (h:t) = if (charsetexpr (h:t) && paranthesis' (h:t) && obeyrules (h:t)) then True else False

The isexpression function combines all the functions above to validate a line of the file. The project is like this : you have to read a random file that looks like the one above and contains variable name = expression and recognises (,),+,-,*,/,^ which is power, sqrt(k) where k is number or variable name, and mod, calculate the expressions and provide the resulting variable names and their values.


Solution

  • You most likely want to parse what's called an Abstract Syntax Tree. For the fragment given here, it might look like

    data AST
      = Var String
      | Lit Double
      | Plus AST AST
      | Mult AST AST
      | Sqrt AST
      | Assign String AST
      | Then AST AST
    

    With this choice of AST, the fragment of code you have given looks like

    fragment :: AST
    fragment = foldr1 Then [
        Assign "x" (Lit 3)
      , Assign "y" (Lit 4)
      , Assign "z" (Plus (Var "x") (Var "y"))
      , Assign "badluck" (Plus (Var "x") (Mult (Sqrt (Var "z")) (Lit 7)))
      ]
    

    We then pick an evaluator function which converts these AST values to their "canonical" values. In this case, the basic canonical value would be a Double, but since the AST could fail due to referring to an unknown variable, we'll have the canonical value be Maybe Double instead. Finally, since we expect the environment to change as expressions are evaluated, we return the updated environment.

    In order to operationalize this evaluator, we need a notion of the environment which gets assigned to. A good choice for this is Data.Map

    import           Control.Applicative
    import qualified Data.Map            as Map
    import           Data.Map            (Map)
    
    type Env = Map String Double
    
    eval :: Env -> AST -> Maybe (Double, Env)
    eval env (Var s) = fmap (\x -> (x, env)) (Map.lookup s env)
    eval env (Plus e1 e2) = do -- maybe monad
      (v1, env')  <- eval env  e1
      (v2, env'') <- eval env' e2
      return (v1 + v2, env'')
    eval env (Mult e1 e2) = do -- maybe monad
      (v1, env')  <- eval env  e1
      (v2, env'') <- eval env' e2
      return (v1 * v2, env'')
    eval env (Sqrt e)     = fmap (\(x, e) -> (sqrt x, e)) (eval env e)
    eval env (Assign s e) = do
      (v, env') <- eval env e
      return (v, Map.insert s v env')
    eval env (Then e1 e2) = do 
      (_, env')  <- eval env  e1
      (v, env'') <- eval env' e2
      return (v, env'')
    

    Technically, this could have been done using a monad transformer stack of State Env and Maybe, but as written the operation might be a little more clear if tedious.