Search code examples
compiler-constructionocamlocamllex

Error Eof Ocaml


I'm doing a compiler for class in Ocaml. I need to read a file with commands or expressions like "1" and then it returns Int 1. The same code worked with the whole class except for me and my friend. Everyone is using the same ocaml version and Ubuntu 13.04. The error is: Lexico.Eof

Someone have any idea about what this could be? This is the asa.ml:

type opB =
| Soma
| Sub
| Mul
| Div

type exp =
| Int of int
| Float of float
| String of string
| Char of char
| Identificador of string
| Bin of opB * exp * exp

This is Sintatico.mly:

%{
open Asa;;
%}

%token <int> INT
%token <float> FLOAT
%token <string> STRING
%token <char> CHAR
%token <string> IDENTIFICADOR
%token APAREN FPAREN PTVIRG
%token MAIS MENOS MUL DIV

%left MAIS MENOS
%left MUL DIV

%start main
%type <Asa.exp> main

%%

main: expr                 { $1 }
;

expr: IDENTIFICADOR         { Identificador($1) }
| INT                       { Int($1) }
| FLOAT                     { Float($1) }
| STRING                    { String($1) }
| CHAR                      { Char($1) }
| APAREN expr FPAREN        { $2 }
| expr MAIS expr            { Bin(Soma, $1, $3) }
| expr MENOS expr           { Bin(Sub, $1, $3) }
| expr MUL expr             { Bin(Mul, $1, $3) }
| expr DIV expr             { Bin(Div, $1, $3) }
;

The Lexico.mll:

{
open String
open Sintatico
exception Eof
}

let digito = ['0'-'9']
let caracter = [^ '\n' '\t' '\b' '\r' '\'' '\\']
let identificador = ['a'-'z' 'A'-'Z']['a'-'z' '0'-'9']*

rule token = parse
| [' ' '\t' '\n']   { token lexbuf } (* ignora os espacos *)
| digito+ as inum   { print_string " int ";  INT (int_of_string inum) }
| digito+'.'digito+ as fnum { print_string " float "; FLOAT (float_of_string fnum) }
| '\"' ([^ '"']* as s) '\"' { print_string " string "; STRING (s)}
| '\'' caracter '\'' as ch      { print_string " char "; CHAR (String.get ch 1) }

| identificador as id       { print_string " identificador "; IDENTIFICADOR (id) }

| '('               { print_string " abreparent "; APAREN }
| ')'               { print_string " fechaparent "; FPAREN }

| '+'               { print_string " + "; MAIS }
| '-'               { print_string " - "; MENOS }
| '*'               { print_string " * "; MUL }
| '/'               { print_string " / "; DIV }

| ';'                           { print_string " ptv "; PTVIRG }

| eof               { raise Eof }

The code to call the file named carregatudo.ml is:

#load "asa.cmo"
#load "sintatico.cmo"
#load "lexico.cmo"

open Asa;;

let analisa_arquivo arquivo = 
let ic = open_in arquivo in
let lexbuf = Lexing.from_channel ic in
let asa = Sintatico.main Lexico.token lexbuf in
close_in ic;    
asa

Sorry about the portuguese:

arquivo means file

Lexico means Lexer

Sintatico means Parser

First I run this makefile using the command make interpretador:

CAMLC = ocamlc
CAMLLEX = ocamllex
CAMLYACC = ocamlyacc

interpretador: asa.cmo sintatico.cmi sintatico.cmo lexico.cmo

portugol: asa.cmo sintatico.cmi sintatico.cmo lexico.cmo principal.cmo

clean:
rm *.cmo *.cmi

# regras genericas
.SUFFIXES: .mll .mly .mli .ml .cmi .cmo .cmx
.mll.mli:
$(CAMLLEX) $<
.mll.ml: 
$(CAMLLEX) $<
.mly.mli:
$(CAMLYACC) $<
.mly.ml:
$(CAMLYACC) $<
.mli.cmi:
$(CAMLC) -c $(FLAGS) $<
.ml.cmo:
$(CAMLC) -c $(FLAGS) $<

And next the carregatudo.ml: #use "carregatudo.ml";;

Next the function: analisa_arquivo("teste.pt");;

The input file teste.pt is like:

1

and the return should be

Int 1

But I keep getting the error Lexico.Eof

Thank you!


Solution

  • The parser is consuming more than one token in order to see whether the recursive rules are matched, which quite naturally causes the Eof to be raised. Basically your parser is running off the end of the file because it lacks any rule to tell it when to stop looking for more parts of the expression.

    An easy fix is to change the Eof exception to an token END_OF_INPUT, and match against that in the grammar:

    main: expr END_OF_INPUT { $1 }
    

    Alternatively you could introduce an explicit terminator such as ;.