Hi i am trying to run John code from lex and yacc book by R. Levine i have compiled the lex and yacc program in linux using the commands
lex example.l
yacc example.y
gcc -o example y.tab.c
./example
the program asks the user for input of verbs,nouns,prepositions e.t.c in the format
verb accept,admire,reject
noun jam,pillow,knee
and then runs the grammar in yacc to check if it's a simple or compound sentence
but when i type
jam reject knee
it shows noting on screen where it is supposed to show the line "Parsed a simple sentence." on parsing.The code is given below
yacc file
%{
#include <stdio.h>
/* we found the following required for some yacc implementations. */
/* #define YYSTYPE int */
%}
%token NOUN PRONOUN VERB ADVERB ADJECTIVE PREPOSITION CONJUNCTION
%%
sentence: simple_sentence { printf("Parsed a simple sentence.\n"); }
| compound_sentence { printf("Parsed a compound sentence.\n"); }
;
simple_sentence: subject verb object
| subject verb object prep_phrase
;
compound_sentence: simple_sentence CONJUNCTION simple_sentence
| compound_sentence CONJUNCTION simple_sentence
;
subject: NOUN
| PRONOUN
| ADJECTIVE subject
;
verb: VERB
| ADVERB VERB
| verb VERB
;
object: NOUN
| ADJECTIVE object
;
prep_phrase: PREPOSITION NOUN
;
%%
extern FILE *yyin;
main()
{
while(!feof(yyin)) {
yyparse();
}
}
yyerror(s)
char *s;
{
fprintf(stderr, "%s\n", s);
}
lex file
%{
/*
* We now build a lexical analyzer to be used by a higher-level parser.
*/
#include "ch1-06y.h" /* token codes from the parser */
#define LOOKUP 0 /* default - not a defined word type. */
int state;
%}
%%
\n { state = LOOKUP; }
\.\n { state = LOOKUP;
return 0; /* end of sentence */
}
^verb { state = VERB; }
^adj { state = ADJECTIVE; }
^adv { state = ADVERB; }
^noun { state = NOUN; }
^prep { state = PREPOSITION; }
^pron { state = PRONOUN; }
^conj { state = CONJUNCTION; }
[a-zA-Z]+ {
if(state != LOOKUP) {
add_word(state, yytext);
} else {
switch(lookup_word(yytext)) {
case VERB:
return(VERB);
case ADJECTIVE:
return(ADJECTIVE);
case ADVERB:
return(ADVERB);
case NOUN:
return(NOUN);
case PREPOSITION:
return(PREPOSITION);
case PRONOUN:
return(PRONOUN);
case CONJUNCTION:
return(CONJUNCTION);
default:
printf("%s: don't recognize\n", yytext);
/* don't return, just ignore it */
}
}
}
. ;
%%
/* define a linked list of words and types */
struct word {
char *word_name;
int word_type;
struct word *next;
};
struct word *word_list; /* first element in word list */
extern void *malloc();
int
add_word(int type, char *word)
{
struct word *wp;
if(lookup_word(word) != LOOKUP) {
printf("!!! warning: word %s already defined \n", word);
return 0;
}
/* word not there, allocate a new entry and link it on the list */
wp = (struct word *) malloc(sizeof(struct word));
wp->next = word_list;
/* have to copy the word itself as well */
wp->word_name = (char *) malloc(strlen(word)+1);
strcpy(wp->word_name, word);
wp->word_type = type;
word_list = wp;
return 1; /* it worked */
}
int
lookup_word(char *word)
{
struct word *wp = word_list;
/* search down the list looking for the word */
for(; wp; wp = wp->next) {
if(strcmp(wp->word_name, word) == 0)
return wp->word_type;
}
return LOOKUP; /* not found */
}
header file
# define NOUN 257
# define PRONOUN 258
# define VERB 259
# define ADVERB 260
# define ADJECTIVE 261
# define PREPOSITION 262
# define CONJUNCTION 263
The build details you describe do not follow the usual pattern, and in fact they do not work for the code you provide.
Having sorted out how to build your program, it does not work at all, instead segfaulting before reading any input.
Having solved that problem, your expectation of the program's behavior with the given input is incorrect in at least two ways.
yacc
builds C source for a parser and optionally a header file containing corresponding token definitions. It is usual to exercise the option to get the definitions, and to #include
their header in the lexer's source file (#include 'y.tab.h'
):
yacc -d example.y
lex
builds C source for a lexical analyzer. This can be done either before of after yacc
, as lex
does not depend directly on the token definitions:
lex example.l
The two generated C source files must be compiled and linked together, possibly with other sources as well, and possibly with libraries. In particular, it is often convenient to link in libl
(or libfl
if your lex
is really GNU flex
). I linked the latter to get the default yywrap()
:
gcc -o example lex.yy.c y.tab.c -lfl
Your generated program is built around this:
extern FILE *yyin;
main()
{
while(!feof(yyin)) {
yyparse();
}
}
In the first place, you should read Why is “while ( !feof (file) )” always wrong?. Having had that under consideration might have spared you from committing a much more fundamental mistake: evaluating yyin
before it has been set. Although it's true that yyin
will be set to stdin
if you don't set it to something else, that cannot happen at program initialization because stdin
is not a compile-time constant. Therefore, when control first reaches the loop control expression, yyin
's value is still NULL, and a segfault results.
It would be safe and make more sense to test for end of file after yyparse()
returns.
You complained that the input
verb accept,admire,reject
noun jam,pillow,knee
jam reject knee
does not elicit any output from the program, but that's not exactly true. That input does not elicit output from the program when it is entered interactively, without afterward sending an end-of-file signal (i.e. by typing control-D at the beginning of a line).
The parser not yet having detected end-of-file in that case (and not paying any attention at all to newlines, since your lexer notifies it about them only when they immediately follow a period), it has no reason to attempt to reduce its token stack to the start symbol. It could be that you will continue with an object to extend the simple sentence, and it cannot be sure that it won't see a CONJUNCTION next, eithger. It doesn't print anything because it's waiting for more input. If you end the sentence with a period or afterward send a control-D then it will in fact print "Parsed a simple sentence."