Search code examples
matlabidentifierlexical-analysis

Simple MATLAB lexer program


I created a simple lexer program from MATLAB code where, when the user types a string, the lexemes in that string are categorized. However, when I enter a string in the command window the identifiers are not getting displayed.

The code is as follows :

function determineLexemes()
   j = 0;
   prompt = 'Enter string : ';
   str = input(prompt);
   arr = char(str);
   strTwo = '';
   display('Symbol Table');
   fprintf('Lexeme \t\t Token \n');
   k = length(arr);
   for i = 1: k
     if(arr(i) == '+')
       fprintf('+ \t\t ADD_OP \n');
     end
    if(arr(i) == '-')
       fprintf('- \t\t SUB_OP \n');
    end
    if(arr(i) == '*')
       fprintf('* \t\t MULT_OP \n');
    end
    if(arr(i) == '/')
       fprintf('/ \t\t DIV_OP \n');
    end
   if(arr(i) == '(')
      fprintf('( \t\t LEFT_PAREN \n');
   end
   if(arr(i) == ')')
      fprintf(') \t\t RIGHT_PAREN \n');
   end
   if(arr(i) == '=')
      fprintf('= \t\t EQUAL_OP \n');
   end


   x = str2num(arr(i));
   y = isletter(arr(i));


   if(y || (isempty(x) ==0))
      strTwo = strcat(strTwo,arr(i));
   end


   if(~ischar(arr(i)) && ~isnumeric(arr(i)))
      if(~isspace(arr(i)) && ~isempty(strTwo))
           m(j) = strTwo;

           if(isNumeric(strTwo(1)) && regexp('.*[a-zA-]+.*'))
               disp(strcat('Error. Potential variable (', strTwo, ') whose name starts with digit found'));
               strTwo = '';
               j = j + 1;
           end
           if(~(isNumeric(strTwo(1) && regexp('.*[a-zA-]+.*'))))
               disp(strcat(m(j), ('\t\t IDENTIFIER')));
               strTwo = '';
               j = j + 1;   
           end 
       end
    end 
 end
end

And the intended output, when '(2a + b)' is entered to the user prompt,is as follows:

enter image description here

However, the output currently does not identify identifiers (i.e. 2a and b in this example).

Any help on this problem is appreciated.


Solution

  • I tried to keep the changes needed by your code to a minimum, but there were quite a number of mistakes (even things like isNumeric instead of isnumeric or a missing argument for the regex function). Hope you'll be satisfied with this.

    function determineLexemes()
       j = 1;
       prompt = 'Enter string : ';
       str = input(prompt);
       arr = char(str);
       strTwo = '';
       display('Symbol Table');
       fprintf('Lexeme \t\t Token \n');
       k = length(arr);
       for i = 1: k
         if(arr(i) == '+')
           fprintf('+ \t\t ADD_OP \n');
         end
        if(arr(i) == '-')
           fprintf('- \t\t SUB_OP \n');
        end
        if(arr(i) == '*')
           fprintf('* \t\t MULT_OP \n');
        end
        if(arr(i) == '/')
           fprintf('/ \t\t DIV_OP \n');
        end
       if(arr(i) == '(')
          fprintf('( \t\t LEFT_PAREN \n');
       end
       if(arr(i) == ')')
          fprintf(') \t\t RIGHT_PAREN \n');
       end
       if(arr(i) == '=')
          fprintf('= \t\t EQUAL_OP \n');
       end
    
    
       x = str2num(arr(i));
       y = isletter(arr(i));
    
    
       if(y || ~isempty(x))
          strTwo = strcat(strTwo,arr(i));
       end
    
      if(~isspace(arr(i)) && ~isempty(strTwo))
           if(~isempty(str2num(strTwo(1))) && any(regexp(strTwo,'.*[a-zA-]+.*')))
               fprintf(strcat('Error. Potential variable (', strTwo, ') whose name starts with digit found \n'));
               strTwo = '';
               j = j + 1;
           else
            if isempty(str2num(strTwo(1)))
               fprintf(strcat(strTwo, ('\t\t IDENTIFIER \n')));
               strTwo = '';
               j = j + 1;   
            end
           end
      end
      end
    end