Location of the error token always starts at 0

I'm writing a parser with error handling. I would like to output to the user the exact location of the parts of the input that couldn't be parsed.

However, the location of the error token always starts at 0, even if before it were parts that were parsed successfully.

Here's a heavily simplified example of what I did. (The problematic part is probably in the parser.yy.)

Location.hh:

#pragma once
#include <string>

// The full version tracks position in bytes, line number and offset in the current line.
// Here however, I've shortened it to line number only.
struct Location
{
    int beginning, ending;
    operator std::string() const { return std::to_string(beginning) + '-' + std::to_string(ending); }
};

LexerClass.hh:

#pragma once
#include <istream>
#include <string>
#if ! defined(yyFlexLexerOnce)
    #include <FlexLexer.h>
#endif
#include "Location.hh"

class LexerClass : public yyFlexLexer
{
    int currentPosition = 0;
protected:
    std::string *yylval = nullptr;
    Location *yylloc = nullptr;
public:
    LexerClass(std::istream &in) : yyFlexLexer(&in) {}
    [[nodiscard]] int yylex(std::string *const lval, Location *const lloc);
    void onNewLine() { yylloc->beginning = yylloc->ending = ++currentPosition; }
};

lexer.ll:

%{
    #include "./parser.hh"
    #include "./LexerClass.hh"
    
    #undef  YY_DECL
    #define YY_DECL int LexerClass::yylex(std::string *const lval, Location *const lloc)
%}

%option c++ noyywrap
%option yyclass="LexerClass"

%%

%{
    yylval = lval;
    yylloc = lloc;
%}

[[:blank:]] ;
\n          { onNewLine(); }
[0-9]       { return yy::Parser::token::DIGIT; }
.           { return yytext[0]; }

parser.yy:

%language "c++"

%code requires {
    #include "LexerClass.hh"
    #include "Location.hh"
}

%define api.parser.class {Parser}
%define api.value.type {std::string}
%define api.location.type {Location}
%parse-param {LexerClass &lexer}
%defines

%code {
    template<typename RHS>
    void calcLocation(Location &current, const RHS &rhs, const int n);
    #define YYLLOC_DEFAULT(Cur, Rhs, N) calcLocation(Cur, Rhs, N)
    
    #define yylex lexer.yylex
}

%token DIGIT

%%

numbers:
      %empty
    | numbers number ';' { std::cout << std::string(@number) << "\tnumber" << std::endl; }
    | error ';' { yyerrok; std::cerr << std::string(@error) << "\terror context" << std::endl; }
    ;

number:
      DIGIT {}
    | number DIGIT {}
    ;

%%

#include <iostream>

template<typename RHS>
inline void calcLocation(Location &current, const RHS &rhs, const int n)
{
    current = (n <= 1)
        ? YYRHSLOC(rhs, n)
        : Location{YYRHSLOC(rhs, 1).beginning, YYRHSLOC(rhs, n).ending};
}

void yy::Parser::error(const Location &location, const std::string &message)
{
    std::cout << std::string(location) << "\terror: " << message << std::endl;
}

int main()
{
    LexerClass lexer(std::cin);
    yy::Parser parser(lexer);
    return parser();
}

For the input:

expected output:

0-2 number
3-3 number
5-5 error: syntax error
4-6 error context
7-8 number

actual output:

0-2 number
3-3 number
5-5 error: syntax error
0-6 error context
7-8 number

Solution

I'm building upon the rici's answer, so read that one first.

Let's consider the rule:

numbers:
      %empty
    | numbers number ';'
    | error ';' { yyerrok; }
    ;

This means the nonterminal numbers can be one of these three things:

It may be empty.
It may be a number preceded by any valid numbers.
It may be an error.

Do you see the problem yet? The whole numbers has to be an error, from the beginning; there is no rule saying that anything else allowed before it. Of course Bison obediently complies to your wishes and makes the error start at the very beginning of the nonterminal numbers. It can do that because error is a jack of all trades and there can be no rule about what can be included inside of it. Bison, to fulfill your rule, needs to extend the error over all previous numbers.

When you understand the problem, fixing it is rather easy. You just need to tell Bison that numbers are allowed before the error:

numbers:
      %empty
    | numbers number ';'
    | numbers error ';' { yyerrok; }
    ;

This is IMO the best solution. There is another approach, though.

You can move the error token to the number:

numbers:
      %empty
    | numbers number ';' { yyerrok; }
    ;

number:
      DIGIT
    | number DIGIT
    | error
    ;

Notice that yyerrok needs to stay in numbers because the parser would enter an infinite loop if you place it next to a rule that ends with token error.

A disadvantage of this approach is that if you place an action next to this error, it will be triggered multiple times (more or less once per every illegal terminal). Maybe in some situations this is preferable but generally I suggest using the first way of solving the issue.