Search code examples

How to parse non ascii / arbitrary chars in Ragel

I want to parse the C string char str[] = {0x1b, 'h', 'i'} using ragel.

I produce this sequence on the bash command line using $'\x1bhi'

However I am unable to get Ragel to execude the CmdAction.


#include <string.h>
#include <stdio.h>
machine foo;

    Space = ' ';

    Cmd = ('h'|'i') +;
    action CmdAction
        fprintf(stderr, "cmd:%.*s\n", (int)(te - ts), ts);

    main :=
        [\x1b] Cmd => CmdAction;
        "\x1b" Cmd => CmdAction;
        'E' Cmd => CmdAction;

%% write data;

int main( int argc, char **argv ) {
    for (int i = 0; i < strlen(argv[1]); i++) {
        fprintf(stderr, "%d] 0x%02x\n", i, argv[1][i]);
    int cs, res = 0;
    int top;
    char *ts;
    char *te;
    int act;
    char *eof = NULL;
    int stack[128];
    if ( argc > 1 ) {
        char *p = argv[1];
        char *pe = p + strlen(p) + 1;
        %% write init;
        %% write exec;
    printf("result = %i\n", res );
    return 0;

// from bash use $'' to produce raw data strings as arg
//main $'\x1bhi'
ragel main.c -o main.c.c
gcc main.c.c -o main
./main $'\x1bhi'
0] 0x1b
1] 0x68
2] 0x69
result = 0
./main $'Ehi'
0] 0x45
1] 0x68
2] 0x69
result = 0

How to parse arbitrary chars in Ragel?

What input would the above Ragel code accept?


  • Quick fix:

    --- main.c.orig 2025-02-06 13:29:28.501665490 +0300
    +++ main.c      2025-02-06 13:29:58.933601263 +0300
    @@ -19,8 +19,7 @@
         main :=
    -        [\x1b] Cmd => CmdAction;
    -        "\x1b" Cmd => CmdAction;
    +        0x1b Cmd => CmdAction;
             'E' Cmd => CmdAction;

    The problem is your 0x1b specification syntax, Ragel doesn't support (and doesn't need to) \x (which is interpreted as x), so:

    • [\x1b] is any of x, 1 or b (try ./main bhi, ./main xhi, ./main 1hi with the original code)
    • "\x1b" is an x1b string, can be checked with ./main x1bhi

    But a simple 0x1b works fine as expected.