Search code examples
cregexrubycrashfuzzing

Call to ruby regex through C api from C code not working


I am trying to call ruby regex from C code:

#include <ruby.h>
#include "ruby/re.h"

int main(int argc, char** argv) {
    
    char string[] = "regex";
    ruby_setup();
    rb_reg_regcomp(string);
    return 0;
}

I compiled the newest version of ruby myself (commit 0b303c683007598a31f2cda3d512d981b278f8bd) and I link my program against it. It compiles with the warning:

fuzzer.c: In function ‘main’:
fuzzer.c:10:17: warning: passing argument 1 of ‘rb_reg_regcomp’ makes integer from pointer without a cast [-Wint-conversion]
   10 |  rb_reg_regcomp(string);
      |                 ^~~~~~
      |                 |
      |                 char *
In file included from fuzzer.c:4:
/home/cyberhacker/Asioita/Hakkerointi/Rubyregex/ruby/build/output/include/ruby-3.3.0+0/ruby/re.h:36:28: note: expected ‘VALUE’ {aka ‘long unsigned int’} but argument is of type ‘char *’
   36 | VALUE rb_reg_regcomp(VALUE str);

That I think is because the "VALUE" keyword in the ruby source code is a generic pointer to any type. When I try to run the program I get a segfault with this backtrace:

Program received signal SIGSEGV, Segmentation fault.
rb_enc_dummy_p (enc=enc@entry=0x0) at ../encoding.c:181
181     return ENC_DUMMY_P(enc) != 0;
(gdb) where
#0  rb_enc_dummy_p (enc=enc@entry=0x0) at ../encoding.c:181
#1  0x000055555569bd00 in rb_reg_initialize (obj=obj@entry=140737345038080, s=0xc62000007ffff78a <error: Cannot access memory at address 0xc62000007ffff78a>, len=-4574812796478291968, enc=enc@entry=0x0, options=options@entry=0, err=err@entry=0x7fffffffdb30 "", sourcefile=0x0, sourceline=0) at ../re.c:3198
#2  0x00005555556a11c8 in rb_reg_initialize_str (sourceline=0, sourcefile=0x0, err=0x7fffffffdb30 "", options=0, str=140737488346082, obj=140737345038080) at ../include/ruby/internal/core/rstring.h:516
#3  rb_reg_init_str (options=0, s=140737488346082, re=140737345038080) at ../re.c:3299
#4  rb_reg_new_str (options=0, s=140737488346082) at ../re.c:3291
#5  rb_reg_regcomp (str=140737488346082) at ../re.c:3373
#6  0x0000555555584648 in main () at ../include/ruby/internal/encoding/encoding.h:418

I tried to fiddle around with the type of the string which I pass to the function, but nothing really seemed to work. Expected behaviour is that it runs succesfully.

Can someone help? Thanks in advance!


Solution

  • After a bit of digging I figured out that you need to convert the c string to a ruby string and then pass it to the function. I was confused, because in the documentation they say that: "Ruby’s String kinda corresponds to C’s char*." .

    #include <ruby.h>
    #include "ruby/re.h"
    int main(int argc, char** argv) {
        VALUE x;
        char string[] = "regex";
        x = rb_str_new_cstr(string);
        rb_reg_regcomp(x);
        return 0;
    
    
    
    }