Search code examples
javarubyregexnetbeansrecursive-regex

How to write a Ruby-regex pattern in Java (includes recursive named-grouping)?


well... i have a file containing tintin-script. Now i already managed to grab all actions and substitutions from it to show them properly ordered on a website using Ruby, which helps me to keep an overview.

Example TINTIN-script

#substitution {You tell {([a-zA-Z,\-\ ]*)}, %*$}
              {<279>[<269> $sysdate[1]<279>, <269>$systime<279> |<219> Tell  <279>] <269>to   <219>%2<279> : <219>%3} 
              {4}
#substitution {{([a-zA-Z,\-\ ]*)} tells you, %*$}  
              {<279>[<269> $sysdate[1]<279>, <269>$systime<279> |<119> Tell  <279>] <269>from <119>%2<279> : <119>%3} 
              {2}

#action {Your muscles suddenly relax, and your nimbleness is gone.}
{
    #if {$sw_keepaon}
    {
        aon;
    };
} {5}

#action {xxxxx}
{
    #if {$sw_keepfamiliar}
    {
        familiar $familiar;
    };
} {5}

To grab them in my Ruby-App i read my script-file into a varibable 'input' and then use the following pattern to scan the 'input'

pattern = /(?<braces>{([^{}]|\g<braces>)*}){0}^#(?<type>action|substitution)\s*(?<b1>\g<braces>)\s*(?<b2>\g<braces>)\s*(?<b3>\g<braces>)/im

input = ""

File.open("/home/igambin/lmud/lmud.tt") { |file| input = file.read }

input.scan(pattern) { |prio, type, pattern, code|
  ## here i usually create objects, but for simplicity only output now
  puts "Type    : #{type}"
  puts "Pattern : #{pattern}"
  puts "Priority: #{prio}"
  puts "Code    :\n#{code}"
  puts
}

Now my idea was to use the netbeans platform to write a module to not only keep an overview but also to assist editing the tintin script file. So opening the file in an Editor-Window I still need to parse the tintin-file and have all 'actions' and 'substitutions' from the file grabbed and displayed in an eTable, in wich I could dbl-click on one item to open a modification-window.

I've setup the module and got everything ready so far, i just can't figure out how to translate the ruby-regex pattern i've written to a working java-regex-pattern. It seems named-group-capturing and especially the recursive application of these groups is not supported in Java. Without that I seem to be unable to find a working solution...

Here's the ruby pattern again...

pattern = /(?<braces>{([^{}]|\g<braces>)*}){0}^#(?<type>action|substitution)\s*(?<b1>\g<braces>)\s*(?<b2>\g<braces>)\s*(?<b3>\g<braces>)/im

Can anyone help me to create a java pattern that matches the same?

Many thanks in advance for tips/hints/ideas and especially for solutions or (close-to-solution comments)!


Solution

  • Your text format seems pretty simple; it's possible you don't really need recursive matching. This Java-compatible regex matches your sample data correctly, as far as I can tell:

    (?s)#(substitution|action)\s*\{(.*?)\}\s*\{(.*?)\}\s*\{(\d+)\}
    

    Would that work for you? If you run Java 7, you can even name the groups. ;)