Search code examples
javaregexdata-structuresdfa

Data Structure to represent a DFA


I was wondering, what would be the best data structure to represent a DFA?

I am looking at converting a regular expression to a DFA and make this particular functionality as a library in Java.

The main thing is that, each entity in the regex carries a set of value rather than a single string value like "car" . In my case , each entity would carry many properties like {car, Honda, 4x4, sedan, ... } (Though I am not searching for cars, this is just an example.)

Any suggestions?


Solution

  • If I understand your question correctly you want to have a matching/filtering library for an arbitrary regular language over an alphabet with dynamic types? Going with your car example, I'd imagine you'd want to be able to create an expression in order to match over a List where all Cars (have the color red, have between 2 and 6 Passengers and each Passenger is between 8 and 88 years of age) or (have 1 Passenger).

    Coincidentally I've been looking for something like that myself (for document validation) and the closest I could get was Jing; A Java RELAX-NG library. Unfortunately, the alphabet in Jing consists out of XML nodes so it didn't solve my problem. At the moment I'm attempting to write a library myself which does just this (matching against regular languages over an arbitrary type of alphabet), based on the pattern matching in Jing. If you like to help with this, please let me know ;).