Search code examples
javastringjsonparsingstructured-data

Extract structured data from plain text


On input I have a plain text (in my case typically it will be HTML) and a "grammar specification" (some way for extracting data from plain text to structured data), then on output I need to have some structured data (JSON is fine but maybe there exists something better?)

Are there any libraries for this task? What are good approaches to specify "grammar spec"? What are the best approaches for solving such problem?


Solution

  • Some tools for grammar based transformations:

    Addition: