Search code examples
xmlsearch-enginepentahokettle

Searching a Text in XML files


I am having many Kettle Transformation saved files (saved in desktop). What i need to do is to create a search page, where in the user types in any text. The result displayed is all the possible transformation saved files where that particular typed string appears.

Note that the transformation files are in XML format. The search string can be anything that an XML file can contain. I have no clue of how to develop this search? Please help me out.

[ I have tried using XPATH to read the xml files (using the Pentaho Data Integration tool), but missing the connection of the search.]


Solution

  • If I uderstood what you want correctly, try that:

    1. In the properties of you transformation define a parameter "String4Search".
    2. You take "Get Data from XML" step. To get all the XML files, you should set "File/Directory" to your desktop path, and the wildcard ".*.xml" Than you define a field (say named "found_string") which contains the searched string if it presents in file (XPath: //*[contains(text(),"${String4Search}")]). You also should check the checkbox "Include filename in output" and definde the name of the field which contains the name of the input file.
    3. You connect a "Filter rows" step after your "Get Data from XML" step, and there you can filter all records where the field "found_string" is empty.
    4. Then you just add steps for selection of the field with filename, and removal of duplicates and that's your results.

    Everytime when you start the transformation, you are asked to set the value of parameter "String4Search". So you just set it to your string to be found in files.