Search code examples
jmeterjmeter-pluginsjmeter-5.0jsr223

Extract text from pdf file using jsr223 preprocessor


How to extract the text/content of a pdf file using JSR223 PreProcessor in JMeter?


Solution

  • You will need a library like PDFBox for this

    1. Add it and all its dependencies to JMeter Classpath

    2. Restart JMeter to pick the .jars up

    3. The simplest code to read text from PDF would be something like:

      def doc = org.apache.pdfbox.pdmodel.PDDocument.load(new File('path-to-the-file.pdf'))
      def text = new org.apache.pdfbox.text.PDFTextStripper().getText(doc)
      
      //now do what you need with the text, i.e. save it into ${text} JMeter variable
      
      vars.put('text', text)
      

    More information: