Search code examples
regexstringstr-replacepdf-reader

Regular Expression to parse string using pdf-reader


Well, I have to parse this line in Ruby on Rails

TD(AQ-163W-1B2V) Tj0.00 -13.52 TD(AQ-180W-7BV) Tj0.00 -13.48 TD(AW-48HE-1AV) Tj0.00 -13.52 TD(AW-48HE-8AV) Tj0.00 -13.48 TD(AW-49H-7EV) Tj0.00 -13.52 TD(AW-80D-1AV) Tj0.00 -13.48 TD(AW-80D-2AV)

I need a regular expression to store only data inside TD, without parenthesis. In this case it will be:

["AQ-163W-1B2V", "AQ-180W-7BV", "AW-48HE-1AV", etc.]

Any idea? Thanks!


Solution

  • You may use string.scan

    > s = "TD(AQ-163W-1B2V) Tj0.00 -13.52 TD(AQ-180W-7BV) Tj0.00 -13.48 TD(AW-48HE-1AV)"
    > s.scan(/\(([^()]+)\)/)
    => [["AQ-163W-1B2V"], ["AQ-180W-7BV"], ["AW-48HE-1AV"]]
    > s.scan(/(?<=\()[^()]+(?=\))/)
    => ["AQ-163W-1B2V", "AQ-180W-7BV", "AW-48HE-1AV"]