Search code examples
lucenealphanumericrange-query

Alphanumeric range query


Is there an effective way to handle alphanumeric ranges in lucene? Example ranges,

  • 1 to 1 (includes 1A, 1B.. 1Z)
  • 10A12 to 10A22 (includes 10A12, 10A13.. 120A22)
  • 1 to 10 (includes 1A,1B..,2A,2B..,9Z,10) [Does not include 10A]

I have two approaches:

  1. Expand each range and index all possible values. I guess the unique values won't be huge.
  2. Index on low and high values. Then use range query. Not sure, how effective is range query on alphanumeric ranges

Need expert advice on this, please.


Solution

  • I hope you agree that your defined rules are very customary and not really suitable for a generic framework, such as Lucene. For example, why would range [1..1] include letters but [1..10] wouldn't?

    I don't know if it is possible with your data set, but if you could come up with rules, converting each element (including element having letters) into a unique number using some arbitrary formula, you could use this formula both when indexing and querying. This would even allow range matching.