Search code examples
javaregex

Match text, also match text when enclosed with listed neighbours


I am working on multi-replacement tool and need to replace name of OBJECT_TYPE to different variants.

I have several instances of such name and need to replace it only when relevant, so the rule is to have it solo, or with _POST or _TEMP afterwards. Can have an extension of .ddl or .sql. Can have a prefix Stage_ or CORE_ as well.

I have come up with this regex, but I is not capturing all I need.

\b(?!Stage_|CORE_)(OBJECT_TYPE)(?=_POST|_TEMP|.ddl|.sql|\b)

Here is the regex demo.

The first 6 lines should be matched and should match the OBJECT_TYPE only, the rest should not be captured.

OBJECT_TYPE.sql
OBJECT_TYPE_POST.sql
Stage_OBJECT_TYPE.sql
AUX_DB.OBJECT_TYPE.ddl
CORE_DB.CORE_OBJECT_TYPE_TEMP.ddl
CORE_DB.CORE_OBJECT_TYPE_POST.ddl
----
OBJECT_TYPE_ID
CR_OBJECT_TYPE_POST.sql
REST_OBJECT_TYPE_POST.sql
FLOW_OBJECT_TYPE_POST.sql
Stage_CR_OBJECT_TYPE.sql
Stage_REST_OBJECT_TYPE.sql
Stage_FLOW_OBJECT_TYPE.sql
AUX_DB.CR_OBJECT_TYPE.ddl
AUX_DB.REST_OBJECT_TYPE.ddl
AUX_DB.FLOW_OBJECT_TYPE.ddl
CORE_DB.CORE_CR_OBJECT_TYPE_TEMP.ddl
CORE_DB.CORE_REST_OBJECT_TYPE_TEMP.ddl
CORE_DB.CORE_FLOW_OBJECT_TYPE_TEMP.ddl
CORE_DB.CORE_CR_OBJECT_TYPE_POST.ddl
CORE_DB.CORE_REST_OBJECT_TYPE_POST.ddl
CORE_DB.CORE_FLOW_OBJECT_TYPE_POST.ddl

Solution

  • I suggest using

    (?<=Stage_|CORE_|\b)OBJECT_TYPE(?=(?:_POST|_TEMP)?\.(?:dd|sq)l\b)
    

    See the regex demo.

    Details:

    • (?<=Stage_|CORE_|\b) - immediately to the left, there should be either Stage_, CORE_ or a word boundary
    • OBJECT_TYPE - a string that you need to find
    • (?=(?:_POST|_TEMP)?\.(?:dd|sq)l\b) - immediately to the right of the current location, there must be
      • (?:_POST|_TEMP)? - an optional occurrence of _POST or _TEMP
      • \. - a dot
      • (?:dd|sq)l - dd or sq followed with l
      • \b - a word boundary.