How to withdraw the substring that follow a pattern in PostgreSQL

I need to find a pattern which will give me a substring which follows logic to substring from the right side, after the third _.

What I want to achieve is to retrieve everything up until 3rd _ .

I tried this SQL, but it did not work

select 
    SUBSTRING('22D_XYZ_xy_100_xyz_123', 1, position('_' in REVERSE('22D_XYZ_xy_100_xyz_123')) - 1) as result;

It returns the 22D whereas I expect 22D_XYZ_xy

Solution

The phrase, "substring which follows logic to substring from the right side, after the third _," is difficult to understand. I infer that the intended behavior is to extract the subtring that includes all text before the third _ from the right and not all text before the third _ from the left. When searching from the left, there is also ambiguity as to expected behavior if there are less than three _ characters: should the entire string be returned or should the result be NULL? Unfortunately, the example input, 22D_XYZ_xy_100_xyz_123, and expected output, 22D_XYZ_xy, are consistent with all of these cases. The following query demonstrates all three interpretations:

SELECT
  string,
  regexp_substr(string, '^.*(?=(_[^_]*){3}$)') as from_right,
  regexp_substr(string, '^([^_]*_){0,2}[^_]*') as from_left,
  regexp_substr(string, '^([^_]*_){2}[^_]*(?=_)') as from_left_requires_3
FROM (
  VALUES ('22D_XYZ_xy_100_xyz_123'),
         ('1_2_3_4_5_6_7'),
         ('1_2_3_4_5_6'),
         ('1_2_3_4_5_'),
         ('_1_2_3_4_5'),
         ('1_2_3_4_5'),
         ('1_2_3_4'),
         ('1_2_3'),
         ('1_2_'),
         ('1_2'),
         ('1'),
         ('____'),
         ('___'),
         ('__'),
         ('_'),
         (''),
         (NULL)) t(string);

Executing the query gives the following:

string	from_right	from_left	from_left_requires_3
'22D_XYZ_xy_100_xyz_123'	'22D_XYZ_xy'	'22D_XYZ_xy'	'22D_XYZ_xy'
'1_2_3_4_5_6_7'	'1_2_3_4'	'1_2_3'	'1_2_3'
'1_2_3_4_5_6'	'1_2_3'	'1_2_3'	'1_2_3'
'1_2_3_4_5_'	'1_2_3'	'1_2_3'	'1_2_3'
'_1_2_3_4_5'	'_1_2'	'_1_2'	'_1_2'
'1_2_3_4_5'	'1_2'	'1_2_3'	'1_2_3'
'1_2_3_4'	'1'	'1_2_3'	'1_2_3'
'1_2_3'	NULL	'1_2_3'	NULL
'1_2_'	NULL	'1_2_'	NULL
'1_2'	NULL	'1_2'	NULL
'1'	NULL	'1'	NULL
'____'	'_'	'__'	'__'
'___'	''	'__'	'__'
'__'	NULL	'__'	NULL
'_'	NULL	'_'	NULL
''	NULL	''	NULL
NULL	NULL	NULL	NULL

To search from the right, a positive lookahead zero-length assertion, (?=(_[^_]*){3}$), is used to check that the unmatched portion of the string begins with _ and includes exactly three _ characters. If the assertion is false, then NULL is returned.

To find the left substring with no more than two _ characters, the regular expression matches up to two substings of zero or more non-_ characters followed by a single _ along with any non-_ characters prior to the next _ or the end of the string.

Finding the left substring preceding a required third _ is similar to finding the left substring with no more than two _ characters, except that the occurrence specifier for the group is {2} instead of {0,2} and a positive lookahead zero-lengh assertion, (?=_), ensures that _ is the first unmatched character. As with the search from the right, if the assertion is false, then NULL is returned.