I have observed what seems to me an odd behavior of Postgres' to_tsvector function.
SELECT to_tsvector('english', 'abc-xyz');
returns
'abc':2 'abc-xyz':1 'xyz':3
However,
SELECT to_tsvector('english', 'abc-001');
returns
'-001':2 'abc':1
Why not something like this?
'abc':2 'abc-001':1 '001':3
And what should I do to be able to search by the numeric portion alone, without the hyphen?
This can be circumvented with PG13's dict-int addon's absval
option. See the official documentation.
But in case you're stuck with an earlier PG version, here's the generalized version of a "number or negative number" workaround in a query.
select regexp_replace($$'test' & '1':* & '2'$$::tsquery::text,
'''([.\d]+''(:\*)?)', '(''\1 | ''-\1)', 'g')::tsquery;
This results in:
'test' & ( '1':* | '-1':* ) & ( '2' | '-2' )
It replaces lexemes that look like positive numbers with "number or negative number" kind of subqueries.
The double cast ::tsquery::text
is just there to show how you would pass a tsquery casted to text.
Note that it handles prefix matching numeric lexemes as well.