I try to select only lexemes from unnested ts_vector column:
select lexeme
from
(select unnest(to_tsvector('russian', description))
from cards) as roots;
But it doesn't work, because SQL doesn't know anything about lexeme
column. How can I select only lexemes from unnested ts_vectors?
What you found yourself:
SELECT (unnest(to_tsvector(description))).lexeme
FROM cards;
The equivalent standard SQL form with the set-returning function in the FROM
list is slightly more verbose, but easier to integrate in bigger queries:
SELECT d.lexeme
FROM cards c
LEFT JOIN LATERAL unnest(to_tsvector(c.description))) d ON true;
Related:
Since Postgres 9.6 there is a second "overloaded" variant of unnest()
. Quoting the release notes:
Add new functions for
tsvector
data (Stas Kelvich)The new functions are
ts_delete()
,ts_filter()
,unnest()
,tsvector_to_array()
,array_to_tsvector()
, and a variant ofsetweight()
that sets the weight only for specified lexeme(s).
Bold emphasis mine.
See:
SELECT proname, proargtypes::regtype[], prorettype::regtype
FROM pg_proc
where proname = 'unnest';
proname | proargtypes | prorettype
--------+------------------+-----------
unnest | [0:0]={anyarray} | anyelement
unnest | [0:0]={tsvector} | record
(2 rows)
db<>fiddle here
The function is documented in the manual among text search functions:
unnest
(tsvector
) →setof record
(lexeme
text
,positions
smallint[]
,weights
text
)
It returns setof record
with named output columns. Hence we can refer to the column lexeme
directly like we did.