Search code examples
sqlpostgresqlsupabase

SQL Query for Related Posts by Tags in Common


Model

tags
- name
- pid (short for post_id)
(fk = name + pid)

posts
- id
- title
- content
(all posts have minimum of 3 tags)
...

Function

create function get_related_posts(post_id uuid)
returns setof posts
language plpgsql
as $$
begin
    return query
    select 
    ... (what code goes here?)  
    posts.id = $1;
end;$$;

Looking to create a function get_related_posts, which lets me get the related posts by tag.

And then I could limit those results on the front end using Supabase.

So maybe something like this, but in SQL?

pseudocode

results = [];
Select all posts that have all 5 tags
  if any exist, add them to top of results array
loop through all possible sets of 4 tags
  select all posts that have these 4 tags
  if exists, add them to results array
loop through all possible sets of 3 tags
  select all posts that have these 3 tags
  if exists, add them to results array
loop through all possible results sets of 2 tags
  select all posts that have these 2 tags
  if exists, add them to results array
loop through all possible individual tags
  select all posts that have this one tag
  if exists, add them to the results array
return results

Of course, if I used paging and limit, no idea either how this would work.

Can't wrap my head around this...

J

EDIT - The results I am looking for are just a set of posts records...

ID Title Content
1 post1 title ...
... ... ...

J


Solution

  • Solve problems like this step by step

    The following gets all tags for a post ID

     select name 
     from tags
     where pid = $1
    

    The following gets the tags of all posts (there will be a row per post that has a tag.

     select posts.id, tags.name 
     from posts
     join tags on posts.id = tags.pid
    

    we can then join it to the list of tags in the original to get all items that have a matching tag

     select posts.id, tags.name 
     from posts
     join tags on posts.id = tags.pid
     join (
       select name 
       from tags
       where pid = $1
     ) as match on match.name = tags.name
    

    we want these posts ordered by the ones with the most tags -- if we group by we can get a count of matching tags and order them.

     select posts.id, count(*) as match_count 
     from posts
     join tags on posts.id = tags.pid
     join (
       select name 
       from tags
       where pid = $1
     ) as match on match.name = tags.name
     group by post.id
     order by match_count desc
    

    A quick note on thinking like sql -- sql works in sets -- this means you don't use if statements (like your code) but instead filters or joins. As you can see each step I performed was over an entire set of data -- not looking at row by row.