I am new to PostgreSQL and am using PGADMIN 4 on a Mac. I have one column of imported data that has some usernames, sometimes a last name and mostly a first and last name in the same column.
I care more to be able to query and count the most occurrences of a name in the column. I will be able to determine by results if it is a first or last for my need. Listing the first 50 should do it. Please assist with the specific code including addressing the table and column.
Have played with this, but need more:
select surname, count(*) from atreedata
group by surname
order by count(*) desc limit 40;
Works great with only one name! I need the most common names listed by name and count.
Common Column Example:
John Smith
jsmith3
Stacey123
Bob Smith
Jones
So, if I understand it correctly, you just need to find the most numerous words in surname column.
There's a built-in function regexp_split_to_table that can split strings to words and creates rows from those words. So:
select surname_word, count(*) as surname_word_count
from (
select regexp_split_to_table(surname, E'\\s+') as surname_word
from atreedata
) as surname_words
group by surname_word
order by surname_word_count desc
limit 40;