I have a table with a skills column having comma separated values. The skills can have duplicates per student. I need to use group by to list unique skills per student.
What should I be using so that I get list with no duplicates. Please help.
Input Table student_skills
:
STUDENT_ID | SEMESTER | SKILLS |
---|---|---|
101 | 1 | C, SQL |
101 | 2 | C, CPP |
102 | 1 | CPP, Java |
102 | 2 | Java, JavaScript |
Desired result:
STUDENT_ID | SKILLS |
---|---|
101 | C, SQL, CPP |
102 | CPP, Java, JavaScript |
SQL Queries to create table, insert data:
create table student_skills
(STUDENT_ID number(10),
SEMESTER varchar2(5),
SKILLS varchar2(50));
insert into student_skills (STUDENT_ID, SEMESTER, SKILLS)
values (101, '1', 'C, SQL');
insert into student_skills (STUDENT_ID, SEMESTER, SKILLS)
values (101, '2', 'C, CPP');
insert into student_skills (STUDENT_ID, SEMESTER, SKILLS)
values (102, '1', 'CPP, Java');
insert into student_skills (STUDENT_ID, SEMESTER, SKILLS)
values (102, '2', 'Java, JavaScript');
commit;
One option is to split skills into rows (the temp
CTE), and then aggregate them back (line #11):
SQL> with temp as
2 (select distinct
3 student_id,
4 trim(regexp_substr(skills, '[^,]+', 1, column_value)) skill
5 from student_skills cross join
6 table(cast(multiset(select level from dual
7 connect by level <= regexp_count(skills, ',') + 1
8 ) as sys.odcinumberlist))
9 )
10 select student_id,
11 listagg(skill, ', ') within group (order by skill) skills
12 from temp
13 group by student_id;
STUDENT_ID SKILLS
---------- ------------------------------
101 C, CPP, SQL
102 CPP, Java, JavaScript
SQL>