Search code examples
mysqlsqlsplit

SQL query that searches comma-delimited field


I have a student table which looks something like this:

 id  |  name  |  school_descriptors
-------------------------------------------------------
 1   |  Rob   |  Comp Sci,Undergraduate,2020  
 2   |  Tim   |  Business,MBA,2022
 3   |  Matt  |  Business,MBA,2022
 4   |  Jack  |  Law,Masters,2024
 5   |  Steph |  Comp Sci,Masters,2022  

The school_descriptors field is just one column, and stores information about the Course, Qualification and Graduation year as a comma-delimited string. (it's terribly designed and I wish it could be split up into its own fields, but it can't right now (I am not the database owner))

I want to provide an interface where teachers can quickly find students that match certain Course, Qualifications and Graduation years, and thus would like to create relevant queries.

Question 1: For example, I would like a teacher to be able to select from the UI: "Business", "MBA" and get returned students with ID 2 and 3. Specifically, an example question I have is: Find students who are in the Business Course and doing the MBA qualification:

SELECT * FROM student_table WHERE school_descriptors LIKE '%Business%' AND school_descriptors LIKE '%MBA%'

The query I have in mind is a basic LIKE query, but I can't help but think there is a more efficient query that can take advantage of the fact that the school_descriptor string is 1) always in a specific order (e.g. course, qualification, graduation), and 2) comma-delimited, and thus could be perhaps split. The table currently sits at ~5000 rows so relatively small but is expected to grow.

Related question 2: Find students who are in the Comp Sci Course and graduating after 2019:

Would it be possible to split the school_descriptors field and add a >2019 operand?

Many thanks!


Solution

  • In MySql you can use the function SUBSTRING_INDEX() to split the column school_descriptors.
    This will work only if the positions of Course, Qualification and Graduation year are fixed.

    select *,
      substring_index(school_descriptors, ',', 1) Course, 
      substring_index(substring_index(school_descriptors, ',', 2), ',', -1) Qualification, 
      substring_index(school_descriptors, ',', -1) Graduation
    from student_table 
    

    See the demo.
    Results:

    > id | name  | school_descriptors          | Course   | Qualification | Graduation
    > -: | :---- | :-------------------------- | :------- | :------------ | :---------
    >  1 | Rob   | Comp Sci,Undergraduate,2020 | Comp Sci | Undergraduate | 2020      
    >  2 | Tim   | Business,MBA,2022           | Business | MBA           | 2022      
    >  3 | Matt  | Business,MBA,2022           | Business | MBA           | 2022      
    >  4 | Jack  | Law,Masters,2024            | Law      | Masters       | 2024      
    >  5 | Steph | Comp Sci,Masters,2022       | Comp Sci | Masters       | 2022