Search code examples
mysqlgroup-bygroup-concat

MySQL - GROUP_CONCAT if value is not a substring


I have a column called "Permissions" in my table. The permissions are strings which can be: "r","w","x","rw","wx","rwx","xwr" etc. Please note the order of characters in the string is not fixed. I want to GROUP_CONCAT() on the "Permissions" column of my table. However this causes very large strings.

Example: "r","wr","wx" group concatenated is "r,wr,wx" but should be "r,w,x" or "rwx". Using distinct() clause doesn't seem to help much. I am thinking that if I could check if a permission value is a substring of the other column then I should not concatenate it, but I don't seem to find a way to accomplish that.

Any column based approach using solely string functions would also be appreicated.

EDIT: Here is some sample data:

+---------+
| perm    |
+---------+
| r,x,x,r |
| x       |
| w,rw    |
| rw      |
| rw      |
| x       |
| w       |
| x,x,r   |
| r,x     |
+---------+

The concatenated result should be:

 +---------+
 | perm    |
 +---------+
 | r,w,x   |
 +---------+

I don't have control over the source of data and would like not to create new tables ( because of restricted privileges and memory constraints). I am looking for a post-processing step that converts each column value to the desired format.


Solution

  • A good idea would be to first normalize your data.

    You could, for example try this way (I assume your source table is named Files):

    1. Create simple table called PermissionCodes with only column named Code (type of string).

    2. Put r, w, and x as values into PermissionCodes (three rows total).

    3. In a subquery join Files to PermissionCodes on a condition that Code exists as a substring in Permissions.

    4. Perform your GROUP_CONCAT aggregation on the result of the subquery.

    If it is a case here, that for the same logical entires in Files there exists multiple permission sets that overlaps (i.e. for some file there is a row with rw and another row with w) then you would limit your subquery to distinct combinations of Files' keys and Code.

    Here's a fiddle to demonstrate the idea:

    http://sqlfiddle.com/#!9/6685d6/4