Search code examples
mysqlindexingsubstringfind-in-set

mysql: finding the shortest and the longest value within an mysql-column


I had a question about finding the shortest und the longest value within an concated string in MySQL-Column "values" . The Problem, the values within the column are concated with "|" and may be diffrent long.

Table:

ID  |  values 
----------------------------------------
A   |  12.1|11.23|134.44
B   |  11.134|1.3|34.5|152.12|1.31313|134.331|12.31
C   |  34.11|1.34|343412.2|13......

The question is: Is there some simple possiblity to find both (shortest and the longest) value only by native mysql query, without using any language as Java or PHP.

Thank you


Solution

  • You can't get the result you need in a single query, at least not in the current version of MySQL. The reason is that you can't form a query to fetch individual elements from a delimited string of unknown length before you know the maximum length.

    First find out how many elements are in the longest list:

    select max(length(`values`)-length(replace(`values`, '|', ''))) as max from t;
    +------+
    | max  |
    +------+
    |    6 |
    +------+
    

    Now you know you'll need to test up to 7 "fields" within your delimited string. There's no way to form SQL with a variable number of unioned queries. The syntax must be fixed at prepare time, so you need to know how many.

    select id, substring_index(substring_index(`values`, '|', 1), '|', -1) as element from t
    union
    select id, substring_index(substring_index(`values`, '|', 2), '|', -1) from t
    union
    select id, substring_index(substring_index(`values`, '|', 3), '|', -1) from t
    union
    select id, substring_index(substring_index(`values`, '|', 4), '|', -1) from t
    union
    select id, substring_index(substring_index(`values`, '|', 5), '|', -1) from t
    union
    select id, substring_index(substring_index(`values`, '|', 6), '|', -1) from t
    union
    select id, substring_index(substring_index(`values`, '|', 7), '|', -1) from t;    
    
    +------+----------+
    | id   | element  |
    +------+----------+
    | A    | 12.1     |
    | A    | 11.23    |
    | A    | 134.44   |
    | B    | 11.134   |
    | B    | 1.3      |
    | B    | 34.5     |
    | B    | 152.12   |
    | B    | 1.31313  |
    | B    | 134.331  |
    | B    | 12.31    |
    | C    | 34.11    |
    | C    | 1.34     |
    | C    | 343412.2 |
    | C    | 13       |
    +------+----------+
    

    Now use the query above as a subquery, you can find the longest or shortest:

    (select id, element from (...subquery...) as t1 order by length(element) asc limit 1)
    union
    (select id, element from (...subquery...) as t2 order by length(element) desc limit 1)
    
    +------+----------+
    | id   | element  |
    +------+----------+
    | C    | 343412.2 |
    | C    | 13       |
    +------+----------+
    

    I agree with others who have commented that this is really the wrong way to use an RDBMS. I understand you said that you're committed to this structure, but you'll find in the long run, it makes more work for you than the work it would take to fix the schema.

    See also my answer to Is storing a delimited list in a database column really that bad?