Search code examples
mysqlsqlstringdrydatabase-performance

Remove everything after a plus in MySQL


I have a query in MySQL

select various(functions(here(...))) foo, count(*) ct
from table
group by foo
having ct > 1;

which looks for duplicate data. I would like to change the query to remove plus signs and anything following them from foo, so that if various(functions(here(...))) yields foo+bar I get just foo. (If a plus sign does not occur, it stays unchanged.)

What's the best way to do this? I can use replace

select if(locate("+", various(functions(here(...))))>0, left(various(functions(here(...))), locate("+", various(functions(here(...)))) - 1), various(functions(here(...)))) foo, count(*) ct
from table
where conditions
group by foo
having ct > 1;

but this seems like 'obviously' the wrong thing. Regex would be nice but they don't exist in MySQL as far as I know. A subquery makes this slightly less unwieldy

select if(locate("+", bar)>0, left(bar, locate("+", bar)-1), bar) foo
from table
left join (
  select pkey, various(functions(here(...))) bar
  from table
  where conditions
) subtable using(pkey)
group by foo
having ct > 1
;

but as the table is large I'd like to know if there's a more efficient or more maintainable solution.


Solution

  • Use substring_index():

    select substring_index(various(functions(here(...))), '+', 1) as foo, count(*) ct
    from table
    group by foo
    having ct > 1;