I have a MySQL (MySQL v8) table where one column has a long string of information, but when I select it, I need to retrieve only a specific part of the long string.
I need as part of the response the value that comes immediately before the " - " part, and after the <Br>
part (unless the beginning of the string, in which case there is no <Br>
)
create table codes (
id integer primary key
desc text not null
);
insert into codes values (1, '423.9 - This is the first item<Br>511.9 - This is the second item<Br>486 - This is the third item (486.0 actually)<Br>42.a.7.31 - <Br>785.0 - 785.? unsure if decimal is correct')
insert into codes values (2, '428.0M - CHF NOS')
insert into codes values (3, 'E.423.9 - Just testing!<Br>511.9 - Another test 12345')
I need the part of the description immediately before the ' - ' part, for example after "select ... from codes" I would have this:
423.9, 511.9, 486, 42.a.7.31, 785.0
428.0M
E.423.9, 511.9
Ideally you should do this cleanup operation in some scripting language (e.g. Python, Perl) outside of MySQL, before you import your data into the database. That being said, on MySQL 8+, we could try using REGEXP_REPLACE()
for a regex option to extract out the terms wanted:
SELECT id, `desc`,
REGEXP_REPLACE(REGEXP_REPLACE(`desc`, '([^ ]+).*?(<Br>|$)', '$1, '), ', $', '') AS output
FROM codes;
The regex pattern used here matches:
([^ ]+)
match AND capture in $1
the first non whitespace term.*?
match until reaching the nearest(<Br>|$)
<Br>
tag OR end of the stringThen, we replace with just $1
followed by commma and space, to build out the desired CSV string. Note that the outer call to REGEXP_REPLACE
is to strip off the final dangling comma.