Given a table 'updates' that contains records comprising two fields, both containing text only:
existing_value
and replacement_value
This table is used to update a 2nd table 'main' with a single field:
name
whereby the name
field is to be updated with a replacement_value
, replacing all instances of existing_value
with replacement_value
. However, what's needed is not a simple update of the 'main' table replacing on the basis of main.name = updates.repleacement_value
where main.name = updates.existing_value
.
main.name
, updates.existing_value
and updates.replacement_value
all contain strings that take the form of a simple value such as Joe
or they can (and mostly do) contain strings that include delimiters to separate multiple values held in a single field e.g. Joe Soap\\Joe Bloggs\\Joe
(yes I know field stuffing is a terrible idea, but the source data is what it is).
For the purposes of clarification let's assume an example where the following applies:
updates.existing_value = 'Joe'
updates.replacement = 'Jose'
Main has two records:
main.name = 'Joe Soap\\Joe Bloggs\\Joe'
main.name = 'Joe\\Joe Soap\\Joe\\Joe Little'
The problem I need to solve is to replace only the standalone 'Joe'
in each record to 'Jose'
without inadvertently affecting 'Joe Soap', 'Joe Bloggs' or 'Joe Little'.
One way to solve the problem is to load each record in 'main' into Python, turn each record into a list, process the list and recompile the string before writing it back to 'main' (a dataframe would be most efficient), but before going this route, I was wondering whether there's a way to achieve the same directly in SQLite?
I've experimented with LIKE
, INSTR()
and REPLACE()
, but there doesn't seem to be a way to curtail REPLACE()
's scope in that it'll replace all instances of 'Joe' with 'Jose'.
Is there some other SQL
method I've not considered or is the problem best tackled outside or SQLite
?
You can delimit the search term:
update main
set name =
trim(
replace(
replace(
replace(
'\'||name||'\',
'\',
'\\'
),
'\'||map.old||'\',
'\'||map.new||'\'
),
'\\',
'\'
),
'\'
)
from (
select distinct
main.rowid as id,
updates.existing_value as old,
updates.replacement as new
from main inner join updates
where instr(
'\'||main.name||'\',
'\'||updates.existing_value||'\'
)
) as map
where main.rowid = map.id;
\Joe\
Joe\Joe\Joe
)existing_value
Note: If multiple updates apply to the same row, the result is nondeterministic.