I would like to explain my task. I have been asked to modify many files with one command without alter the data. The core is to properly insert DATE_B values, since it would give us an error trying to insert this into MYTABLE, due to the fact we create DATE_B as DATE datatype.
Insert into MYTABLE (DATE_A, DATE_B) values (to_timestamp('2017-08-01','YYYY-MM-DD'),'2019-09-09');
Insert into MYTABLE (DATE_A, DATE_B) values (to_timestamp('2018-10-30','YYYY-MM-DD'),'2021-09-09');
Insert into MYTABLE (DATE_A, DATE_B) values (to_timestamp('2022-08-11','YYYY-MM-DD'),'2022-08-06');
I have to modify all DATE_B values in order to insert DATE before the value. What I mean is:
Insert into MYTABLE (DATE_A, DATE_B) values (to_timestamp('2017-08-01','YYYY-MM-DD'),DATE'2019-09-09');
Insert into MYTABLE (DATE_A, DATE_B) values (to_timestamp('2018-10-30','YYYY-MM-DD'),DATE'2021-09-09');
Insert into MYTABLE (DATE_A, DATE_B) values (to_timestamp('2022-08-11','YYYY-MM-DD'),DATE'2022-08-06');
My approach was to use REGEX in order to identify the single quote, year, dash, month, dash, day, dash, single quote... But if I do this I will substitute also my properly formatted to_timestamp values.
I found this command from GeekForGeeks useful since we catch the first letter and we enclose them with parenthesis.
$ echo "Welcome To The Geek Stuff" | sed 's/\(\b[A-Z]\)/\(\1\)/g'
Trying to approach my goal:
$ echo "Welcome To The Geek Stuff '2010-10-12' '1999-10-10'" | sed -E 's/(\b[A-Z])/DATE\1/g'
Using regex:
$ echo "Welcome To The Geek Stuff '2010-10-12' '1999-10-10'" | sed -E 's/'('\d{4}-)'/DATE\1/g'
$ echo "Welcome To The Geek Stuff" ,'1999-10-10' | sed 's/^(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])$/'DATE\1'/g'
I also read about h, H subcommand of sed, but it is not clear enough for me to make it works.
I would appreciate if someone could explain how to properly implement sed h subcommand and how to properly use regex with sed in order to identify COMMA, single quote, year, dash, month, dash, day, dash, single quote. Save this and insert word DATE
Thank you beforehand!
Assumptions/Understandings:
'YYYY-MM-DD'
values (...)
clause; this means ...values (...)
clause (ie, has a preceding (
) as well as strings elsewhere in the values(...)
clause (ie, has a preceding ,
) but ...DATE
prefix if the string is embedded within a function (eg, to_timestamp('YYYY-MM-DD')
)INSERT
statements are formatted as shown in the sample (eg, a space between values
and (
; no white space before a 'YYYY-MM-DD'
string)Adding a few scenarios to our input:
$ cat insert.sql
Insert into MYTABLE (DATE_A, DATE_B) values (to_timestamp('2017-08-01','YYYY-MM-DD'),'2019-09-09');
Insert into MYTABLE (DATE_A, DATE_C) values ('2021-09-09',to_timestamp('2018-10-30','YYYY-MM-DD'));
Insert into MYTABLE (DATE_A, MY_INT, DATE_B) values ('2022-08-11','2344','2022-08-06');
One sed
idea:
sed -E "s/( \(|,)('[0-9]{4}-[0-9]{2}-[0-9]{2}')/\1DATE\2/g" insert.sql
Where:
-E
- enable extended regex support (eliminates need to escape parens and braces)"s/..../g"
- wrap sed
script in double quotes to eliminate need to escape single quote within the script( \(|,)
- (1st capture group) space + (
or a comma (in this case we need the escape to designate the (
as part of the data)('[0-9]{4}-[0-9]{2}-[0-9]{2}')
- (2nd capture group) our string 'YYYY-MM-DD'
\1DATE\2
- inserting the DATE
string between the two capture groupsThis generates:
Insert into MYTABLE (DATE_A, DATE_B) values (to_timestamp('2017-08-01','YYYY-MM-DD'),DATE'2019-09-09');
Insert into MYTABLE (DATE_A, DATE_C) values (DATE'2021-09-09',to_timestamp('2018-10-30','YYYY-MM-DD'));
Insert into MYTABLE (DATE_A, MY_INT, DATE_B) values (DATE'2022-08-11','2344',DATE'2022-08-06');
Once satisfied with the results OP can add the -i
flag to have sed
overwrite the input file.