UPDATE: To make this question easier. I have changed the string and now you don't have to worry about BEGIN END at all. Instead I now have only GO statements as terminators. I hope I will get some answers from other people now
I need to write a regular expression which can detect hidden DML statements like INSERT, UPDATE, DELETE in a DDL script (CREATE, ALTER, DROP).
Example. In the script below it should catch 2 delete table5 statements and the last insert into table3 statement. But it should ignore the insert statements which are actually in the body of the stored proc itself.
If you run this RegEx here : http://regexr.com?33rf3, you will see that I have accomplished 90% of it. The only part that is not working is that it is doing a greedy match until the last GO. I need it to stop at the first GO. I have tried using +? operator to make it non-greedy but it doesn't want to work.
delete table5 /*Regex should find this*/
GO
create proc abc
begin
insert into table1 /*Regex should IGNORE this*/
begin
fake nesting here
insert into table2 /*Regex should IGNORE this*/
end
end
GO
delete table5 /*Regex should find this*/
GO
alter proc abc2
begin
--no statements in this proc
end
GO
insert into table3 /*Regex should find this*/
GO statements aren't needed.
var regex1 = new Regex(@"
(?s)
(create|alter|drop).+?
begin
(?:
(?> (?! begin | end ) .
| begin (?<counter>)
| end (?<-counter>)
)*
)
(?(counter)(?!))
end
"
, RegexOptions.IgnorePatternWhitespace
);
var regex2 = new Regex(@"(?:insert|update|delete).+");
var result = regex2.Matches(regex1.Replace(your_input, ""))
.Cast<Match>()
.Select(m => m.Value);
This code deletes all create/alter/drop statements and than looks for delete/alter/drop. I think it could be done using only one regex, but it is all I can suggest now, sorry:)