Search code examples
c#sql-serverregexpostgresqlregex-greedy

Regexp convert sql LIMIT syntax to SqlServer TOP syntax


I have some query string written for PostgreSQL and I have to convert them to SqlServer. I'm using C# 6.

In order to convert from LIMIT syntax to TOP syntax I've written the following simple regex:

SELECT ([\S\s]*)LIMIT (\d+) -> SELECT TOP $2 $1

The complete line code is:

return Regex.Replace(query, @"SELECT ([\S\s]*)LIMIT (\d+)", "SELECT TOP $2 $1", RegexOptions.RightToLeft);

Now, eveything works fine for simple query but it does not work for subquery: the TOP string is put everytime in the super parent query.

Try for example the following query:

SELECT b3.aaa,
       b3.count
FROM
(
    SELECT CONCAT(b.foo / 100.0, '-', (b.foo + 10) / 100.0) AS aaa,
                      COUNT(b.foo) AS count
    FROM
    (
        SELECT b2.foo - (b2.foo % 10) AS foo
        FROM
        (
            SELECT CAST(b.foo * 100 AS INT) AS foo
            FROM eee b
                 INNER JOIN bar f ON b.bar_id = f.id
            WHERE b.foo < 1
                  AND f.nome = 'aaa'
        ) b2
    ) b
    GROUP BY b.foo
    ORDER BY b.foo
    LIMIT 10
)

it is converted to:

SELECT TOP 10 b3.aaa,
       b3.count
FROM
(
    SELECT CONCAT(b.foo / 100.0, '-', (b.foo + 10) / 100.0) AS aaa,
                      COUNT(b.foo) AS count
    FROM
    (
        SELECT b2.foo - (b2.foo % 10) AS foo
        FROM
        (
            SELECT CAST(b.foo * 100 AS INT) AS foo
            FROM eee b
                 INNER JOIN bar f ON b.bar_id = f.id
            WHERE b.foo < 1
                  AND f.nome = 'aaa'
        ) b2
    ) b
    GROUP BY b.foo
    ORDER BY b.foo
)

while I'd like to have

SELECT b3.aaa,
       b3.count
FROM
(
    SELECT TOP 10 CONCAT(b.foo / 100.0, '-', (b.foo + 10) / 100.0) AS aaa,
                      COUNT(b.foo) AS count
    FROM
    (
        SELECT b2.foo - (b2.foo % 10) AS foo
        FROM
        (
            SELECT CAST(b.foo * 100 AS INT) AS foo
            FROM eee b
                 INNER JOIN bar f ON b.bar_id = f.id
            WHERE b.foo < 1
                  AND f.nome = 'aaa'
        ) b2
    ) b
    GROUP BY b.foo
    ORDER BY b.foo
)

How should I write the regex I need?


Solution

  • This works okay with some changes I made to handle quoted parens:

    var pattern = @"SELECT ((?:[^'()]|(?>'[^']+')|(?<open>\()|(?<-open>\)))+(?(open)(?!)))LIMIT (\d+)";
    
    query = Regex.Replace(query, pattern, "SELECT TOP $2 $1");