Search code examples
powershellsqlitecgi

newlines containing text to SQLite in windows console


Running batch files powered CGI web site with SQLite database, storing text with newlines requirement newly arise.

I am able to store text with newlines into a database by:

sqlite3.exe "\work\test.db" "INSERT INTO test (text) VALUES ('Hello' || char(10) || 'world')"

I am able successfully insert such text into a web form by:

echo ^<form action="script.cmd" method="POST"^>
echo ^<textarea^>
sqlite3.exe "\work\test.db" "select text from test;" 2>NUL
echo ^</textarea^>
echo ^<input type="button" action="submit"^>
echo ^</form^>

I am also able to read and decode possibly edited text from STDIN in "script.cmd" by:

powershell.exe -noprofile "Add-Type -AssemblyName System.Web;[System.Web.HttpUtility]::UrlDecode($Input)"

BUT I need to convert newlines to:

|| char(10) ||

as described in very first command in this post. Think about to prepend HTTPDecoding powershell by something like:

$decodedString -replace \"`r`n\", '|| char(10)'"

but problem is, if two newlines one by another occurs, the number of doublepipes need to be odd:

|| char(10) || char(10) ||

Anyone have idea how to deal with this? I dont want to make it too complex.


Solution

  • I don't want to make it too complex.

    The following is complex, but should work robustly:[1]

    powershell.exe -noprofile "Add-Type -AssemblyName System.Web; [regex]::Replace(([System.Web.HttpUtility]::UrlDecode($Input).TrimEnd() -replace '''', '''''' -replace '(?m)^([^\r]+)(\r?)$', '''$1''$2'), '(\r?\n)+', { ' || ' + 'char(10) || ' * $args[0].Groups[1].Captures.Count })"
    

    Note:

    • The above trims (removes) trailing whitespace from the decoded input string, under the assumption that it isn't significant. (To also trim leading whitespace, use .Trim() instead of .TrimEnd().)
    • If a trailing newline had to be preserved, the solution would have to be amended in a nontrivial manner.

    The pieces of the solution are:

    • $Input is (an enumerator for) for every line of stdin input; the assumption is that there is only one line in your case, with URL-encoded newlines such as %0d%0a (CRLF, i.e. \r\n, Windows-format) or %0a (LF, i.e. \n, Unix-format), which [System.Web.HttpUtility]::UrlDecode() decodes into their literal forms.

    • -replace '''', '''''' escapes any line-internal ' as ''

    • -replace '(?m)^([^\r]+)(\r?)$', '''$1''$2'', '''' then encloses each non-empty line in '...'

    • [regex]::Replace() then operates on the resulting multiline string on any nonempty run of newlines ('(\r?\n)+') and uses a script block ({ ... }) as a match-evaluator delegate.

      • Each run of newlines is then replaced with ||, followed by as many instances of char(10) || as there are newlines in the run.

    [1] To test the solution interactively, from a cmd.exe session, place the following before the command above (from a batch file you'd have to double the % chars.): echo a'1%0d%0a%0d%0ab%0d%0ac| . The input is the URL-encoded form of the following multiline string: a'1\r\n\r\nb\r\nc, which should produce the following verbatim output:
    'a''1' || char(10) || char(10) || 'b' || char(10) || 'c'