In Filehelpers BeforeReadRecords Event, we need Regular Expression to manipulate all single "double quotes" with in a field with two double quotes.
CSV Content:
"Mine"s Minesweeper", "Yours"s Minesweeper", "Uncle Sam"s Minesweeper"
"Mine"s Minesweeper2", "Yours"s Minesweeper2", "Unknown Minesweeper3"
Need helps to create a vb.net Regular Expression to replace the all inner double quotes. Currently we are using below approrach?
Dim engine As New FileHelperEngine(cb.CreateRecordClass())
AddHandler engine.BeforeReadRecord, AddressOf BeforeReadRecordHandler
Event Code
Sub BeforeReadRecordHandler(engine As EngineBase, e As FileHelpers.Events.BeforeReadEventArgs(Of Object))
Try
Dim newLine As String = ""
Dim sep As String = ""
Dim arr() As String = e.RecordLine.Split(",")
arr.AsParallel().ForAll(Sub(x)
If x.Length > 1 Then
newLine = String.Format("{0}{1}{2}", newLine, sep, x.Substring(1, IIf(x.Length <= 2, 0, x.Length - 2)).Replace("""", """"""))
Else
newLine = String.Format("{0}{1}{2}", newLine, sep, x)
End If
sep = ","
End Sub)
e.RecordLine = newLine
Catch ex As Exception
End Try
End Sub
Trying to generate regular expression for ;
String.Format("{0}{1}{2}", newLine, sep, x.Substring(1, IIf(x.Length <= 2, 0, x.Length - 2)).Replace("""", """""")).
Output should be
CSV Content:
"Mine""s Minesweeper", "Yours""s Minesweeper", "Uncle Sam""s Minesweeper"
"Mine""s Minesweeper2", "Yours""s Minesweeper2", "Unknown Minesweeper3"
.Net supports arbitrary length lookbehind, so you can use the following;
(?<!(^|,)\s*)"(?!\s*($|,))
Use with
Regexp.replace(input,(?<!(^|,)\s*)"(?!\s*($|,)),"""""",RegexOptions.Multiline)
This matches any " not preceded by the start of line or a comma, and not suceeded by end of string or comma, both conditions ignoring an arbitrary amount of whitespace.
It will misbehave if an entry in the CSV is not bracketed by quotes, or if a comma occurs in the text of an entry.