Seemingly simple problem. I need to find those records where "long text" field1
in table notes
contains lines starting with -" (i.e. hyphen and double quote) using VBA regular expressions in SQL. (The field1
is using rich text formatting (table and form field). I could drop it, if necessary, but in my context it is useful to have rich text).
Double quote is stored as its html entity "
can be reverted by PLAINTEXT()
function
To make things easier I start by simply trying to match fields containing any quote characters: I can find fields containing single quote without a problem:
SELECT *
FROM notes
WHERE RegexMatch(field1, ".*'.*")
;
but it fails with double-quote
SELECT *
FROM notes
WHERE RegexMatch(field1, ".*"".*")
;
Where I try to escape the quote character by another quote. What am I doing wrong? I have also pasted my RegexMatch()
function in case this is relevant.
' ----------------------------------------------------------------------'
' Return True if the given string value matches the given Regex pattern '
' ----------------------------------------------------------------------'
Public Function RegexMatch(value As Variant, pattern As String) As Boolean
If IsNull(value) Then Exit Function
' Using a static, we avoid re-creating the same regex object for every call '
Static regex As Object
' Initialise the Regex object '
If regex Is Nothing Then
Set regex = CreateObject("vbscript.regexp")
With regex
.Global = True
.IgnoreCase = True
.Multiline = True
End With
End If
' Update the regex pattern if it has changed since last time we were called '
If regex.pattern <> pattern Then regex.pattern = pattern
' Test the value against the pattern '
RegexMatch = regex.Test(value)
End Function
Many thanks for your help
Query now works well:
SELECT *
FROM notes
WHERE RegexMatch(PLAINTEXT(notes.field1), "-""")
;
If the column uses Rich Text format, then "
is internally stored as "
.
Any query for that column will work with the HTML source text, not the formatted HTML.
You can see the source text by creating a form based on the table, with a textbox for the field that has TextBox.TextFormat = Plain Text.
Erik A makes a very good point in the comment:
You can use the PlainText function to query the plain text representation of the rich text.
Then you can use this to query for double quotes:
WHERE PlainText(fld) LIKE "*""*"
or
WHERE PlainText(fld) LIKE '*"*'