Search code examples
regexms-accessvba

Expressing basic Access query criteria as regular expressions


I'm familiar with Access's query and filter criteria, but I'm not sure how to express similar statements as regular expression patterns. I'm wondering if someone can help relate them to some easy examples that I understand.

If I were using regular expressions to match fields like Access, how would I express the following statements? Examples are similar to those found on this Access Query and Filter Criteria webpage. As in Access, case is insensitive.

  1. "London"

    Strings that match the word London exactly.

  2. "London" or "Paris"

    Strings that match either the words London or Paris exactly.

  3. Not "London"

    Any string but London.

  4. Like "S*"

    Any string beginning with the letter s.

  5. Like "*st"

    Any string ending with the letters st.

  6. Like "*the*dog*"

    Any strings that contain the words 'the' and 'dog' with any characters before, in between, or at the end.

  7. Like "[A-D]*"

    Any strings beginning with the letters A through D, followed by anything else.

  8. Not Like "*London*"

    Any strings that do not contain the word London anywhere.

  9. Not Like "L*"

    Any strings that don't begin with an L.

  10. Like "L*" And Not Like "London*"

    Any strings that begin with the letter L but not the word London.


Solution

  • Regex as much more powerful than any of the patterns you have been used to for creating criteria in Access SQL. If you limit yourself to these types of patterns, you will miss most of the really interesting features of regexes.

    For instance, you can't search for things like dates or extracting IP addresses, simple email or URL detection or validation, basic reference code validation (such as asking whether an Order Reference code follows a mandated coding structure, say something like PO123/C456 for instance), etc.

    As @Smandoli mentionned, you'd better forget your preconceptions about pattern matching and dive into the regex language.

    I found the book Mastering Regular Expressions to be invaluable, but tools are the best to experiment freely with regex patterns; I use RegexBuddy, but there are other tools available.

    Basic matches

    Now, regarding your list, and using fairly standardized regular expression syntax:

    1. "London"

      Strings that match the word London exactly.

      ^London$

    2. "London" or "Paris"

      Strings that match either the words London or Paris exactly.

      ^(London|Paris)$

    3. Not "London"

      Any string but London.

      You match for ^London$ and invert the result (NOT)

    4. Like "S*"

      Any string beginning with the letter s.

      ^s

    5. Like "*st"

      Any string ending with the letters st.

      st$

    6. Like "*the*dog*"

      Any strings that contain the words 'the' and 'dog' with any characters before, in between, or at the end.

      the.*dog

    7. Like "[A-D]*"

      Any strings beginning with the letters A through D, followed by anything else.

      ^[A-D]

    8. Not Like "*London*"

      Any strings that do not contain the word London anywhere.

      Reverse the matching result for London (you can use negative lookahead like:
      ^(.(?!London))*$, but I don't think it's available to the more basic Regex engine available to Access).

    9. Not Like "L*"

      Any strings that don't begin with an L.

      ^[^L] negative matching for single characters is easier than negative matching for a whole word as we've seen above.

    10. Like "L*" And Not Like "London*"

      Any strings that begin with the letter L but not the word London.

      ^L(?!ondon).*$

    Using Regexes in SQL Criteria

    In Access, creating a user-defined function that can be used directly in SQL queries is easy.
    To use regex matching in your queries, place this function in a module:

    ' ----------------------------------------------------------------------'
    ' Return True if the given string value matches the given Regex pattern '
    ' ----------------------------------------------------------------------'
    Public Function RegexMatch(value As Variant, pattern As String) As Boolean
        If IsNull(value) Then Exit Function
        ' Using a static, we avoid re-creating the same regex object for every call '
        Static regex As Object
        ' Initialise the Regex object '
        If regex Is Nothing Then
            Set regex = CreateObject("vbscript.regexp")
            With regex
                .Global = True
                .IgnoreCase = True
                .MultiLine = True
            End With
        End If
        ' Update the regex pattern if it has changed since last time we were called '
        If regex.pattern <> pattern Then regex.pattern = pattern
        ' Test the value against the pattern '
        RegexMatch = regex.test(value)
    End Function
    

    Then you can use it in your query criteria, for instance to find in a PartTable table, all parts that are matching variations of screw 18mm like Pan Head Screw length 18 mm or even SCREW18mm etc.

    SELECT PartNumber, Description
    FROM   PartTable
    WHERE  RegexMatch(Description, "screw.*?d+\s*mm")
    

    Caveat

    • Because the regex matching uses old scripting libraries, the flavour of Regex language is a bit more limited than the one found in .Net available to other programming languages.
      It's still fairly powerful as it is more or less the same as the one used by JavaScript.
      Read about the VBScript regex engine to check what you can and cannot do.

    • The worse though, is probably that the regex matching using this library is fairly slow and you should be very careful not to overuse it.

    That said, it can be very useful sometimes. For instance, I used regexes to sanitize data input from users and detect entries with similar patterns that should have been normalised.
    Well used, regexes can enhance data consistency, but use sparingly.