Search code examples
stringvbaexceldouble-quotes

How can I find quoted text in a string?


Example

Say I have a string:

"I say ""Hello world"" and she says ""Excuse me?"""

VBA will interpret this string as:

I say "Hello world" and she says "Excuse me?"

A more complex example:

I have a string:

"I say ""Did you know that she said """"Hi there!"""""""

VBA interprets this string as:

I say "Did you know that she said ""Hi there!"""

If we remove "I say "

"Did you know that she said ""Hi there!"""

we can continue parsing the string in vba:

Did you know that she said "Hi there!"

Problem

Ultimately I want some function, sBasicQuote(quotedStringHierarchy as string), which returns a string containing the next level up in the string hierarchy.

E.G.

dim s as string
s = "I say ""Did you know that she said """"Hi there!"""""""
s = sBasicQuote(s) ' returns 'I say "Did you know that she said ""Hi there!"""'
s = sBasicQuote(s) ' returns 'Did you know that she said "Hi there!"'
s = sBasicQuote(s) ' returns 'Hi there!'

I just can't figure out an algorithm that would work with this... You almost need to replace all double quotes, but when you've replaced the nth double quote you have to skip to the n+1th douple quote?

How does one implement this in VBA?


Solution

  • My Solution

    I spent some more time thinking and came up with this solution.

    Function sMineDoubleQuoteHierarchy(s As String) As String
        'Check the number of quotes in the string are even - sanity check
        If (Len(s) - Len(Replace(s, """", ""))) Mod 2 <> 0 Then sMineDoubleQuoteHierarchy = "Error - Odd number of quotes found in sMineDoubleQuoteHierarchy() function": Exit Function
    
        'First thing to do is find the first and last *single* quote in the string
        Dim lStart, lEnd, i As Long, fs As String
        lStart = InStr(1, s, """")
        lEnd = InStrRev(s, """")
    
        'After these have been found we need to remove them.
        s = Mid(s, lStart + 1, lEnd - lStart - 1)
    
        'Start at the first character
        i = 1
    
        Do While True
            'Find where the next double quote is
            i = InStr(1, s, """""")
    
            'if no double quote is found then concatenate with fs with the remainder of s
            If i = 0 Then Exit Do
    
            'Else add on the string up to the char before the ith quote
            fs = fs & Left(s, i - 1)
    
            'Replace the ith double quote with a single quote
            s = Left(s, i - 1) & Replace(s, """""", """", i, 1)
    
            'Increment by 1 (ensuring the recently converted double quote is no longer a single quote
            i = i + 1
        Loop
    
        'Return fs
        sMineDoubleQuoteHierarchy = s
    End Function
    

    What's going on in this solution?

    The first part of the process is removing the first and last single quote from the string and returning the text between them. Then we loop through the string replacing each instance of "" and replacing it with ". Each time we do this we skip to the next character to unsure strings like """" go to "" instead of ".

    Does anyone else have a better/more compact solution?


    Edit

    After all the suggestions in this forum I settled with this. It's got some extra error trapping to find validate nested strings.

    Public Function DoubleQuoteExtract(ByVal s As String, Optional ByRef ErrorLevel As Boolean) As String
        'This effectively parses the string like BASIC does by removing incidents of "" and replacing them with "
    
        'SANITY CHECK - Check even number of quotes
        Dim countQuote As Double
        countQuote = Len(s) - Len(Replace(s, """", ""))
    
        'Calculate whether or not quote hierarchy is correct:
        '"..."          - Is okay           - Count Quotes = 2      - Count Quotes / 2 = 1
        '""...""        - Is not okay       - Count Quotes = 4      - Count Quotes / 2 = 2
        '"""..."""      - Is okay           - Count Quotes = 6      - Count Quotes / 2 = 3
        '""""...""""    - Is not okay       - Count Quotes = 8      - Count Quotes / 2 = 4
        'etc.
        'Ultimately: IF CountQuotes/2 = Odd The string hierarchy is setup fine
        '            IF CountQuotes/2 = Even, The string Hierarchy is setup incorrectly.
    
        Dim X As Double: X = countQuote / 2
        Dim ceil As Long: ceil = Int(X) - (X - Int(X) > 0)
        If ceil Mod 2 <> 0 Then sDoubleQuoteExtract = "#Error - Incorrect number of double quotes forming an incomplete hierarchy.": GoTo ErrorOccurred
    
        'If an odd number of quotes are found then they cannot be paired correctly, thus throw error
        If countQuote Mod 2 <> 0 Then sDoubleQuoteExtract = "#Error - Odd number of quotes found in sMineDoubleQuoteHierarchy() function": GoTo ErrorOccurred
    
    
        'Find the next incident of single quote. Trim the string to this
        s = Mid(s, InStr(1, s, String(1, Chr(34))))
    
        'replace all instances of "" with "
        s = Replace(s, String(2, Chr(34)), String(1, Chr(34)))
    
        'Finally trim off the first and last quotes
        DoubleQuoteExtract = Mid(s, 2, Len(s) - 2)
        ErrorLevel = False
        Exit Function
    ErrorOccurred:
        ErrorLevel = True
    End Function