Search code examples
htmlvbadommailchimp-api-v3.0

How do I handle the MailChimp API response for the HTML variable in VBA?


A string variable oldHTMLContent contains a text string from a MailChimp API request response that represents the current content of an email campaign. Here is the string but it includes a bunch of \r\n that you can't see in the display below:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
  <head>
    <style type="text/css">
      @media only screen and (max-width: 480px) {
        table#canspamBar td {
          font-size:14px !important;
        }
        
        table#canspamBar td a {
          display:block !important;
          margin-top:10px !important;
        }
      }
    </style>
  </head>
  <body>
    <p> </p>
    <div class="userBot">
      <a href="http://dev.mydev.org/why-so-many-people-are-signing-up-for-cynthia-for-new-york-volunteer-events"><img src="http://dev.mydev.org/wp-content/uploads/2018/07/CynthiaNixon.jpg" width="1012" height="592" alt="CynthiaNixon.jpg"></a>
      <p>When we ask ourselves why so many people are signing up for Cynthia For New York volunteer events this weekend, this is what ... (click for more)</p>
    </div>            <center>
    <br>
    <br>
    <br>
    <br>
    <br>
    <br>
    <table border="0" cellpadding="0" cellspacing="0" width="100%" id="canspamBarWrapper" style="background-color:#FFFFFF;border-top:1px solid #E5E5E5;">
      <tr>
        <td align="center" valign="top" style="padding-top:20px;padding-bottom:20px;">
          <table border="0" cellpadding="0" cellspacing="0" id="canspamBar">
            <tr>
              <td align="center" valign="top" style="color:#606060;font-family:Helvetica, Arial, sans-serif;font-size:11px;line-height:150%;padding-right:20px;padding-bottom:5px;padding-left:20px;text-align:center;">
                This email was sent to <a href="mailto:*|EMAIL|*" target="_blank" style="color:#404040 !important;">*|EMAIL|*</a>
                <br><a href="*|ABOUT_LIST|*" target="_blank" style="color:#404040 !important;"><em>why did I get this?</em></a>    <a href="*|UNSUB|*" style="color:#404040 !important;">unsubscribe from this list</a>    <a href="*|UPDATE_PROFILE|*" style="color:#404040 !important;">update subscription preferences</a>
                <br>*|LIST:ADDRESSLINE|*
                <br>
                <br>
              </td>
            </tr>
          </table>
        </td>
      </tr>
    </table>
  </center>
</body>
</html>

I want to extract just the "userBot" class but I can't seem to access it with getElementsByClassName.

When this code executes, the result is always zero.

Dim oldHTMLContent As String
Dim oldHtmlDoc As MSHTML.HTMLDocument
Set oldHtmlDoc = New HTMLDocument
oldHtmlDoc.body.innerText=oldHTMLContent
debug.Print oldHtmlDoc.getElementsByClassName("userBot").length

How do I define the right object and load it with the HTML string so I can work with the userBot class? I can see I'm loading the whole DOM, including


Solution

  • Transfer as .innerHTML to the new HTMLDocument then use a CSS class selector, ".", as shown below. Also, your naming seems a little confusing. IMO it would be clearer if you were transferring oldInnerHTML to newHTMLDoc, or something like that.

    Option Explicit
    Public Sub test()
        Dim html As New HTMLDocument
    
        html.body.innerHTML = [A1] '<= This is your oldHTMLContent. I am reading from a cell.
        Debug.Print html.querySelector(".userBot").innerText
    End Sub
    

    This is the same as saying:

    Debug.Print html.getElementsByClassName("userBot")(0).innerText
    

    Sample of output:

    sample