Search code examples
excelhtml-tabletextrangevba

Formatting conflict for HTML table query in Excel


I would like to import some data from an online HTML table and place it in a spreadsheet that follows another formatting system. Let me explain. I am keeping track of a baseball team's results for a season, and more specifically hits, home runs and RBIs. The data is posted here and is organized according to the dates of the events. In my spreadsheet, I also organize the data according to the dates of the events, but with the exception that my column with dates (first column) contains dates for the entire season and not for specific events. Days off (when there are no events) are automatically marked as null by Excel. Additionnaly, the dates in my spreadsheet are formatted like this (DD/MM) whereas the dates online are formatted like this (Mnth DD). Also, I am interested in keeping track of only certain results (therefore certain columns of the HTML table) Is there a way for me to import specific portions (rows) of data, preserve my formatting (DD/MM + days off marked as null) and have the data place itself on the corresponding row? Ideally the data would update itself as the season progresses. Please have a look at the image in case all of this is unclear.enter image description here


Solution

  • Make a new worksheet. Go to Data - From Web, navigate to that site, select whole page (I couldn't see how to just get the table you want) and Import it.

    This new worksheet will be your data page and the other page (your existing sheet) will be your presentation page. On the data page, go to AG86 (or thereabouts) and put this formula in

    =IF(ISERR(FIND("(",C86)),C86,DATEVALUE(LEFT(C86,FIND("(",C86)-1)))
    

    Fill that down to AG1000. That will clean up problems with dates that come from that website.

    Now if you want to get home runs, go to your presentation page and, assuming you have dates in A1:A?, but this formula in B1

    =IF(ISNA(MATCH(A2,Sheet1!$AG$86:$AG$1000,FALSE)),"",INDEX(Sheet1!$M$86:$M$1000,MATCH(A2,Sheet1!$AG$86:$AG$1000,FALSE),1))
    

    Finally, every time you use the worksheet, go to the data sheet, right click on the web query and refresh it to get the latest data. You can event hide the data page if you don't want anyone else to see it.