Search code examples
c#quandl

Download Current WSJ.com Prime Rate


I need to automatically download the current Wall Street Journal Prime Rate and load the data into my database. What is the best method for downloading this data automatically?

I have come up with three possible solutions for doing this:

  1. Scrape a HTML web page from WSJ.
  2. Parse a RSS news feed from WSJ.
  3. Use some API that I haven't found from WSJ.

Regarding solution 1, although I don't like solution 1 since it could easily break, it's the only one that I have worked out from end to end. It appears I can scrape this page with a WebRequest / WebResponse and read the text in this code:

<tr>
<td style="text-align:left" class="colhead">&nbsp;</td>
<td class="colhead">Latest</td>
<td class="colhead">Wk ago</td>
<td class="colhead">High</td>
<td class="colhead">Low</td>
</tr>
<tr>
<td class="text">U.S.</td>
<td style="font-weight:bold;" class="num">3.25</td>
<td class="num">3.25</td>
<td class="num">3.25</td>
<td class="num" style="border-right:0px">3.25</td>
</tr>

Regarding solution 2, although I can implement a RSS reader solution, I don't see a way to reliably anticipate verbiage for changes in the Prime Rate. Therefore, I don't think this is as safe or reliable a way to get the data as solution 1.

Regarding solution 3, I haven't found any published API's for checking money rates like the Prime Rate. If anyone knows of a web service or other API for checking money rates, then please let me know.


Solution

  • I implemented the following code to scrape the HTML and process the results:

    SSIS Package C# Script Task

    public void Main()
        {
            // Post the web page.
            try
            {
                // Set variables.
                bool fireAgain = true;
                Uri WebPageURI = new Uri("http://online.wsj.com/mdc/public/page/2_3020-moneyrate.html");
    
                // Post the web page.
                WebRequest request = WebRequest.Create(WebPageURI);
                request.Timeout = (1000 * 60 * 60);
                request.Method = "POST";
                WebResponse response = request.GetResponse();
                StreamReader reader = new StreamReader(response.GetResponseStream());
                string Output = reader.ReadToEnd();
    
                Dts.Variables["wall_street_journal_str"].Value = Output;
                Dts.Events.FireInformation(0, "WebRequest:", WebPageURI.ToString(), "", 0, ref fireAgain);
                Dts.Events.FireInformation(0, "WebResponse:", Output, "", 0, ref fireAgain);
            }
            catch (WebException ex)
            {
                Dts.Events.FireError(0, "Error:", ex.Message, "", 0);
            }
    
    
            // Return success.
            Dts.TaskResult = (int)ScriptResults.Success;
        }
    

    Stored Procedure Called by an Execute SQL Task

    SET @WALL_STREET_JOURNAL_HTML = SUBSTRING(@WALL_STREET_JOURNAL_HTML, CHARINDEX('Prime rates', @WALL_STREET_JOURNAL_HTML, 1), CHARINDEX('Canada', @WALL_STREET_JOURNAL_HTML, CHARINDEX('Prime rates', @WALL_STREET_JOURNAL_HTML, 1)) - CHARINDEX('Prime rates', @WALL_STREET_JOURNAL_HTML, 1))
        SELECT @RATE_CHANGE_DATE = CONVERT(DATE, RTRIM(LTRIM(SUBSTRING(@WALL_STREET_JOURNAL_HTML, CHARINDEX('[', @WALL_STREET_JOURNAL_HTML, 1) + 23, CHARINDEX(']', @WALL_STREET_JOURNAL_HTML, 1) - CHARINDEX('[', @WALL_STREET_JOURNAL_HTML, 1) - 23))))
            , @RATE = CONVERT(NUMERIC(8, 2), SUBSTRING(@WALL_STREET_JOURNAL_HTML, CHARINDEX('class="num">', @WALL_STREET_JOURNAL_HTML, CHARINDEX('U.S.</td>', @WALL_STREET_JOURNAL_HTML, 1)) + 12, CHARINDEX('<', @WALL_STREET_JOURNAL_HTML, CHARINDEX('class="num">', @WALL_STREET_JOURNAL_HTML, CHARINDEX('U.S.</td>', @WALL_STREET_JOURNAL_HTML, 1))) - CHARINDEX('class="num">', @WALL_STREET_JOURNAL_HTML, CHARINDEX('U.S.</td>', @WALL_STREET_JOURNAL_HTML, 1)) -12))
    

    Ugly code, but it serves the purpose. There are also several validation rules that run to verify the data is correct, but this is the core of what I used to scrape the value.