Search code examples
c#httpwebrequestwebclientddos

How to: fetch data from a site without doing a Denial of Service attack / know when there is knew data on a site


I'm trying to fetch data from a site using httpwebrequest\webclient, so what im doing is sending a request to get the site's html every 30 secounds.

Whats is happenin is that the site is blocking me for a Denial of Service attack because i send too much request from the computer.

How can i know when there is new data on a site without fetching data every 30 secounds?

OR

How can i fetch data from a site every 30 secounds without getting blocked for Denial of Service attack?

ok so im adding some code:

public void DownloadFile(String remoteFilename, String localFilename)
{
            Stream remoteStream = null;
            Stream localStream = null;
            HttpWebRequest gRequest = (HttpWebRequest)WebRequest.Create(remoteFilename);
            gRequest.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.1.8) Gecko/20100202 Firefox/3.5.8 GTBDFff GTB7.0";

            gRequest.CookieContainer = new CookieContainer();
            gRequest.Accept = " text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, */*";
            gRequest.KeepAlive = true;
            gRequest.ContentType = @"application/x-www-form-urlencoded";


            #region CookieManagement
            if (gCookies != null && gCookies.Count > 0)
            {
                gRequest.CookieContainer.Add(gCookies);
            }

            HttpWebResponse gResponse;

            try{
                gResponse = (HttpWebResponse)gRequest.GetResponse();

                //check if the status code is http 200 or http ok

                if (gResponse.StatusCode == HttpStatusCode.OK)
                {
                    remoteStream = gResponse.GetResponseStream();
                    localStream = File.Create(localFilename);
                    byte[] buffer = new byte[1024];
                    int bytesRead;

                    do
                    {
                        // Read data (up to 1k) from the stream
                        bytesRead = remoteStream.Read(buffer, 0, buffer.Length);

                        // Write the data to the local file
                        localStream.Write(buffer, 0, bytesRead);
                    } while (bytesRead > 0);
                }
                else
                {
                    MessageBox.Show("Error!");
                    Application.Exit();
                }

                if (gResponse != null) gResponse.Close();
                if (remoteStream != null) remoteStream.Close();
                if (localStream != null) localStream.Close();
            }
            catch (Exception e)
            {
                MessageBox.Show(e.ToString());
                Application.Exit();
            }
            #endregion
        }

and in the timer:

DownloadFile("http://www.fxp.co.il/forumdisplay.php?f=2709", @"C:\tmph.html");

so this forum is a buy\sell forum, so what im trying to do is to get the forum html every 30 sec, check the html for number of unread "buy" posts using htmlagilitypack.


Solution

  • You can use a longer polling interval and do HEAD requests to avoid fetching the entire document. You can parse the header returned and only do the GET if it is different from the previous header.