I get several marketing emails with url links that get redirected from site to site to site. I'd like to write a program to track each URL redirect using Delphi and Indy. I'd like to traverse each URL, record the full QueryString and any Cookies that may have been set during the process.
How do I do this using the Indy components that come with D2010?
First of all you need a HTTP client, which is TIdHTTP
in Indy.
Now you will need a data structure that will hold your results:
TRedirection = record
queryString: String;
cookies: TStrings;
end;
TRedirectionArray = array of TRedirection;
Create a class that does the work (a class is required, because the event functions are defined as procedure of object
):
TRedirectionTester = class
private
FRedirData: TRedirectionArray;
procedure redirectEvent(Sender: TObject; var dest: string;
var NumRedirect: Integer; var Handled: boolean; var VMethod: TIdHTTPMethod);
procedure newCookie(ASender: TObject; ACookie: TIdCookie; var VAccept: Boolean);
public
function traverseURL(url: String): TRedirectionArray;
property RedirData: TRedirectionArray read FRedirData;
end;
This provides basic functionality - you can call traverseURL
with an URL, and it will return a TRedirectionArray
with the querystrings and cookies involved.
Then implement the OnRedirect
event:
procedure TRedirectionTester.redirectEvent(Sender: TObject; var dest: string;
var NumRedirect: Integer; var Handled: boolean; var VMethod: TIdHTTPMethod);
var
redirDataLength: Integer;
begin
Handled := True;
redirDataLength := Length(FRedirData);
SetLength(FRedirData, redirDataLength + 1);
FRedirData[redirDataLength].queryString := dest;
FRedirData[redirDataLength].cookies := TStringList.Create;
end;
This will add an entry in the array, and store the querystring of the redirection. As this redirection itself doesn't contain a cookie (cookies are set when requesting the redirected page), you can't add any cookies here yet.
That's why you will need an OnNewCookie
handler:
procedure TRedirectionTester.newCookie(ASender: TObject; ACookie: TIdCookie; var VAccept: Boolean);
var
redirDataLength: Integer;
begin
VAccept := True;
redirDataLength := High(FRedirData);
if (Assigned(FRedirData[redirDataLength].cookies)) then
FRedirData[redirDataLength].cookies.Add(ACookie.CookieText);
end;
This does nothing but adding the CookieText
to the data set. That field contains a 'summary' of the cookie - it's the actual string data that is sent when requesting a page.
Finally, put it together by implementing the traverseURL
function:
function TRedirectionTester.traverseURL(url: String): TRedirectionArray;
var
traverser: TIdHTTP;
begin
traverser := TIdHTTP.Create();
traverser.HandleRedirects := True;
traverser.OnRedirect := redirectEvent;
traverser.CookieManager := TIdCookieManager.Create();
traverser.CookieManager.OnNewCookie := newCookie;
SetLength(FRedirData, 1);
FRedirData[0].queryString := url;
FRedirData[0].cookies := TStringList.Create;
traverser.Get(url);
Result := FRedirData;
end;
It doesn't do much: It creates the required objects, and assigns the event handlers. Then it adds the first url as the first redirection (even though it's not a real redirection, I added it for completeness).
The call to Get
then sends the requests. It will return after the final page is located and returned by the webserver.
I tested it with http://bit.ly/Lb2Vho.
This however only handles redirects that are caused by an HTTP status code 301 or 302. As far as I know it doesn't handle redirects that are done via <meta>
tags or javascript.
To add that functionality, you have to check the results of the call to Get
, and parse that to search for such redirects.