So I am currently trying to log into my account on a website using WebRequest
.
I have been reading about it to the point where I feel like I wanted to use an example to learn by trial and error.
This is the example I am using Login to website, via C#
So when I try to execute my code it returns an unhandled exception and its this one
System.Net.WebException: 'The remote server returned an error: (404) Not Found.'
I tried stepping through the code and I THINK it might be that it's trying to POST somewhere where it can't. I wanted to fix this before moving onto getting a confirmation that it successfully logged in. I changed the username and password to dummy text for the sake of this question.
What did I do wrong here and whats the most logical way of fixing this issue? Thanks in advance.
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
string formUrl = "https://secure.runescape.com/m=weblogin/login.ws"; // NOTE: This is the URL the form POSTs to, not the URL of the form (you can find this in the "action" attribute of the HTML's form tag
string formParams = string.Format("login-username={0}&login-password={1}", "myUsername", "password");
string cookieHeader;
WebRequest req = WebRequest.Create(formUrl);
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
When you scrape a website, you have to make sure you mimic everything that happens. That includes any client-side state (Cookies) that is sent earlier before a form is POST-ed. As most sites don't like to be scraped or steered by bots they are often rather picky about what is the payload. Same is true for the site you're trying to control.
Three important things you have missed:
Fixing those omissions will result in the following code:
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
string useragent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36";
// capture cookies, this is important!
var cookies = new CookieContainer();
// do a GET first, so you have the initial cookies neeeded
string loginUrl = "https://secure.runescape.com/m=weblogin/loginform.ws?mod=www&ssl=0&dest=community";
// HttpWebRequest
var reqLogin = (HttpWebRequest) WebRequest.Create(loginUrl);
// minimal needed settings
reqLogin.UserAgent = useragent;
reqLogin.CookieContainer = cookies;
reqLogin.Method = "GET";
var loginResp = reqLogin.GetResponse();
//loginResp.Dump(); // LinqPad testing
string formUrl = "https://secure.runescape.com/m=weblogin/login.ws"; // NOTE: This is the URL the form POSTs to, not the URL of the form (you can find this in the "action" attribute of the HTML's form tag
// in ther html the form has 3 more hidden fields, those are needed as well
string formParams = string.Format("username={0}&password={1}&mod=www&ssl=0&dest=community", "myUsername", "password");
string cookieHeader;
// notice the cast to HttpWebRequest
var req = (HttpWebRequest) WebRequest.Create(formUrl);
// put the earlier cookies back on the request
req.CookieContainer = cookies;
// the Referrer is mandatory, without it a timeout is raised
req.Headers["Referrer"] = "https://secure.runescape.com/m=weblogin/loginform.ws?mod=www&ssl=0&dest=community";
req.UserAgent = useragent;
req.ContentType = "application/x-www-form-urlencoded";
req.Method = "POST";
byte[] bytes = Encoding.ASCII.GetBytes(formParams);
req.ContentLength = bytes.Length;
using (Stream os = req.GetRequestStream())
{
os.Write(bytes, 0, bytes.Length);
}
WebResponse resp = req.GetResponse();
cookieHeader = resp.Headers["Set-cookie"];
This returns for me success. It is up to you parse the resulting HTML to plan your next steps.