I am attempting to scrape a webpage using the urlread()
function in MATLAB, though I've run into a problem that I haven't seen before. When I run the code
X = urlread('http://espn.go.com/mens-college-basketball/schedule/_/date/20141114');
I get the error
Error using urlreadwrite (line 92)
The server did not find a resource to match this request.
Error in urlread (line 36)
[s,status] = urlreadwrite(mfilename,catchErrors,url,varargin{:});
When I attempt to visit the link on my browser (http://espn.go.com/mens-college-basketball/schedule/_/date/20141114), I have no problems accessing the page. Does anyone have a solution to this problem?
It appears that the site is blocking the default MATLAB Rxxxxx
user-agent parameter in the http request.
Faking the user-agent seems to work around the limitation:
x = urlread('http://espn.go.com/mens-college-basketball/schedule/_/date/20141114', 'UserAgent', 'Mozilla/5.0');