Search code examples
rcurltwitterrcurl

When using R curl to download a Twitter page, the page downloaded is "This browser is no longer supported"


So, I have an script that uses curl_download to download a Twitter page, and then use read_html to get some data off of it. It used to work fine, but now, instead of downloading the proper Twitter page, it downloads this page instead:

A picture of the Twitter page with a popup sating "This browser is no longer supported. Please switch to a supported browser or disable the extension which masks your browser to continue using twitter.com. You can see a list of supporter browsers in our Help Center."

I'm not sure how Curl would have the wrong browser, or how to change it if it does, but this is a very new problem. The reason I am doing this is so the script can grab the number of followers from the .html file (and do a bunch of other irrelevant things with it), so if anyone just happens to know a significantly easier way to do that I am open, but otherwise I'm hoping someone has seen this Curl issue.

Here is my code:

library(curl)

twitter_file <- "location the file is meant to be saved"

curl_download("https://twitter.com/SelectFulton", twitter_file, quiet = TRUE)

Thank you!


Solution

  • @r2evans was correct about changing the user agent working! This was the code I ended up using:

    withr::with_options(list(HTTPUserAgent="Googlebot/2.1 (+http://www.google.com/bot.html)"), curl_download("https://twitter.com/SelectFulton", twitter_file, quiet = TRUE))

    and there are no longer any issues. Thanks for the help!