Search code examples
rfeed

Extracting full article text via the newsanchor package [in R]


I am using the newsanchor package in R to try to extract entire article content via NewsAPI. For now I have done the following :

require(newsanchor)
results <- get_everything(query = "Trump +Trade", language = "en")
test <- results$results_df

This give me a dataframe full of info of (maximum) a 100 articles. These however do not containt the entire actual article text. Rather they containt something like the following:

[1] "Tensions between China and the U.S. ratcheted up several notches over the weekend as Washington sent a warship into the disputed waters of the South China Sea. Meanwhile, Google dealt Huaweis smartphone business a crippling blow and an escalating trade war co… [+5173 chars]"

Is there a way to extract the remaining 5173 chars. I have tried to read the documentation but I am not really sure.


Solution

  • I don't think that is possible at least with free plan. If you go through the documentation at https://newsapi.org/docs/endpoints/everything in the Response object section it says :

    content - string

    The unformatted content of the article, where available. This is truncated to 260 chars for Developer plan users.

    So all the content is restricted to only 260 characters. However, test$url has the link of the source article which you can use to scrape the entire content but since it is being aggregated from various sources I don't think there is one automated way to do this.