R: Automate scraping & storage of Twitter data on the cloud

I am an R user working on a project that involves gaining insights from Twitter data (more specifically, scraping Twitter data using the rtweet package, and conducting a set of analyses on this data). In addition, I have built a Shiny app based on this data for visualisation purposes.

Where I Need Further Inputs

Today, the Twitter data that I scrape is stored locally on my laptop. However I'd like to do this differently. Ideally, I'd like to be able to achieve the following -

1) The data is scraped from Twitter using rtweet package and stored directly on a cloud platform (like AWS or Microsoft Azure, for example).

2) I'd like to define a periodicity for this scraping process (for example: once every two days). I'd like to achieve this through some scheduling tool.

3) Eventually, I'd like my Shiny app (hosted on shinyapps.io) to be able to communicate with this cloud platform and retrieve the tweets stored in it for analysis.

I have searched the Internet for solutions, but haven't found anything straight-forward yet.

If anyone has experience doing this, your inputs would be highly appreciated.

Solution

You create account at AWS. Then you create s3 bucket On your virtual server or machine from hwrre you want to do a copy, you install aws cli(client for interacting with aws resiurces)

Then,you ran copy command and files are being copied to cloud.

Same way back, you use cli to retrieve the files