Split dataset file in parts of a specific size

I want to analyze this dataset on a system that limits the imports to 100MBs at a time.

How should one split a dataset, per rows, into a max of 100MBs parts?

Solution

Read the dataset.
Split the dataset into 14 chunks (in 13 chunks I had a file with over 100MB).
Then I saved the result back as a csv using purrr

Here is the script I used:

trade = read.csv("commodity_trade_statistics_data.csv")

no_of_chunks <- 14

f <- ceiling(1:nrow(trade) / nrow(trade) * 14)

res <- split(trade, f)

library(purrr)
map2(res, paste0("chunk_", names(res), ".csv"), write.csv)

Estimating non-monotonic bi-exponential curve fit
column type issue when converting csv to parquet using duckdb in R
"Target position can only be set for new windows" in chromote in R
Determine level of nesting in R?
Week start on Mondays
Center output from dm_draw
plot a network based on given values
Adding a X axis title to faceted ggballoonplot
Calculate mean of matrices having different dimensions
check if two columns have a one-to-one relationship in R
How to extract Std.Dev from VarCorr glmmTMB
How do you print to stderr in R?
How to plot China map with South China Sea in base R
Get column and row position of nth element in a matrix
Is there any authoritative documentation on R release nicknames?
R Glassdoor Web Scraping
Issue with graticule across 180° for several country/territory EEZs
Separating grouped layers in a raster stack in terra
How can I use group_by and mutate to perform a subtraction calculation with specific groupings? Time 0 minus Time X for all groups
How to directly open .R data containing data frame code in R?
Way to web-scrape a popular eSport website using R?
Variance calculation warning: longer object length is not a multiple
gratia::draw(): "'length.out' must be a non-negative number"
Using Swift as custom engine in knitr and including all previous content
convert source target value dataframe into a correlation matrix
ggplot2 plotting a 100% stacked area chart
Use string as formula for ipwtm function?
interpolarization within groups with NA
Multi-row x-axis labels in ggplot line chart
How to do a SOAP request for EUR-Lex API with R?