I am loading some large .csv files on an RStudio server which require circa 73gb. The server is a Ubuntu 18.04.5 LTS
with 125GB RAM, so theoretically I should be able to load more data on RAM and do computations which require more RAM. However, RStudio Server denies that with the following error message:
"Error: cannot allocate vector of size 3.9 Gb".
Looking into the free -mh
it seems that there is available
memory of 42GB, but RStudio does not utilise it.
~:$ free -mh
total used free shared buff/cache available
Mem: 125G 81G 1.2G 28M 43G 42G
Swap: 979M 4.0M 975M`
Why is this happening and how can I utilise these 42GB for computations?
top
output:
top - 11:05:58 up 21:19, 1 user, load average: 0.24, 0.22, 0.22
Tasks: 307 total, 1 running, 201 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.6 us, 0.2 sy, 0.1 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 13170917+total, 1216824 free, 85389312 used, 45103040 buff/cache
KiB Swap: 1003516 total, 999420 free, 4096 used. 45033504 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4292 user1 30 10 39956 5052 4272 S 2.6 0.0 0:06.94 galaxy
1171 root 20 0 488892 44216 7540 S 1.3 0.0 12:24.39 cbdaemon
5998 user1 20 0 44428 3936 3168 R 1.0 0.0 0:00.07 top
6153 user1 20 0 275740 101112 29396 S 0.7 0.1 25:55.97 nxagent
1207 root 20 0 377116 163944 68132 S 0.3 0.1 5:37.13 evl_manager
1247 root 20 0 1262092 63308 14956 S 0.3 0.0 0:35.77 mcollectived
6623 user1 20 0 126640 5852 3212 S 0.3 0.0 0:02.48 sshd
7413 user1 20 0 80.623g 0.078t 53712 S 0.3 63.8 20:14.46 rsession
1 root 20 0 225756 8928 6320 S 0.0 0.0 0:23.05 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd
...
Surprisingly, the problem was not on the OS or R/RStudio Sever, but instead on the use of data.table
, which I used to open the .csv files. Apparently, it prevented the system from using the available memory. Once I moved to dplyr
, I had no problem utilising all the memory the system has.