I am using a Lenovo Laptop, CPU @ 2.20GHz, 7.86 GB of usable memory, 64-bit Windows 8. I am analyzing in R studio datasets usually with over 250,000 rows. The function reads a table (called ppt) and goes through all the rows of this table and take decisions through the statements in the body of the while loop:
while (i < (length(ppt[,1]) - 192)) {
print(i)
.
.
.
.
i = i+1
}
After some hours running the code and not finishing it, I inserted the print(i) in the function to trace it. For a table having 294991 rows (size = 6.17MB), i goes from 20 to 270781 in about 14 seconds, then it stops and does, and no more i is printed which I assume the code is not analyzing anymore but still running. In fact I would have to hit STOP in order to continue working with R studio.
Then I deleted some rows of this dataset making it to have 147635 rows. Same thing, but now i goes from 20 to 147400 (in about 8 seconds) and seems to be still working and printing no i's.
I still made the data shorter, having 37000 rows. Now, it goes all the way up to the last and finishes running.
Sample data:
> ppt<- read.csv("Flow_pptJoint - Copy - Copy.csv")
> ppt[60:70,]
date precip flow NA.
60 12/1/2003 14:45 NA 85 NA
61 12/1/2003 15:00 NA 85 NA
62 12/1/2003 15:15 NA 85 NA
63 12/1/2003 15:30 NA 85 NA
64 12/1/2003 15:45 NA 85 NA
65 12/1/2003 16:00 NA 83 NA
66 12/1/2003 16:15 NA 83 NA
67 12/1/2003 16:30 NA 83 NA
68 12/1/2003 16:45 NA 83 NA
69 12/1/2003 17:00 NA 83 NA
70 12/1/2003 17:15 NA 83 NA
I was wondering if that should be a memory problem, and if yes how I could approach the issue.
Given your hardware it seems unlikely that you are facing a memory issue (by the way, it is generally expected to give columns as well as rows in order to give a more accurate idea of the size of the data). Also, memory issues generally end with an "Error: Cannot allocate memory" or "Bad alloc" or something of the sorts.
This seems rather like an endless loop. Check your while statements and the specific rows of data that they get stuck on.
An option to do this is with a browser
statement in the iteration of the loop that gets stuck.
Also, in general loops are quite ineffective in R. When possible, consider other approaches (maybe ddply
with a custom function that computes the statements?).