I have a huge df which dimension is (58556185 X 2)
user page like
1 A 1
1 B 1
1 C 1
2 A 1
2 C 1
3 B 1
. . .
and the unique user and unique pages are 100,000 and 50,000 respectively I want to spread it into
user/page
A B C ...
1 1 1 0 ...
2 1 0 1 ...
3 0 1 0 ...
.
.
I have used this code and it works for small dataset
data <- data%>%
group_by(user)%>%
spread(page, like, fill = 0, drop = TRUE)
But when apply to huge df, it comes out Error: cannot allocate vector of size 21626.2 Gb
Any suggestions? Thanks
I have used sparse matrix to solve this problem.
mat <- sparseMatrix(as.integer(factor(data.fbpage$uid)) ,as.integer(factor(data.fbpage$pageId)), x=data.fbpage$like)