Error in running a spread because of unique 'key combinations'; combining rows of data

I'm looking to do a group_by on Google Analytics data where I have a unique user identifier, URL the person went to, and then the number of times the user went to that page.

The data come from Google Analytics like this:

ID          Page                  Pageviews
abc123      example.com/pagea     2 
qwer123     example.com/pageb     3 
abc123      example.com/pageb     4
qwer123     example.com/pagec     5 
uiop123     example.com/pagea     6

I'm trying to flip it into

ID        example.com/pagea    example.com/pageb    example.com/pagec
abc123    2                    4                    0
qwer123   0                    3                    5
uiop123   6                    0                    0

However, when I use spread, I get an error: Error: Each row of output must be identified by a unique combination of keys.

The command I'm running is: df <- data %>% spread(Page, Pageviews, fill = 0)

Here's where I think I'm causing the issue: Before I do the spread, I am removing some data from the URLs to normalize the URLs (basically removing query strings). So before I do the spread, I think I need to consolidate where I have the same ID and Page and then add the combined Pageviews, so rather than a two rows, I now have 1.

Basically, I think I need to go to the first part of the data and turn instances of :

ID          Page                  Pageviews
abc123      example.com/pagea     2 
abc123      example.com/pagea     3

into

ID          Page                  Pageviews
abc123      example.com/pagea     5

What's the least painful way to do that?

Solution

Use dplyr first :

library(dplyr)
library(tidyr)
df <- data %>% group_by(ID,Page) %>%
summarise(Pageviews = sum(Pageviews,na.rm=T) %>%
spread(Page, Pageviews, fill = 0)