Search code examples
rdataframegoogle-analytics-apiradar-chart

Create rows in a data frame based on other rows and column combination in R


I have a problem with a data frame in R, I have some data with two dimensions and one metric, but some combination of categories don't have data. My data look like these:

          interestAffinityCategory userGender users
1                 Music Lovers       male   198
2                 Music Lovers     female   190
3  News Junkies & Avid Readers       male   134
4  News Junkies & Avid Readers     female   115
5                  Sports Fans       male   109
6                 Movie Lovers       male   108
7                 Technophiles       male    93
8                    TV Lovers       male    88
9                    TV Lovers     female    79
10                Technophiles     female    70

Example, Sport Fans, only have data for male gender. I need all the categories, even with a 0 value in the users column. Like: Sport Fans, female, 0 How my data need to be: (line 8 and 6)

      interestAffinityCategory userGender users
1                 Music Lovers       male   198
2                 Music Lovers     female   190
3  News Junkies & Avid Readers       male   134
4  News Junkies & Avid Readers     female   115
5                  Sports Fans       male   109
6                  Sports Fans     female   0
7                 Movie Lovers       male   108
8                 Movie Lovers     female   0
9                 Technophiles       male   93
10                    TV Lovers       male  88
11                    TV Lovers     female  79
12                Technophiles     female    70

I tried to find a solution, but I only find similar cases, but with only one dimension, and it didn't work for me.

Ps.: This data is from the Google Analytics API, I want to get the top 10 categories, and make a graph with visits by gender, but for it, I need to show data for all mix of categories and gender, even with 0 visits.


Solution

  • You should use the complete function from tidyr. The first argument is your data, second and third are the columns that you want to find all possible comibnations (if you have more, you can just list them one by one), and fill is a list with the default values to fill in.

    complete(data, interestAffinityCategory, userGender, fill=list(users=0))