Search code examples
rgoogle-apigoogle-analytics-api

Error While Fetching Data From Google Analytics API from R


I am able to receive Data from MCF API using RGA library.Sharing the Query:

temp_data <- get_mcf(profileId = "xxxxxxxxx", start.date = "2017-01-09",
     end.date = "2017-01-31", metrics = "mcf:totalConversions",
     dimensions = "mcf:sourceMediumPath", sort = NULL,
     filters = "mcf:conversionType==Transaction",
     samplingLevel = NULL,start.index=1,max.results = 100000)

The above query fetches me 14836 rows of data. When i'm trying to increase the data range I am getting this error. Error: Server error: (500) Internal Server Error Response too large: Internal Error

Is there any workaround ??


Solution

  • If you check the documentation for the MCF API you will find that the valid values for Max-results is a number between 1000 and 10000.

    max-results

    max-results=100 Optional. Maximum number of rows to include in this response. You can use this in combination with start-index to retrieve a subset of elements, or use it alone to restrict the number of returned elements, starting with the first. If max-results is not supplied, the query returns the default maximum of 1000 rows.

    The Multi-Channel Funnels Reporting API returns a maximum of 10,000 rows per request, no matter how many you ask for. It can also return fewer rows than requested, if there aren't as many dimension segments as you expect. For instance, there are fewer than 300 possible values for mcf:medium, so when segmenting only by medium, you can't get more than 300 rows, even if you set max-results to a higher value.

    You should be using nextLink in order to retrieve the next set of data if you have more then 10000 rows in your response.

    Update: Out of curiosity I contacted Google Analytics API team. I thought it strange that you are getting more rows back then you should be based upon the documentation. This is the response I got back

    To me it sounds like the developer needs to just shorten the date range to not get 500 server timeout. I don't know how he knows how many row's a query will return when he is getting a 500 response so I think there is a bit of confusion in his question still. As far as I know we have not changed the number of rows allowed in the response, but we still need to construct the full response on our side and sort, so if the number of rows is large and the CPU usage on the server is heavy during his request he will easily get a 500 timeout error.

    That being said I have asked the Backend team if anything has changed about the 10k limit recently..

    - google dev who shall not be named -