Search code examples
unixgoogle-analyticsuniversal-analytics

Which utility or UNIX command can help to submit a bulk of event data to Google Universal Analytics?


https://developers.google.com/analytics/devguides/collection/protocol/v1/reference

In the link above the Measurements Protocol is elaborated. Suppose I have a CSV file with columns like EventName, ClientID etc and I'd like to submit it to the Universal Analytics system. Is there a UNIX command, utility or a third-party software that will allow me to submit that data from command line or any kind of a friendlier UI?


Solution

  • I'm not a bash wizard myself so there will be any kind of ways to improve this ( I adapted an example I found on the web), but here is barebone example.

    To group the hits into sessions (if applicable) you need a client id. Client id is a mandatory parameter, but if you want to log each row from your file as a new session you can use a random number for the cid parameter.

    However the example assumes that the first column in your csv file contains a parameter that can be used as cid. Be aware that a session has max 500 hits (so after that you need to switch the cid) and that there is a limit of 20 hits per session that is replenished at 2 hits per second, so probably you want to build a delay into your script.

    The example assumes a csv file with a semicolon as a delimiter (can be adjusted in the IFS variable). It also assumes that there are three columns, one for the cid, one for the page path, one for the document title. if you have more than three columns the last value (pagetitle) will consume all remaining columns (so if there are more than three columns append the columns names in the line that starts with "while").

    Then the script simply builds an url (the variables from the line that starts with "while" are intefied by a dollar sign in front of the name) and uses wget to call the Google tracking server (the server returns a gif image which wget will store - I'm sure there is an option that tells wget to dismiss the content from the request).

    #!/bin/bash
    UAID="UA-XXXXX-XX" // Google Analytics Account ID
    INPUT=data.cvs // Input file name
    OLDIFS=$IFS // store default csv delimiter
    IFS=; // set csv delimiter
    [ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; } // nice error message if input file is missing
    while cid page pagetitle // while there are rows in the csv read fields
    do
        wget "www.google-analytics.com/collect?v=1&tid=$UAID&cid=$cid&t=pageview&dp=$page&dt=$pagetitle" // call Google Tracking server 
    done < $INPUT // no more rows
    IFS=$OLDIFS // restore default csv delimiter
    

    Obviously you'd have to make this script executable. I tested this (recent Debian/bash) so I'm rather sure this will work. It might be not very efficent though.