Monday 6 July 2015

What happened next?

Google Analytics, for good reason, likes to deal with session level data. It wants you to ask 'Show me sessions where x occurred'. This is because session level data is an order of magnitude smaller than 'hit' level data. Processing a lot of hit level data is a big ask, especially in a service that half the internet uses without paying a cent.

But still, sometimes you want to know hit level data - so what's an analyst to do? Custom reports go some of the way, as do custom funnel reports.  However, I was recently asked - how long does it take people on average to go from page A to page B?

Enter big query.

With Google Analytics Premium you get a daily export of your hit level data that you can query to your hearts content. I came up with the query below which will give you a summary of:
  • How many times page A was hit
  • How many times any page was hit after page A
  • How many people exit from page A
  • How many times page B was hit directly after page A
  • What was the average time in seconds between page A and page B for the same user / session
If you've got big query access, I'd encourage you to give it a go. And feel free to steal this query! Brutal critique also welcome. 

SELECT [date], COUNT(hits.page.pagePath) AS pointA, COUNT(nextHitPagePath) AS pointB, AVG(timeToNextPage) AS averageTimeToPointB, COUNT(CASE WHEN nextHitPagePath LIKE '%POINT B%' THEN 1 ELSE NULL END) AS pointBViews, COUNT(CASE WHEN nextHitPagePath IS NULL THEN 1 ELSE NULL END) AS exits, FROM ( SELECT hits.page.pagePath, hits.hitNumber, fullVisitorId, visitId, nextHitPagePath, [date], ((nextHitTime - hits.time) / 1000) AS timeToNextPage FROM ( SELECT [date], hits.page.pagePath, hits.time, hits.hitNumber, fullVisitorId, visitId, LAG (hits.hitNumber, 1) OVER (PARTITION BY fullVisitorId, visitId ORDER BY hits.time DESC) AS nextHitNumber, LAG (hits.page.pagePath, 1) OVER (PARTITION BY fullVisitorId, visitId ORDER BY hits.time DESC) AS nextHitPagePath, LAG (hits.time, 1) OVER (PARTITION BY fullVisitorId, visitId ORDER BY hits.time DESC) AS nextHitTime FROM (TABLE_DATE_RANGE([93475458.ga_sessions_], TIMESTAMP('2015-05-16'), TIMESTAMP('2015-05-31'))) WHERE hits.type = 'PAGE') WHERE hits.page.pagePath LIKE '%POINT A%') GROUP EACH BY [date]