Google Analytics Data Analysis

The Context

The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website.

The Challenge

Our goal is to analyze the Google Analytics data to answer several key questions that can provide insights into user behavior, website performance, and marketing effectiveness. The questions we aim to answer are:

The Approach

We will use SQL queries to extract and analyze the relevant data from the Google Analytics dataset. The dataset contains information about user interactions, device details, traffic sources, and transaction data. Each query will be designed to answer one of the specific questions listed above.


1. Total Number of Transactions Generated per Device Browser

SELECT
    device.browser AS browser,
    COUNT(totals.transactions) AS total_transactions
FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
    AND totals.transactions IS NOT NULL
GROUP BY
    device.browser
ORDER BY
    total_transactions DESC;
Browser Total Transactions
Chrome 944
Safari 65
Firefox 12
Internet Explorer 7
Edge 2
Opera 1

Insights

Actions


2. Real Bounce Rate per Traffic Source

SELECT
    trafficSource.source AS traffic_source,
    COUNTIF(totals.pageviews = 1) / COUNT(*) AS real_bounce_rate
FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
GROUP BY
    traffic_source
ORDER BY
    real_bounce_rate DESC;
Traffic Source Real Bounce Rate
es.search.yahoo.com 1
github.com 1
mx.search.yahoo.com 1
... ...

Insights

Actions


3. Average Number of Product Pageviews for Users Who Made a Purchase

SELECT
    AVG(totals.pageviews) AS avg_product_pageviews
FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
    AND totals.transactions IS NOT NULL;
Average Product Pageviews
25.92

Insights

Actions


4. Average Number of Product Pageviews for Users Who Did Not Make a Purchase

SELECT
    AVG(totals.pageviews) AS avg_product_pageviews
FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
    AND totals.transactions IS NULL;
Average Product Pageviews
3.45

Insights

Actions


5. Average Number of Product Pageviews for Users That Made a Purchase

SELECT
    AVG(totals.transactions) AS avg_transactions_per_user
FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
    AND totals.transactions IS NOT NULL;
Average Transactions per User
1.04

Insights

Actions


6. Average Amount of Money Spent per Session

SELECT
    AVG(totals.transactionRevenue) / 1000000 AS avg_revenue_per_session
FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
    AND totals.transactionRevenue IS NOT NULL;
Average Revenue per Session
120.76

Insights

Actions


7. Sequence of Pages Viewed


A. Create a Temporary Table

-- Create a temporary table with the initial data
CREATE TEMP TABLE temp_ga_sessions AS
SELECT
    fullVisitorId,
    visitId,
    visitStartTime,
    hits.page.pagePath AS page_path,
    hits.hitNumber AS hit_number
FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`,
    UNNEST(hits) AS hits
WHERE
    _TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
ORDER BY
    fullVisitorId,
    visitStartTime,
    hit_number;

B. Top Landing Pages

-- Find the top landing pages
SELECT
    page_path,
    COUNT(*) AS session_count
FROM
    temp_ga_sessions
WHERE
    hit_number = 1
GROUP BY
    page_path
ORDER BY
    session_count DESC
LIMIT 5;
Top Landing Pages Session Count
/home 34,883
/google+redesign/shop+by+brand/youtube 17,300
/google+redesign/apparel/mens/mens+t+shirts 2,967
/signin.html 1,473
/google+redesign/apparel 932
... ...
Insights

C. Top Exit Pages

-- Find the top exit pages
WITH last_hits AS (
    SELECT
        fullVisitorId,
        visitId,
        MAX(hit_number) AS last_hit_number
    FROM
        temp_ga_sessions
    GROUP BY
        fullVisitorId,
        visitId
)
SELECT
    s.page_path,
    COUNT(*) AS session_count
FROM
    temp_ga_sessions s
JOIN
    last_hits lh
ON
    s.fullVisitorId = lh.fullVisitorId
    AND s.visitId = lh.visitId
    AND s.hit_number = lh.last_hit_number
GROUP BY
    s.page_path
ORDER BY
    session_count DESC
LIMIT 5;
Top Exit Pages Session Count
/home 23,210
/google+redesign/shop+by+brand/youtube 12,747
/google+redesign/apparel/mens/mens+t+shirts 2,760
/asearch.html 2,044
/basket.html 1,905
... ...
Insights:

D. Common Navigation Paths

-- Find the most common navigation paths
WITH paths AS (
    SELECT
        fullVisitorId,
        visitId,
        STRING_AGG(page_path, ' -> ' ORDER BY hit_number) AS navigation_path
    FROM
        temp_ga_sessions
    GROUP BY
        fullVisitorId,
        visitId
)
SELECT
    navigation_path,
    COUNT(*) AS path_count
FROM
    paths
GROUP BY
    navigation_path
ORDER BY
    path_count DESC
LIMIT 5;
Common Navigation Paths Path Count
/home 15,761
/google+redesign/shop+by+brand/youtube 10,701
/home -> /home 2,763
/google+redesign/apparel/mens/mens+t+shirts 1,389
/google+redesign/shop+by+brand/youtube -> /home 609
... ...
Insights:

E. Average Hits per Session

-- Calculate the average number of hits per session
SELECT
    AVG(hit_count) AS average_hits_per_session
FROM (
    SELECT
        fullVisitorId,
        visitId,
        COUNT(*) AS hit_count
    FROM
        temp_ga_sessions
    GROUP BY
        fullVisitorId,
        visitId);
Average Hits per Session
4.53
Insights:

Insights and Recommendations

The analysis provides several key insights: