Google Analytics Data Analysis
The Context
The Google Merchandise Store sells Google branded merchandise. The data is typical of what you would see for an ecommerce website.
The Challenge
Our goal is to analyze the Google Analytics data to answer several key questions that can provide insights into user behavior, website performance, and marketing effectiveness. The questions we aim to answer are:
- What is the total number of transactions generated per device browser in July 2017?
- What is the real bounce rate per traffic source?
- What was the average number of product pageviews for users who made a purchase in July 2017?
- What was the average number of product pageviews for users who did not make a purchase in July 2017?
- What was the average total transactions per user that made a purchase in July 2017?
- What is the average amount of money spent per session in July 2017?
- What is the sequence of pages viewed?
The Approach
We will use SQL queries to extract and analyze the relevant data from the Google Analytics dataset. The dataset contains information about user interactions, device details, traffic sources, and transaction data. Each query will be designed to answer one of the specific questions listed above.
1. Total Number of Transactions Generated per Device Browser
SELECT
device.browser AS browser,
COUNT(totals.transactions) AS total_transactions
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
AND totals.transactions IS NOT NULL
GROUP BY
device.browser
ORDER BY
total_transactions DESC;
| Browser |
Total Transactions |
| Chrome |
944 |
| Safari |
65 |
| Firefox |
12 |
| Internet Explorer |
7 |
| Edge |
2 |
| Opera |
1 |
Insights
- Chrome is the dominant browser, generating the majority of transactions (944), followed by Safari with 65 transactions.
- Browsers like Firefox, Internet Explorer, Edge, and Opera contribute minimally to transactions.
Actions
- Optimize Website for Chrome and Safari: Ensure the website performs optimally on these browsers to maintain and potentially increase transaction volume.
- Cross-Browser Testing: Regularly test the website on other browsers like Firefox, Internet Explorer, Edge, and Opera to identify and fix any issues that might be hindering transactions.
2. Real Bounce Rate per Traffic Source
SELECT
trafficSource.source AS traffic_source,
COUNTIF(totals.pageviews = 1) / COUNT(*) AS real_bounce_rate
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
GROUP BY
traffic_source
ORDER BY
real_bounce_rate DESC;
| Traffic Source |
Real Bounce Rate |
| es.search.yahoo.com |
1 |
| github.com |
1 |
| mx.search.yahoo.com |
1 |
... |
... |
Insights
- There are many traffic sources with a 100% bounce rate, meaning all visitors from these sources leave the site after viewing only one page.
- A high bounce rate indicates a mismatch between visitor expectations and the content they land on, leading to immediate exits.
Actions
- Landing Page Optimization: Review and improve landing pages for the traffic sources with high bounce rates to make them more engaging and relevant.
- Targeted Content Creation: Develop content that matches the interests of visitors coming from specific high-bounce-rate sources to reduce the bounce rate and increase engagement.
7. Sequence of Pages Viewed
A. Create a Temporary Table
-- Create a temporary table with the initial data
CREATE TEMP TABLE temp_ga_sessions AS
SELECT
fullVisitorId,
visitId,
visitStartTime,
hits.page.pagePath AS page_path,
hits.hitNumber AS hit_number
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST(hits) AS hits
WHERE
_TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
ORDER BY
fullVisitorId,
visitStartTime,
hit_number;
B. Top Landing Pages
-- Find the top landing pages
SELECT
page_path,
COUNT(*) AS session_count
FROM
temp_ga_sessions
WHERE
hit_number = 1
GROUP BY
page_path
ORDER BY
session_count DESC
LIMIT 5;
| Top Landing Pages |
Session Count |
| /home |
34,883 |
| /google+redesign/shop+by+brand/youtube |
17,300 |
| /google+redesign/apparel/mens/mens+t+shirts |
2,967 |
| /signin.html |
1,473 |
| /google+redesign/apparel |
932 |
| ... |
... |
Insights
- The home page and YouTube brand page are the most frequent landing pages, indicating their importance in attracting visitors.
- Pages like the men's apparel section and the sign-in page also see significant initial traffic, highlighting their relevance to users.
- Optimizing these landing pages for speed, clarity, and engaging content can enhance the user experience and increase conversions.
C. Top Exit Pages
-- Find the top exit pages
WITH last_hits AS (
SELECT
fullVisitorId,
visitId,
MAX(hit_number) AS last_hit_number
FROM
temp_ga_sessions
GROUP BY
fullVisitorId,
visitId
)
SELECT
s.page_path,
COUNT(*) AS session_count
FROM
temp_ga_sessions s
JOIN
last_hits lh
ON
s.fullVisitorId = lh.fullVisitorId
AND s.visitId = lh.visitId
AND s.hit_number = lh.last_hit_number
GROUP BY
s.page_path
ORDER BY
session_count DESC
LIMIT 5;
| Top Exit Pages |
Session Count |
| /home |
23,210 |
| /google+redesign/shop+by+brand/youtube |
12,747 |
| /google+redesign/apparel/mens/mens+t+shirts |
2,760 |
| /asearch.html |
2,044 |
| /basket.html |
1,905 |
| ... |
... |
Insights:
- The home page is the most common exit page, suggesting users might not find what they are looking for or complete their journey successfully.
- High exits from the YouTube brand page and the men's apparel section indicate potential areas for improvement in user experience or product offerings.
- Addressing issues on high exit pages, such as improving content relevance and navigation, can reduce bounce rates and retain users longer.
D. Common Navigation Paths
-- Find the most common navigation paths
WITH paths AS (
SELECT
fullVisitorId,
visitId,
STRING_AGG(page_path, ' -> ' ORDER BY hit_number) AS navigation_path
FROM
temp_ga_sessions
GROUP BY
fullVisitorId,
visitId
)
SELECT
navigation_path,
COUNT(*) AS path_count
FROM
paths
GROUP BY
navigation_path
ORDER BY
path_count DESC
LIMIT 5;
| Common Navigation Paths |
Path Count |
| /home |
15,761 |
| /google+redesign/shop+by+brand/youtube |
10,701 |
| /home -> /home |
2,763 |
| /google+redesign/apparel/mens/mens+t+shirts |
1,389 |
| /google+redesign/shop+by+brand/youtube -> /home |
609 |
| ... |
... |
Insights:
- The most common navigation paths indicate that users frequently revisit the home page, suggesting potential navigation issues or lack of clear direction.
- The YouTube brand page appears frequently in navigation paths, highlighting its popularity and the need to optimize this page for conversions.
- Simplifying navigation and providing clear pathways to key actions can enhance user experience and reduce redundant page visits.
E. Average Hits per Session
-- Calculate the average number of hits per session
SELECT
AVG(hit_count) AS average_hits_per_session
FROM (
SELECT
fullVisitorId,
visitId,
COUNT(*) AS hit_count
FROM
temp_ga_sessions
GROUP BY
fullVisitorId,
visitId);
| Average Hits per Session |
| 4.53 |
Insights:
- With an average of 4.53 hits per session, users show moderate engagement with the website content.
- Increasing the number of hits per session can be achieved by adding engaging content, interactive elements, and personalized recommendations.
- Monitoring changes in this metric over time can help evaluate the effectiveness of engagement strategies and website improvements.
Insights and Recommendations
The analysis provides several key insights:
- Optimize the home and YouTube brand pages for better user engagement and conversion.
- Investigate and improve high exit pages, particularly the home page and basket page.
- Streamline user navigation to reduce revisits to the home page.
- Enhance user engagement by adding interactive elements and improving content quality.
- Implement data-driven strategies to personalize user experiences and improve conversion rates.