Google Analytics (GA) has long been the leading website analytics software available, and with good reason. The program provides detailed information on your site visitors and their interactions, offers a range of interactive features that enables you to analyse and quickly identify traffic patterns, ensures that the data is accurate and complete, and even allows you to measure how well your marketing campaigns are performing!
Google Analytics can also be used to track conversions. This means that website owners can see not only how many people are visiting their site, but also how many of those visitors are taking the desired action (such as making a purchase or signing up for a newsletter). This information is indispensable for understanding whether or not a website is successful in achieving its goals.
Five Key Benefits of Google Analytics 4
Google Analytics 4 (GA4) is the latest version of Google Analytics, and it offers a number of benefits over the previous Universal Analytics version that can be extremely helpful for businesses.
- Google Analytics 4 allows cross-domain tracking that does not rely on code adjustments.
- GA4 allows you to track conversions more effectively.
- It makes it easier to segment your audience.
- It provides more options for customising reports.
- Built-in GA4 and BigQuery integration
BigQuery and Google Analytics 4 integration
When it comes to Google and data, it’s no mystery that the company has been at the forefront of cutting-edge technologies for over 20 years. One of those technologies that have taken off in recent years is Google BigQuery, a service built on their big data infrastructure that allows users to extract, store and explore large datasets with just a few clicks.
In the past BigQuery was only available to paying users but with the introduction of Google Analytics 4, it is now accessible for free. Even though there is a ceiling on the amount of data you can store and query via the free BigQuery sandbox, in my opinion, your business can do a lot with the monthly free data allowance – 10GB of free storage and free processing of up to 1TB of query data.
Connecting Google Analytics 4 to BigQuery – a step-by-step guide
Connecting your Google Analytics 4 property to BigQuery is easy and it allows you to gain a better understanding of your digital information and data. To connect GA4 to BigQuery you need to follow the 4 easy steps:
- Step 1: Set up a Project in Google BigQuery
- Step 2: Enable Google Analytics 4 BigQuery Linking
- Step 3: Enable Google Cloud API
- Step 4: Create a Service account
Step 1: Create a BigQuery Project
Start by going to https://console.cloud.google.com/bigquery and creating a new project. Click on ‘Select a project” and then ‘New project’.
If you are not a paid user you are allowed to have up to 25 projects. Your project name will automatically create a project ID so choose something descriptive as a project ID cannot be changed once it has been set.
After choosing your project ID click on ‘Create’ and your project will be created. Well done, you have now created your first Google BigQuery project and you should be able to see it in the top left window corner.
Step 2: Link Google Analytics 4 to BigQuery
Login to your Google Analytics 4 property and navigate and click on the GA 4 Admin option button in the bottom-left corner of the window.
Clicking on the BigQuery button will open a screen which allows you to select your BigQuery project.
To find your BigQuery project click on ‘Link’ and then ‘Choose a BigQuery’ project.
Select the desired property ID and click Confirm to continue.
Now, select the Google Cloud region for your data from the drop-down menu and press ‘Next’.
On the next screen, you have to configure which data streams and events to export. I wouldn’t touch ‘Include advertising identifiers for mobile app streams’ unless you want to export advertiser identifiers to BigQuery.
Normally I would choose both Daily (once a day) or Streaming (continuous export) frequency of data import as I can always change these options at a later date.
Clicking on the ‘Next’ button will allow you to review the settings of your link to a BigQuery project and if you’re happy with everything you see click on ‘Submit to complete the process.
Good job! You have now successfully linked your Google Analytics 4 property to a BigQuery project. This should be confirmed on the screen below.
Quick review: you have created a BigQuery project and linked a GA4 property to it but they are still not connected. To accomplish that you need to create an API.
Step 3: Enable Google Cloud API
To start the process go to the Google Cloud Console and select ‘APIs & Services’ followed by ‘Library’.
Make sure that the project that you want to link is selected in the top navigation.
Search for BiqQuery API and click on it. Make sure the BIgQuery API is enabled and click ‘Manage’.
Now you gonna add the Service account by first clicking on ‘Credentials’ and then clicking on ‘Create Credentials.
Select ‘Service Account’ and you gonna see a window like that.
In the ‘Service account name’ field, type this special email address [email protected]. The Service account will then be automatically generated for you.
After describing your Service Account click on ‘Create and Continue’. On the new window grant the editor access to the Service account and select ‘Done’.
Congratulations, now the whole process has been completed – you’ve created a project in Google BigQuery Project, linked your GA4 property to BigQuery, enabled the Google Cloud API and added a Service account.
All you have to do now is wait (it may take up to 24 hours) for the first of your GA4 data to be exported to BigQuery.
Note: If you want to know the format and schema of the GA4 property data and the Google Analytics for Firebase data that is exported to BigQuery read [GA4] BigQuery Export schema.
Accessing your data in BigQuery
Once you have left your data to populate, come back to your BigQuery account using this link https://console.cloud.google.com/bigquery to check the data.
Click on your project name to reveal a list of datasets in your project. Here you can only see one dataset (in the example above analytics_251301421) as we are only sending the data from Google Analytics 4 into this project.
A dataset is a top-level container that’s used to organise and control access to your data. In simple terms, it’s a kind of folder in which your information is stored in the form of tables and views.
Within your dataset, you can see a list of tables:
- Events_(number of days) – GA4 data from the previous day gets automatically exported from the property to BigQuery every day. To see the individual dates you have to click on the date filter next to ‘Events’.
- Events_intraday_<current date> – Data from the GA4 property is automatically imported throughout the day and this will correspond with the ‘streaming’ frequency setting in Google Analytics 4.
Clicking on any of the tables will show you the structure of that table.
Clicking on the ‘Details’ tab will show you the size of the table, the number of rows and when the table was created. To analyse a table you can either run a query or by simply clicking on the ‘Preview’ tab you can inspect your data without the need to run a query. Needless to say, this is a good practice as it allows you to have a quick view of the data you have imported and the structure of your tables.
If you do want to query a table click on the Query button at the top of the data table.
The system will automatically write a sequel statement (SQL SELECT Statement) for you but you have to add ‘*’ between SELECT and FROM in the query before clicking on ‘Run’.
SELECT
*
FROM
-- Replace with your table name.
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
LIMIT
10
Note: SQL keywords are NOT case sensitive: select is the same as SELECT. MySQL supports the LIMIT clause to select a limited number of records which can save you a lot of money when querying big websites.
I know that some of you may be instantly taken aback by the structure of the BigQuery tables so I feel like I have to try to explain what ‘nested fields’ mean.
BigQuery Nested Fields
Nested fields in BiQuery are fields that are linked together as a single entity, like a struct or an object.
In the previous version of Google Analytics, every row in the data set represented a single session. As you probably already know Google Analytics 4 is event-based so every row in our data set represents an event and every event can contain multiple event parameters and corresponding values.
Going back to the previous example after querying the table only rows 1,2, and 3 are ‘real’ and the other ‘rows’ are in fact nested fields (most of the time with NULL values).
To extract the nested values and essentially flatten the tables we must use the UNNEST function.
SELECT
DISTINCT (
SELECT
value.string_value
FROM
UNNEST(event_params)
WHERE
key = 'page_title' ) AS Page_Title
FROM
-- Replace with your table name.
`ga4-omi-sido.analytics_251301421.events_*`
ORDER BY
1
LIMIT
10
This code will give us the following result.
To truly understand the UNNEST concept in detail visit this article by Todd Kerpelman.
BigQuery SQL Examples
Before wrapping up this article let me give you a few simple examples so you can get some practice using BigQuery with your Google Analytics 4 data. Copy and paste any of the SQL queries on this page into the BigQuery web console at https://console.cloud.google.com/bigquery but don’t forget to replace the example table name with your table name.
Total user count v New users count
WITH
UserInfo AS (
SELECT
user_pseudo_id,
MAX(
IF
(event_name IN ('first_visit',
'first_open'),
1,
0)) AS is_new_user
-- Replace with your table name.
FROM
`ga4-omi-sido.analytics_251301421.events_*`
-- Replace with your date range.
WHERE
_TABLE_SUFFIX BETWEEN '20220519'
AND '20220524'
GROUP BY
1 )
SELECT
COUNT(*) AS user_count,
SUM(is_new_user) AS new_user_count
FROM
UserInfo;
This code will give you the following result:
Most Viewed Pages by Page Title
SELECT
value.string_value as Page_Title,
COUNT(*) as Page_Views
FROM
`ga4-omi-sido.analytics_251301421.events_*`,
UNNEST(event_params)
WHERE
key = "page_title"
AND event_name = "page_view"
GROUP BY
1
ORDER BY
2 DESC
LIMIT 10
This code will give you the following result:
Top items added to shopping cart
SELECT
item_name AS Top_Products,
COUNT(DISTINCT user_pseudo_id) AS user_count
FROM
-- Replace with your table name.
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`,
UNNEST(items)
WHERE
-- Replace with your date range.
_TABLE_SUFFIX BETWEEN '20201101'
AND '20220131'
AND event_name IN ('add_to_cart')
GROUP BY
1
ORDER BY
user_count DESC
LIMIT
15;
This code will give you the following result:
Notice that in the latter example I am using the [GA4] Google Analytics 4-property sample dataset for BigQuery which you can access using this link.
Another way to find and use the Google BigQuery public dataset is to search for public into your BigQuery console.
Then click on ‘Narrow search to pinned projects.”
When you see ‘bigquery-public-data’ pin it to your project so it’s always there for you to explore.
BigQuery E-commerce Product Analysis Report
Find out what products are being purchased, added to the cart, and viewed by customers through your website or app.
WITH
productData AS (
SELECT
item_name AS itemName,
event_name,
user_pseudo_id,
event_timestamp,
items.quantity,
item_revenue,
ecommerce.transaction_id
FROM
-- Replace with your table name.
`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`,
UNNEST(items) AS items
WHERE
_table_suffix BETWEEN '20210101'
AND '20210131' ),
ecommerceMetrics AS (
SELECT
itemName,
COUNT(CASE
WHEN event_name = 'view_item' THEN user_pseudo_id
ELSE
NULL
END
) AS itemViews,
COUNT(CASE
WHEN event_name = 'add_to_cart' THEN user_pseudo_id
ELSE
NULL
END
) AS addToCarts,
COUNT(DISTINCT
CASE
WHEN event_name = 'purchase' THEN transaction_id
ELSE
NULL
END
) AS ecommercePurchases,
SUM(CASE
WHEN event_name = 'purchase' THEN quantity
ELSE
NULL
END
) AS itemPurchaseQuantity,
SUM(item_revenue) AS itemRevenue
FROM
productData
GROUP BY
itemName )
SELECT
itemName,
itemViews,
addToCarts,
(CASE
WHEN itemViews = 0 THEN 0
ELSE
addToCarts / itemViews * 100
END
) AS cartToViewRate,
ecommercePurchases,
(CASE
WHEN itemViews = 0 THEN 0
ELSE
ecommercePurchases / itemViews * 100
END
) AS purchaseToViewRate,
itemPurchaseQuantity,
itemRevenue
FROM
ecommerceMetrics
WHERE
itemViews > 0
OR itemRevenue > 0
ORDER BY
itemViews DESC
This SQL query will give you the following result:
The result of this analysis will provide online businesses with a comprehensive understanding of the products being viewed, added to baskets, and purchased. Additionally, it will highlight products that are simply viewed or added to baskets without being purchased, thus enabling companies to recognise potential areas for improvement in their website or app. By examining the data, companies can acquire valuable insights and make necessary modifications to enhance the customer experience and increase sales.
This post has gotten rather lengthy so I think it’s time to wrap it up here, but as you can see, there is a lot to cover in the new Google Analytics 4.
I hope you’ve enjoyed this article and feel a bit more confident to utilise your own Google Analytics 4 data in Google BigQuery.
Leave a Reply