Online Shop vs Petabytes
Shop: I need to understand how users navigate the website.
Consultant: Best practice is to capture every user actions in BigQuery/<insert certification here>.
Consultant: Best practice for BigQuery is append-only
so we keep adding user actions into raw table, deduplication/aggregation is for later.Consultant: Best practice is to partition by day and cluster by user_id/anonymous_id.
Consultant: Best practice is to build materialized views.
Consultant: Best practice is to build data cubes.
Consultant: Best practice is to build data marts.
Consultant: Best practice is real-time reporting so we run hourly.
Shop: Spends a fortune for petabytes of data scanned for urls.
Consultant on LinkedIn: we have built a Petabyte‑scale Data Product!
Last modified on 2026-03-04