Online Shop vs Petabytes

Online Shop vs Petabytes

Shop: I need to understand how users navigate the website.

 

Consultant: Best practice is to capture every user actions in BigQuery/<insert certification here>.

Consultant: Best practice for BigQuery is append-only
so we keep adding user actions into raw table, deduplication/aggregation is for later.

Consultant: Best practice is to partition by day and cluster by user_id/anonymous_id.

Consultant: Best practice is to build materialized views.

Consultant: Best practice is to build data cubes.

Consultant: Best practice is to build data marts.

Consultant: Best practice is real-time reporting so we run hourly.

 

Shop: Spends a fortune for petabytes of data scanned for urls.

Consultant on LinkedIn: we have built a Petabyte‑scale Data Product!


Last modified on 2026-03-04