Should use WRITE_TRUNCATE mode or bq query --replace to replace
partitions atomically to prevent duplicate data
Will have tooling to generate an optimized mostly materialized view that
only calculates the most recent partition
Properties
Must accept a date via @submission_date query parameter
Must output a column named submission_date matching the query parameter
Must produce similar results when run multiple times
Should produce identical results when run multiple times
May depend on the previous partition
Should be impacted by values from a finite number of preceding partitions
This allows for backfilling in chunks instead of serially for all time
and limiting backfills to a certain number of days following updated data
For example sql/moz-fx-data-shared-prod/clients_last_seen_v1.sql can be run serially on any 28 day
period and the last day will be the same whether or not the partition
preceding the first day was missing because values are only impacted by
27 preceding days