mozetl.sync package

Submodules

mozetl.sync.bookmark_validation module

# Sync Bookmark Validation Dataset

This notebook is adapted from a gist that transforms the sync_summary into a flat table to avoid straining the resources on the Presto cluster.[1] The bookmark totals table generates statistics relative to the server clock.

See bugs 1349065, 1374831, 1410963

[1] https://gist.github.com/kitcambridge/364f56182f3e96fb3131bf38ff648609

mozetl.sync.bookmark_validation.extract(spark, path, start_date)[source]

Register a temporary sync_summary view on the start date.

mozetl.sync.bookmark_validation.load(spark, bucket, prefix, version, start_date)[source]

Save tables to disk.

mozetl.sync.bookmark_validation.transform(spark)[source]

Create the bookmark problem and summary tables.

Module contents