mozetl.addon_aggregates package¶

Submodules¶

mozetl.addon_aggregates.addon_aggregates module¶

ETL code for the addon_aggregates dataset

mozetl.addon_aggregates.addon_aggregates.add_addon_columns(df)[source]¶

Constructs additional indicator columns decribing the add-on/theme present in a given record. The columns are

is_self_install is_shield_addon is_foreign_install is_system is_web_extension Which maps True -> 1 and False -> 0

Parameters: df – SparkDF, exploded on active_addons, each record maps to a single add-on

:return df with the above columns added

mozetl.addon_aggregates.addon_aggregates.aggregate_addons(df)[source]¶

Aggregates add-on indicators by client, channel, version and locale. The result is a DataFrame with the additional aggregate columns:

n_self_installed_addons (int) n_shield_addons (int) n_foreign_installed_addons (int) n_system_addons (int) n_web_extensions (int) first_addon_install_date (str %Y%m%d) profile_creation_date (str %Y%m%d)

for each of the above facets.

Parameters: df – an expoded instance of main_summary by active_addons with various additional indicator columns
Return SparkDF: an aggregated dataset with each of the above columns

mozetl.addon_aggregates.addon_aggregates.get_dest(output_bucket, output_prefix, output_version, date=None, sample_id=None)[source]¶

Stitches together an s3 destination.

Parameters

output_bucket – s3 output_bucket
output_prefix – s3 output_prefix (within output_bucket)
output_version – dataset output_version

:retrn str -> s3://output_bucket/output_prefix/output_version/submissin_date_s3=[date]/sample_id=[sid]

mozetl.addon_aggregates.addon_aggregates.load_main_summary(spark, input_bucket, input_prefix, input_version)[source]¶

Loads main_summary from the bucket constructed from input_bucket, input_prefix, input_version

Parameters

spark – SparkSession object
input_bucket – s3 bucket (telemetry-parquet)
input_prefix – s3 prefix (main_summary)
input_version – dataset version (v4)

:return SparkDF

mozetl.addon_aggregates.addon_aggregates.ms_explode_addons(ms)[source]¶

Explodes the active_addons object in the ms DataFrame and selects relevant fields

Parameters: ms – a subset of main_summary

:return SparkDF

mozetl.addon_aggregates package¶

Submodules¶

mozetl.addon_aggregates.addon_aggregates module¶

Module contents¶

python_mozetl

Navigation

Related Topics