`mozanalysis.segments`

class mozanalysis.segments.SegmentDataSource(name, from_expr, window_start: int = 0, window_end: int = 0, client_id_column: str = 'client_id', submission_date_column: str = 'submission_date', default_dataset: str | None = None, app_name: str | None = None, group_id_column: str = 'profile_group_id', glean_client_id_column: str | None = None, legacy_client_id_column: str | None = None)[source]

Represents a table or view, from which segments may be defined.

window_start and window_end define the window of data used to determine whether each client fits a segment. Ideally this window ends at/before the moment of enrollment, so that user’s branches can’t bias the segment assignment. window_start and window_end are integers, representing the number of days before or after enrollment.

Parameters:

name (str) – Name for the Data Source. Should be unique to avoid confusion.
from_expr (str) – FROM expression - often just a fully-qualified table name. Sometimes a subquery. May contain the string {dataset} which will be replaced with an app-specific dataset for Glean apps. If the expression is templated on dataset, default_dataset is mandatory.
window_start (int, optional) – See above.
window_end (int, optional) – See above.
client_id_column (str, optional) – Name of the column that contains the client_id (join key). Defaults to ‘client_id’.
submission_date_column (str, optional) – Name of the column that contains the submission date (as a date, not timestamp). Defaults to ‘submission_date’.
default_dataset (str, optional) – The value to use for {dataset} in from_expr if a value is not provided at runtime. Mandatory if from_expr contains a {dataset} parameter.
app_name – (str, optional): app_name used with metric-hub, used for validation
group_id_column (str, optional) – Name of the column that contains the group_id (join key). Defaults to ‘profile_group_id’.
glean_client_id_column (str, optional) – Name of the column that contains the glean telemetry client_id (join key). This is also used to specify that the data source supports glean.
legacy_client_id_column (str, optional) – Name of the column that contains the legacy telemetry client_id (join key). This is also used to specify that the data source supports legacy.

from_expr_for(dataset: str | None) → str[source]

Expands the from_expr template for the given dataset. If from_expr is not a template, returns from_expr.

Parameters:: dataset (str or None) – Dataset name to substitute into the from expression.

build_query(segment_list, time_limits, experiment_slug, from_expr_dataset=None, analysis_unit: AnalysisUnit = AnalysisUnit.CLIENT, use_glean_ids: bool | None = None)[source]

Return a nearly self contained SQL query.

The query takes a list of {analysis_id}``s from ``raw_enrollments, and adds one non-NULL boolean column per segment: True if the client is in the segment, False otherwise.

build_query_target(target, time_limits, from_expr_dataset=None, analysis_unit: AnalysisUnit = AnalysisUnit.CLIENT)[source]

Return a nearly-self contained SQL query, for use with mozanalysis.sizing.HistoricalTarget.

This query returns all distinct client IDs that satisfy the criteria for inclusion in a historical analysis using this datasource. Separate sub-queries are constructed for each additional Segment in the analysis.

class mozanalysis.segments.Segment(name: str, data_source, select_expr: str, friendly_name: str | None = None, description: str | None = None, app_name: str | None = None)[source]

Represents an experiment Segment.

Parameters:

name (str) – The segment’s name; will be a column name.
data_source (SegmentDataSource) – Data source that provides the columns referenced in select_expr.
select_expr (str) – A SQL select expression that includes an aggregation function (we GROUP BY {analysis_unit}). Returns a non-NULL BOOL: True if the user is in the segment, False otherwise.
friendly_name (str) – A human-readable dashboard title for this segment
description (str) – A paragraph of Markdown-formatted text describing the segment in more detail, to be shown on dashboards
app_name – (str, optional): app_name used with metric-hub, used for validation

mozanalysis.segments

`mozanalysis.segments`