`mozanalysis.metrics`

class mozanalysis.metrics.AnalysisBasis(value)[source]: Determines what the population used for the analysis will be based on.

class mozanalysis.metrics.DataSource(name, from_expr, experiments_column_type: str | None = 'simple', client_id_column: str | None = 'client_id', submission_date_column: str | None = 'submission_date', default_dataset: str | None = None, app_name: str | None = None, group_id_column: str | None = 'profile_group_id', glean_client_id_column: str | None = None, legacy_client_id_column: str | None = None)[source]

Represents a table or view, from which Metrics may be defined.

Parameters:

name (str) – Name for the Data Source. Used in sanity metric column names.
from_expr (str) – FROM expression - often just a fully-qualified table name. Sometimes a subquery. May contain the string {dataset} which will be replaced with an app-specific dataset for Glean apps. If the expression is templated on dataset, default_dataset is mandatory.
experiments_column_type (str or None) –
Info about the schema of the table or view:
- ’simple’: There is an experiments column, which is an (experiment_slug:str -> branch_name:str) map.
- ’native’: There is an experiments column, which is an (experiment_slug:str -> struct) map, where the struct contains a branch field, which is the branch as a string.
- ’glean’: There is an experiments column inside ping_info, which is an (experiment_slug:str -> struct) map, where the struct contains a branch field, which is the branch as a string.
- ’events_stream’: There is an experiment within a JSON column event_extra. branch is in the same column.
- None: There is no experiments column, so skip the sanity checks that rely on it. We’ll also be unable to filter out pre-enrollment data from day 0 in the experiment.
client_id_column (str, optional) – Name of the column that contains the client_id (join key). Defaults to ‘client_id’.
submission_date_column (str, optional) – Name of the column that contains the submission date (as a date, not timestamp). Defaults to ‘submission_date’.
default_dataset (str, optional) – The value to use for {dataset} in from_expr if a value is not provided at runtime. Mandatory if from_expr contains a {dataset} parameter.
app_name – (str, optional): app_name used with metric-hub, used for validation
group_id_column (str, optional) – Name of the column that contains the profile_group_id (join key). Defaults to ‘profile_group_id’.
glean_client_id_column (str, optional) – Name of the column that contains the glean telemetry client_id (join key). This is also used to specify that the data source supports glean.
legacy_client_id_column (str, optional) – Name of the column that contains the legacy telemetry client_id (join key). This is also used to specify that the data source supports legacy.

from_expr_for(dataset: str | None) → str[source]

Expands the from_expr template for the given dataset. If from_expr is not a template, returns from_expr.

Parameters:: dataset (str or None) – Dataset name to substitute into the from expression.

build_query(metric_list: list[Metric], time_limits: TimeLimits, experiment_slug: str, from_expr_dataset: str | None = None, analysis_basis: AnalysisBasis = AnalysisBasis.ENROLLMENTS, analysis_unit: AnalysisUnit = AnalysisUnit.CLIENT, exposure_signal=None, use_glean_ids: bool | None = None) → str[source]

Return a nearly-self contained SQL query.

This query does not define enrollments but otherwise could be executed to query all metrics from this data source.

build_query_union_metric_rows(metric_list: list[Metric], experiment_slug: str, metrics_query_table: str = 'metrics') → str[source]

Return a query that produces a unioned and pivoted row-per-metric version of the build_query results.

The metrics_query_table parameter specifies where the results to be pivoted exist. This is expected to be of the format that is returned by the query in build_query, and can either be a CTE or table/view.

EXAMPLE:

—- becomes —-

build_query_targets(metric_list: list[Metric], time_limits: TimeLimits, experiment_name: str, analysis_length: int, from_expr_dataset: str | None = None, continuous_enrollment: bool = False, analysis_unit: AnalysisUnit = AnalysisUnit.CLIENT) → str[source]

Return a nearly-self contained SQL query that constructs the metrics query for targeting historical data without an associated experiment slug.

This query does not define targets but otherwise could be executed to query all metrics from this data source.

classmethod from_mcp_data_source(parser_data_source: ParserDataSource, app_name: str | None = None) → DataSource[source]: metric-config-parser DataSource objects do not have an app_name

class mozanalysis.metrics.Metric(name: str, data_source: DataSource, select_expr: str, friendly_name: str | None = None, description: str | None = None, bigger_is_better: bool = True, app_name: str | None = None)[source]

Represents an experiment metric.

Needs to be combined with an analysis window to be measurable!

Parameters:

name (str) – A slug; uniquely identifies this metric in tables
data_source (DataSource) – where to find the metric
select_expr (str) – a SQL snippet representing a clause of a SELECT expression describing how to compute the metric; must include an aggregation function since it will be GROUPed BY the analysis unit and branch
friendly_name (str) – A human-readable dashboard title for this metric
description (str) – A paragraph of Markdown-formatted text describing what the metric measures, to be shown on dashboards
app_name – (str, optional): app_name used with metric-hub, used for validation

mozanalysis.metrics.agg_sum(select_expr: str) → str[source]: Return a SQL fragment for the sum over the data, with 0-filled nulls.

mozanalysis.metrics.agg_any(select_expr: str) → str[source]: Return the logical OR, with FALSE-filled nulls.

mozanalysis.metrics.agg_histogram_mean(select_expr: str) → str[source]: Produces an expression for the mean of an unparsed histogram.

mozanalysis.metrics

EXAMPLE:

`mozanalysis.metrics`