mozanalysis.frequentist_stats.bootstrap

mozanalysis.frequentist_stats.bootstrap.compare_branches(df, col_label, ref_branch_label='control', stat_fn=<function mean>, num_samples=10000, threshold_quantile=None, individual_summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995), comparative_summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995))[source]

Jointly sample bootstrapped statistics then compare them.

Performs a percentile bootstrap, which, according to Efron, is not significantly more distasteful than a basic bootstrap, regardless of what you may read on Stack Overflow.

Parameters:
  • df – a pandas DataFrame of queried experiment data in the standard format (see mozanalysis.experiment).

  • col_label (str or list) – Label for the df column contaning the metric to be analyzed. If a list, labels for the multiple metrics to be analyzed.

  • ref_branch_label (str, optional) – String in df['branch'] that identifies the branch with respect to which we want to calculate uplifts - usually the control branch.

  • stat_fn (func, optional) –

    A function that either:

    • Aggregates each resampled population to a scalar (e.g.

      the default, np.mean), or

    • Aggregates each resampled population to a dict of

      scalars.

    In both cases, this function must accept a one-dimensional ndarray or pandas Series as its input.

  • num_samples (int, optional) – The number of bootstrap iterations to perform.

  • threshold_quantile (float, optional) – An optional threshold quantile, above which to discard outliers. E.g. 0.9999.

  • individual_summary_quantiles (list, optional) – Quantiles to determine the confidence bands on individual branch statistics. Change these when making Bonferroni corrections.

  • comparative_summary_quantiles (list, optional) – Quantiles to determine the confidence bands on comparative branch statistics (i.e. the change relative to the reference branch, probably the control). Change these when making Bonferroni corrections.

Returns a dictionary:

If stat_fn returns a scalar (this is the default), then this function returns a dictionary has the following keys and values:

‘individual’: dictionary mapping each branch name to a pandas

Series that holds the expected value for the bootstrapped stat_fn, and confidence intervals.

‘comparative’: dictionary mapping each branch name to a pandas

Series of summary statistics for the possible uplifts of the bootstrapped stat_fn relative to the reference branch.

Otherwise, when stat_fn returns a dict, then this function returns a similar dictionary, except the Series are replaced with DataFrames. Each row in each DataFrame corresponds to one output of stat_fn, and is the Series that would be returned if stat_fn computed only this statistic.

mozanalysis.frequentist_stats.bootstrap.bootstrap_one_branch(data, stat_fn=<function mean>, num_samples=10000, seed_start=None, threshold_quantile=None, summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995))[source]

Run a bootstrap for one branch on its own.

Resamples the data num_samples times, computes stat_fn for each sample, then returns summary statistics for the distribution of the outputs of stat_fn.

Parameters:
  • data – The data as a 1D numpy array, pandas series, or pandas dataframe.

  • stat_fn – Either a function that aggregates each resampled population to a scalar (e.g. the default value np.mean lets you bootstrap means), or a function that aggregates each resampled population to a dict of scalars. In both cases, this function must accept a one-dimensional ndarray as its input.

  • num_samples – The number of bootstrap iterations to perform

  • seed_start – An int with which to seed numpy’s RNG. It must be unique within this set of calculations.

  • threshold_quantile (float, optional) – An optional threshold quantile, above which to discard outliers. E.g. 0.9999.

  • summary_quantiles (list, optional) – Quantiles to determine the confidence bands on the branch statistics. Change these when making Bonferroni corrections.

mozanalysis.frequentist_stats.bootstrap.get_bootstrap_samples(data, stat_fn=<function mean>, num_samples=10000, seed_start=None, threshold_quantile=None)[source]

Return stat_fn evaluated on resampled and original data.

Do the resampling in parallel over the cluster.

Parameters:
  • data – The data as a 1D numpy array, pandas series, or pandas dataframe.

  • stat_fn – Either a function that aggregates each resampled population to a scalar (e.g. the default value np.mean lets you bootstrap means), or a function that aggregates each resampled population to a dict of scalars. In both cases, this function must accept a one-dimensional ndarray as its input.

  • num_samples – The number of samples to return

  • seed_start

    A seed for the random number generator; this function will use seeds in the range:

    [seed_start, seed_start + num_samples)
    

    and these particular seeds must not be used elsewhere in this calculation. By default, use a random seed.

  • threshold_quantile (float, optional) – An optional threshold quantile, above which to discard outliers. E.g. 0.9999.

Returns:

stat_fn evaluated over num_samples samples.

  • By default, a pandas Series of sampled means

  • if stat_fn returns a scalar, a pandas Series

  • if stat_fn returns a dict, a pandas DataFrame with columns set to the dict keys.

mozanalysis.frequentist_stats.bootstrap.compare_branches_quantiles(df, col_label, ref_branch_label='control', quantiles_of_interest=None, num_samples=10000, threshold_quantile=None, individual_summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995), comparative_summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995))[source]

Performs inferences on the metric quantiles inspired by Spotify’s “Resampling-free bootstrap inference for quantiles” approach https://arxiv.org/pdf/2202.10992.pdf.

Parameters are similar to compare_branches except for:

Parameters:
  • quantiles (List[float]) – a list of quantiles upon which inferences are desired.

  • Ex – 0.2 is the 20th percentile, 0.5 is the median, etc.

mozanalysis.frequentist_stats.bootstrap.get_quantile_bootstrap_samples(data, quantiles_of_interest, num_samples=10000, threshold_quantile=None)[source]

Params are similar to get_bootstrap_samples