mozanalysis.frequentist_stats.bootstrap
- mozanalysis.frequentist_stats.bootstrap.compare_branches(df, col_label, ref_branch_label='control', stat_fn=<function mean>, num_samples=10000, threshold_quantile=None, individual_summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995), comparative_summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995))[source]
Jointly sample bootstrapped statistics then compare them.
Performs a percentile bootstrap, which, according to Efron, is not significantly more distasteful than a basic bootstrap, regardless of what you may read on Stack Overflow.
- Parameters:
df – a pandas DataFrame of queried experiment data in the standard format (see
mozanalysis.experiment
).col_label (str or list) – Label for the df column contaning the metric to be analyzed. If a list, labels for the multiple metrics to be analyzed.
ref_branch_label (str, optional) – String in
df['branch']
that identifies the branch with respect to which we want to calculate uplifts - usually the control branch.stat_fn (func, optional) –
A function that either:
- Aggregates each resampled population to a scalar (e.g.
the default,
np.mean
), or
- Aggregates each resampled population to a dict of
scalars.
In both cases, this function must accept a one-dimensional ndarray or pandas Series as its input.
num_samples (int, optional) – The number of bootstrap iterations to perform.
threshold_quantile (float, optional) – An optional threshold quantile, above which to discard outliers. E.g. 0.9999.
individual_summary_quantiles (list, optional) – Quantiles to determine the confidence bands on individual branch statistics. Change these when making Bonferroni corrections.
comparative_summary_quantiles (list, optional) – Quantiles to determine the confidence bands on comparative branch statistics (i.e. the change relative to the reference branch, probably the control). Change these when making Bonferroni corrections.
- Returns a dictionary:
If
stat_fn
returns a scalar (this is the default), then this function returns a dictionary has the following keys and values:- ‘individual’: dictionary mapping each branch name to a pandas
Series that holds the expected value for the bootstrapped
stat_fn
, and confidence intervals.- ‘comparative’: dictionary mapping each branch name to a pandas
Series of summary statistics for the possible uplifts of the bootstrapped
stat_fn
relative to the reference branch.
Otherwise, when
stat_fn
returns a dict, then this function returns a similar dictionary, except the Series are replaced with DataFrames. Each row in each DataFrame corresponds to one output of stat_fn, and is the Series that would be returned ifstat_fn
computed only this statistic.
- mozanalysis.frequentist_stats.bootstrap.bootstrap_one_branch(data, stat_fn=<function mean>, num_samples=10000, seed_start=None, threshold_quantile=None, summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995))[source]
Run a bootstrap for one branch on its own.
Resamples the data
num_samples
times, computesstat_fn
for each sample, then returns summary statistics for the distribution of the outputs ofstat_fn
.- Parameters:
data – The data as a 1D numpy array, pandas series, or pandas dataframe.
stat_fn – Either a function that aggregates each resampled population to a scalar (e.g. the default value
np.mean
lets you bootstrap means), or a function that aggregates each resampled population to a dict of scalars. In both cases, this function must accept a one-dimensional ndarray as its input.num_samples – The number of bootstrap iterations to perform
seed_start – An int with which to seed numpy’s RNG. It must be unique within this set of calculations.
threshold_quantile (float, optional) – An optional threshold quantile, above which to discard outliers. E.g.
0.9999
.summary_quantiles (list, optional) – Quantiles to determine the confidence bands on the branch statistics. Change these when making Bonferroni corrections.
- mozanalysis.frequentist_stats.bootstrap.get_bootstrap_samples(data, stat_fn=<function mean>, num_samples=10000, seed_start=None, threshold_quantile=None)[source]
Return
stat_fn
evaluated on resampled and original data.Do the resampling in parallel over the cluster.
- Parameters:
data – The data as a 1D numpy array, pandas series, or pandas dataframe.
stat_fn – Either a function that aggregates each resampled population to a scalar (e.g. the default value
np.mean
lets you bootstrap means), or a function that aggregates each resampled population to a dict of scalars. In both cases, this function must accept a one-dimensional ndarray as its input.num_samples – The number of samples to return
seed_start –
A seed for the random number generator; this function will use seeds in the range:
[seed_start, seed_start + num_samples)
and these particular seeds must not be used elsewhere in this calculation. By default, use a random seed.
threshold_quantile (float, optional) – An optional threshold quantile, above which to discard outliers. E.g.
0.9999
.
- Returns:
stat_fn
evaluated overnum_samples
samples.By default, a pandas Series of sampled means
if
stat_fn
returns a scalar, a pandas Seriesif
stat_fn
returns a dict, a pandas DataFrame with columns set to the dict keys.
- mozanalysis.frequentist_stats.bootstrap.compare_branches_quantiles(df, col_label, ref_branch_label='control', quantiles_of_interest=None, num_samples=10000, threshold_quantile=None, individual_summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995), comparative_summary_quantiles=(0.005, 0.025, 0.5, 0.975, 0.995))[source]
Performs inferences on the metric quantiles inspired by Spotify’s “Resampling-free bootstrap inference for quantiles” approach https://arxiv.org/pdf/2202.10992.pdf.
Parameters are similar to compare_branches except for:
- Parameters:
quantiles (List[float]) – a list of quantiles upon which inferences are desired.
Ex – 0.2 is the 20th percentile, 0.5 is the median, etc.