Submodules
presc.configuration module
Config management for PRESC, handled using confuse.Configuration.
- class presc.configuration.LocalConfig(from_config)[source]
Bases:
confuse.core.Configuration
Confuse config view that overrides but doesn’t modify another view.
This is useful for temporarily overriding options, eg. with feature-specific settings, while still taking advantage of the confuse resolution and templating functionalities.
The override is dynamic, so it will always pull in the most recent values of the underlying configuration.
from_config: a confuse Configuration (RootView) instance to override.
- class presc.configuration.PrescConfig(from_config=None)[source]
Bases:
object
Wrapper around a confuse Configuration object.
This is used for managing config options in PRESC, including the global config.
- from_config
A PrescConfig instance to override. If None, the config is initialized to the default settings.
- Type
- set(settings)[source]
Update one or more config options.
These should be specified in a dict, either mirroring the nested structure of the configuration file, or as flat key-value pairs using dots to indicate nested namespaces.
Examples
config.set({"report": {"title": "My Report", "author": "Me"}})
config.set({"report.title": "My Report", "report.author": "Me"})
- property settings
Access the underlying confuse object.
presc.dataset module
- class presc.dataset.Dataset(df, label_col, feature_cols=None)[source]
Bases:
object
Convenience API for a dataset used with classification model.
Wraps a pandas DataFrame and provides shortcuts to access feature and label columns. It also allows for other columns, eg. computed columns, to be included or added later.
- df
- Type
Pandas DataFrame
- label_col
The name of the column containing the labels
- Type
str
- feature_cols
An array-like of column names corresponding to model features. If not specified, all columns aside from the label column will be assumed to be features.
- Type
Array of str
- property column_names
Returns feature and other column names.
- property df
Returns the underlying DataFrame.
- property feature_names
Returns the feature names as a list.
- property features
Returns the dataset feature columns.
- property labels
Returns the dataset label column.
- property other_cols
Returns the dataset columns other than features and label.
- property size
- subset(subset_rows, by_position=False)[source]
Returns a Dataset corresponding to a subset of this one.
- Parameters
subset_rows – Selector for the rows to include in the subset (that can be passed to .loc or .iloc).
by_position (bool) – If True, subset_rows is interpeted as row numbers (used with .iloc). Otherwise, subset_rows is used with .loc.
presc.model module
- class presc.model.ClassificationModel(classifier, train_dataset=None, retrain_now=False)[source]
Bases:
object
Represents a classification problem.
Instances wrap a ML model together with its associated training dataset.
- Parameters
classifier (sklearn Classifier) – the classifier to wrap
dataset (Dataset) – optionally include the associated training dataset
retrain_now (bool) – should the classifier first be (re-)trained on the given dataset?
- property classifier
Returns the underlying classifier.
- predict_labels(test_dataset)[source]
Predict labels for the given test dataset.
- Parameters
test_dataset (presc.dataset.Dataset) –
- Returns
A like-indexed Series.
- Return type
Series
- predict_probs(test_dataset)[source]
Compute predicted probabilities for the given test dataset.
This must be supported by the underlying classifier, otherwise an error will be raised.
- Parameters
test_dataset (presc.dataset.Dataset) –
- Returns
A like-indexed DataFrame of probabilities for each class.
- Return type
DataFrame
- train(train_dataset=None)[source]
Train the underlying classification model.
- Parameters
train_dataset (presc.dataset.Dataset) – A Dataset to train on. Defaults to the pre-specified training dataset, if any.
presc.utils module
- exception presc.utils.PrescError[source]
Bases:
ValueError
,AttributeError
General exception class for errors related to PRESC computations.
- presc.utils.include_exclude_list(all_vals, included='*', excluded=None)[source]
Find values remaining after inclusions and exclusions are applied.
Values are first restricted to explicit inclusions, and then exclusions are applied.
The special values “*” and None are interpreted as “all” and “none” respectively for included and excluded.
- Parameters
all_vals (list) – The full list of possible values
included (list) – The list of values to include. Those not listed here are dropped.
excluded (list) – The list of values to drop (after restricting to included).
- Returns
The list of values out of all_vals that should be included.
- Return type
list