ssbc.utils

Utility functions for conformal prediction.

Functions

`build_conditional_prediction_sets`(probs, ...)	Build prediction sets using a SINGLE threshold for conditional analysis.
`build_mondrian_prediction_sets`(probs, ...[, ...])	Build prediction sets using Mondrian conformal prediction thresholds.
`compute_operational_rate`(prediction_sets, ...)	Compute operational rate indicators for prediction sets.
`evaluate_test_dataset`(test_labels, ...)	Evaluate a test dataset and compute empirical operational rates.

ssbc.utils.build_mondrian_prediction_sets(probs, threshold_0, threshold_1, return_lists=False)[source]

Build prediction sets using Mondrian conformal prediction thresholds.

This function implements the standard Mondrian conformal prediction approach: - For each sample, include class 0 if score_0 <= threshold_0 - For each sample, include class 1 if score_1 <= threshold_1 - Where score_k = 1 - P(class=k)

Parameters:

probs (np.ndarray, shape (n, 2)) – Probability predictions for each sample. probs[i, 0] = P(class=0), probs[i, 1] = P(class=1)
threshold_0 (float) – Conformal prediction threshold for class 0
threshold_1 (float) – Conformal prediction threshold for class 1
return_lists (bool, default=False) – If True, returns lists instead of sets

Returns:

List of prediction sets, where each set/list contains the classes included in the prediction set for that sample.

Return type:

list[set[int]] or list[list[int]]

Examples

>>> import numpy as np
>>> from ssbc.utils import build_mondrian_prediction_sets
>>>
>>> probs = np.array([
...     [0.8, 0.2],  # High confidence class 0
...     [0.5, 0.5],  # Uncertain
...     [0.2, 0.8],  # High confidence class 1
... ])
>>> threshold_0, threshold_1 = 0.3, 0.3
>>> pred_sets = build_mondrian_prediction_sets(probs, threshold_0, threshold_1)
>>> print(pred_sets)  # [{0}, {0, 1}, {1}]

Notes

This function is used throughout the codebase for building Mondrian conformal prediction sets. It centralizes the logic to ensure consistency across all modules that perform conformal prediction evaluation.

ssbc.utils.build_conditional_prediction_sets(probs, threshold, return_lists=False)[source]

Build prediction sets using a SINGLE threshold for conditional analysis.

Unlike Mondrian CP which uses separate thresholds per class, this uses ONE threshold for BOTH classes - as in standard (non-Mondrian) conformal prediction.

This is used for conditional analysis where we want to evaluate predictions conditioned on the true class label, using the threshold calibrated for that class.

Parameters:

probs (np.ndarray, shape (n, 2)) – Probability predictions [P(class=0), P(class=1)] Note: The data should be filtered by true class label before calling this function. For conditional analysis, only samples with the same true label should be included.
threshold (float) – Single conformal prediction threshold for both classes This should be the threshold calibrated for the class of the samples in probs.
return_lists (bool, default=False) – If True, returns lists instead of sets

Returns:

Prediction sets where: - {0, 1} if both P(0) >= 1-threshold AND P(1) >= 1-threshold (doublet) - {0} if P(0) >= 1-threshold AND P(1) < 1-threshold (singleton) - {1} if P(1) >= 1-threshold AND P(0) < 1-threshold (singleton) - {} if both P(0) < 1-threshold AND P(1) < 1-threshold (abstention)

Return type:

list[set[int]] or list[list[int]]

Examples

>>> import numpy as np
>>> from ssbc.utils import build_conditional_prediction_sets
>>>
>>> probs = np.array([
...     [0.8, 0.2],  # High confidence class 0: score_0=0.2, score_1=0.8
...     [0.75, 0.75],  # Uncertain, both above threshold: score_0=0.25, score_1=0.25
...     [0.2, 0.8],  # High confidence class 1: score_0=0.8, score_1=0.2
... ])
>>> threshold = 0.3
>>> pred_sets = build_conditional_prediction_sets(probs, threshold)
>>> print(pred_sets)  # [{0}, {0, 1}, {1}]

Notes

This function is used for conditional analysis in Mondrian conformal prediction, where we evaluate prediction sets conditioned on the true class label. For each class, we use the threshold calibrated for that class and apply it to BOTH classes in the prediction set, providing conditional coverage guarantees.

The data is filtered by true class label BEFORE calling this function (e.g., via split_by_class). This ensures that when evaluating conditional coverage P(Y ∈ C(X) | Y = y), we only analyze samples where the true label Y equals the class y for which the threshold was calibrated.

ssbc.utils.compute_operational_rate(prediction_sets, true_labels, rate_type)[source]

Compute operational rate indicators for prediction sets.

For each prediction set, compute a binary indicator showing whether a specific operational event occurred (singleton, doublet, abstention, error in singleton, or correct in singleton).

Parameters:

prediction_sets (list[set | list]) – Prediction sets for each sample. Each set contains predicted labels.
true_labels (np.ndarray) – True labels for each sample
rate_type ({"singleton", "doublet", "abstention", "error_in_singleton", "correct_in_singleton"}) – Type of operational rate to compute: - “singleton”: prediction set contains exactly one label - “doublet”: prediction set contains exactly two labels - “abstention”: prediction set is empty - “error_in_singleton”: singleton prediction that doesn’t contain true label - “correct_in_singleton”: singleton prediction that contains true label

Returns:

Binary indicators (0 or 1) for whether the event holds for each sample

Return type:

np.ndarray

Examples

>>> pred_sets = [{0}, {0, 1}, set(), {1}]
>>> true_labels = np.array([0, 0, 1, 0])
>>> indicators = compute_operational_rate(pred_sets, true_labels, "singleton")
>>> print(indicators)  # [1, 0, 0, 1]
>>> indicators = compute_operational_rate(pred_sets, true_labels, "correct_in_singleton")
>>> print(indicators)  # [1, 0, 0, 0] - first and last are singletons, first is correct

Notes

This function is useful for computing operational statistics on conformal prediction sets, such as singleton rates, escalation rates, and error rates.

ssbc.utils.evaluate_test_dataset(test_labels, test_probs, threshold_0, threshold_1)[source]

Evaluate a test dataset and compute empirical operational rates.

This function takes a test dataset with true labels and probability predictions, applies Mondrian conformal prediction thresholds, and returns comprehensive empirical rates for both marginal and per-class statistics.

Parameters:

test_labels (np.ndarray) – True labels for test samples (0 or 1)
test_probs (np.ndarray) – Probability predictions for test samples, shape (n_samples, 2) test_probs[i, 0] = P(class=0), test_probs[i, 1] = P(class=1)
threshold_0 (float) – Conformal prediction threshold for class 0
threshold_1 (float) – Conformal prediction threshold for class 1

Returns:

Dictionary containing empirical rates with structure: - ‘marginal’: Marginal rates across all samples - ‘class_0’: Rates for class 0 samples only - ‘class_1’: Rates for class 1 samples only Each containing: - ‘singleton_rate’: Fraction of samples with singleton predictions - ‘doublet_rate’: Fraction of samples with doublet predictions - ‘abstention_rate’: Fraction of samples with abstention (empty set) - ‘singleton_error_rate’: Fraction of singleton predictions that are incorrect - ‘n_samples’: Number of samples in this group - ‘n_singletons’: Number of singleton predictions - ‘n_doublets’: Number of doublet predictions - ‘n_abstentions’: Number of abstentions

Return type:

dict

Examples

>>> import numpy as np
>>> from ssbc import evaluate_test_dataset
>>>
>>> # Generate test data
>>> test_labels = np.array([0, 0, 1, 1, 0])
>>> test_probs = np.array([
...     [0.8, 0.2],  # High confidence class 0
...     [0.6, 0.4],  # Medium confidence class 0
...     [0.3, 0.7],  # High confidence class 1
...     [0.4, 0.6],  # Medium confidence class 1
...     [0.5, 0.5],  # Uncertain
... ])
>>>
>>> # Evaluate with thresholds
>>> results = evaluate_test_dataset(test_labels, test_probs, 0.3, 0.3)
>>> print(f"Marginal singleton rate: {results['marginal']['singleton_rate']:.3f}")
>>> print(f"Class 0 singleton rate: {results['class_0']['singleton_rate']:.3f}")

Notes

This function is useful for: - Evaluating conformal prediction performance on test data - Comparing empirical rates to theoretical bounds - Computing operational statistics for reporting - Validating that thresholds work as expected

The function builds prediction sets using the Mondrian approach: - For each sample, include class 0 if score_0 <= threshold_0 - For each sample, include class 1 if score_1 <= threshold_1 - Where score_k = 1 - P(class=k)