ssbc.hyperparameter

Hyperparameter sweep and optimization for Mondrian conformal prediction.

Functions

`sweep_and_plot_parallel_plotly`(class_data, ...)	Convenience wrapper: run sweep + show plotly parallel coordinates figure.
`sweep_hyperparams_and_collect`(class_data, ...)	Sweep (a0,d0,a1,d1), run mondrian_conformal_calibrate + report_prediction_stats, and return a tidy DataFrame with hyperparams + selected metrics.

ssbc.hyperparameter.sweep_hyperparams_and_collect(class_data, alpha_0, delta_0, alpha_1, delta_1, mode='beta', extra_metrics=None, quiet=True)[source]

Sweep (a0,d0,a1,d1), run mondrian_conformal_calibrate + report_prediction_stats, and return a tidy DataFrame with hyperparams + selected metrics.

This function performs a grid search over hyperparameter combinations and evaluates the resulting conformal prediction performance.

Parameters:

class_data (dict) – Output from split_by_class()
alpha_0 (array-like) – Grid of alpha values for class 0
delta_0 (array-like) – Grid of delta values for class 0
alpha_1 (array-like) – Grid of alpha values for class 1
delta_1 (array-like) – Grid of delta values for class 1
mode (str, default="beta") – “beta” or “beta-binomial” mode for SSBC
extra_metrics (dict of {name: function}, optional) – Additional metrics to compute. Each function takes the summary dict and returns a scalar value.
quiet (bool, default=True) – If True, suppress progress output

Returns:

Tidy dataframe with one row per hyperparameter combination. Columns include: - a0, d0, a1, d1: hyperparameters - cov: overall coverage rate - sing_rate: singleton prediction rate - err_all: overall singleton error rate - err_pred0, err_pred1: errors by predicted class - err_y0, err_y1: errors by true class - esc_rate: escalation rate (doublets + abstentions) - n_total, sing_count, m_abst, m_doublets: counts - Any additional metrics from extra_metrics

Return type:

pd.DataFrame

Examples

>>> import numpy as np
>>> from ssbc import BinaryClassifierSimulator, split_by_class
>>>
>>> # Generate data
>>> sim = BinaryClassifierSimulator(0.1, (2, 8), (8, 2), seed=42)
>>> labels, probs = sim.generate(1000)
>>> class_data = split_by_class(labels, probs)
>>>
>>> # Define grid
>>> alpha_grid = np.arange(0.05, 0.20, 0.05)
>>> delta_grid = np.arange(0.05, 0.20, 0.05)
>>>
>>> # Run sweep
>>> df = sweep_hyperparams_and_collect(
...     class_data,
...     alpha_0=alpha_grid, delta_0=delta_grid,
...     alpha_1=alpha_grid, delta_1=delta_grid,
... )
>>>
>>> # Analyze results
>>> print(df[['a0', 'a1', 'cov', 'sing_rate', 'err_all']].head())

Notes

The function performs a complete grid search, so the total number of evaluations is len(alpha_0) × len(delta_0) × len(alpha_1) × len(delta_1). For large grids, this can be computationally expensive.

ssbc.hyperparameter.sweep_and_plot_parallel_plotly(class_data, delta_0, delta_1, alpha_0, alpha_1, mode='beta', extra_metrics=None, color='err_all', color_continuous_scale=None, title=None, height=600)[source]

Convenience wrapper: run sweep + show plotly parallel coordinates figure.

This function combines hyperparameter sweep and visualization in one call.

Parameters:

class_data (dict) – Output from split_by_class()
delta_0 (array-like) – Grid of delta values for classes 0 and 1
delta_1 (array-like) – Grid of delta values for classes 0 and 1
alpha_0 (array-like) – Grid of alpha values for classes 0 and 1
alpha_1 (array-like) – Grid of alpha values for classes 0 and 1
mode (str, default="beta") – “beta” or “beta-binomial” mode for SSBC
extra_metrics (dict of {name: function}, optional) – Additional metrics to compute
color (str, default='err_all') – Column to use for coloring the parallel coordinates
color_continuous_scale (plotly colorscale, optional) – Color scale for the plot
title (str, optional) – Plot title (defaults to auto-generated title)
height (int, default=600) – Plot height in pixels

Returns:

df (pd.DataFrame) – Results dataframe
fig (plotly.graph_objects.Figure) – Interactive parallel coordinates plot

Examples

>>> import numpy as np
>>> from ssbc import BinaryClassifierSimulator, split_by_class
>>>
>>> # Generate data
>>> sim = BinaryClassifierSimulator(0.1, (2, 8), (8, 2), seed=42)
>>> labels, probs = sim.generate(1000)
>>> class_data = split_by_class(labels, probs)
>>>
>>> # Run sweep and plot
>>> df, fig = sweep_and_plot_parallel_plotly(
...     class_data,
...     delta_0=np.arange(0.05, 0.20, 0.05),
...     delta_1=np.arange(0.05, 0.20, 0.05),
...     alpha_0=np.arange(0.05, 0.20, 0.05),
...     alpha_1=np.arange(0.05, 0.20, 0.05),
...     color='err_all'
... )
>>> fig.show()  # Display in notebook
>>> # Or save: fig.write_html("sweep_results.html")

Notes

The parallel coordinates plot allows interactive exploration of the hyperparameter space. You can brush (select) ranges on any axis to filter configurations and see their impact on other metrics.