OBD

class ordinal_xai.models.obd.OBD(base_classifier='logistic', decomposition_type='one-vs-following', **kwargs)[source]

Bases: BaseEstimator, BaseOrdinalModel

Ordinal Binary Decomposition (OBD) Model.

This model implements ordinal classification by decomposing the problem into a series of binary classification tasks. It supports two decomposition strategies and various base classifiers.

Parameters:

base_classifier (str, default='logistic') – The base classifier to use for each binary classification task. Options are: - ‘logistic’: LogisticRegression with default parameters - ‘svm’: SVC with probability=True and balanced class weights - ‘rf’: RandomForestClassifier with conservative defaults - ‘xgb’: XGBClassifier with conservative defaults
decomposition_type (str, default='one-vs-following') – The type of binary decomposition to use. Options are: - ‘one-vs-following’: Each class is compared against all following classes - ‘one-vs-next’: Each class is compared only against the next class
**kwargs (dict) – Additional parameters to pass to the base classifier. These will override the default parameters set for each classifier type.

feature_names_

Names of the features used during training

Type:: list

n_features_in_

Number of features seen during training

Type:: int

ranks_

Unique ordinal class labels in ascending order

Type:: ndarray

_models

List of trained binary classifiers

Type:: list

_encoder

Feature encoder used for categorical variables

Type:: object

_scaler

Feature scaler used for numerical variables

Type:: object

is_fitted_

Flag indicating whether the model has been fitted

Type:: bool

Notes

The model automatically handles categorical and numerical features through the transform_features utility
For each binary classification task, if only one class is present in the training data, a DummyClassifier is used instead of the specified base classifier
The predict_proba method returns class probabilities that sum to 1 for each sample
The model is compatible with scikit-learn’s cross-validation and grid search

Examples

>>> from models.obd import OBD
>>> import pandas as pd
>>> import numpy as np
>>>
>>> # Create sample data
>>> X = pd.DataFrame(np.random.randn(100, 5))
>>> y = pd.Series(np.random.randint(0, 3, 100))
>>>
>>> # Initialize model with SVM base classifier
>>> model = OBD(base_classifier='svm', decomposition_type='one-vs-next')
>>>
>>> # Train the model
>>> model.fit(X, y)
>>>
>>> # Make predictions
>>> predictions = model.predict(X)
>>> probabilities = model.predict_proba(X)

__init__(base_classifier='logistic', decomposition_type='one-vs-following', **kwargs)[source]

Initialize the OBD model.

Parameters:

base_classifier (str, default='logistic') – The base classifier to use. Options are: - ‘logistic’: LogisticRegression - ‘svm’: SVC with probability=True - ‘rf’: RandomForestClassifier - ‘xgb’: XGBClassifier
decomposition_type (str, default='one-vs-following') – The type of binary decomposition to use. Options are: - ‘one-vs-following’: Each class is compared against all following classes - ‘one-vs-next’: Each class is compared only against the next class
**kwargs (dict) – Additional parameters to pass to the base classifier.

_get_base_classifier()[source]

Get the appropriate base classifier instance with sensible defaults.

Returns:: estimator – An instance of the specified base classifier with appropriate default parameters.
Return type:: object
Raises:: ValueError – If an unknown base classifier is specified.

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

set_params(**params)[source]

Set the parameters of this estimator.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: object

fit(X: DataFrame, y: Series) → OBD[source]

Fit the ordinal binary decomposition model.

Parameters:

X (pd.DataFrame) – Training data of shape (n_samples, n_features)
y (pd.Series) – Target values of shape (n_samples,)

Returns:

self – Returns self.

Return type:

object

Raises:

TypeError – If X is not a DataFrame or y is not a Series
ValueError – If X and y have different number of samples If X or y contains missing values If decomposition_type is not ‘one-vs-following’ or ‘one-vs-next’

predict(X: DataFrame) → ndarray[source]

Predict ordinal class labels.

Parameters:: X (pd.DataFrame) – Samples of shape (n_samples, n_features)
Returns:: y_pred – Predicted class labels of shape (n_samples,)
Return type:: ndarray

predict_proba(X: DataFrame) → ndarray[source]

Predict class probabilities.

Parameters:

X (pd.DataFrame) – Samples of shape (n_samples, n_features)

Returns:

proba – Class probabilities of shape (n_samples, n_classes)

Return type:

ndarray

Raises:

TypeError – If X is not a DataFrame
ValueError – If X contains missing values If X has different number of features than training data

Notes

The probabilities are computed differently based on the decomposition type:

For ‘one-vs-following’: - P(class 0) = 1 - P(class > 0) - P(class i) = P(class > i-1) - P(class > i) - P(class K-1) = P(class > K-2)

For ‘one-vs-next’: - P(class 0) = 1 - P(class > 0) - P(class i) = P(class > i-1) * (1 - P(class > i)) - P(class K-1) = P(class > K-2)

transform(X: DataFrame, fit=False, no_scaling=False) → DataFrame[source]

Transform input data into the format expected by the model.

Parameters:

X (pd.DataFrame) – Input data of shape (n_samples, n_features)
fit (bool, default=False) – Whether this is being called during fit or predict
no_scaling (bool, default=False) – Whether to skip feature scaling

Returns:

X_transformed – Transformed data ready for model input

Return type:

pd.DataFrame

_abc_impl = <_abc._abc_data object>

set_transform_request(*, fit: bool | None | str = '$UNCHANGED$', no_scaling: bool | None | str = '$UNCHANGED$') → OBD

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

False: metadata is not requested and the meta-estimator will not pass it to transform.

None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

fitstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for fit parameter in transform.

no_scalingstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for no_scaling parameter in transform.

selfobject
The updated object.