OGBoost
- class ordinal_xai.models.ogboost.OGBoost(base_learner=None, n_estimators: int = 100, learning_rate: float = 0.1, learning_rate_thresh: float = 0.001, validation_fraction: float = 0.1, n_iter_no_change: int | None = None, tol: float = 0.0001, link_function: str = 'probit', subsample: float = 1.0, verbose: int = 0, random_state: int | None = None, cv_early_stopping_splits: int | None = None)[source]
Bases:
BaseEstimator,BaseOrdinalModelOrdinal Gradient Boosting Model for ordinal regression.
This class implements a wrapper around the GradientBoostingOrdinal model from the ogboost package. The model uses gradient boosting to learn ordinal relationships and is particularly effective for complex non-linear patterns in ordinal data.
- Parameters:
base_learner (estimator, default=DecisionTreeRegressor(max_depth=3)) – The base learner used to update the latent function
n_estimators (int, default=100) – Maximum number of boosting iterations
learning_rate (float, default=0.1) – Learning rate for the latent function updates
learning_rate_thresh (float, default=0.001) – Learning rate for the threshold updates
validation_fraction (float, default=0.1) – Fraction of data to use as a holdout set for early stopping
n_iter_no_change (int or None, default=None) – Number of iterations with no improvement to wait before stopping early
tol (float, default=1e-4) – Tolerance for measuring improvement in early stopping
link_function ({'probit', 'logit', 'loglog', 'cloglog', 'cauchit'}, default='probit') – Link function used to transform latent scores to probabilities
subsample (float, default=1.0) – Fraction of samples used to fit each base learner
verbose (int, default=0) – Verbosity level
random_state (int, RandomState instance or None, default=None) – Seed or random state for reproducibility
cv_early_stopping_splits (int or None, default=None) – If an integer > 1, uses K-fold cross-validation for early stopping
- feature_names_
Names of features used during training
- Type:
list
- n_features_in_
Number of features seen during training
- Type:
int
- ranks_
Unique ordinal class labels
- Type:
ndarray
- _encoder
Encoder for categorical features
- Type:
OneHotEncoder
- _scaler
Scaler for numerical features
- Type:
StandardScaler
- _model
The fitted ogboost GradientBoostingOrdinal model
- Type:
GradientBoostingOrdinal
- is_fitted_
Whether the model has been fitted
- Type:
bool
Notes
The model handles both categorical and numerical features automatically
Categorical features are one-hot encoded
Numerical features are standardized
The model assumes ordinal classes are consecutive integers starting from 0
- __init__(base_learner=None, n_estimators: int = 100, learning_rate: float = 0.1, learning_rate_thresh: float = 0.001, validation_fraction: float = 0.1, n_iter_no_change: int | None = None, tol: float = 0.0001, link_function: str = 'probit', subsample: float = 1.0, verbose: int = 0, random_state: int | None = None, cv_early_stopping_splits: int | None = None)[source]
Initialize the Ordinal Gradient Boosting Model.
- Parameters:
base_learner (estimator, default=None) – The base learner used to update the latent function. If None, uses DecisionTreeRegressor(max_depth=3)
n_estimators (int, default=100) – Maximum number of boosting iterations
learning_rate (float, default=0.1) – Learning rate for the latent function updates
learning_rate_thresh (float, default=0.001) – Learning rate for the threshold updates
validation_fraction (float, default=0.1) – Fraction of data to use as a holdout set for early stopping
n_iter_no_change (int or None, default=None) – Number of iterations with no improvement to wait before stopping early
tol (float, default=1e-4) – Tolerance for measuring improvement in early stopping
link_function (str, default='probit') – Link function used to transform latent scores to probabilities
subsample (float, default=1.0) – Fraction of samples used to fit each base learner
verbose (int, default=0) – Verbosity level
random_state (int, RandomState instance or None, default=None) – Seed or random state for reproducibility
cv_early_stopping_splits (int or None, default=None) – If an integer > 1, uses K-fold cross-validation for early stopping
- get_params(deep: bool = True) Dict[str, any][source]
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
Parameter names mapped to their values
- Return type:
dict
- set_params(**params: any) OGBoost[source]
Set the parameters of this estimator.
- Parameters:
**params (dict) – Estimator parameters
- Returns:
self – The estimator instance
- Return type:
OGBoost
- fit(X: DataFrame, y: Series) OGBoost[source]
Fit the Ordinal Gradient Boosting Model.
This method fits the model to the training data, handling both categorical and numerical features appropriately.
- Parameters:
X (pd.DataFrame of shape (n_samples, n_features)) – Training data
y (pd.Series of shape (n_samples,)) – Target values
- Returns:
self – The fitted model
- Return type:
OGBoost
- Raises:
ValueError – If the input data contains invalid values
- predict(X: DataFrame) ndarray[source]
Predict ordinal class labels.
- Parameters:
X (pd.DataFrame of shape (n_samples, n_features)) – Samples to predict
- Returns:
Predicted ordinal class labels
- Return type:
ndarray of shape (n_samples,)
- Raises:
NotFittedError – If the model has not been fitted
- predict_proba(X: DataFrame) ndarray[source]
Predict class probabilities.
- Parameters:
X (pd.DataFrame of shape (n_samples, n_features)) – Samples to predict probabilities for
- Returns:
Predicted class probabilities
- Return type:
ndarray of shape (n_samples, n_classes)
- Raises:
NotFittedError – If the model has not been fitted
- transform(X: DataFrame, fit: bool = False, no_scaling: bool = False) DataFrame[source]
Transform input data into the format expected by the model.
This method handles both categorical and numerical features: - Categorical features are one-hot encoded - Numerical features are standardized (unless no_scaling=True)
- Parameters:
X (pd.DataFrame of shape (n_samples, n_features)) – Input data to transform
fit (bool, default=False) – Whether to fit new encoder/scaler or use existing ones
no_scaling (bool, default=False) – Whether to skip scaling of numerical features
- Returns:
Transformed data
- Return type:
pd.DataFrame
- Raises:
ValueError – If the input data has different features than training data
- decision_function(X: DataFrame) ndarray[source]
Compute the latent function values for input samples.
This method returns the scalar value of the latent function for each observation, which can be used as a high-resolution alternative to class labels for comparing and ranking observations.
- Parameters:
X (pd.DataFrame of shape (n_samples, n_features)) – Samples to compute decision function for
- Returns:
Latent function values
- Return type:
ndarray of shape (n_samples,)
- Raises:
NotFittedError – If the model has not been fitted
- feature_importances_() ndarray[source]
Get feature importances from the fitted model.
Note: This method may not be available for all base learners.
- Returns:
Feature importances if available
- Return type:
ndarray of shape (n_features,)
- Raises:
NotFittedError – If the model has not been fitted
AttributeError – If the base learner doesn’t support feature importances
- get_booster_params() Dict[str, any][source]
Get parameters of the underlying boosting model.
- Returns:
Parameters of the underlying GradientBoostingOrdinal model
- Return type:
dict
- Raises:
NotFittedError – If the model has not been fitted
- _abc_impl = <_abc._abc_data object>
- classmethod _build_request_for_signature(router, method)
Build the MethodMetadataRequest for a method using its signature.
This method takes all arguments from the method signature and uses
Noneas their default request value, exceptX,y,Y,Xt,yt,*args, and**kwargs.- Parameters:
router (MetadataRequest) – The parent object for the created MethodMetadataRequest.
method (str) – The name of the method.
- Returns:
method_request – The prepared request using the method’s signature.
- Return type:
MethodMetadataRequest
- _doc_link_module = 'sklearn'
- property _doc_link_template
- _doc_link_url_param_generator = None
- classmethod _get_default_requests()
Collect default request values.
This method combines the information present in
__metadata_request__*class attributes, as well as determining request keys from method signatures.
- _get_doc_link()
Generates a link to the API documentation for a given estimator.
This method generates the link to the estimator’s documentation page by using the template defined by the attribute _doc_link_template.
- Returns:
url – The URL to the API documentation for this estimator. If the estimator does not belong to module _doc_link_module, the empty string (i.e. “”) is returned.
- Return type:
str
- _get_metadata_request()
Get requested metadata for the instance.
Please check User Guide on how the routing mechanism works.
- Returns:
request – A
MetadataRequestinstance.- Return type:
MetadataRequest
- classmethod _get_param_names()
Get parameter names for the estimator
- _get_params_html(deep=True)
Get parameters for this estimator with a specific HTML representation.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values. We return a ParamsDict dictionary, which renders a specific HTML representation in table form.
- Return type:
ParamsDict
- _html_repr()
Build a HTML representation of an estimator.
Read more in the User Guide.
- Parameters:
estimator (estimator object) – The estimator to visualize.
- Returns:
html – HTML representation of estimator.
- Return type:
str
Examples
>>> from sklearn.utils._repr_html.estimator import estimator_html_repr >>> from sklearn.linear_model import LogisticRegression >>> estimator_html_repr(LogisticRegression()) '<style>#sk-container-id...'
- property _repr_html_
HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should be favored in the long term, _repr_html_ is only implemented for consumers who do not interpret _repr_mimbundle_.
- _repr_html_inner()
This function is returned by the @property _repr_html_ to make hasattr(estimator, “_repr_html_”) return `True or False depending on get_config()[“display”].
- _repr_mimebundle_(**kwargs)
Mime bundle used by jupyter kernels to display estimator
- _validate_params()
Validate types and values of constructor parameters
The expected type and values must be defined in the _parameter_constraints class attribute, which is a dictionary param_name: list of constraints. See the docstring of validate_parameter_constraints for a description of the accepted constraints.
- get_metadata_routing()
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequestencapsulating routing information.- Return type:
MetadataRequest
- set_transform_request(*, fit: bool | None | str = '$UNCHANGED$', no_scaling: bool | None | str = '$UNCHANGED$') OGBoost
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- fitstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
fitparameter intransform.- no_scalingstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
no_scalingparameter intransform.
- selfobject
The updated object.