Unsupported objective 'poission' for LGBMRegressor #462

janjagusch · 2021-05-10T07:28:31Z

Issue description

When trying to run onnxmltools.convert_lightgbm on a lightgbm.sklearn.LGBMRegressor model with objective='poisson', the conversion fails with a RuntimeError.

First diagnosis

The only supported objective are 'binary', 'multiclass' and 'regression'.

Possible solution

Implement the 'poisson' objective similarly to the 'regression' objective (same n_classes, different post_tranform). Thepost_transform should be f(y)=e^y (I don't know if such a post transform function already exists). I'd like to give this a try, if that's okay for you.

Reproducible example

import numpy as np
from lightgbm.sklearn import LGBMRegressor
from onnxmltools import convert_lightgbm
from skl2onnx.common.data_types import FloatTensorType

N_ROWS = 1000
N_COLS = 4

X = np.random.randn(N_ROWS, N_COLS)
# For 'poisson' objective, all target values need to be non-negative
y = abs(np.random.randn(N_ROWS))

estimator = LGBMRegressor(objective="poisson")
estimator.fit(X, y)

initial_types = [("float_input", FloatTensorType([None, N_COLS]))]
convert_lightgbm(model=estimator, name="Poisson Estimator", initial_types=initial_types)

Results in the following error:

Traceback (most recent call last):
  File "/Users/janjagusch/Projects/risk-modelling-pipeline-sandbox/notebooks/example-upstream.py", line 33, in <module>
    convert_lightgbm(model=estimator, name="Poisson Estimator", initial_types=initial_types)
  File "/usr/local/Caskroom/miniforge/base/envs/risk-modelling-pipeline-sandbox/lib/python3.9/site-packages/onnxmltools/convert/main.py", line 60, in convert_lightgbm
    return convert(model, name, initial_types, doc_string, target_opset, targeted_onnx,
  File "/usr/local/Caskroom/miniforge/base/envs/risk-modelling-pipeline-sandbox/lib/python3.9/site-packages/onnxmltools/convert/lightgbm/convert.py", line 55, in convert
    onnx_ml_model = convert_topology(topology, name, doc_string, target_opset, targeted_onnx)
  File "/usr/local/Caskroom/miniforge/base/envs/risk-modelling-pipeline-sandbox/lib/python3.9/site-packages/onnxconverter_common/topology.py", line 776, in convert_topology
    get_converter(operator.type)(scope, operator, container)
  File "/usr/local/Caskroom/miniforge/base/envs/risk-modelling-pipeline-sandbox/lib/python3.9/site-packages/onnxmltools/convert/lightgbm/operator_converters/LightGbm.py", line 235, in convert_lightgbm
    raise RuntimeError(
RuntimeError: LightGBM objective should be cleaned already not 'poisson'.

Versions

numpy                     1.20.2
lightgbm                  3.2.1
onnxmltools               1.8.0
skl2onnx                  1.8.0.1

The text was updated successfully, but these errors were encountered:

janjagusch · 2021-05-10T13:24:27Z

Ok, this seems more complicated than I first thought.

The ONNX spec does not seem to support an exponential post transform function. So we would need to add this to the onnxruntime first.
Next, we would need to update the docstring in onnx to indicate that 'EXPONENTIAL' is a valid post transform identifier.
Then we would need to bump the versions here and add the new objective.

janjagusch · 2021-05-10T14:24:44Z

I found a workaround for this issue:

Fitting the model with objective='poisson'
Changing the model objective post fit to 'regression' (this requires a monkey patch)
Serialising the model
Wrapping the exponential transform function around the predictions, using Python

The results of the prediction using ONNX are equivalent to the LightGMB predictions up to 6 decimal places.

import numpy as np
import onnxruntime as rt
from lightgbm.sklearn import LGBMRegressor
from onnxmltools import convert_lightgbm
from skl2onnx.common.data_types import FloatTensorType

N_ROWS = 1000
N_COLS = 4

X = np.random.randn(N_ROWS, N_COLS)
# For 'poisson' objective, all target values need to be non-negative
y = abs(np.random.randn(N_ROWS))

estimator = LGBMRegressor(objective="poisson")
estimator.fit(X, y)

pred_sklearn = estimator.predict(X)


# Monkey patching lightgbm.basic.Booster.dump_model
# to overwrite the objetive to 'regression', if the
# actual objective is 'poisson'.
import lightgbm.basic
from lightgbm.basic import Booster as Booster

lightgbm.basic.Booster._dump_model = lightgbm.basic.Booster.dump_model


def dump_model(
    self, num_iteration=None, start_iteration=0, importance_type="split"
):
    result = Booster._dump_model(
        self,
        num_iteration=num_iteration,
        start_iteration=start_iteration,
        importance_type=importance_type,
    )
    if result["objective"] == "poisson":
        result["objective"] = "regression"
    return result


lightgbm.basic.Booster.dump_model = dump_model

initial_types = [("float_input", FloatTensorType([None, N_COLS]))]
onnx_model = convert_lightgbm(
    model=estimator, name="Poisson Estimator", initial_types=initial_types
)

sess = rt.InferenceSession(onnx_model.SerializeToString())
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name

pred_onnx = sess.run([label_name], {input_name: X.astype(np.float32)})[0][:, 0]

np.testing.assert_almost_equal(np.exp(pred_onnx), pred_sklearn, decimal=6)

xhochy · 2021-05-10T15:38:32Z

@janjagusch I guess it is much more reliable in your workaround case if you add an Exp operation at the end of your generated onnx_model instead of doing a manual post-process. That way you would have a portable ONNX model export again.

janjagusch · 2021-05-11T08:43:33Z

Nice idea, @xhochy! I have added an exponential post transform node to the model graph and have changed the model output to point to the output of the post transform node:

import lightgbm.basic
import numpy as np
import onnx
import onnxruntime as rt
from lightgbm.basic import Booster as Booster
from lightgbm.sklearn import LGBMRegressor
from onnxmltools import convert_lightgbm
from skl2onnx.common.data_types import FloatTensorType

N_ROWS = 1000
N_COLS = 4

X = np.random.randn(N_ROWS, N_COLS)
# For 'poisson' objective, all target values need to be non-negative
y = abs(np.random.randn(N_ROWS))

estimator = LGBMRegressor(objective="poisson")
estimator.fit(X, y)

pred_sklearn = estimator.predict(X)


# Monkey patching lightgbm.basic.Booster.dump_model
# to overwrite the objetive to 'regression', if the
# actual objective is 'poisson'.
lightgbm.basic.Booster._dump_model = lightgbm.basic.Booster.dump_model


def dump_model(
    self, num_iteration=None, start_iteration=0, importance_type="split"
):  # noqa: D103
    result = Booster._dump_model(
        self,
        num_iteration=num_iteration,
        start_iteration=start_iteration,
        importance_type=importance_type,
    )
    if result["objective"] == "poisson":
        result["objective"] = "regression"
    return result


lightgbm.basic.Booster.dump_model = dump_model

initial_types = [("float_input", FloatTensorType([None, N_COLS]))]
onnx_model = convert_lightgbm(
    model=estimator, name="Poisson Estimator", initial_types=initial_types
)

# Adding a node to the graph that applies the
# exponential post transform function.
# Changing the graph output to the output of
# the new post transform node.
output_name = onnx_model.graph.output[0].name
new_output_name = f"exp_{output_name}"

exp_node = onnx.helper.make_node("Exp", inputs=[output_name], outputs=[new_output_name])
onnx_model.graph.node.append(exp_node)
onnx_model.graph.output[0].name = new_output_name

sess = rt.InferenceSession(onnx_model.SerializeToString())
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name

pred_onnx = sess.run([label_name], {input_name: X.astype(np.float32)})[0][:, 0]

np.testing.assert_almost_equal(pred_onnx, pred_sklearn, decimal=6)

ogencoglu · 2024-09-07T09:57:17Z

Facing the same issue with custom loss function. Why does onnx need objective for inference anyway?

janjagusch mentioned this issue May 11, 2021

Allow to add custom post transform functions that are not supported by the ONNX spec yet #463

Merged

wenbingl closed this as completed in #463 May 20, 2021

brunovilar mentioned this issue Oct 3, 2021

Unsupported objective 'quantile' for LGBMRegressor #504

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unsupported objective 'poission' for LGBMRegressor #462

Unsupported objective 'poission' for LGBMRegressor #462

janjagusch commented May 10, 2021 •

edited

Loading

janjagusch commented May 10, 2021

janjagusch commented May 10, 2021

xhochy commented May 10, 2021

janjagusch commented May 11, 2021

ogencoglu commented Sep 7, 2024

Unsupported objective 'poission' for LGBMRegressor #462

Unsupported objective 'poission' for LGBMRegressor #462

Comments

janjagusch commented May 10, 2021 • edited Loading

Issue description

First diagnosis

Possible solution

Reproducible example

Versions

janjagusch commented May 10, 2021

janjagusch commented May 10, 2021

xhochy commented May 10, 2021

janjagusch commented May 11, 2021

ogencoglu commented Sep 7, 2024

janjagusch commented May 10, 2021 •

edited

Loading