Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported objective 'poission' for LGBMRegressor #462

Closed
janjagusch opened this issue May 10, 2021 · 5 comments · Fixed by #463
Closed

Unsupported objective 'poission' for LGBMRegressor #462

janjagusch opened this issue May 10, 2021 · 5 comments · Fixed by #463

Comments

@janjagusch
Copy link
Contributor

janjagusch commented May 10, 2021

Issue description

When trying to run onnxmltools.convert_lightgbm on a lightgbm.sklearn.LGBMRegressor model with objective='poisson', the conversion fails with a RuntimeError.

First diagnosis

The only supported objective are 'binary', 'multiclass' and 'regression'.

Possible solution

Implement the 'poisson' objective similarly to the 'regression' objective (same n_classes, different post_tranform). Thepost_transform should be f(y)=e^y (I don't know if such a post transform function already exists). I'd like to give this a try, if that's okay for you.

Reproducible example

import numpy as np
from lightgbm.sklearn import LGBMRegressor
from onnxmltools import convert_lightgbm
from skl2onnx.common.data_types import FloatTensorType

N_ROWS = 1000
N_COLS = 4

X = np.random.randn(N_ROWS, N_COLS)
# For 'poisson' objective, all target values need to be non-negative
y = abs(np.random.randn(N_ROWS))

estimator = LGBMRegressor(objective="poisson")
estimator.fit(X, y)

initial_types = [("float_input", FloatTensorType([None, N_COLS]))]
convert_lightgbm(model=estimator, name="Poisson Estimator", initial_types=initial_types)

Results in the following error:

Traceback (most recent call last):
  File "/Users/janjagusch/Projects/risk-modelling-pipeline-sandbox/notebooks/example-upstream.py", line 33, in <module>
    convert_lightgbm(model=estimator, name="Poisson Estimator", initial_types=initial_types)
  File "/usr/local/Caskroom/miniforge/base/envs/risk-modelling-pipeline-sandbox/lib/python3.9/site-packages/onnxmltools/convert/main.py", line 60, in convert_lightgbm
    return convert(model, name, initial_types, doc_string, target_opset, targeted_onnx,
  File "/usr/local/Caskroom/miniforge/base/envs/risk-modelling-pipeline-sandbox/lib/python3.9/site-packages/onnxmltools/convert/lightgbm/convert.py", line 55, in convert
    onnx_ml_model = convert_topology(topology, name, doc_string, target_opset, targeted_onnx)
  File "/usr/local/Caskroom/miniforge/base/envs/risk-modelling-pipeline-sandbox/lib/python3.9/site-packages/onnxconverter_common/topology.py", line 776, in convert_topology
    get_converter(operator.type)(scope, operator, container)
  File "/usr/local/Caskroom/miniforge/base/envs/risk-modelling-pipeline-sandbox/lib/python3.9/site-packages/onnxmltools/convert/lightgbm/operator_converters/LightGbm.py", line 235, in convert_lightgbm
    raise RuntimeError(
RuntimeError: LightGBM objective should be cleaned already not 'poisson'.

Versions

numpy                     1.20.2
lightgbm                  3.2.1
onnxmltools               1.8.0
skl2onnx                  1.8.0.1
@janjagusch
Copy link
Contributor Author

Ok, this seems more complicated than I first thought.

@janjagusch
Copy link
Contributor Author

I found a workaround for this issue:

  • Fitting the model with objective='poisson'
  • Changing the model objective post fit to 'regression' (this requires a monkey patch)
  • Serialising the model
  • Wrapping the exponential transform function around the predictions, using Python

The results of the prediction using ONNX are equivalent to the LightGMB predictions up to 6 decimal places.

import numpy as np
import onnxruntime as rt
from lightgbm.sklearn import LGBMRegressor
from onnxmltools import convert_lightgbm
from skl2onnx.common.data_types import FloatTensorType

N_ROWS = 1000
N_COLS = 4

X = np.random.randn(N_ROWS, N_COLS)
# For 'poisson' objective, all target values need to be non-negative
y = abs(np.random.randn(N_ROWS))

estimator = LGBMRegressor(objective="poisson")
estimator.fit(X, y)

pred_sklearn = estimator.predict(X)


# Monkey patching lightgbm.basic.Booster.dump_model
# to overwrite the objetive to 'regression', if the
# actual objective is 'poisson'.
import lightgbm.basic
from lightgbm.basic import Booster as Booster

lightgbm.basic.Booster._dump_model = lightgbm.basic.Booster.dump_model


def dump_model(
    self, num_iteration=None, start_iteration=0, importance_type="split"
):
    result = Booster._dump_model(
        self,
        num_iteration=num_iteration,
        start_iteration=start_iteration,
        importance_type=importance_type,
    )
    if result["objective"] == "poisson":
        result["objective"] = "regression"
    return result


lightgbm.basic.Booster.dump_model = dump_model

initial_types = [("float_input", FloatTensorType([None, N_COLS]))]
onnx_model = convert_lightgbm(
    model=estimator, name="Poisson Estimator", initial_types=initial_types
)

sess = rt.InferenceSession(onnx_model.SerializeToString())
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name

pred_onnx = sess.run([label_name], {input_name: X.astype(np.float32)})[0][:, 0]

np.testing.assert_almost_equal(np.exp(pred_onnx), pred_sklearn, decimal=6)

@xhochy
Copy link
Contributor

xhochy commented May 10, 2021

@janjagusch I guess it is much more reliable in your workaround case if you add an Exp operation at the end of your generated onnx_model instead of doing a manual post-process. That way you would have a portable ONNX model export again.

@janjagusch
Copy link
Contributor Author

Nice idea, @xhochy! I have added an exponential post transform node to the model graph and have changed the model output to point to the output of the post transform node:

import lightgbm.basic
import numpy as np
import onnx
import onnxruntime as rt
from lightgbm.basic import Booster as Booster
from lightgbm.sklearn import LGBMRegressor
from onnxmltools import convert_lightgbm
from skl2onnx.common.data_types import FloatTensorType

N_ROWS = 1000
N_COLS = 4

X = np.random.randn(N_ROWS, N_COLS)
# For 'poisson' objective, all target values need to be non-negative
y = abs(np.random.randn(N_ROWS))

estimator = LGBMRegressor(objective="poisson")
estimator.fit(X, y)

pred_sklearn = estimator.predict(X)


# Monkey patching lightgbm.basic.Booster.dump_model
# to overwrite the objetive to 'regression', if the
# actual objective is 'poisson'.
lightgbm.basic.Booster._dump_model = lightgbm.basic.Booster.dump_model


def dump_model(
    self, num_iteration=None, start_iteration=0, importance_type="split"
):  # noqa: D103
    result = Booster._dump_model(
        self,
        num_iteration=num_iteration,
        start_iteration=start_iteration,
        importance_type=importance_type,
    )
    if result["objective"] == "poisson":
        result["objective"] = "regression"
    return result


lightgbm.basic.Booster.dump_model = dump_model

initial_types = [("float_input", FloatTensorType([None, N_COLS]))]
onnx_model = convert_lightgbm(
    model=estimator, name="Poisson Estimator", initial_types=initial_types
)

# Adding a node to the graph that applies the
# exponential post transform function.
# Changing the graph output to the output of
# the new post transform node.
output_name = onnx_model.graph.output[0].name
new_output_name = f"exp_{output_name}"

exp_node = onnx.helper.make_node("Exp", inputs=[output_name], outputs=[new_output_name])
onnx_model.graph.node.append(exp_node)
onnx_model.graph.output[0].name = new_output_name

sess = rt.InferenceSession(onnx_model.SerializeToString())
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name

pred_onnx = sess.run([label_name], {input_name: X.astype(np.float32)})[0][:, 0]

np.testing.assert_almost_equal(pred_onnx, pred_sklearn, decimal=6)

@ogencoglu
Copy link

Facing the same issue with custom loss function. Why does onnx need objective for inference anyway?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants