Search code examples
pythonlightgbm

Lightgbm can't access data from Dataset get_field method


I have got a simple lgbm dataset:

import lightgbm as lgbm

dataset = lgbm.Dataset(data=X, label=y, feature_name=X.columns.tolist())

Where X is a pandas df, and y a pandas series. I want to access a specific column of X in my custom objective function. But when I try:

data = dataset.get_field('data')

I get this error message:

Traceback (most recent call last):

  File "<ipython-input-71-34d27860b9e3>", line 1, in <module>
    data = dataset.get_field('data')

  File "/Users/***/anaconda3/envs/py3k/lib/python3.6/site-packages/lightgbm/basic.py", line 1007, in get_field
    ctypes.byref(out_type)))

  File "/Users/***/anaconda3/envs/py3k/lib/python3.6/site-packages/lightgbm/basic.py", line 48, in _safe_call
    raise LightGBMError(_LIB.LGBM_GetLastError())

LightGBMError: b'Field not found'

Whereas this works well:

y = dataset.get_field('label')

Thank you!


Solution

  • It doesn't seem to be possible.

    The data seems to be the core of a dataset, whereas the rest of lgb.Dataset constructor arguments are handled as additional features. You can see all of them other than the data end up in lgb.Dataset.set_field function as can be tracked in the _lazy_init function. Filed setting in C back-end is handled by SetXXXField functions as handled by the LGBM_DatasetSetField function. You will see that those calls do not appear elsewhere in c_api.cpp