Search code examples
pythonfeature-storemlrun

Issue with feature type overriding after ingest values to FeatureSet


I see that some feature types were changed after ingest value to the feature set, see information in log:

> 2023-07-08 17:29:42,018 [warning] Overriding type of entity 'fn0' from 'int' to 'int32'. This may result in errors or unusable data.
> 2023-07-08 17:29:42,018 [warning] Overriding type of entity 'fn1' from 'int' to 'int32'. This may result in errors or unusable data.
> 2023-07-08 17:29:42,018 [warning] Overriding type of entity 'fn2' from 'int' to 'int32'. This may result in errors or unusable data.
> 2023-07-08 17:29:46,792 [warning] Overriding type of entity 'fn0' from 'int32' to 'int'. This may result in errors or unusable data.
> 2023-07-08 17:29:46,792 [warning] Overriding type of entity 'fn1' from 'int32' to 'int'. This may result in errors or unusable data.
> 2023-07-08 17:29:46,792 [warning] Overriding type of entity 'fn2' from 'int32' to 'int'. This may result in errors or unusable data.

I used this sample code

feature_set = fstore.FeatureSet(feature_name, entities=[fstore.Entity("fn0"),
                                                            fstore.Entity("fn1"),
                                                            fstore.Entity("fn2")],
                                engine="storey")        
feature_set.set_targets(targets=[RedisNoSqlTarget(path="redis://redis-fs-test.eu.infra:6379")],
                        with_defaults=False)
feature_set.save()
...
fstore.ingest(feature_set,dataFrm,overwrite=False)

Do you know, how can I avoid type overriding?


Solution

  • I got it. You have to do two steps:

    1. You have to define all entities and features with relevant types during FeatureSet definition
    2. Than, you can use ingest with setting infer_options=data_types.InferOptions.Null (based on that, you switch-off type discovery and type overriding)

    See new code:

    feature_set = fstore.FeatureSet(feature_name, entities=[fstore.Entity("fn0", value_type=data_types.ValueType.INT32),
                                                            fstore.Entity("fn1", value_type=data_types.ValueType.INT32),
                                                            fstore.Entity("fn2", value_type=data_types.ValueType.INT32)],
                                    engine="storey")
    feature_set.add_feature(mlrun.feature_store.Feature(data_types.ValueType.INT32,name="fn3"))
    feature_set.add_feature(mlrun.feature_store.Feature(data_types.ValueType.INT32,name="fn4"))
    ...
    
    feature_set.set_targets(targets=[RedisNoSqlTarget(path="redis://redis-fs-test.eu.infra:6379")],
                            with_defaults=False)
    feature_set.save()
    ...
    fstore.ingest(feature_set,dataFrm,overwrite=False, infer_options=data_types.InferOptions.Null)