Search code examples
pythondatabricksfeature-store

Databricks Feature Store - Can I use native Python (instead of PySpark) to create features?


I would like to create a feature table with some popular time series features using out of the box feature transformations provided by popular python packages such as ta-lib or pandas-ta - these packages rely on numpy/pandas and not Spark dataframes.

Can this be done with Databricks Feature Store?

In the documentation I could only find feature creation examples using Spark dataframes.


Solution

  • When it comes to creation - yes, you can do it using Pandas. You just need to convert Pandas DataFrame into Spark DataFrame before creating the feature store or writing new data into it. The simplest way to do it is to use spark.createDataFrame function, passing Pandas DataFrame to it as an argument.