azure-databricks azure-cognitive-services

Databricks Cognitive Services: name 'TextSentiment' is not defined

I am trying to implement Cognitive Services using the following guide https://learn.microsoft.com/en-us/azure/cognitive-services/big-data/getting-started

I have implemented the sample code:

from mmlspark.cognitive import *
from pyspark.sql.functions import col

# Add your subscription key from the Language service (or a general Cognitive Service key)
service_key = "ADD-SUBSCRIPTION-KEY-HERE"

df = spark.createDataFrame([
  ("I am so happy today, its sunny!", "en-US"),
  ("I am frustrated by this rush hour traffic", "en-US"),
  ("The cognitive services on spark aint bad", "en-US"),
], ["text", "language"])

sentiment = (TextSentiment()
    .setTextCol("text")
    .setLocation("eastus")
    .setSubscriptionKey(service_key)
    .setOutputCol("sentiment")
    .setErrorCol("error")
    .setLanguageCol("language"))

results = sentiment.transform(df)

# Show the results in a table
display(results.select("text", col("sentiment")[0].getItem("score").alias("sentiment")))

I have installed the library Azure:mmlspark:0.17

But I keep on getting the error:

name 'TextSentiment' is not defined

Any thoughts?

Solution

I found that currently, there is only one cluster type that can make this tutorial work which is Databricks Runtime Version: 6.4 Extended Support (includes Apache Spark 2.4.5, Scala 2.11) as the library supports only Runtime Version 7.0 or below. Also, It seems that you need to install the latest version of the library onto the cluster as follows:

Coordinate: com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc3
Repository: https://mmlspark.azureedge.net/maven

See the screenshot.