Search code examples
scalaapache-sparkapache-spark-sqlwildcardregexp-like

Spark Scala: How to use wild card as literal in a LIKE statement


I have a simple use case. I have to use a wildcard as a value in LIKE condition.

I am trying to filter out records from a string column that contains _A_. Its a simple LIKE statement use case. But since _ in _A_ is a wild card, a LIKE would throw wrong result.

In SQL we can use ESCAPE to achieve this. How can i achieve this in Spark?

I have not tried Regular expression. Wanted to know if there is any other simpler workaround

I am using Spark 1.5 with Scala.

Thanks in advance!


Solution

  • You can use .contains (or) like (or) rlike functions for this case and use \\ to escape _ in like

    val df=Seq(("apo_A_"),("asda"),("aAc")).toDF("str")
    
    //using like
    df.filter(col("str").like("%\\_A\\_%")).show()
    
    //using rlike
    df.filter(col("str").rlike(".*_A_.*")).show()
    
    //using contains
    df.filter(col("str").contains("_A_")).show()
    
    //+------+
    //|   str|
    //+------+
    //|apo_A_|
    //+------+