Search code examples
apache-sparkpysparkredisspark-redis

Read specific key from redis using pyspark


I am trying to read a specific key from Redis using pyspark. As per documentation, I haven't found any specific command to read a particular key. Using the below code I can read all data from Redis:

testid = spark.read.format("org.apache.spark.sql.redis")\
.option("table",'testing123')\
.option("key.column","id")\
.load()

Kindly suggest


Solution

  • You can try keys.pattern. From the docs:

    To read Redis Hashes you have to provide a keys pattern with .option("keys.pattern", keysPattern) option. The DataFrame schema should be explicitly specified or can be inferred from a random row.

    [...] Spark-Redis tries to extract the key based on the key pattern:

    • if the pattern ends with * and it's the only wildcard, the trailing substring will be extracted
    • otherwise there is no extraction - the key is kept as is.
    testid = spark.read.format("org.apache.spark.sql.redis") \
    .option("keys.pattern", "keyPattern:*") \
    .option("key.column","id") \
    .option("infer.schema", "true") \
    .load()