Search code examples
csvapache-sparkspark-streamingmetricscodahale-metrics

Spark metrics: Is there a way to add some json content at a configurable interval to spark metrics


I am using spark metrics feature in my spark streaming application i am already adding two custom metric to spark metric system those are

  1. Incoming Events per second : using Spark Meter
  2. Number events successfully processed: using Spark counter

The above metrics are written to CSV file by spark metric system as per configuration is metrics.properties

Now my requirement is to add a json String at specified interval to spark metrics system.

The output i am expecting is CSV file which will have data some thing like below

1,jsonString1

2,jsonString2

or

jsonString1

jsonString2

Please suggest a way to do this i searched a lot but could not get the answer i am expecting

Thanks in advance!


Solution

  • The above said requirement can be achieved using Spark Gauge.

    Initialize SparkGuage object

    com.huawei.ccloud.cloudyaf.spark.SparkGauge sparkGauge = org.apache.spark.groupon.metrics.UserMetricsSystem.gauge("samplejsonStringGauge");
    

    can execute below code when each time you need to record a json string to spark metrics

    sparkGauge.gauge("{\"id\":1,\"name\":\"A green door\",\"price\":12.5,\"tags\":[\"home\",\"green\"]}");
    

    The rest is taken care by Spark metric system based on configurations in metrics.properties