Search code examples
cassandracollectdgoogle-cloud-stackdriver

Stackdriver-agent didn't collect monitoring data (HitRate for KeyCache)


I'm configuring Stackdriver-agent in GCE VM to monitor Cassandra metrics. (based on GCP guide: https://cloud.google.com/monitoring/agent/plugins/cassandra)

I used the default setting of the link above, and they work fine. However, one metric I added doesn't work with the following error.

I tried gauge or counter for Type and Value or Count for Attribute. However, either of them doesn't work well.

Any suggestion, please.

  • Error

    Feb 19 23:14:08 pgxxxxxxx1 collectd[16917]: write_gcm: Server response (CollectdTimeseriesRequest) contains errors: { "payloadErrors": [ { "index": 161, "valueErrors": [ { "error": { "code": 3, "message": "Unsupported collectd id: plugin: \"cassandra\" type: \"gauge\" type_instance: \"cache_key_cache-hitrate\"" } } ] } ] }

Config (added KeyCache-Hitrate metrics to the original config in the guide)

  • Connection part:

    <Connection>
    # When using non-standard Cassandra configurations, replace the below with
    #ServiceURL "service:jmx:rmi:///jndi/rmi://CASSANDRA_HOST:CASSANDRA_PORT/jmxrmi"
    ServiceURL "service:jmx:rmi:///jndi/rmi://localhost:7199/jmxrmi"
    InstancePrefix "cassandra"
    User "cassandra"
    Password "xxxxxxxx"
    Collect "cassandra_storageservice-load"
    Collect "cassandra_Cache_KeyCache-Hits"
    Collect "cassandra_Cache_KeyCache-HitRate"   <===== Added line
    ...
    Collect "cassandra_DroppedMessage_MUTATION-Dropped"
    Collect "cassandra_DroppedMessage_READ-Dropped"
    

  • MBean part:

    <MBean "cassandra_Cache_KeyCache-HitRate">
        ObjectName "org.apache.cassandra.metrics:type=Cache,scope=KeyCache,name=HitRate"
        <Value>
            Type "gauge"
            InstancePrefix "cache_key_cache-hitrate"
            Table false
            Attribute "Value"
        </Value>
    </MBean>
    

My environment stackdriver-agent 5.5.2-379.sdl.stretch cassandra 3.11.1


Solution

  • Following the guide for custom metrics and , I could solve my issue.

    1. create custom metrics Follow the guide here: https://cloud.google.com/monitoring/custom-metrics/creating-metrics#monitoring-create-metric-python (From Custom metric name until call the create method. Timeseries is not required)

    Also need to be authorized to access to monitoring. (Follow IAM guide).

    1. configure cassandra plugin (.conf file) Guide here: https://cloud.google.com/monitoring/agent/custom-metrics-agent (From the top until Load the new configuration)

    sample code of mine

    1. code to create custom metrics client_request_read-latency-1minrate.py

      from google.cloud import monitoring

      client = monitoring.Client() descriptor = client.metric_descriptor( 'custom.googleapis.com/cassandra/client_request/latency/1minrate', metric_kind=monitoring.MetricKind.GAUGE, value_type=monitoring.ValueType.DOUBLE, labels=[monitoring.label.LabelDescriptor("operation", description="The storage operation name.")], description='Cassandra read latency rate for 1 minitue', display_name='Read latency 1 minutes rate') descriptor.create()

    2. cassandra plugin example (following 2-1 and 2-2 in the same config file) 2-1. cassandra plugin example part 1 in

       <MBean "cassandra_custom_ClientRequest_Read-Latency">
           ObjectName "org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency"
           <Value>
               Type "gauge"
               InstancePrefix "client_request_read-latency-1minrate"
               Table false
               Attribute "OneMinuteRate"
           </Value>
       </MBean>
      
      <Connection>
          # When using non-standard Cassandra configurations, replace the below with
          #ServiceURL "service:jmx:rmi:///jndi/rmi://CASSANDRA_HOST:CASSANDRA_PORT/jmxrmi"
          ServiceURL "service:jmx:rmi:///jndi/rmi://localhost:7199/jmxrmi"
          InstancePrefix "cassandra_custom"
          User "cassandra user name"
          Password "your password"
      
          Collect "cassandra_custom_ClientRequest_Read-Latency"
      </Connection>
      

    2-2. cassandra plugin example part 2

    <Chain "GenericJMX_cassandra_custom">
        <Rule "rewrite_genericjmx_to_cassandra_custom">
            <Match regex>
                Plugin "^GenericJMX$"
                PluginInstance "^cassandra_custom.*$"
            </Match>
            <Target "set">
                MetaData "stackdriver_metric_type" "custom.googleapis.com/cassandra/client_request/latency/1minrate"
                MetaData "label:operation" "%{plugin_instance}"
            </Target>
            <Target "replace">
                MetaData "label:operation" "cassandra_custom_" ""
            </Target>
        </Rule>
        <Rule "go_back">
            Target "return"
        </Rule>
    </Chain>
    
    <Chain "PreCache">
        <Rule "jump_to_GenericJMX_cassandra_custom">
            <Target "jump">
                Chain "GenericJMX_cassandra_custom"
            </Target>
        </Rule>
    </Chain>
    PreCacheChain "PreCache"
    

    Official guide of Stackdriver monitoring manual is not easy to read and understand. I hope this will help..