Search code examples
javaapache-flinkflink-streaminggraphite

Flink not sending metrics to Graphite


I have two Apache Flink clusters: 1.1.3 in production and 1.3.2 in staging.

I'm interested in sending metrics to a Graphite server, so I set it up as explained in https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html.

I got it to work in my 1.1.3 cluster, but not in 1.3.2. The jar files I added to the Flink lib directory are:

In 1.1.3:

In 1.3.2:

The settings I added is the same on both (except they send to a different Graphite server):

metrics.reporters: grph
metrics.reporter.grph.class: org.apache.flink.metrics.graphite.GraphiteReporter
metrics.reporter.grph.host: 10.x.x.x
metrics.reporter.grph.port: 2003
metrics.reporter.grph.prefix: flink
metrics.reporter.grph.protocol: TCP

The error message I'm seeing on the staging cluster (1.3.2) is:

java.lang.NoClassDefFoundError: com/codahale/metrics/Reporter
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:264)
    at org.apache.flink.runtime.metrics.MetricRegistry.<init>(MetricRegistry.java:123)
    at org.apache.flink.runtime.taskexecutor.TaskManagerServices.fromConfiguration(TaskManagerServices.java:188)
    at org.apache.flink.runtime.taskmanager.TaskManager$.startTaskManagerComponentsAndActor(TaskManager.scala:1921)
    at org.apache.flink.runtime.taskmanager.TaskManager$.runTaskManager(TaskManager.scala:1819)
    at org.apache.flink.runtime.taskmanager.TaskManager$.selectNetworkInterfaceAndRunTaskManager(TaskManager.scala:1673)
    at org.apache.flink.runtime.taskmanager.TaskManager$$anon$2.call(TaskManager.scala:1574)
    at org.apache.flink.runtime.taskmanager.TaskManager$$anon$2.call(TaskManager.scala:1572)
    at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
    at org.apache.flink.runtime.taskmanager.TaskManager$.main(TaskManager.scala:1572)
    at org.apache.flink.runtime.taskmanager.TaskManager.main(TaskManager.scala)
Caused by: java.lang.ClassNotFoundException: com.codahale.metrics.Reporter
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 40 common frames omitted

Any help would be greatly appreciated!


Solution

  • You have to add io.dropwizard.metrics:metrics-core 3.1.0 to the /lib folder as well. The reason is that in 1.1 the Flink runtime itself was using metrics-core, which is no longer the case in 1.3.

    Alternatively, you can also use the flink-metrics-graphite jar-with-dependencies which should contain everything you need.