Search code examples
dockerdocker-networkopen-telemetryjaeger

OpenTelemetry- How to debug connections


I'm trying to get OpenTelemetry container to pass spans along to my Jaeger container, but haven't quite figured out, and can't tell what's wrong, either.

I have confirmed that:

  • my app is generating and passing along spans to OTel
  • Otel is receiving the spans

But beyond that, I see nothing that might denote that errors are occur during export to Jaeger, but no spans ever appear there. It is also hard to debug as there is a large amount of text output every ten seconds that makes it hard to scroll through and find the important bits.

Running Otel with:

/usr/bin/docker run \
  --name oqm_otel \
  -p 1888:1888 \
  -p 8888:8888 \
  -p 8889:8889 \
  -p 13133:13133 \
  -p 4317:4317 \
  -p 4318:4318 \
  -p 55679:55679 \
  -v /etc/oqm/infra/otel/otel-collector-config.yaml:/etc/otel-collector-config.yaml \
  --add-host host.docker.internal:host-gateway \
  otel/opentelemetry-collector:0.72.0

Running Jaeger with:

docker run --name oqm_jaeger -p 8090:16686 -p 8091:14268 -p 8096:4317 -e COLLECTOR_OTLP_ENABLED=true -d jaegertracing/all-in-one:1.42

/etc/oqm/infra/otel/otel-collector-config.yaml:

# Configuration for OpenTelemetry Collector within the OQM system.
receivers:
  otlp:
    protocols:
      grpc:
      http:
        cors:
          allowed_origins:
            - "http://*"
            - "https://*"

exporters:
  jaeger:
    endpoint: "host.docker.internal:8096"
    tls:
      insecure: true
  logging:
    verbosity: detailed

processors:
  batch:

extensions:
  health_check:

service:
  telemetry:
    logs:
      level: "debug"
  extensions: [health_check]
  pipelines:
    traces:
      receivers: [otlp]
      processors: []
      exporters: [jaeger, logging]

Logs in Otel about my request:

:      -> http.method: Str(GET)
Mar 02 17:55:42 oqm-dev bash[38184]:      -> net.host.port: Int(8080)
Mar 02 17:55:42 oqm-dev bash[38184]:      -> http.response_content_length: Int(5)
Mar 02 17:55:42 oqm-dev bash[38184]: Attributes:
Mar 02 17:55:42 oqm-dev bash[38184]:     Status message :
Mar 02 17:55:42 oqm-dev bash[38184]:     Status code    : Unset
Mar 02 17:55:42 oqm-dev bash[38184]:     End time       : 2023-03-02 22:55:39.869314467 +0000 UTC
Mar 02 17:55:42 oqm-dev bash[38184]:     Start time     : 2023-03-02 22:55:39.810170975 +0000 UTC
Mar 02 17:55:42 oqm-dev bash[38184]:     Kind           : Server
Mar 02 17:55:42 oqm-dev bash[38184]:     Name           : /api/v1/info/currency
Mar 02 17:55:42 oqm-dev bash[38184]:     ID             : 871e3a199b593b47
Mar 02 17:55:42 oqm-dev bash[38184]:     Parent ID      :
Mar 02 17:55:42 oqm-dev bash[38184]:     Trace ID       : 341d0d9ebb77fbc41c90dece6f725571
Mar 02 17:55:42 oqm-dev bash[38184]: Span #0
Mar 02 17:55:42 oqm-dev bash[38184]: InstrumentationScope io.quarkus.opentelemetry
Mar 02 17:55:42 oqm-dev bash[38184]: ScopeSpans SchemaURL:
Mar 02 17:55:42 oqm-dev bash[38184]: ScopeSpans #0

The only mention of the string jaeger in the logs:

Mar 02 21:15:08 oqm-dev bash[8359]: 2023-03-03T02:15:08.325Z        warn        internal/warning.go:51        Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks        {"kind": "receiver", "name": "jaeger", "data_type": "traces", "docum ...

Any ideas?


Solution

  • First off, a very special thanks to @Michael Hausenblas for getting on with me to help sort this out.

    Issue #1: The config file I was passing in wasn't getting picked up by OpenTelemetry. The file the image was looking for was /etc/otelcol/config.yaml. Once I changed my docker to add the file there, things started making much more sense.

    Issue #2: Rather than using jaeger as the exporter, I swapped to using the otlp exporter, and setting the url to send it to to my Jaeger server's otlp endopoint (<host>:4317). For good measure, I also added the COLLECTOR_OTLP_ENABLED=true environment variable to my Jaeger server run, though unclear if needed at this point with newer versions of Jaeger.

    # Configuration for OpenTelemetry Collector within the OQM system.
    receivers:
      otlp:
        protocols:
          grpc:
    
    exporters:
      logging:
        loglevel: debug
      otlp:
        endpoint: host.docker.internal:8096
        tls:
          insecure: true
    
    processors:
      batch:
    
    extensions:
      health_check:
    
    service:
      telemetry:
        logs:
          level: debug
      extensions: [health_check]
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlp, logging]
    
    

    (8096 is the port I have exposed for Jaeger's otlp port)