I'm using Prometheus and Grafana, and I'm trying to track a web server app.
I want to graph the average duration in ms of a particular query. I think I can get there from the data below, but I'm struggling.
My two sets of values:
rate(http_server_request_duration_seconds_sum[5m])
Element Value
{instance="dbserver:5000",job="control-tower",method="get",path="/api/control/v1/node/config.json"} 0.0010491088980113385
{instance="dbserver:5000",job="control-tower",method="get",path="/api/schedule/v1/programs/:id.json"} 0
{instance="dbserver:5000",job="control-tower",method="get",path="/api/schedule/v1/users.json"} 0
{instance="dbserver:5000",job="control-tower",method="get",path="/metrics"} 0.00009133616130826839
{instance="dbserver:5000",job="control-tower",method="post",path="/api/caption/v1/messages.json"} 0
{instance="dbserver:5000",job="control-tower",method="post",path="/api/caption/v1/sessions.json"} 0
{instance="dbserver:5000",job="control-tower",method="post",path="/api/schedule/v1/programs.json"} 0
{instance="dbserver:5000",job="control-tower",method="put",path="/api/caption/v1/sessions/captioners.json"} 0
{instance="dbserver:5000",job="control-tower",method="put",path="/api/control/v1/agents/:id.json"}
rate(http_server_requests_total[5m])
Element Value
{code="200",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="get",path="/api/control/v1/node/config.json"} 0.03511075688258612
{code="200",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="get",path="/api/schedule/v1/programs/:id.json"} 0
{code="200",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="get",path="/api/schedule/v1/users.json"} 0
{code="200",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="get",path="/metrics"} 0.06671043807691363
{code="200",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="post",path="/api/caption/v1/sessions.json"} 0
{code="200",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="post",path="/api/schedule/v1/programs.json"} 0
{code="200",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="put",path="/api/caption/v1/sessions/captioners.json"} 0
{code="200",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="put",path="/api/control/v1/agents/:id.json"} 0
{code="422",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="post",path="/api/schedule/v1/programs.json"} 0
{code="502",host="dbserver:5000",instance="dbserver:5000",job="control-tower",method="post",path="/api/caption/v1/messages.json"}
They have different labels. For this, I only care where path="/api/caption/v1/messages.json".
I think I need to use a combination of rate, sum, and "on" or "ignore", but I haven't been able to get on or ignore to work at all.
I can get the numerator (in seconds) with:
rate( http_server_request_duration_seconds_sum { path="/api/caption/v1/messages.json" }[5m])
And that returns:
{instance="dbserver:5000", job="control-tower", method="post", path="/api/caption/v1/messages.json"}
But the denominator can have different return codes, so I have to sum those, and I need to do some ignore or on or something, but I haven't found an example that helps me out, and I'm really new at this.
Anyone?
Okay, I continued to play. Because I only have one path I worry about, i figured out I could sum the rates. I think this works:
sum( rate( http_server_request_duration_seconds_sum {path="/api/caption/v1/messages.json"}[2h])) / sum( rate( http_server_requests_total{ path="/api/caption/v1/messages.json"}[2h]))
I changed the sample rate as my sample data fell off my 5-minute window, and I had zeros.
I THINK what this is doing is summing the rates, which gets rid of all the labels. And I THINK what it's also doing is using 2 hours of data. I think the rate value is how quickly the value changed over that 2 hour period.
I would love comments.
This solution won't work if I want one chart to include other paths, and I'm still not sure what to do about that, so this solves my current problem but still doesn't help me figure out how to do something similar with ignore or on.