I'm using the following code to create 100 datapoints in tsdb from 0 till 99:
package main
import (
"context"
"fmt"
"github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/storage"
"github.com/prometheus/prometheus/tsdb"
"os"
"time"
)
func main() {
// Create a new TSDB instance
db, err := tsdb.Open(
"./data", // directory where the data will be stored
nil, // a logger (can be nil for no logging)
nil, // an optional prometheus.Registerer
tsdb.DefaultOptions(),
nil,
)
if err != nil {
fmt.Println("Error opening TSDB:", err)
os.Exit(1)
}
defer db.Close()
// Create a new appender
app := db.Appender(context.Background())
// Create labels for the gauge time series
lbls := labels.FromStrings("__name__", "example_gauge", "type", "gauge")
// Initialize a SeriesRef
var ref storage.SeriesRef
startTimestamp := time.Now().Add(-1 * time.Hour).Unix()
// Add some data points
for i := 0; i < 100; i++ {
var err error
ref, err = app.Append(ref, lbls, (startTimestamp+int64(i))*1000, float64(i))
if err != nil {
fmt.Println("Error appending:", err)
os.Exit(1)
}
}
// Commit the data
err = app.Commit()
if err != nil {
fmt.Println("Error committing:", err)
os.Exit(1)
}
}
It works fine except one thing, when I issue the following PromQL query: example_gauge{type="gauge"}
I get 300 points in response, first 100 is from 0 till 99 like expected and last 200 points all have the same value 99, and my chart looks so:
Why it happens?
Big thanks to @markalex and his suggestions. He pointed me to the explanation of staleness here: https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness and I was able to update my example, so now it generates data that I need:
Here I adding hole without data, i.e. marking timeserie as stale and then unmark it:
And here the code that does it:
func main() {
// Create a new TSDB instance
db, err := tsdb.Open(
"./data", // directory where the data will be stored
nil, // a logger (can be nil for no logging)
nil, // an optional prometheus.Registerer
tsdb.DefaultOptions(),
nil,
)
if err != nil {
fmt.Println("Error opening TSDB:", err)
os.Exit(1)
}
defer db.Close()
// Create a new appender
app := db.Appender(context.Background())
// Create labels for the gauge time series
lbls := labels.FromStrings("__name__", "example_gauge", "type", "gauge")
// Initialize a SeriesRef
var ref storage.SeriesRef
startTimestamp := time.Now().Add(-1 * time.Hour).Unix()
// Add some data points
for i := 0; i < 200; i++ {
var err error
if i < 100 || i > 120 {
ref, err = app.Append(ref, lbls, (startTimestamp+int64(i))*1000, float64(i))
} else {
// Mark time series as stale by appending a NaN
ref, err = app.Append(ref, lbls, (startTimestamp+100)*1000, math.NaN())
}
if err != nil {
fmt.Println("Error appending:", err)
os.Exit(1)
}
}
// Commit the data
err = app.Commit()
if err != nil {
fmt.Println("Error committing:", err)
os.Exit(1)
}
}
I know that AI generated answers not allowed, but nevetherless I would like to attach ChatGPT explanation about staleness, explanation in Prometheus documentation sounds a bit vague up to me:
Sure, I can break this down for you. Let's use a weather monitoring example to help clarify the concept of "staleness" in Prometheus.
Timestamps and Data Sampling: Imagine you have sensors recording temperature, humidity, and wind speed. These sensors push data to Prometheus at different time intervals. Let's say temperature data comes every 1 minute, humidity every 2 minutes, and wind speed every 5 minutes. When you want to query an average temperature for the last 10 minutes, Prometheus picks a set of timestamps to sample the data, regardless of when the actual data came in. This helps Prometheus to compare or aggregate multiple time series that may not align exactly in time.
Marking as Stale: If one of your sensors (let's say the wind speed sensor) suddenly stops sending data, Prometheus marks that time series as "stale". This is because it hasn't received any new sample for that specific series.
Effect on Query: Now, if you run a query to get the average wind speed after this time series has been marked as stale, Prometheus won't return a value for it. It's like saying, "Hey, I can't trust this data because it hasn't been updated recently."
Returning to Normal: If the wind speed sensor starts working again and sends a new sample, the time series is no longer stale, and new queries will return values as usual.
5-Minute Threshold: If Prometheus doesn't find any sample within 5 minutes before a query's sampling timestamp, it treats it as if it's stale. This means even if the time series isn't marked as stale, you'll still get no value if the latest sample is older than 5 minutes.
Timestamps in Scrapes: If your sensor sends data with timestamps, then only the 5-minute rule applies, ignoring the staleness flag.
So, in essence, staleness in Prometheus helps to keep your metrics accurate by ignoring time series that haven't been updated recently. Hope that clears things up!