Can I group values only when multiple values exist within a time window?

I'm working with a poller that polls every minute, and I query aggregate data from it by the hour. The 1-minute data looks something like this

my_metric{system="sys1", subsystem="ss1", group="A"} 1
my_metric{system="sys1", subsystem="ss2", group="A"} 1
my_metric{system="sys1", subsystem="ss3", group="B"} 1

my_metric{system="sys2", subsystem="ss4", group="A"} 1
my_metric{system="sys2", subsystem="ss5", group="B"} 1
my_metric{system="sys2", subsystem="ss6", group="A"} 1

I want to count the number of systems each hour that are in each group. However, there are some systems that undergo a change from A to B within the 1-hour window, and using count by (system, group) or similar queries counts these systems twice. So is there a way to use label_replace or group or count distinct to do something like - if A and B both exist within the 1-hour window, then label_replace with "Updated"?

Solution

Without being able to test the query its hard to guess whether they will work as intended, especially for these non-trivial queries.

The first operator we need is the unless operator. It works like an XOR

metricA unless metricB

returns metricA when metricB does not exist and metricB when metricA does not exist. In combination with avg_over_time we can do the following:

avg_over_time(my_metric{group="A"}[1h])
unless
avg_over_time(my_metric{group="B"}[1h])

gives us all the metrics that existed only in one group within the last hour.

Now we need to handle the cases where a system switched a group. In that case you need to decide if you want to have that counted for A or B. There we can use the and operator,

metricA and metricB

returns you the values of metric A where also metric B exists.

avg_over_time(my_metric{group="A"}[1h])
and
avg_over_time(my_metric{group="B"}[1h])

returns you a metric for group A, if they existed the same time. (If you need it the other way around, just switch A and B)

The next operator is or:

metricA or metricB

simply metricA as long it exists while it returns metricB when metric A does not exist.

(
   avg_over_time(my_metric{group="A"}[1h])
   unless
   avg_over_time(my_metric{group="B"}[1h])
)
or
(
   avg_over_time(my_metric{group="A"}[1h])
   and
   avg_over_time(my_metric{group="B"}[1h])
)

should return you now the metric when it existed only in one group, or the metric for group B if it existed only in both groups. The only thing you need to do is put a count by (group) around it and it should bring you the expected results.

(In general its a good practice to build up non-trivial queries step by step and test them each time, so you know what metrics are counted and if they are exactly what you are looking for)