I have alert with query
sum by (pod) (kube_pod_status_ready{condition=“false”} == 1) > 0
My issue is when first pod is in not running state for 4m, the second pod come in not running state for 3m (there is 1m diff b/w first and second pod) after 2 minutes 1st pod changed state to running, but the alert is triggering the time from first pod not running state, and it’s showing two pods in the slack message. But the same query is working with prom alerts.
How can I solve this, I want to get the alarm when the pod is not in running state 10m and it should that pod only to slack.
- alert: pod_not_running
expr: sum by (pod) (kube_pod_status_ready{condition=“false”} == 1) > 0
for: 10m
labels:
severity: warning
annotations:
summary: “Pod - {{ $labels.pod }} not running”
I kept alert has to check for 10m with 1m intervals