Grafana 4.2.0
Prometheus 1.6.1
Custom recording rule:
kube_alive_pods = kube_pod_status_phase{pod=~"frontend-.*",phase="Running"} == 1
Querying it from Prom interface results list of currently running pods in kubernetes cluster.
While querying this from templating variable like:
label_values(kube_alive_pods, pod)
prints surplus values, if time range contains already deleted pods.
How can I print only currently running pods regardless of the time range, that I’ve selected?
torkel
April 26, 2017, 12:28pm
2
Not sure. you can investigate the query Grafana sends to Prometheus using Chrome Dev Tools network tab.
For some reason the kube_alive_pods recording rule returns deleted pods?
If I request it from Prometheus web interface, recording rule returns the right number of pods, and only those who alive.
But, while using it in Grafana templating variable via label_values(kube_alive_pods, pod), I’m getting surplus results.
How can I investigate the query Grafana sends to Prometheus, if I only can see requests of mine?
Thanks in advance.
torkel
April 27, 2017, 6:37am
4
I found the request, in which Grafana calls kube_alive_pods, but it’s GET request, so no form-data here. Also, there’s no request, called “render”. Any suggestions, though?
Thanks in advance.
torkel
April 27, 2017, 7:29am
6
so no form-data here. Also, there’s no request, called “render”. Any suggestions, though?
Yes, all requests to Prometheus are GET requests. What was the response and query? did you see anything wrong?
https://domain.com:3000/api/datasources/proxy/3/api/v1/series?match[]=kube_alive_pods&start=1493277947&end=1493278247
That’s a complete request.
In the response, there’s an array of two elements, one of them is surplus. Nothing strange.
torkel
April 27, 2017, 7:42am
8
ok, so its a problem in Prometheus then? You said its result was different, what is the query in that case?
Also, query_result(kube_alive_pods) prints the expected result.
torkel
April 27, 2017, 7:45am
10
that is a different api then the series api
And what this should mean to me?) I’m using this article, there’s nothing about API or else.
torkel
April 27, 2017, 7:55am
12
Sorry, just saying the template variable queries are using the series name lookup api, not doing time series queries. So you cannot compare them and expect them to always match.
instead of using label_values your variable query use query_result(kube_alive_pods) in your template variable query that is using the query Prometheus api and takes time into effect (the other query just checks the Prometheus meta data base)
Thanks for the clarification.
I’ll use query_result() instead, however, isn’t there any method to pull “pod” label values from the result of it? (without using a regex).
torkel
April 27, 2017, 8:02am
14
I’ll use query_result() instead, however, isn’t there any method to pull “pod” label values from the result of it? (without using a regex).
no idea, depends on the response from Prometheus I guess
Got this working by using Prometheus recording rule:
kube_alive_pods = sum(kube_pod_status_phase{pod=~"frontend-.*",phase="Running"} == 1) without (app,instance,job,kubernetes_namespace,kubernetes_pod_name,namespace,phase,pod_template_hash)
and Grafana template variable query_result(kube_alive_pods)
with regex /"(.*)"/
This would show alive pod names within time range in format like frontend-3936042910-kl3kj
.
Note , that Prometheus have stalled metrics, so deleted pods will appear in resulting grafana drop-down for about 5 minutes or kind of this.