Is the alerting code smart enough to work out a max CPU load (>%85) over, say, 1 month, where the load peaks, say, > 5 times ? Anyone can share this code with me?
Welcome @arifhansen to the Grafana forum.
The conditions you describe would certainly be possible in InfluxDB (using Flux) and likely others (Prometheus, etc.) by writing a query (or several queries) to capture all the parameters you are after.
What is your datasource and what have you created thus far?
Thanks grant for your quick response. Datasource will be MongoAtlas cluster. I need to set up an alert which basically alerts me if MongoAtlas cluster has been under/over sized for cost savings. CPU load or Disk IOPS which are hammered/remain idle multiple times over say, 1 month, would provide a good indication how these resources are utilized.
So far, I have a yml alert, just to work out CPU high percentage:
-
record: cpu_usage_percentage-5m
expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode=“idle”}[5m])) * 100) -
alert: CPUOver80%
expr: cpu_usage_percentage-5m > 80
for: 5m
annotations:
summary: “CPU Usage over 80%”
not sure this will work in grafana?
Sorry grant, my bad, datasource is “prometheus”, which is collecting metrics fro MongoAtlas clusters,
It appears you have already written a working Prometheus query to capture the condition you want to alert on. Did you try entering this into the Grafana Unified Alerting? Video and walk-through (for Prometheus) here: Grafana Alerting: Explore our latest updates in Grafana 9 | Grafana Labs
No, this is just an example snippet. I actually need a count of how many CPU max load I hit in say, 1 month, count of, say, > 5. irate, as I understand it, is the rate of change, which is not what I need.
Do you have an existing query that captures CPU utilization (either high or low)?
Since that is basically a Prometheus query question, maybe try posting in the Prometheus forum?
Seems like a lot of hits in the search:
No, this alert needs to be set up in Grafana !
Do you have an existing query in this community that could provide me with the CPU/Disk IOPS utilization, as I would have thought it requires a mixture of the count, avg, max, etc functions ?
Can anyone from the community please help me?
Have you already gone through these examples that Grafana staff built to demonstrate Prometheus queries?
Thanks, let me take a look.
Also, I am trying to get a support contract going between Hansen Technology (my company) & Grafana. I have emailed Thibaud Duprat from Grafana & held a meeting with him last week to email me the different options on your Licensing contract,but have had no response from him since. Can you please ask your team to email this to me? Also, while we consider & review your support contracts, can you or your technical help me with my technical questions?
@arifhansen I am a community volunteer and do not work for Grafana.