We are using default Prometheus Grafana dashboard to monitor the Azure Kubernetes service, we have observed that one of the graph showing sys load more than 100%. Please refer to the following query used by the graph.
avg(node_load5{instance=“x.x.x.x:9100”,job=“node-exporter”}) / count(count(node_cpu_seconds_total{instance=“x.x.x.x:9100”,job=“node-exporter”}) by (cpu)) * 100
We have connected with Azure support for the same and according to them everything looks fine from Azure side and would like to know more about it.
Looking forward to hear from you.
Please let me know if you need any details.