Hosted Prometheus - how to view ingest errors

Hi,

So I’ve setup grafana cloud and am using a /metrics endpoint on my application to collect timeseries metrics and send them to the hosted prometheus timeseries database in grafana cloud.

However, for one server, the metrics aren’t visible in the grafana metrics explorer. So for some reason, grafana is not pushing/parsing the metrics and sending it to prometheus, I only see the default metrics for that server:

  • scrape_samples_scraped: shows 18 (which means it does see my metrics)
  • scrape_series_added: shows 0 (so for some reason, it won’t add the metrics to a series)

I don’t see any errors in my grafana cloud agent logs and when I point the scraping URL to another site that exposes the exact same metrics, it works for that (so my agent config looks allright)… I have no clue why the metrics aren’t transmitted for that one server.

Is there a place where I can view the prometheus ingest errors to see what is going on there?

Thanks!

Hi @flannoo! There are a few common pitfalls that can happen in Explore that we should rule out:

  • Ensure that the correct data source is selected at the top. For metrics, that should be grafanacloud-$instancename-prom.
  • Confirm that the appropriate duration is selected at the top right. For quickly checking metrics that should be recently received, try changing that to the last 15 minutes.
  • Verify the correct type of metric query is being used. Try the “validate selector” option or the other settings in the query options to inspect different results.

Regarding where to check for ingest errors, the Billing and Usage dashboard has a panel that displays metric series, ingestion rate (data points per minute), and discarded metric samples. I recommend checking there first to see if anything is being discarded, and using the queries from that panel in Explore for further review.