Filtering data by value

homonto · July 10, 2023, 4:46pm

Hi,

is it possible to filter by value?
Example:

from(bucket: "homeassistant")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "ms")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["domain"] == "sensor")
  |> filter(fn: (r) => r["entity_id"] =~ /working_time_ms/)
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> yield(name: "mean")

but to add something like this:
|> filter(fn: (r) => r["entity_id" =~/sensor_type/ = "sleeper")
meaning: I want to see ONLY values from sensor “working_time_ms” from sensors that are “sleeper” - I am not interested to see “working_time_ms” for non-sleepers

thank you

grant2 · July 10, 2023, 8:52pm

Just to better understand your data, what are all the total possible choices for entity_id?

homonto · July 10, 2023, 9:20pm

hi @grant2 , thank you for your answer.
all entities (for 1 device: esp32033) are on the picture below
and i.e. I would like to see let it be RSSI but only of these sensors, that have as *_comm_type LoRa or ESPNow - or any other case where I want to see the value of entity1 while filtering by the value of entity 2

grant2 · July 11, 2023, 9:51am

Still not 100% clear on this question, but is this a correct re-phrasing?

You have six sensors whose tag names in InfluxDB are:

sensor.aaa333_working_time_ms
sensor.ddd555_working_time_ms
sensor.bbb444_working_time_ms
sensor.aaa333_sleep_time_ms
sensor.ddd555_sleep_time_ms
sensor.bbb444_sleep_time_ms

You need a Regex expression (for either Flux or Grafana) to filter out only the tags whose name starts with sensor.ddd555 AND whose name ends in _working_time_ms

Correct?

homonto · July 11, 2023, 10:28am

Hi @grant2.

Let us focus on the outcome of only 1 sensor that is: _working_time_ms
But this entity_id is in every device that I have lets say from aaaa to zzzz (as you showed above).
But let us not mix the _working_time_ms with _sleep_time_ms - these 2 represent different metrics.

So I have multiple sensors (kind of 50 to be exact) and some of them are “LoRa” and some of them are “ESPnow” (if you don’t get this thing - different wireless communication methods)
So I have today 1 graph in Grafana, that has very simple filter:
|> filter(fn: (r) => r["entity_id"] =~ /working_time_ms/)

But on such graph, I will have all sensors, irrespective if they have:

“sensor.xxx_comm_type” = “LoRa” or
“sensor.xxx_comm_type” = “ESPnow”

(of course device xxx will have EITHER “LoRa” OR “ESPnow” value of the corresponding from this device:
“sensor.xxx_comm_type”

So my goal is to create 2 different graphs: 1 for LoRa and 1 for ESPnow.
Reason being: LoRa communication (so working_time_ms) is way longer than ESPnow (LoRa is about 1000ms while ESPnow is about 200ms) so having both types o sensors in 1 graph makes the graph a bit invisible (as the graph gets extended vertically due to large values coming from LoRa sensors)

I hope that clarifies.
See on the screen shot the difference between majority being ESPnow (low values) vs the few ones that are LoRa - much higher values. This way the lower part of the graph is not really readable:

THANK YOU FOR YOUR HELP!

grant2 · July 11, 2023, 10:11pm

Are “LoRa” and “ESPnow” tag names in your data, or are these descriptions only part of the entity_id sensor name? Put another way, do you have the ability to use a tag filter and specify “LoRa” or “ESPnow”? If not, wouldn’t it make sense to have these as tag names for your data (call it “wireless_comm_type”). Or perhaps you have years of data and cannot add this as a new tag?

yosiasz · July 11, 2023, 10:13pm

please provide sample data as csv or line protocol otherwise it would be trying to thread a needle in the dark.

homonto · July 12, 2023, 4:11am

guys, I am THINKING now about your answers
this db is influxdb_v2 and this is my db on my server
data in it is already for long time, but I don’t mind dropping entire bucket and recreating it - I am not sure I care about the history

I think however, that my configuration in Home Assistant is wrong (I am not influxdb power user): it seems (to me) that I have in db ONLY the measurements that have unit, right?
See the config from Home Assistant (it that helps):

influxdb:
  api_version: 2
  ssl: true
  verify_ssl: false
  host: !secret influxdb2_ip
  port: 8086
  token: !secret influxdb2_token
  organization: !secret influxdb2_org
  bucket: homeassistant
  default_measurement: units

I think I am missing few other options, i.e. tags or so
I must redesign the way HA is sending data to the db I believe.
I realised that the sensor.esp32094_com_type does not land in the influxdb - it does not have unit, the value is text type so probably this is the reason why I am not able even to see the values in the db so obviously in Grafana as well.

See above: what is value of “32021”? that should be “ESPnow” rather, as the sensor really is in Home Assistant:

but there is no “unit” for this sensor, so I think this is the problem.

So I need to probably rethink how to configure HA to send data to influxdb, which fields, which tags, which measurements, right?

As for request from @yosiasz : I have no idea how to provide this. But I understand what you have asked for - just I don’t know how to export it from influxdb. HA sends data directly to influxdb, Grafana is configured to use this influxdb - how to export the data and where from (probably influx, right?) - no idea

homonto · July 12, 2023, 4:47am

OK, I have created new bucket in influxdb, and modified a bit config on Home Assistant:

influxdb:
  api_version: 2
  ssl: true
  verify_ssl: false
  host: !secret influxdb2_ip
  port: 8086
  token: !secret influxdb2_token2
  organization: !secret influxdb2_org
  bucket: papiobucket
  default_measurement: state
  include:
    domains:
      - sensor
  tags:
    instance: prod
    source: HA
  tags_attributes:
    - friendly_name
    - device_class

the outcome is promising - I can see in influxdb now the values of _comm_type:
LoRa for device: esp32094

and ESPnow for esp32100:

Now, to the first question:
build the query that will show i.e. rssi (or anything else) but ONLY for these devices that fulfill the filter: "when entity_id =~ /comm_type/ AND value = “ESPnow”) - or “LoRa”
the above “filter” I made up using English rather than flux of course

And this way I could have separate graph for “ESPnow” devices and another one for “LoRa” devices (rather then all in one graph).

Let me know please if I am not asking for too much or maybe still the configuration for influxdb on Home Assistant is missing something else

thank you

yosiasz · July 12, 2023, 5:53am

as far as design dont jump into it until you have thought it through well like any database design

What are you capturing, please describe in simple terms not in db design terms. Biggest mistake is to think in terms of columns and rows. describe in plain old simple English

Then we can work from that forward.

homonto · July 12, 2023, 6:10am

I am using Home Assistant and I am building my own devices - mainly ESP32 based.
These devices are for different purpose but they communicate with HA using 3 different ways:
1- direct access over WiFi
2- indirect access over ESPnow (from sensor device to gateway device - then gateway device sends data to HA over WiFi)
3- indirect access over LoRa (from sensor device to gateway device - then gateway device sends data to HA over WiFi)

Devices type 1 are usually always ON, mains powered etc. Devices type 2 and 3 are sleeping so battery powered.
Apart from the “useful” data (temperature, light, humidity, motion etc.) there are some other sensors that are for “admin purposes” such as:

battery status: Volts, % etc.
RSSI
time the sensor was online “work_time_ms” (in ms)

As you can imagine both RSSI as well as work_time_ms are completely different for ESPnow devices (very fast communication, worktime in range of 100ms to 300ms depending on the number of sensors in it) while LoRa devices have worktime few times longer.
Similarly for RSSI: ESPnow devices will have range of -25dBm to -90dBm (below that they die), while LoRa devices will go up to -135dBm before they die.

To distinguish the type of communication, every sensor device sends also one more information: communication type. I am using the sensor called: sensor.xxxx_comm_type
It is text value type and can be: “WiFi”, “ESPnow” or “LoRa”.
Due to the differences between these devices types, I would like to see in Grafana separate graphs i.e. for RSSI or for worktime depending on the VALUE of sensor.xxxx_comm_type, instead of having all RSSI values on the same graph (or worktime)

I think I put here everything so that can draw the full picture

yosiasz · July 12, 2023, 6:21am

way too technical. you need to keep it plain old English, like you describing it to me a pedestrian that know nothing about these things.

homonto · July 12, 2023, 6:21am

and to finish it: today, to realise what I want I have to create 2 graphs with MANUALLY filtered “entity_id” but if I add more sensor device I need to reconfigure filters manually:
example of all devices but LoRa:

from(bucket: "homeassistant")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "dBm")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["entity_id"] =~ /rssi/)
  |> filter(fn: (r) => r["entity_id"] !~ /esp32094/)
  |> aggregateWindow(every: v.windowPeriod, fn: last, createEmpty: false)
  |> yield(name: "last")

But the above filters out by entity_id rather than: “give me WiFi only” or “give me ESPnow only”

Example for LoRa only (I filter it knowing which sensor device I need to include):

from(bucket: "homeassistant")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "dBm")
  |> filter(fn: (r) => r["_field"] == "value")
  |> filter(fn: (r) => r["entity_id"] =~ /rssi/)
  |> filter(fn: (r) => r["entity_id"] =~ /esp32094/)
  |> aggregateWindow(every: v.windowPeriod, fn: last, createEmpty: false)
  |> yield(name: "last")

that is because I know, the sensor device: esp32094 is LoRa device
but again: that is manual work every time I add new device - I would like this to be automatic - I am sure it is doable - just I have no clue how

yosiasz · July 12, 2023, 6:22am

you are again going way off the ranch. too much technical details which might explain why your previous design did not fit just right

so far I gather you

build devices
there are different device types

what are these used for? to measure something?

homonto · July 12, 2023, 6:25am

let me try again (because why not):

how to add to the query something like this (that is NOT query but my English mixed with query):
"give me ONLY the ABOVE results (i.e. RSSI) IF value of sensor: =~ /_comm_type/ equals "LoRa"

homonto · July 12, 2023, 6:26am

yes, environmental sensors: temperature, humidity, light, motion detection - whatever is needed - rain, wind, swimming pool temperature - anything

yosiasz · July 12, 2023, 6:28am

Now we cooking!

temperature: Celsius or F
humidity: measured as?
light: measured as? lumens?
motion detection: measured as?
battery status: Volts, % etc.
RSSI: measured as?

Which device types measure which measurements

homonto · July 12, 2023, 6:31am

@grant2:
if Home Assistant was able to tag the data this way, I think this would solve my mystery problem in 100%, right? maybe this is the way: let HA tag it.
but then, I need to see if HA is clever enough, especially that NOT EVERYTHING in HA even has “communication type” sensor - apparently this will only apply to the devices, where such sensor is implemented

homonto · July 12, 2023, 6:34am

cooking:
temperature in C
humidity in %
light in lux
motion detection has NO UNIT - it only comes to Home Assistant as 1 or 0 /true or false
battery in Volts, battery % in %
RSSI in dBm

let say out of 50 devices: 20 devices measure temperature, 5 have motion detection 50 (all) have battery measurement
and out of these 50 lets say: 45 are ESPnow, 3 are WiFi and 2 are LoRa
and the sensor that shows which device uses which communication method is TEXT type sensor, it has 3 possible values: “LoRa”, “ESPnow” or “WiFi”
and its entity_id is: sensor.xxx_comm_type

grant2 · July 12, 2023, 8:55am

Yes.

Keep us posted if you can add this value as a tag and then correct the graph problem by utilizing this tag in a Flux filter statement.

Topic		Replies	Views
Filter ._value with a variable InfluxDB query-help , flux	2	441	May 12, 2022
Value of _measurement from result of flux query is not available InfluxDB	0	192	December 1, 2022
Last() value within a time range InfluxDB influxdb , query-help , dashboard	5	1749	February 3, 2023
Query and subquery filter other values (other entity_id) InfluxDB influxdb , query-help	1	486	February 3, 2022
How to filter a query using a variable with value = All Configuration flux	1	3683	September 20, 2022

Filtering data by value

Related topics