In this post, I’d like to share my experience with InfluxDB as a long term storage and visualization facility for Home Assistant data. It also covers centralized logging, monitoring, and automatic error notifications. The article consists of three parts:
- Installation of InfluxDB as Hass.io addon, using Grafana for data visualization, embedding graphs into a Lovelace card (this post)
- Using Telegraf to collect various system metrics (e.g. CPU load and memory usage for all Docker containers both created by Hass.io and your own, RAID disks temperature, etc.). This part will also cover memory-effective storage of everlasting data in InfluxDB which you want to keep forever.
- Part 3 will discuss centralized logging for any Docker containers in your setup, as well as for any device utilizing Syslog protocol (including ESP** microcontrollers with Tasmota and ESPEasy firmware). This part will also include anomaly detection and severe error notifications (e.g. notification can be sent if the same error occurs more than 5 times in an hour).
InfluxDB Advantages
- Very small memory footprint and CPU requirements compared to Elastic Search. Worth to say that InfluxDB runs very well on a Raspberry Pi
- It is possible to keep important data as long as you want to, without performance impact using so-called Continuous Queries (they will be described in part 2). If you’re using SQLite with Home Assistant, increasing number of days may slow down your system especially if you run it on an ARM board
- It works well with Grafana, very beautiful dashboard tool for time-series data. The Hass.io add-on also includes a similar tool named Chronograf
- InfluxDB can work with native agent Telegraf (a tiny and fast command line utility) which may send the huge amount of metrics from your systems to InfluxDB using an extensive list of plugins
- Monitoring and anomaly detection service named Capacitor is also included to InfluxDB addon. It can notify the user when bad things happen using various communication means, including MQTT.
InfluxDB is a mature time-series DB presented in 2013. Time-series means that it is optimized for chunks of data, where each chunk contains a timestamp. Examples of such data are CPU load, network speed, disk temperature or memory consumption. InfluxDB is written in Go, a modern programming language with static linking which produces one compact executable file with no dependencies. This made it possible to handle the huge amount of data with very small memory and CPU requirements. I have been using InfluxDB for a long time on a 512Mb droplet along with additional services without any problems. When I decided to give Elastic Search a try, I found myself struggling at 100% memory usage along with 100% CPU usage. Yes, you can try to tune up Elastic Search with the JVM settings, but anyway, it really loves your RAM.
InfluxDB Basic Terminology
If you ready to rely on InfluxDB and use it as your data storage, it makes sense to learn at least basic terminology and concepts behind it.
The full list of definitions can be found on InfluxDB site, we will cover only those which may be necessary while playing with data stored by Home Assistant. Below is the sample data for my temperature sensor named sensor.bedroom_temperature
.
Timestamp | domain | entity_id | friendly_name_str | value |
---|---|---|---|---|
2019-01-09T21:19:00.314567936Z | sensor | bedroom_temperature | Bedroom Temperature | 20.5 |
2019-01-09T21:20:00.332456192Z | sensor | bedroom_temperature | Bedroom Temperature | 20.4 |
2019-01-09T21:23:00.390132992Z | sensor | bedroom_temperature | Bedroom Temperature | 20.3 |
2019-01-09T21:25:00.430054912Z | sensor | bedroom_temperature | Bedroom Temperature | 20.4 |
2019-01-09T21:28:00.485129984Z | sensor | bedroom_temperature | Bedroom Temperature | 20.3 |
Retention Policy
A retention policy defines how long the data will be stored on disk (e.g. 7 days or infinite). It also defines some cluster parameters which are irrelevant for this article. Each database in InfluxDB is associated with an automatically generated retention policy named autogen
. Most of the times when you picking up a database in Capacitor UI or issuing an InfluxQL query, the database name is used together with retention policy name separated by a dot, e.g. home_assistant.autogen
.
Database
Database contains data from Home Assistant sensors and other entities. The duration how long the data is available is defined by a retention policy. By default, Home Assistant component InfluxDB stored its data in the database named home_assistant
, but in general InfluxDB can keep as many databases as you need.
Timestamp
Timestamp is the date and time when some sensor values were captured. All data stored in InfluxDB should contain timestamp by design. Timestamps are always stored in UTC and can be converted to local time using client time zone settings.
Field
Any pair of key and value for your sensor data. In the above table, fields are value
and friendly_name_str
. As said above, each field is associated with a timestamp.
Tag
In the above table, tags are domain
and entity_id
. Tags are text fields which may contain limited set of values. These fields are internally indexed and therefore can be queried very fast.
Measurement
Measurement is a way to keep different sets of tags, fields, and timestamps separated. Sounds complicated, but this is a simple concept which allows having multiple named datasets within a database. By default, InfluxDB HA component creates a measurement for temperature sensors, a measurement for humidity sensors, a measurement for data without a unit of measure, etc. This behavior can be redefined by override_measurement
parameter of component configuration). Each measurement can be associated with one or many retention policies.
Let’s check what measurements are in my database home_assistant
:
|
|
There are three measurements, some of them named after the units of measure reported by sensors and one is state
which contains values of entities without units, e.g. switches. Storing all unitless values in the state
measurement is imposed by default_measurement
parameter of InfluxDB HA component configuration.
One of the useful applications of measurements is to store the data with different lifetime requirements in different measurements. Data downsampled to one-hour interval will hold 60 times less disk space than data stored minutely. This approach will be discussed in the second part of this article.
Installation and Configuration
The easiest way to install InfluxDB is a community addon available for Hass.io. Advanced users can try to install InfluxDB manually using the official guide or utilize one of ready to use containers from the Docker hub. One of the advantages of Hass.io is that the system will automatically select proper addon architecture for you based on your hardware setup (ARM or x86).
Installing InfluxDB Hass.io add-on
In order to install the addon, open Hass.io->ADD-ON STORE tab and make sure that Community Hass.io Add-ons is present.
If it is not present, just put the following URL into Add new repository by URL field: https://github.com/hassio-addons/repository When the URL is added, a new section named Community Hass.io Add-ons should appear in the list of plugins. Select and install InfluxDB addon:
When the addon is installed, modify its configuration as given below and restart the addon:
|
|
Thanks to Ingress technology, introduced in Home Assistant 0.91.3, we don’t need to mess around the ports and configuration changes for iframe embedding - this all is done by a single switch named Show in sidebar. This super technology also makes unnecessary to enter login and password for InfluxDB authentication.
Create a User and the Database
We need to create a user and the database in InfluxDB to make data flow. Open the InfluxDB web UI (named Chronograf), select InfluxDB Admin->Users->Create user from the left menu. Assign “ALL” permissions to the user.
The next step is the database. By default InfluxDB component for Home Assistant uses the database surprisingly named home_assistant
, this can be redefined in configuration.yaml
. Open Databases tab and create our DB:
For Duration
I usually pick the 7-day option, this is enough for troubleshooting. The long term data require another approach to reduce its size and frequency, we will see how to keep such data forever in the part 2.
Home Assistant Configuration
Add the following section to configuration.yaml
|
|
include
section contains which HA entities we would like to persist in our database. A detailed description of component configuration is given here: https://www.home-assistant.io/components/influxdb/
default_measurement
option defines the name of measurement, which is used for data from entities without a unit of measurements, e.g. switches state.
override_measurement
option allows keeping all Home Assistant data in one measurement with given name. It may be convenient to use something like temperature
instead of °C
when you have to perform manual queries.
Proper Names for Measurements
When does it matter
This is an optional step and may be skipped especially if you have no plans to create Continues Queries and will use visual query constructors like Grafana or Capacitor. But if you going to try CQs in the future, it’s worth to configure InfluxDB component in a right way.
Problem Description
By default, the measurements in InfluxDB named after units of measurements from Home Assistant. As example, humidity data will fall in %
measurement, temperature data will be in °C
or °F
, etc. This seems to be quite inconvenient by different reasons:
- when writing InfluxDB query by hand, you will need to copy special characters like
°
as you cannot enter them by keyboard, characters like%
should be escaped to avoid query parser errors - data with different meaning but same units of measurements may fall into one measurement, e.g. humidity value and battery status
- data without units (switches, counters, etc) either fall to the default measurement or to the measurement named after entity name
Hopefully measurement names can be customized in a few ways, for an entity like specific sensor with its own ID, for domains like sensor
and globally for all HA entities. InfluxDB component documentation briefly mentions that but contains no extra details (at the moment of writing). Luckily I found a PR on the github which contains enough details to sort it out.
In the example below we assign lt
measurement for a specific entity counter.hotw
. This is defined in component_config
section. Additionally we specify that the data from all entities with temperature
at the end of their names should fall into temperature
measurement. Global customizations are defined in component_config_glob
section:
|
|
Verify That Data Flows
Let’s check that InfluxDB keeps our data. Open its Web UI and go to Explore tab. Select home_assistant.autogen
from the list of databases in the bottom left. As it said before, autogen
prefix is the name of the default retention policy associated with the DB.
We should see some values like (%
, °C
, state
, etc.) in the list named Measurements & Tags. Or, if customization rules were defined, measurement names will be like temperature
, humidity
, lt
. If we click temperature
for instance, the nested list will give us all tags associated with this measurement. You want entity_id
as it refers to a sensor identifier from Home Assistant. Click on entity_id
and you should see your sensor ids (mine is bedroom_temperature
).
Click on Submit Query button. You should see your data now:
Install Grafana
With all due respect to InfluxData talented team, their Chronograf still lacks some essential features. I haven’t found ability to use two Y axis (it is useful to combine temperature and humidity into one diagram). I also missed embed feature which allows using these graphs in Lovelace. The latter was announced, but I did not manage to find any information on how to use it. So we have to install Grafana.
Grafana Hass.io Addon Installation
Since community addons repository have already been added, just pick up Grafana addon from the list and install it. Addon configuration is pretty simple:
|
|
Hit Save and restart addon, now it is ready to use. But first, we have to do some initial configuration:
- Log in to InfluxDB and create Grafana user or stick to an existing user
- Log in into Grafana using
admin/hassio
credentials. Then create a new data source (Configuration->Data Sources->Add Data Source) - Choose InfluxDB as the data source type
- Data source creation dialog should appear. Fill in arbitrary data source name in the Name field
- Copy and paste http://a0d7b954-influxdb:8086 into URL field (this is the internal name of the addon docker container within the Hass.io Docker network)
- Put into InfluxDB details Home Assistant database name, username, and password.
Rest of the fields should remain unchecked. Hit Save to save your configuration. Now it is time to create our first graph.
Creating a Graph
- Select + -> Create Dashboard -> Graph from the sidebar menu
- In order to create a query, activate dropdown menu under Panel Title and Select Edit:
- Query builder is at the bottom. Click select measurement choose the corresponding unit of measure, e.g.
temperature
. Selectentity_id
in WHERE clause and choose your sensor ID: - Select
fill(none)
in the GROUP BY field. You should see your temperature graph now. - Let’s add another type of data, say humidity (this is an optional step). In order to accomplish that click Add Query and repeat steps 1 and 2 with
humidity
measurement. - To tell Grafana that the values on the graph use different units of measure, click on colored thin line which is a part of the graph legend, a popup window should appear. Select Y Axis:
Right
- After changes are applied you should see something like that:
Let’s add this nice graph into a Lovelace card. We want card of iframe
type:
|
|
In the URL
property, we should specify Grafana panel URL. This one can be grabbed from Share dialog (it is available from the same popup menu where Edit is). When Share Panel popup appears, go to Embed tab. You should uncheck useless checkbox Current Time Range and copy URL from the edit box to your clipboard.
Let’s open Lovelace and enjoy the result:
You are awesome!
Light Theme for Grafana
What is that? Didn’t you just say that dark theme is not perfect choice for default Lovelace color scheme? Hopefully, this can be fixed with ease. Add a parameter to Grafana URL &theme=light
so that the final URL looks like:
|
|
Now it looks very good!
There is one more thing to improve: all sensor values fall in a very narrow range. For example, temperature delta is less than a degree Celsius. In order to make them look more useful, let’s set minimum and maximum values for both Y axes using common sense:
Don’t forget to save changes by this small button with 3.5” floppy disk image. Have no idea how this floppy might looks like? You’re welcome:
When the changes are saved, our diagram becomes less weird:
Refresh Interval for Embedded Diagrams
Another useful parameter named refresh
allows specifying refresh interval for our embedded panel. Add the following to refresh it every minute:
|
|
Specify Relative Time for Embedded Diagrams
There is a special control in Grafana UI to specify relative time for graph data:
An important point here is that this parameter can be set to any value from the Grafana UI and saved for a panel, but embedded diagram will always ignore it. There are two ways to specify relative time for an embedded graph:
- Go to Time Range tab in the Graph Editor and fill in Override relative time, e.g.
8h
and save changes. Refresh embedded graph and it should use specified time period. - Use special URL parameters as described in the ticket. Below is an example of URL which specify 8 hours time range (pay attention to
from
andto
parameters):
|
|
Second approach gives us an opportunity to have several Lovelace cards with different resolution using the same URL. Or one can even cosider builing her own custom Lovelace card with time range switcher.
Conclusion
We have learned how to store Home Assistant data in InfluxDB, create awesome diagrams with Grafana and embed them into nice looking Lovelace cards. Part 2 will cover InfluxDB Continuous Queries to effectively keep historical data for unlimited period of time with no significant performance impact. Also Telegraf agent will be discussed, which allows collecing various system metrics in InfluxDB. That’s it for the part 1, stay tuned!