EMEA 2021
tiny Talks
tinyML design for environmental sensing applications
Jianyu ZHAO - Algorithm and Modeling Engineer,
Infineon Technologies AG
The deployment of large numbers of sensors to monitor various environmental parameters (such as temperature, pressure, noise, pollutants, etc.) and the resulting availability of a large amount of data is motivating the use of machine learning (ML) algorithms including neural networks also on small devices with the goal of making the sensors “smarter” and thus enabling “intelligence at the edge”.
ML techniques allow for more accurate analysis of complex sensor behaviors and interdependencies and can help quickly identify dangerous situations, such as the presence of poisonous gases in an indoor or outdoor environment. As the use of more complex algorithms spreads, a growing interest is observed in the scientific community toward a joint optimization of algorithms, software, and dedicated hardware for on-sensor data analysis (inference) on battery-operated low-power devices.
For the specific gas sensing application we address in the present contribution, a small Gated Recurrent Unit (GRU) is used to estimate gas concentrations in the air. It can exploit the time properties of the sensor signals while keeping the memory footprint within the budget. The algorithmic model is first designed and trained on a computer cluster and then deployed on a Cypress PSoC® interface board, which is later used for signal measurement, heater control, real-time concentration estimation, and communication.
The hardware platform is equipped with an ARM Cortex M0+ processor, with 32 kB Flash and 4 kB SRAM. Its limited memory and computational resources hinder the use of ready-made deployment toolchains, such as TensorFlow Lite, which are normally image-oriented and require at least several hundred kB of memory. To circumvent these issues, we developed our own Python and C library dedicated to extremely small and low-cost smart sensor applications. The deployment workflow is guided with a Jupyter Notebook and can be divided into 4 steps: network quantization, C code generation, performance evaluation, and verification on the embedded target. With the dedicated test bench, it’s possible to visualize the simulated output, to flexibly adjust quantization setups (such as the position of the binary point), and thus to find the best trade-off between algorithm performance and memory footprints with little effort. As a result, we manage to migrate the best performing algorithm, which comes with signal processing functions and a GRU regressor (25 time steps and 20 hidden units), from the computer cluster to the target hardware without significant loss of accuracy.
The deployment library can be applied to similar sensor applications concerned with small neural networks with dense and GRU layers. In the future, we plan to extend the support also for convolutional layers.
Ещё видео!