This blog post reviews three core elements (data, models and hardware platforms) to implementing artificial intelligence through tiny machine learning on microcontrollers.
The ability to deploy machine learning (ML) models on small, low-power microcontrollers (MCUs) has made possible the use of artificial intelligence (AI) to process and analyze real-world data in a wide variety of sensor-based applications “at the edge” (a.k.a. endpoint).
Such deployments are possible thanks to the praiseworthy efforts of the TinyML Foundation, the driving force behind the TinyML movement and other bodies committed to ML at the edge.
The tinyML Foundation describes tiny machine learning as...
“…a fast-growing field of machine learning technologies and applications including hardware, algorithms and software capable of performing on-device sensor data analytics at extremely low power, typically in the mW range and below, and hence enabling a variety of always-on use-cases and targeting battery operated devices.”
As a means of implementing ML at the edge, TinyML delivers some amazing bigger-picture benefits that include low-latency and less reliance on cloud connectivity (which in turn means reduced communication bandwidth, improved security and lower overall system cost). What’s not to like?
Core to implementing TinyML are data (captured and prepared) and models (generated and continually enhanced by algorithms), plus, of course, a hardware platform—and in this respect ML tasks can now be performed on many modern 32-bit microcontrollers (MCUs).
Data
Accurate data capture and preparation are essential if meaningful data (a data set) is to be made available throughout the ML process flow (see figure 1).
Figure 1 – The machine learning process flow
The raw (real-time) input could be something as basic as an analog signal or something more complex such as a video stream. For the analog signal, data preparation might involve filtering out noise (with discrete components) and/or conditioning using software—for example, comparing consecutive velocity readings at fixed intervals to establish acceleration. For a video feed, digital filtering can be applied, and algorithms used to detect motion; at this stage it need not recognize what is in frame, just that a certain percentage of pixels have changed in light intensity or color over a number of frames.
There is a trade-off though. While extrapolating only what is considered to be relevant data optimizes the flow, if one of the machine’s tasks is anomaly detection, this might be compromised.
Smart sensors typically combine data gathering and preparation, plus algorithms can be used to categorize the data.
Training data is also needed. For example, in the case of an embedded vision application, live feeds can be compared against libraries of either still images or short video clips that have already been labelled. Test sets of data should also be applied if the machine is to learn (supervised learning). Naturally, live and training data should be as compact as possible—fundamentally, TinyML needs tiny data!
Models
For ML at the Edge, models must be small, fast and energy- and resource-efficient. Accordingly, TinyML is something of a balancing act, in which model compression helps. Two popular compression methods are pruning and quantization. One way of implementing the former is “weight pruning,” in which the weight of the connections between some neurons within the model are set to zero, meaning the machine need not include them when making inferences from the model. Neurons can be pruned too.
As for quantization, it reduces the precision of a model’s parameters, biases and activations by converting data in a high-precision format, such as floating-point 32-bit (FP32), to a lower precision format, say an 8-bit integer (INT8).
Clearly, a lot less memory is required for the model and subsequent processing of the data within the model will be faster. The use of quantized models needs to be factored into machine training in one of two ways:
Post-training quantization using models in, for example, FP32 format which, when the training is considered complete, are quantized for deployment.
Quantization-aware training emulates inference-time quantization, creating a model that downstream tools will use to produce quantized models.
Granted, some accuracy is sacrificed during quantization but at one level it is no different from reducing the resolution of a digital image. It would have to become very grainy for it to be of little use to us as humans, and the same principle is true for a machine. Similarly, while improving processing speed, too much model pruning might lead to erroneous inferences. Again, it’s all a balancing act.
MCUs
Once a suitably compact model exists it is then a case of deploying it on an MCU, and device manufacturers have not been slow in releasing devices, development platforms and software toolkits for TinyML (and AI in general). Also, ML frameworks exist to make things easier.
For example, TensorFlow’s framework/flow is: Pick a new TensorFlow model or retrain an existing one, compress it into a flat buffer using TensorFlow Lite Converter, load the compressed file onto the target and quantize (i.e., this is post-training quantization as described above).
As mentioned, TinyML is aimed at MCUs. Table 1 compares the hardware requirements of traditional ML against TinyML.
Table 1 – Comparison of traditional ML and TinyML resource requirements
Not surprisingly, many modern MCUs meet the requirements for TinyML; in an article on embedded.com entitled “How to quickly deploy TinyML on MCUs” the authors list seven advantages TinyML has over AI services in the cloud: lower cost, lower energy, integration, rapid prototyping, privacy/security, autonomy and real-time. The article also discusses the issues of data preparation, model training, conversion and deployment—using a rock-scissors-paper example.
TinyML’s Big Potential
TinyML is already in use in many applications, including speech and image recognition. It is also being used in variety of industries for predictive maintenance. Moreover, it has been predicted by ABI Research that TinyML device installs will increase from nearly 2 billion in 2022 to over 11 billion in 2027.
The potential for widespread adoption of ML and AI has been discussed for decades but now, TinyML and its associated ecosystem of data, tools, software and hardware for delivering ML at the edge is turning that potential into reality across a wide variety of industries.
For more information on AI and ML, be sure to check out our web page.
Yann LeFaou, Oct 12, 2023
Tags/Keywords: AI-ML
Σχόλια