The goal of the Internet of Things (IoT) is to acquire data from various embedded systems and impart analytical processes on that data to improve performance, efficiency, and business outcomes. In part one of a three-part series on designing analytics-driven embedded systems, best practices for the acquisition and pre-processing of IoT data are reviewed.
Analytics-driven embedded systems are here. The ability to create analytics that process massive amounts of business and engineering data is enabling designers in many industries to develop intelligent products and services. Designers can use analytics to describe and predict a system’s behavior, and further combine analytics with embedded control systems to automate actions and decisions.
In some implementations, the analytics are performed in the cloud to improve embedded system performance. Borislav Savkovic, a controls systems engineer by training, led a team at BuildingIQ to design a building climate control system that uses analytics to reduce energy consumption. The system starts with gigabytes of engineering and business data. The engineering data comes from power meters, thermometers, pressure sensors, and other HVAC sensors. The business data comes from weather forecasts, real-time energy prices, and demand response data. In the analytics-driven system, the team uses signal processing to remove noise, machine learning to detect spikes, control theory to account for heating and cooling dynamics, and multi-objective optimization with hundreds of parameters. The analytics running in BuildingIQ’s cloud service tune the building’s HVAC embedded systems. The result: an analytics-driven system that reduces energy consumption up to 25 percent in commercial buildings.
In other cases, the analytics run directly in the embedded systems themselves. The design team at Scania, the Swedish truck manufacturer, embeds analytics into their emergency braking systems to provide real-time crash avoidance to reduce accidents and meet stringent EU regulations. Engineering data from cameras and radar are processed in real time for object detection and road marking detection, and subsequently fused to signal collision warning alerts and automatic brake request. System safety and reliability are ensured with exhaustive test and verification, including test scenario creation, system modeling with simulated and recorded data, and hardware-in-the-loop (HIL) testing.
These two examples highlight the steps designers use in developing analytics-driven systems:
- Pre/processing massive amounts of data
- Developing analytic algorithms
- Running analytics and controls in real time
- Integrating analytics with sensors and embedded devices, and possibly other non-embedded resources such as IT systems and the cloud
We’ll cover pre-processing in this installment, and the remaining steps in parts two and three.
The first step in developing analytics is to access the wealth of available data to explore patterns and develop deeper insights. The datasets are not only large in size, but can also come from many different sources and represent many different attributes. Therefore, the software tools you use for exploratory analysis and analytics development should be capable of accessing all the data sources and formats you plan to use. File types might include text, spreadsheet, image, audio, video, geospatial, web, and XML. You may also need application-specific data formats such as the Common Data Format (CDF) or Hierarchical Data Format (HDF) for scientific data, and CAN for automotive data. You also should be able to access data from the storage and generation points, such as:
- Stored data: Databases, data warehouses, distributed file systems, and Hadoop big data systems
- Equipment data: Such as live and historical industrial plant data stored in distributed control systems (DCSs), supervisory control and data acquisition systems (SCADAs), and programmable logic controllers (PLCs)
- Internet of Things devices: Including sensors, local hub, or cloud data aggregators
A key step is data cleaning and preparation before developing predictive models. For example, data might have missing or erroneous values, or it might use different timestamp formats. Predictions from erroneous data can be difficult to debug, or worse, can lead to inaccurate or misleading predictions that impact system performance and reliability. Common pre-processing tasks include:
- Cleaning data that has errors, outliers, or duplicates
- Handling missing data with discarding, filtering, or imputation
- Removing noise from sensor data with advanced signal processing techniques
- Merging and time-aligning data with different sample rates
Another important part of pre-processing is data transformation and reduction. The goal here is to find the most predictive features of the data and filter data that will not enhance the predictive power of the analytics model. Some common techniques include:
- Feature selection to reduce high-dimension data
- Feature extraction and transformation for dimensionality reduction
- Domain analysis such as signal, image, and video processing
There’s an increasing need to put more of the data pre-processing and reduction on the sensor or embedded device itself. There are many reasons for this, but the two most prominent requirements are low power and speed. Many smart devices need to run for an extended period of time without charging. One of the most costly power consumers is the use of wireless communication to send data to a server or cloud analytics engine. Since streaming all raw sensor data can be prohibitively costly, good system designs should do local preprocessing and only upload the useful information or a predictive signal itself. Speed is often paramount as the value of any real-time system is the timely response an interconnected system offers. You may need to embed intelligence in the sensor or embedded device to determine when or how often to communicate with other devices in the network to achieve this desired responsiveness.
BuildingIQ and Scania are good examples of the challenges and successful approaches to capturing and pre-processing engineering data and combining it with traditional business data to develop an analytics-driven systems.
In part two of this three part series, we’ll cover advanced analytics algorithms such as machine learning and deep learning, and show how software tools enable domain experts to develop and run analytics and prescriptive controls in real-time.