Extending Microsoft’s Azure Digital Twins for real-time analytics

Extending Microsoft’s Azure Digital Twins for real-time analytics Dr William L. Bain is the founder and CEO of ScaleOut Software, which develops software products for in-memory computing and stream-processing designed to enhance operational intelligence within live systems. Bill earned a Ph.D. in electrical engineering from Rice University. Over a 40-year career focused on parallel computing, he has contributed to advancements at Bell Labs Research, Intel, and Microsoft, and holds several patents in computer architecture and distributed computing.

As countless applications need to track live systems, developers face the challenge of implementing real-time analytics that can react to incoming telemetry and quickly identify problems or opportunities. Examples include telematics software that tracks vehicles in a fleet, security software monitoring physical points of entry or network endpoints in a cyber infrastructure, health-tracking systems that analyze telemetry from wearable devices, and many others. These applications are all tasked with the need to simultaneously digest messages from numerous data sources, find patterns of interest, and act quickly when necessary.

Using Digital Twins for Real-Time Analytics

The digital twin model has evolved over the last twenty years as a compelling, object-oriented approach to modelling the state and behavior of devices, and it has been widely adopted for product lifecycle management (PLM). Using the power of in-memory computing, new capabilities for real-time analytics can now be added to the digital twin model and enable it to track telemetry from large numbers of data sources. This approach has the potential to simplify application design and streamline the development process while enabling consistently fast analysis even for large workloads. It also can easily be integrated into popular digital twin platforms, such as Microsoft’s Azure Digital Twins, to expand their range of applications from PLM to real-time analytics for live systems with many data sources.

Consider a software health-tracking application that analyzes telemetry from a large population of wearable devices reporting activity level, heart rate, blood oxygen, and other biometrics. A digital twin for each participant can dynamically analyze these parameters and integrate them with knowledge of the wearer’s age, physical condition, medical history, and medications to detect potential medical issues and signal alerts. Using the digital twin model, developers can implement analytics code that combines this information as telemetry flows in for each user and possibly make use of machine learning techniques to look for anomalies. The in-memory computing platform can then simultaneously run thousands of digital twins to track all users and provide immediate, relevant feedback.

Integrating Real-Time Analytics into Azure Digital Twins

Azure Digital Twins provides a compelling platform for tracking properties that describe data sources with a rich set of features from implementing digital twins, including components, inheritance, relationships, and more. The Azure Digital Twins Explorer GUI tool lets users view digital twin models and instances, as well as their relationships. How can we integrate real-time analytics with Azure Digital Twins to ensure high performance combined with straightforward application development while leveraging the platform’s full capabilities? 

When implementing logic for Azure Digital Twins, users typically create serverless functions using Azure Functions. These functions are used to ingest messages generated by data sources for delivery to digital twins via Azure IoT Hub (or other message hubs). These functions also update the properties of Azure digital twins using APIs provided for this purpose. For example, here’s a redrawn tutorial sample that shows how Azure functions can process messages from a thermostat and update both its digital twin and a parent digital twin that models the room in which the thermostat is located. Note that the first Azure function’s update triggers the Azure Event Grid to run a second function that updates the room’s property:

The challenge in using serverless functions to process messages, implement real-time analytics, and update digital twin properties is that they add overhead and complexity, and they may not provide adequate throughput at scale. By their nature, serverless functions are stateless and must obtain their state from external services; this adds latency. In addition, they are subject to scheduling and authentication overheads on each invocation, and this adds delays that limit scalability. The use of multiple serverless functions also adds complexity when developing analytics code.

Integrating an in-memory computing platform with the Azure Digital Twins infrastructure addresses both of these challenges. This technology runs on a cluster of virtual servers and hosts application-defined software objects holding digital twin properties in memory for fast access along with a software-based compute engine that can run application-defined methods with extremely low latency. By storing each Azure digital twin instance’s properties in memory and routing incoming messages to an in-memory method for processing, both latency and complexity can be dramatically reduced, and real-time analytics can be scaled to handle thousands or even millions of data sources.

ScaleOut Software’s newly announced Azure Digital Twins Integration does just this. It integrates the ScaleOut Digital Twin Streaming Service™, an in-memory computing platform running on Microsoft Azure (or on-premises), with the Azure Digital Twins service to provide real-time streaming analytics. It accelerates message processing using in-memory computing to ensure fast, scalable performance while simultaneously streamlining the programming model.

The ScaleOut Azure Digital Twins Integration creates a component within an Azure Digital Twin model in which it hosts “real-time” properties for each digital twin instance of the model. These properties track dynamic changes to the instance’s physical data source and provide context for real-time analytics.

To implement real-time analytics code, application developers create a message-processing method for an Azure digital twin model. This method can be written in C# or Java, using an intuitive rules-based language, or by configuring machine learning (ML) algorithms implemented by Microsoft’s ML.NET library. It makes use of each instance’s real-time properties, which it stores in a memory-based object called a real-time digital twin, and the in-memory compute engine automatically persists these properties in the Azure digital twin instance.

Here’s a diagram that illustrates how real-time digital twins integrate with Azure digital twins to provide real-time streaming analytics:

This diagram shows how each real-time digital twin instance maintains in-memory properties, which it retrieves when deployed, and automatically persists these properties in its corresponding Azure digital twin instance. The real-time digital twin connects to Azure IoT Hub or another message source to receive and then analyze incoming messages from its corresponding data source. Fast, in-memory processing provides sub-millisecond access to real-time properties and completes message processing with minimal latency. It also avoids repeated authentication delays every time a message is processed by authenticating once with the Azure Digital Twins service at startup.

All real-time analytics performed during message processing run within a single in-memory method that has full access to the digital twin instance’s properties. This code also can access and update properties in other Azure digital twin instances. These features simplify the design by avoiding the need to split functionality across multiple serverless functions and by providing a straightforward, object-oriented design framework with advanced, built-in capabilities, such as ML.

To further accelerate development, ScaleOut provides tools that automatically generate Azure digital twin model definitions for real-time properties. These model definitions can be used either to create new digital twin models or to add a real-time component to an existing model. Users just need to upload the model definitions to the Azure Digital Twins service.

Here’s how the above tutorial example for the thermostat would be implemented using ScaleOut’s Azure Digital Twins Integration instead of serverless functions:

Note that the ScaleOut Digital Twins Streaming Service takes responsibility for ingesting messages from Azure IoT Hub and for invoking analytics code for the data source’s incoming messages. Multiple, pipelined connections with Azure IoT Hub ensure high throughput. Also, the two serverless functions have been eliminated since the in-memory method handles both message processing and updates to the parent object (Room 21).

Combining the ScaleOut Digital Twin Streaming Service with Azure Digital Twins gives users the power of in-memory computing for real-time analytics while leveraging the full spectrum of Azure services and tools, as illustrated below for the thermostat example:

Users can view real-time properties with the Azure Digital Twins Explorer tool and track changes due to message processing. They also can take advantage of Azure’s ecosystem of big data analytics tools like Spark to perform batch processing. ScaleOut’s real-time data aggregation, continuous query, and visualization tools for real-time properties enable second-by-second tracking of live systems that boosts situational awareness for users.

Example of Real-Time Analytics with Azure Digital Twins

Incorporating real-time analytics using ScaleOut’s Azure Digital Twins Integration unlocks a wide array of use cases for Azure Digital Twins. For example, here’s how a telematics software application tracking a fleet of trucks could be implemented:

Each truck has a corresponding Azure digital twin that tracks its properties, including a subset of real-time properties held in a component of each instance. When telemetry messages flow into Azure IoT Hub, they are processed and analyzed by ScaleOut’s in-memory computing platform using a real-time digital twin that holds a truck’s real-time properties in memory for fast access and a message-processing method that analyzes telemetry changes, updates properties, and signals alerts when needed. Digital twin analytics, combined with data aggregation and visualization powered by the in-memory platform, enables dispatchers to quickly spot emerging issues and take corrective action in a timely manner.

Digital Twins: A Breakthrough for Real-Time Analytics

Although initially designed for use in PLM, digital twins offer developers a powerful new software model for tracking and visualizing a population of physical devices. Adding real-time analytics to digital twin platforms, such as Azure Digital Twins, extends their reach into live, production systems and unlocks a wide range of important use cases. It leverages the digital twin model to create a compelling new software architecture for analyzing incoming telemetry with low latency, high scalability, and a straightforward development model. This powerful new approach to real-time analytics enables managers to continuously examine telemetry from thousands or even millions of data sources and immediately identify emerging issues, thereby avoiding costly problems and capturing elusive opportunities.

Editor’s note: This article is in association with ScaleOut Software

(Photo by Vladimir Anikeev on Unsplash)

Tags: , , , , , , , , ,

View Comments
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *