A reference architecture for manufacturers to push innovation, adaptability, and continuous improvement.

ByDunith DhanushkaonFebruary 13, 2024
Blueprint for the future: architecting for industrial IoT workloads

The manufacturing landscape has evolved beyond recognition since the Industrial Revolution. Back in the day, those with the most manpower or the largest fleet of machines owned the lion’s share of the market. Fast forward to today, machines are no longer the center of attention in manufacturing — but the data they produce.

The emergence of the Internet of Things (IoT), Big Data analytics, and cloud computing has shifted the manufacturing paradigm from labor-intensive processes to data-driven automated factories. Businesses that leverage data to understand, operationalize, and control machines can gain a competitive advantage in the market with continuous innovation, efficiency, and cost reduction.

As a helpful nudge in that direction, this post presents a reference architecture for Industrial Internet of Things (IIoT) workloads that can be efficiently implemented in manufacturing plants. But first, let’s get on the same page on what IIoT is all about.

Industry 4.0 and IIoT

Industrial Internet of Things (IIoT) refers to the use of Internet of Things (IoT) technologies that specifically focus on integrating smart devices and advanced analytics in the industrial sector to improve efficiency, productivity, and overall operations. For example, in the manufacturing industry, IIoT sensors are commonly used to capture data on machine performance to predict equipment failures and automate quality control in real time.

The impact of IIoT cannot be understated, and the evidence only continues to strengthen in its favor. One McKinsey study, for instance, revealed that IIoT-powered predictive maintenance can reduce downtime by 45% and cut costs by up to 30%. Subsequently, IIoT continues to rapidly transform manufacturing from a machine-focused industry to a data-focused one. This shift is largely due to the introduction of three pillars:

  1. Connectivity: IIoT involves connecting industrial devices, equipment, and systems to a network infrastructure, allowing them to communicate with each other. This connectivity enables the collection and sharing of data in real time.

  2. Data collection and analysis: IIoT devices generate and collect vast amounts of data. This data can be analyzed using advanced analytics tools to extract valuable insights, optimize processes, and make data-driven decisions.

  3. Automation and control: IIoT enables automation by connecting sensors, actuators, and other devices to control systems. This can lead to more efficient and precise control of industrial processes.

A Reference Architecture for Industrial IoT

IIoT is a generous domain comprising several sub-sections, so manufacturers implementing an IIoT architecture should have a clear overview of the landscape to help control the long-term cost and complexity of their IIoT projects.

To that end, we present the following reference architecture for IIoT applications that leverage streaming data for swift, data-driven decision-making.

Diagram of a proposed reference architecture for IIoT
Diagram of a proposed reference architecture for IIoT

The following table summarizes the key components of this solution and each one's responsibilities.



PLC devices and IoT sensors on machines

Emit telemetry data

Redpanda cluster

Telemetry data ingestion and event-driven workflows

Apache Flink® cluster

For stateful stream processing and streaming ETL

Machine learning models

For predictive analysis of telemetry data

Time series database

Equipment monitoring and running diagnostics

Workflow engines

Trigger automated business workflows deployed

Data lake and warehouses

Keeps cold industrial data that can be leveraged for experimentation and process optimization

Line of Business (LoB) applications

Internal business systems (e.g., Inventory, supply chain, etc.)

Before I explore each component in detail, allow me to introduce Redpanda, the centerpiece of the architecture that serves as a hub connecting the data flow across different components.

Redpanda as the central data hub

Redpanda is a simple, powerful, and cost-efficient streaming data platform that’s fully compatible with Apache Kafka® APIs while eliminating the usual Kafka complexity. Designed to be an “easy button for streaming data,” Redpanda is free from external dependencies (like JVM or KRaft) and comes with a human-friendly CLI and a rich web UI that greatly simplifies working with real-time data.

So why use Redpanda in an IIoT architecture? Collecting that data in a central location enables downstream applications to efficiently consume it from a single location—without point-to-point integration channels.

Redpanda servers are a central data hub for these data streams, and this enables scalable real-time data ingestion from machines and provides durable data retention until downstream applications consume it. Plus, having Redpanda as the centerpiece decouples data producers from consumers, allowing them to scale and evolve independently.

As the icing, Redpanda’s lean, cost-efficient design consumes a third of the resources of JVM-based alternatives, such as Kafka. This lean infrastructure footprint is particularly useful for manufacturing plants that need to deploy real-time streaming data solutions within resource-constrained environments, like edge devices. In addition, Redpanda’s Tiered Storage offloads older data into streamlined cloud object stores, like Amazon S3, significantly lowering telemetry data retention costs.

Now that we understand the heart of our architecture, let’s move on to how the surrounding components contribute to the three pillars of IIoT.

Connectivity and communication

The first step in an IIoT-enabled environment is to establish communication interfaces with the machinery. In this step, there are two primary goals: read data from machines (telemetry), and write data to machines (control and automation).

Machines in a manufacturing plant can have legacy/proprietary communication interfaces and modern IoT sensors. Most industrial machines today are operated by Programmable logic controllers (PLC). A PLC is an industrial computer ruggedized and adapted to control manufacturing processes, such as assembly lines, machines, robotic devices; or any activity requiring high reliability, ease of programming, and process fault diagnosis.

However, PLCs provide limited connectivity interfaces with the external world over protocols like HTTP and MQTT, restricting external data reads (for telemetry) and writes (for control and automation). Apache PLC4X bridges this gap by providing a set of API abstractions over legacy and proprietary PLC protocols.

PLC4X is an open-source universal protocol adapter for IIoT appliances that enables communication over protocols including, but not limited to, Siemens S7, Modbus, Allen Bradley, Beckhoff ADS, OPC-UA, Emerson, Profinet, BACnet, and Ethernet. The biggest advantage of PLC4X is that it provides a Kafka Connect connector. This allows applications to read from and write to PLC devices as if using databases over JDBC.

Aside from PLCs, modern machines are also equipped with IoT sensors that communicate via the MQTT protocol, making it possible to use MQTT sink and source connectors for data exchange.

Connectivity from machines to Redpanda
Connectivity from machines to Redpanda

Data Collection and Analysis

Regardless of the communication mechanism used above, data collected from machines are ingested into Redpanda in real time for downstream consumption. This telemetry data carries a rich set of information, such as:

  • Operational metrics - Runtime duration (the total time the machine has been in operation), start and stop times

  • Performance metrics - Speed and RPM, throughput, and efficiency

  • Health metrics - Temperature, pressure, vibration, and noise levels

  • Resource utilization - Energy consumption, material usage

  • Faults and alarms - Error codes and warning messages

After ingesting into Redpanda, telemetry data feeds must be cleansed and normalized to produce the output formats that downstream applications expect. This involves event filtering, protocol transformation, enrichment, and aggregation for analytics. A stateful stream processor, like Flink, can be employed for this purpose as it provides native integration with Redpanda as a data source.

Processed data can then be ingested into a time series database, such as InfluxDB or Prometheus, for time series processing, visualization, and interactive analysis. That includes real-time analytics use cases like:

  • Remote equipment monitoring: Monitoring telemetry data to detect and respond to faults or abnormalities in real time. This minimizes the impact of failures by triggering alerts and allowing rapid intervention.

  • Predictive maintenance: Analyzing real-time telemetry data to detect anomalies or patterns indicative of potential equipment failures. This enables proactive maintenance, reduces downtime, and extends the lifespan of machinery.

  • Energy optimization: Continuous monitoring of energy consumption based on real-time telemetry data to identify opportunities for energy savings and optimize resource utilization.

At the same time, telemetry data feeds can be routed to destinations like data warehouses and data lakes for offline use cases, like regulatory reporting, ad-hoc exploration, and machine learning workloads. These use cases include but are not limited to:

Training ML models: Using historical telemetry data to train machine learning models for predictive maintenance or anomaly detection.

Root cause analysis: Investigating past incidents by analyzing telemetry data to determine the root causes of failure.

Historical performance analysis: Analyzing historical telemetry data to identify trends, patterns, and performance benchmarks.

Regulatory reporting: Using historical telemetry data to generate reports for compliance within industry regulations.

Appropriate sink connectors deployed in Kafka Connect can route telemetry data ingest into Redpanda. Redpanda Cloud provides built-in sink connectors to destinations like Amazon S3, GCS, Google BigQuery, Snowflake, and many more.

Real-time and offline analytics on telemetry data
Real-time and offline analytics on telemetry data

Automation and control

Automation and remote control of machine operations increase the efficiency of a factory floor by eliminating operations that require manual human intervention.

Redpanda’s event-driven architecture supports the implementation of automation and control systems. IIoT devices can publish events to Redpanda topics, triggering automated responses and control actions in real time. Events published to a control topic in Redpanda can trigger business processes deployed in stateful workflow engines, such as Camunda, jBPM, and Activiti.

This enables seamless flow of data between IIoT devices and LoB applications to implement use cases, such as:

  • Remote control of machines: Machines deployed in hazardous work environments can be remotely controlled and monitored through connected IIoT devices. That way, humans can be entirely spared from potential harm.

  • Factory floor automation: Production processes can be scheduled, controlled, and monitored to reduce manual intervention. For example, when the raw material usage reaches a certain threshold, purchase orders can be automatically sent for inventory replenishment.

From reference architecture to robust industrial system

Manufacturers with IIoT systems in place or are planning to adopt IIoT in the future can use this reference architecture as a blueprint to push innovation, adaptability, and continuous improvement in the industrial landscape. However, taking them from paper to practice relies on your streaming data setup and available resources to run them.

Redpanda provides a powerful and scalable foundation for handling real-time telemetry data generated by IIoT devices—and is available as a fully-managed cloud service or self-hosted platform. This reliable foundation helps manufacturers achieve seamless connectivity, efficient data collection and analysis, and responsive automation and control of industrial processes.

To get your IIoT data moving, try Redpanda for free or grab the Community Edition from GitHub. You can also join the Redpanda Community on Slack to discuss your specific project with our team.

Originally published on The New Stack

Let's keep in touch

Subscribe and never miss another blog post, announcement, or community event. We hate spam and will never sell your contact information.