- Agrippina Mwangi (Lead Developer and Researcher) - LinkedIn
- León Navarro-Hilfiker (OT Security Engineer) -LinkedIn
- Executive Summary
- Pre-requisites
- Data Plane Design
- Temperature Module
- Control Plane Design
- Knowledge Plane Design
- Reach Us
- References
Flash events of benign traffic flows and switch thermal instability are key stochastic disruptions causing intermittent network service interruptions in software-defined Industrial Internet of Things (IIoT-Edge) networks for offshore wind power plants (WPPs).
These stochastic disruptions violate the service level agreements and quality of service requirements enforced to ensure high availability and high performance; for reliable transmission of critical, time-sensitive and best-effort data traffic.
To mitigate these stochastic disruptions, this study implements a threshold-triggered Deep Q-Network (DQN) self-healing framework that detects, analyzes, repairs, and adapts network behavior and resources in response to these stochastic disruptions.
On a Microsoft Azure (Ms-Azure) proof-of-concept testbed, the DQN self-healing framework was trained and tested in software-defined IIoT-Edge networks designed for triple WPP cluster application scenario.
The testbed comprised Mininet-emulated super spine-leaf switch network topologies at the data plane, ONOS-based controller clusters at the control plane, and the threshold-triggered DQN self-healing agent in the knowledge plane.
Simulation results from this testbed demonstrated that this threshold-triggered DQN self-healing agent outperformed the baseline super spine-leaf switch network algorithms by 53.84% for specific test case scenarios representing varied network states.
Further, it was observed that the DQN self-healing agent maintained switch temperatures within the nominal operating range by activating select external rack fans.
These findings highlight the potential of learning algorithms in building the resilience and autonomy of such critical industrial operational technology networks.
- Create a Microsoft Azure (Ms-Azure) account and get a subscription.
- Create an Ms-Azure resource group and assign it subnets, a network security group, and Bastion.
- Create several Linux virtual machines in this Ms-Azure Resource Group with the following compute and storage specifications:
- Linux Ubuntu server 22.04 lts-Gen2 x64, 2 vCPUs (16GiB RAM), 128-512GB SSD/HDD
- Docker ver. 24.0.7
- Mininet Ver. 2.3.0
- InfluxdB Ver.2.7.10
- Python Ver.3.12.3
See the Installation Guide
Alternatively, get a physical server and proceed to step 3.
-
On one of the VMs with Mininet Installation run the network topology for an offshore wind farm (reduce model with 20 WTGs communicating with one OSS)
-
At the Mininet prompt, run xterm on select mininet hosts to initialize traffic generation using the following data sets:
- MQTT sensor data traffic
- IEC61850 SV/GOOSE docker based data traffic
- Ordinary ping tests
-
To monitor network performance, use the iperf3 tool for active measurements of network latency, throughput, jitter, packet loss (loss of datagrams).
- A temperature module was designed to generate temperature profiles for the data plane network topology switches using a linear approach. This is because, the Mininet emulator does not capture the temperature profiles of the virtual Openflow switches.
- For this setup, the temperature module was designed as shown in this source file.
-
On one of the VMs, download the ONOS ver.2.0.0 SDN Controller.
-
Create a cluster using the "org.onosproject.cluster-ha" ONOS SDN controller feature.
-
Install the following ONOS features:
- org.onosproject.pipelines.basic
- org.onosproject.fwd
- org.onosproject.openflow
- org.onosproject.cpman
- org.onosproject.metrics
-
See more information on ONOS installation and design here.
-
The Knowledge plane interacts with the ONOS SDN Controller subsystems using RESTFul APIs from the Northbound Interface.
The knowledge plane's graphical abstract is as shown below:
- On one of the VMs, download Anaconda and install the relevant Tensorflow and Keras dependencies in a new environment (not the base).
- The knowledge plane hosts 4 modules namely: Observe, Orient, Decide, and Act modules. These modules interact with each other exchanging important network performance information derived from the ONOS SDN controller topology manager, statistics manager, and flow rule manager as shown in the self-healing framework.
- Detailed descriptions of each module and associated source files:
- Observe module (Description) (Source File)
- Orient Module (Description) (Source File)
- Decide Module (Description) (Source File)
- Act Module (Description ) (Source File)
- If you need assistance using this tool, kindly log an issue here and we will respond within 24hrs maximum waiting time.
- Also, feel free to contribute to discussion posts and suggest any points of improvement by logging an issue.
- Mwangi et al., (n.d.) "Implementing self-healing autonomous software-defined IIoT-Edge networks in Offshore Wind Power Plants" submitted to IEEE Transactions on Network and Service Management (November, 2024).
@article{mwangi2025tnsm,
title="Implementing self-healing autonomous software-defined IIoT-Edge networks in Offshore Wind Power Plants",
journal="IEEE Transactions on Network and Service Management",
year="2025",
volume="",
issue="",
}
- More studies from us and cite our work: Reference list

