Researchers from the University of Bristol and the Defence Science and Technology Laboratory (Dstl) have developed a groundbreaking approach to autonomous cyber defence for maritime operational technology. Their work, published in the latest issue of *IEEE Transactions on Cybernetics*, introduces a novel simulation environment and explores the application of Multi-Agent Reinforcement Learning (MARL) to defend industrial control systems (ICS) in maritime settings.
The study, led by Alec Wilson and Ryan Menzies, addresses a critical gap in cybersecurity: the vulnerability of Operational Technology (OT) systems. Unlike traditional IT systems, OT infrastructure often relies on legacy systems and lacks modern security controls, making it a prime target for sophisticated cyber-attacks. The researchers developed the IPMSRL simulation environment, a generic Integrated Platform Management System (IPMS) that models maritime OT systems. This environment allows for the testing and refinement of autonomous cyber defence strategies.
The team employed MARL to enable autonomous decision-making in cyber defence. Two approaches were compared: a shared critic implementation of Multi Agent Proximal Policy Optimisation (MAPPO) and Independent Proximal Policy Optimisation (IPPO). The results were clear—MAPPO outperformed IPPO, reaching an optimal policy after 800,000 timesteps, while IPPO only achieved a mean episode outcome of 0.966 after one million timesteps. Hyperparameter tuning further enhanced training performance, demonstrating the importance of fine-tuning algorithms for real-world applications.
One of the most compelling aspects of the research is its practical relevance. The team tested the MARL defenders under real-world constraints, such as reduced attack detection alert success rates. Even when alert success probabilities dropped to 0.75 or 0.9, the MARL defenders maintained high success rates, winning over 97.5% and 99.5% of episodes, respectively. This resilience underscores the potential of MARL to provide robust cyber defence in challenging environments.
The implications for the defence and security sector are significant. As cyber threats continue to evolve, traditional IT-centric solutions often fall short in protecting OT systems. The researchers’ work offers a promising alternative, demonstrating that autonomous cyber defence can effectively safeguard critical maritime infrastructure. By leveraging MARL, defence agencies and private sector organisations can enhance their cyber resilience and better defend against increasingly sophisticated attacks.
The research also highlights the importance of collaboration between academia and government. The University of Bristol and Dstl’s partnership has yielded innovative solutions that address real-world challenges. As cyber threats grow more complex, such collaborations will be crucial in developing the next generation of defence technologies. The study not only advances the field of autonomous cyber defence but also sets a foundation for future research in MARL applications across various operational domains. Read the original research arXiv here.

