In the realm of underwater acoustic signal processing, the ability to accurately detect and classify sounds is crucial for both environmental monitoring and defence applications. However, the complexity of ship-radiated and environmental noise presents significant challenges. Recent advancements in machine learning have shown promise in improving classification accuracy, but issues such as limited dataset availability and a lack of standardised experimentation continue to hinder progress. A new study introduces a novel deep learning architecture, GSE ResNeXt, which integrates learnable Gabor convolutional layers with a ResNeXt backbone enhanced by squeeze-and-excitation attention mechanisms. This innovative approach aims to address the current limitations and improve the robustness and generalisation of underwater acoustic classification systems.
The researchers behind this study, Lucas Cesar Ferreira Domingos, Russell Brinkworth, Paulo Eduardo Santos, and Karl Sammut, highlight the importance of effective signal processing in underwater environments. The GSE ResNeXt model leverages Gabor filters, which act as two-dimensional adaptive band-pass filters, to extend the feature channel representation. By combining these filters with channel attention mechanisms, the model enhances its ability to extract discriminative features, leading to improved training stability and convergence. This integration allows the model to better handle the complexities of underwater acoustic signals, which are often characterised by high levels of noise and variability.
The evaluation of the GSE ResNeXt model involved three distinct training-test split strategies, each reflecting increasingly complex classification tasks. These strategies were designed to address critical issues such as data leakage, temporal separation, and taxonomy. The results demonstrated that the GSE ResNeXt model consistently outperformed baseline models like Xception, ResNet, and MobileNetV2 in terms of classification performance. Notably, the addition of Gabor convolutions to the initial layers of the model reduced training time by up to 62%, highlighting the efficiency gains achieved through this approach.
One of the key findings of the study was the significant impact of temporal separation between training and testing subsets on model performance. The researchers observed that temporal separation had a more substantial effect on performance than the volume of training data. This insight underscores the importance of careful experimental design in evaluating the robustness and generalisation of underwater acoustic classification models. By addressing these factors, researchers can develop more reliable and effective systems for real-world applications.
The study also emphasises the need for future developments to focus on mitigating the environmental effects on input signals. Enhancing the model’s ability to handle varying environmental conditions will be crucial for improving its reliability and generalisation in practical scenarios. The researchers suggest that continued advancements in signal processing techniques and machine learning architectures will play a pivotal role in achieving these goals.
In conclusion, the introduction of the GSE ResNeXt model represents a significant step forward in the field of underwater acoustic classification. By integrating learnable Gabor convolutional layers with attention mechanisms, the model demonstrates superior performance and efficiency compared to existing baseline models. The findings of this study provide valuable insights into the challenges and opportunities in underwater acoustic signal processing, paving the way for future innovations in this critical area of research. As the field continues to evolve, the development of robust and generalisable models will be essential for addressing the complex demands of environmental monitoring and defence applications. Read the original research paper here.

