
Deep-sea polymetallic nodule image recognition method based on an improved Mask R-CNN model
WENG Zebang, LI Xiaohu, LI Jie, LI Zhenggang, WANG Hao, ZHU Zhimin, MENG Xingwei, LI Huaiming
Journal of Marine Sciences ›› 2025, Vol. 43 ›› Issue (3) : 32-39.
Deep-sea polymetallic nodule image recognition method based on an improved Mask R-CNN model
Optical survey and evaluation of deep-sea polymetallic nodules face challenges such as low contrast, small object detection, and boundary ambiguity. This study proposes an improved Mask R-CNN model incorporating dynamic sparse convolution (DSConv) and simple parameter-free attention module (SimAM) for nodule image segmentation. SimAM effectively suppresses sediment background interference, while DSConv alleviates boundary blurring. The combined model achieves an accuracy of 91.5%, precision of 78.0%, recall of 75.1%, and IoU of 69.4%. When applying the improved model and the original model to the actual survey lines, it was found that in the identification results of the seabed nodules coverage rate, the proportion of data with an error less than 5%, increased from 57% of the original model to 77% of the improved model. This research can provide a reliable technical solution for the calculation of deep-sea polymetallic nodule coverage rate, and its modular design can also be extended to other fields of target recognition and image segmentation.
polymetallic nodules / image segmentation / Mask R-CNN / coverage rate / SimAM / DSConv
[1] |
|
[2] |
|
[3] |
Poly-metallic nodules are a marine resource considered for deep sea mining. Assessing nodule abundance is of interest for mining companies and to monitor potential environmental impact. Optical seafloor imaging allows quantifying poly-metallic nodule abundance at spatial scales from centimetres to square kilometres. Towed cameras and diving robots acquire high-resolution imagery that allow detecting individual nodules and measure their sizes. Spatial abundance statistics can be computed from these size measurements, providing e.g. seafloor coverage in percent and the nodule size distribution. Detecting nodules requires segmentation of nodule pixels from pixels showing sediment background. Semi-supervised pattern recognition has been proposed to automate this task. Existing nodule segmentation algorithms employ machine learning that trains a classifier to segment the nodules in a high-dimensional feature space. Here, a rapid nodule segmentation algorithm is presented. It omits computation-intense feature-based classification and employs image processing only. It exploits a nodule compactness heuristic to delineate individual nodules. Complex machine learning methods are avoided to keep the algorithm simple and fast. The algorithm has successfully been applied to different image datasets. These data sets were acquired by different cameras, camera platforms and in varying illumination conditions. Their successful analysis shows the broad applicability of the proposed method.
|
[4] |
|
[5] |
|
[6] |
|
[7] |
Semantic segmentation of targets in underwater images within turbid water environments presents significant challenges, hindered by factors such as environmental variability, difficulties in acquiring datasets, imprecise data annotation, and the poor robustness of conventional methods. This paper addresses this issue by proposing a novel joint method using deep learning to effectively perform semantic segmentation tasks in turbid environments, with the practical case of efficiently collecting polymetallic nodules in deep-sea while minimizing damage to the seabed environment. Our approach includes a novel data expansion technique and a modified U-net based model. Drawing on the underwater image formation model, we introduce noise to clear water images to simulate images captured under varying degrees of turbidity, thus providing an alternative to the required data. Furthermore, traditional U-net-based modified models have shown limitations in enhancing performance in such tasks. Based on the primary factors underlying image degradation, we propose a new model which incorporates an improved dual-channel encoder. Our method significantly advances the fine segmentation of underwater images in turbid media, and experimental validation demonstrates its effectiveness and superiority under different turbidity conditions. The study provides new technical means for deep-sea resource development, holding broad application prospects and scientific value.
|
[8] |
|
[9] |
|
[10] |
|
[11] |
|
[12] |
|
[13] |
|
[14] |
|
[15] |
|
[16] |
|
[17] |
|
/
〈 |
|
〉 |