PDF(11174 KB)
基于Mask R-CNN改进模型的深海多金属结核图像分割方法
翁泽邦, 李小虎, 李洁, 李正刚, 王浩, 朱志敏, 孟兴伟, 李怀明
海洋学研究 ›› 2025, Vol. 43 ›› Issue (3) : 32-39.
PDF(11174 KB)
PDF(11174 KB)
基于Mask R-CNN改进模型的深海多金属结核图像分割方法
Deep-sea polymetallic nodule image recognition method based on an improved Mask R-CNN model
在深海多金属结核光学图像分割中,面临着图像对比度低、目标小和边界模糊等问题。本研究构建了一种引入动态稀疏卷积(dynamic sparse convolution,DSConv)和无参数注意力模块(simple parameter-free attention module,SimAM)的改进Mask R-CNN(mask region-based convolutional neural network)模型,对深海图像进行多金属结核目标物识别和分割。引入SimAM有效抑制了沉积物背景对结核识别的干扰;引入DSConv有效缓解了结核边界模糊问题;同时引入两个模块的改进模型,图像分割准确率为91.5%、精确率为78.0%、召回率为75.1%、交并比为69.4%。将改进模型与原始模型应用在实际测线上发现,海底结核覆盖率的识别结果中,误差低于5%的数据占比从原始模型的57%提升至改进模型的77%。本研究可为深海多金属结核覆盖率计算提供可靠的技术方案,其模块化设计也可拓展至其他目标识别、图像分割领域。
Optical survey and evaluation of deep-sea polymetallic nodules face challenges such as low contrast, small object detection, and boundary ambiguity. This study proposes an improved Mask R-CNN model incorporating dynamic sparse convolution (DSConv) and simple parameter-free attention module (SimAM) for nodule image segmentation. SimAM effectively suppresses sediment background interference, while DSConv alleviates boundary blurring. The combined model achieves an accuracy of 91.5%, precision of 78.0%, recall of 75.1%, and IoU of 69.4%. When applying the improved model and the original model to the actual survey lines, it was found that in the identification results of the seabed nodules coverage rate, the proportion of data with an error less than 5%, increased from 57% of the original model to 77% of the improved model. This research can provide a reliable technical solution for the calculation of deep-sea polymetallic nodule coverage rate, and its modular design can also be extended to other fields of target recognition and image segmentation.
多金属结核 / 图像分割 / Mask R-CNN模型 / 覆盖率 / 注意力机制 / 动态稀疏卷积
polymetallic nodules / image segmentation / Mask R-CNN / coverage rate / SimAM / DSConv
| [1] |
|
| [2] |
|
| [3] |
Poly-metallic nodules are a marine resource considered for deep sea mining. Assessing nodule abundance is of interest for mining companies and to monitor potential environmental impact. Optical seafloor imaging allows quantifying poly-metallic nodule abundance at spatial scales from centimetres to square kilometres. Towed cameras and diving robots acquire high-resolution imagery that allow detecting individual nodules and measure their sizes. Spatial abundance statistics can be computed from these size measurements, providing e.g. seafloor coverage in percent and the nodule size distribution. Detecting nodules requires segmentation of nodule pixels from pixels showing sediment background. Semi-supervised pattern recognition has been proposed to automate this task. Existing nodule segmentation algorithms employ machine learning that trains a classifier to segment the nodules in a high-dimensional feature space. Here, a rapid nodule segmentation algorithm is presented. It omits computation-intense feature-based classification and employs image processing only. It exploits a nodule compactness heuristic to delineate individual nodules. Complex machine learning methods are avoided to keep the algorithm simple and fast. The algorithm has successfully been applied to different image datasets. These data sets were acquired by different cameras, camera platforms and in varying illumination conditions. Their successful analysis shows the broad applicability of the proposed method.
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
Semantic segmentation of targets in underwater images within turbid water environments presents significant challenges, hindered by factors such as environmental variability, difficulties in acquiring datasets, imprecise data annotation, and the poor robustness of conventional methods. This paper addresses this issue by proposing a novel joint method using deep learning to effectively perform semantic segmentation tasks in turbid environments, with the practical case of efficiently collecting polymetallic nodules in deep-sea while minimizing damage to the seabed environment. Our approach includes a novel data expansion technique and a modified U-net based model. Drawing on the underwater image formation model, we introduce noise to clear water images to simulate images captured under varying degrees of turbidity, thus providing an alternative to the required data. Furthermore, traditional U-net-based modified models have shown limitations in enhancing performance in such tasks. Based on the primary factors underlying image degradation, we propose a new model which incorporates an improved dual-channel encoder. Our method significantly advances the fine segmentation of underwater images in turbid media, and experimental validation demonstrates its effectiveness and superiority under different turbidity conditions. The study provides new technical means for deep-sea resource development, holding broad application prospects and scientific value.
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
/
| 〈 |
|
〉 |