Research on efficient matching method of coal gangue recognition image and sorting image

Ye, Zhang; Hongwei, Ma; Peng, Wang; Wenjian, Zhou; Xiangang, Cao; Mingzhen, Zhang

doi:10.1038/s41598-024-75654-0

Download PDF

Article
Open access
Published: 26 October 2024

Research on efficient matching method of coal gangue recognition image and sorting image

Zhang Ye^1,2,
Ma Hongwei^1,2,
Wang Peng^1,2,
Zhou Wenjian^1,2,
Cao Xiangang^1,2 &
…
Zhang Mingzhen³

Scientific Reports volume 14, Article number: 25536 (2024) Cite this article

1037 Accesses
1 Citations
Metrics details

Subjects

Abstract

When the coal gangue sorting robot sorts coal gangue, the position of the target coal gangue will change due to belt slippage, deviation, and speed fluctuations of the belt conveyor. This will cause the robotic to fail in grasping or miss grasping. We have developed a solution to this problem: the IMSSP-Net two-stage network gangue image fast matching method. This method will reacquire the target gangue position information and improve the robot’s grasping precision and efficiency. In the first stage, we use SuperPoint to guarantee the scene adaptability and credibility of feature point extraction. We have enhanced Superpoint’s ability to detect feature points further by using the improved Multi-scale Retinex with Color Restoration enhancement algorithm. In the second stage, we introduce SuperGlue for feature matching to improve the robustness of the matching network. We eliminated erroneous feature matching point pairs and improved the accuracy of image matching by adopting the PROSAC algorithm. We conducted image matching comparison experiments under different object distances, scales, rotation angles, and complex conditions. The experimental platform adopts the double-manipulator truss-type coal gangue sorting robot independently developed by the team. The matching precision, recall, and matching time of the method are 98.2%, 98.3%, and 84.6ms, respectively. The method can meet the requirements of efficient and accurate matching between coal gangue recognition images and sorting images.

Image feature extraction and recognition model construction of coal and gangue based on image processing technology

Article Open access 05 December 2022

Design and application of coal gangue sorting system based on deep learning

Article Open access 17 July 2024

A deep learning method based on multi-scale fusion for noise-resistant coal-gangue recognition

Article Open access 02 January 2025

Introduction

At present, intelligent sorting robots for coal gangue have achieved sorting of different types of coal gangue to a certain extent. However, there are still some challenges in practical applications^1,2,3,4. The coal gangue sorting robot recognizes the target coal gangue and obtains its pose information through the recognition system during the sorting process^5,6,7. The robot control system combines pose information, belt speed, and time to calculate the real-time position of the target coal gangue, thereby achieving robot sorting. However, the actual working environment is complex. During the process of gangue moving from the recognition area to the grasping area, factors such as belt slippage, deviation, and speed fluctuations cause changes in the position and posture of the gangue. The existing methods are unable to achieve precise positioning of coal gangue, resulting in manipulator empty grasping, missed grasping, and even grasping failure. These all seriously affect the sorting efficiency of robots. To solve this problem, it is necessary to accurately locate the target coal gangue before the robot performs sorting actions. Therefore, by using visual methods to match the target coal gangue recognition image provided by the recognition system with the sorting image collected during sorting, precise positioning of the target coal gangue can be achieved. This method can further improve the sorting accuracy and efficiency of robots.

The research contributions of our work can be summarized as follows:

(1)
The coal gangue image matching method combining SuperPoint network and SuperGlue network was proposed. Applied to the intelligent sorting scenario of robot coal gangue, it has rich feature information and strong feature matching ability.
(2)
The improved MSRCR image enhancement algorithm is used to improve the stability and applicability of SuperPoint feature point detection in complex conditions.
(3)
The SuperGlue significantly enhances the feature representation and matching capabilities of the network. By utilizing PROSAC, the matching features are further refined and filtered, thereby ensuring the precision of image matching even under complex conditions.

Next, the relevant work on image matching in robot sorting scenarios is introduced in Section “Related work”. Section “Proposed method” introduces the proposed method framework. Section “Experiment” introduces the performance indicators and analysis of four image matching methods, and Section “Discussion and conclusion” presents the conclusions.

Related work

There introduces some related work on the application of image matching in sorting scenarios.

With the widespread application of machine vision in various industrial scenarios, scholars have also made certain progress in the research of image matching methods in sorting systems. Yang et al. proposed an improved edge feature template matching algorithm for the rapid sorting of multi-objective workpieces, which can achieve high real-time performance when combined with a pyramid layering strategy⁸. Yao et al. proposed a fast part matching method based on HU invariant moment features and improved Harris corner points to address the issues of long matching time and low precision in the process of part image matching. This method improves the speed of image matching, but the Harris corner detection operator performs relatively well only in fixed scale image detection⁹. Wang et al. proposed an improved ORB image matching algorithm for the fast recognition and sorting requirements of workpieces with complex texture features on the surface¹⁰. It replaces the Brief descriptor with the SURF feature sub-description, enhancing the robustness to lighting and image scale changes, and improving the matching accuracy of workpiece images under scale changes.

The above traditional image matching methods based on SIFT^11,12, SURF^13,14,15, BRIEF^16,17, and ORB^18,19,20 all rely on manually designed descriptors. Although they have certain effects, considering the on-site environment of the intelligent sorting robot for coal gangue, there are still some shortcomings. For example, for coal gangue images under complex lighting and scale changes, the above methods cannot effectively extract stable feature points or vectors from the images. Moreover, template based matching has relatively low efficiency, as deep learning can extract advanced semantic information from images.

To address above problems, DeTone et al. proposed SuperPoint network in 2018, which is an end-to-end learning approach that inputs images and outputs features of feature points and descriptors²¹. Among them, SuperPoint is not only accurate but also has a simple structure that can extract key points in real time. However, in the feature-matching stage, traditional methods often rely on minimizing or maximizing specific established metrics, such as squared differences and correlations. The typical Fast Approximation Neighbor Library (FLANN) matching method has poor robustness for matching image pairs lacking texture. SuperGlue: a graph neural network feature matching method provided a new idea for the matching problem^22,23,24,25. The SuperGlue network is a mid-level matching network that needs to be combined with a feature extraction network to better leverage its performance.

The SuperPoint network extracts the most robust image feature points, while the SuperGlue network is completely robust to lighting changes, blurring, and scenes lacking texture. We have combined these two networks to achieve a significant improvement in the matching accuracy of complex images. We have also applied this combination to the sorting of coal gangue by robots in complex scenarios.

Proposed method

Motivation

We conducted research on coal gangue image matching under complex conditions by combining SuperPoint and Superglue networks. However, during the research process, based on the actual working conditions of coal gangue sorting, we believe that further optimization can be carried out in the following two aspects.

(1)
Due to the differences in object distance, scale, and rotation angle of sorting images acquired in different regions, it is difficult to detect enough key points for matching. In order to improve the ability to detect feature points, the sorting images are enhanced using the improved MSRCR algorithm.
(2)
The sorting process of coal gangue is complex. The presence of coal slag, water stains, and other factors on the belt can cause serious interference with the matching results. This will result in a decrease in the number of matching points and incorrect matching point pairs. Therefore, by optimizing the matching results through PROSAC, the matching accuracy can be improved.

Therefore, to solve the problem of image matching of coal gangue under complex conditions, we propose a more robust image matching network called IMSSP-Net.

Model construction

The coal gangue sorting robot obtains the recognition result of the target coal gangue from the recognition system as the recognition image. After the coal gangue enters the sorting area, a camera is used to capture the current image as the sorting image. Figure 1 illustrates the overall structure of our network. Firstly, the improved MSRCR algorithm enhances the two images separately. Then, the SuperPoint algorithm is used to extract feature points and their descriptors from the recognition image and sorting image. Next, it is used as input to perform feature matching between images using the SuperGlue algorithm. Finally, the obtained matching points are filtered using PROSAC to obtain the best matching point pair. The precise location of the target coal gangue is obtained by using the minimum bounding rectangle to frame the matching results.

Improved MSRCR algorithm and feature extraction

To improve the ability of network to detect feature points in coal gangue images. Therefore, we use the improved MSRCR for image enhancement and then use the SuperPoint algorithm to detect the features of the image.

Improved MSRCR algorithm

The MSRCR is a commonly used image enhancement technique that enhances the details and contrast of images while preserving edge information. Its principle involves decomposing the image into different scales, followed by local contrast adjustment and enhancement. Finally, the enhanced image is reconstructed.The normalization formula is employed for intensity mapping, normalizing the Retinex components to scale the intensity range of the image.This achieves overall adjustment of image contrast and brightness. The relationship is expressed as follows:

$$I{\text{out}}1 = \frac{{I_{{{\text{retinex}}}} - \min (I_{{{\text{retinex}}}} )}}{{\max (I_{{{\text{retinex}}}} ) - \min (I_{{{\text{retinex}}}} )}} \times (H_{{{\text{clip}}}} - L_{{{\text{clip}}}} ) + L_{{{\text{clip}}}}$$

(1)

In the equation, I_out1 and I_retinex represents the pixel values of the output image and enhanced image by the MSRCR algorithm. The max(I_retinex) represents the maximum pixel values of the image enhanced by the MSRCR algorithm, min(I_retinex) represents the minimum pixel values of the image enhanced by the MSRCR algorithm, and L_clip and H_clip respectively indicate the lower and upper limits of grayscale levels.

Power-law transformation is a non-linear image enhancement method that alters the pixel value distribution in order to highlight details and features in the image²⁶. It is based on a simple mathematical principle, mapping original pixel values through an exponential function to anew pixel value range, thereby changing the shape of the original pixel value distribution. The relationship is expressed as follows:

$$I_{{{\text{retinex}}}}^{\gamma } = \left( {I{\text{retinex}}} \right)^{\gamma }$$

(2)

In the equation, I_retinex represents the image enhanced by MSRCR. I_retinex^γ represents the result of I_retinex power-law transformation. $\:{\upgamma\:}\:$is the exponent of the power-law transformation. When γ > 1, the pixel values of the output image become more concentrated, emphasizing details. When 0 < γ < 1 ,the pixel values become more dispersed, resulting in smoother details When γ = 1, the power-law transformation has no effect.

Integrating the power-law transformation Eq. (2) into the normalization formula of MSRCR Eq. (1) ,the improved MSRCR algorithm is as follows:

$$I{\text{out}}2 = \frac{{I_{{{\text{retinex}}}}^{\gamma } - minI_{{{\text{retinex}}}}^{\gamma } )}}{{maxI_{{{\text{retinex}}}}^{\gamma } ) - minI_{{{\text{retinex}}}}^{\gamma } }} \times H_{{{\text{clip}}}} - L_{{{\text{clip}}}} + L_{{{\text{clip}}}}$$

(3)

Then use the Eq. (4) to adjust the brightness level range of the output image.

$$I_{{{\text{out}}}} = \left( {I_{{{\text{out2}}}} } \right)^{\beta }, \beta = \frac{{I_{{{\text{out1}}}} }}{{max(I_{{{\text{out1}}}} )}}$$

(4)

In the equation, $\:{\upbeta\:}$ represents the enhancement factor, used to adjust the brightness level range of the output image.Its expression is as follows:

The IMSRCR and MSRCR methods were used to enhance the recognition and sorting images respectively, and the results are shown in Fig. 2. The expression of H_clip and L_clip in Eq. (1) is general expression. In the experiment, H_clip and L_clip are set to 255 and 0, respectively.

To evaluate the quality of the enhanced image, this article uses two widely used image quality evaluation metrics, Peak-Signal to Noise Ratio (PSNR) and Structural Similarity Index (SSIM). The results are shown in Table 1.

Table 1 The comparison of image enhancement.

Full size table

It can be seen that compared with the MSRCR method, the improved method has significantly improved PSNR and SSIM. The IMSRCR image has a more balanced global grayscale, clearer contrast, and richer details.

Feature extraction

The SuperPoint algorithm is different from traditional feature detection algorithms, and its main feature is to simultaneously detect image key points and extract descriptors²⁷. It utilizes a self-supervised convolutional neural network model and is implemented through the Encoder Decoder architecture. In the SuperPoint algorithm, the input image is first subjected to feature extraction through a VGG encoder to obtain a feature map. Then, decode two branches on the output feature map, one for key point detection and the other for descriptor extraction. The convolutional layer in SuperPoint network is shown in Table 2.

Table 2 The convolutional layer in SuperPoint network.

Full size table

The key point detection branch adopts a sliding window approach to calculate the probability of being a key point pixel by pixel on the feature map. The size of each window is usually 8 × 8 pixels, and each pixel is evaluated for being a key point and a probability map of the key points is generated. This probability plot is a 3D tensor with a size of H/8 × W/8 × 65, where M and N are the height and width of the image, respectively, and each pixel block corresponds to an 8 × 8 area as the probability of key points. Determine the final key point position through threshold processing and connectivity analysis. Its structure is shown in Fig. 3.

The descriptor extraction branch is responsible for extracting descriptors for each key point from the feature map. This branch converts feature maps into three-dimensional tensors of H/8 × W/8 × D, where D is the dimension of the descriptor. Extract local features through a series of convolution and pooling operations. To obtain the final descriptor, it is necessary to perform different operations and normalization. The feature map is transformed into a three-dimensional tensor of H×W×D, representing the descriptor of each pixel. Its structure is shown in Fig. 4.

The SuperPoint network was used to extract feature points from the original image, MSRCR, and improved MSRCR images, as shown in Fig. 5. The colored dots in Fig. 5 represent the detected feature points. By comparison, the improved MSRCR method further enriches the color information of the image and extracts significantly more feature points compared to the other two methods. This will be more beneficial for subsequent matching.

Feature matching and optimization

After completing the feature point extraction of coal gangue recognition images and sorting images, we need to match the feature points and optimize the matching results.

SuperGlue feature matching

After using SuperPoint to complete the feature detection of coal gangue recognition images and sorting images, it is necessary to match the feature points of the two images through feature matching methods to obtain the target coal gangue position. This article adopts the SuperGlue algorithm, which comprehensively considers the position information of feature points and visual appearance information to improve the accuracy and robustness of matching.

SuperGlue is a feature matching algorithm based on Graph Neural Network (GNN), which constructs the graph structure of feature points and utilizes Graph Convolutional Neural Network (GCN) for feature aggregation and matching. Figure 6 illustrates its network structure. It mainly consists of two parts: the first half combines self attention mechanism and cross attention mechanism, aiming to extract better matching descriptors. This part significantly improves the accuracy and robustness of feature matching by continuously strengthening the role of the (L = 7) vector f. The optimal matching layer in the latter half uses the inner product to obtain the score matrix and iteratively optimizes it using the Sinkhorn algorithm to achieve preliminary pose estimation optimization. Its structure is shown in Fig. 6.

The first part of attention GNN is the key point encoder. Firstly, the feature points are dimensionally enhanced using a key point encoder. Each key point combines its position and score information and uses a multi-layer perceptron (MLP) to embed the position information of the key points into a high-dimensional vector. As shown in Eq. (5). Specifically, the input data is the key point p and its corresponding feature descriptor d extracted by the feature extraction network, which includes the coordinates and confidence information of the feature points. Firstly, feature points are converted into 256-dimensional data through 5 multi-layer perceptrons (fully connected layers), with channel numbers of 3, 32, 64, 128, and 256, respectively. As shown in Fig. 7.

$$x_{i}^{(0)} = d_{i} + MLP(p_{i} )$$

(5)

Among them, d_i and p_i are the i-th descriptor and feature point, respectively.

The second part of attention GNN is the graph attention mechanism, which includes two undirected edges: Image edges ε_self and Cross edge ε_cross. The former connects the feature points inside the image, while the latter connects the feature points i of adjacent images (denoted as A and B), where t is the time corresponding to adjacent frames. The ^tx_i^A represents the middle expression of the i-th element on image A in layer L. The information m_ε→i aggregates the {j:(i, j)∈ε} of all feature points, where ε∈{ε_self, ε_cross }. From this, all features i in image A transmit updated residual information.

$$^{i + 1} x_{i}^{A} = {}^{t}x_{i}^{A} + MLP\left( {\left[ {ix_{i}^{A} \left\| {m_{i \to i} } \right.} \right]} \right)$$

(6)

Among them, [∙∥∙] represents concatenation operation, and MLP represents multi-layer perceptron. When the number of layers L is odd, ε = ε_cross, and when the number of layers L is even, ε = ε_self. By alternately updating between self mechanism and cross mechanism through m_ε, we simulate the process of humans repeatedly looking between two images until differences or similarities are discovered. The final feature description vector used for matching is shown in Eq. (7).

$$f_{i}^{A} = W \, {.}x_{iA}^{(L)} + b$$

(7)

The above equation W represents weight, and b represents deviation.

The main function of the optimal matching layer is to generate a matching score matrix and output the final matching pair. After completing the feature description vectors f_i^A and f_i^B, the inner product is calculated to obtain the score matrix S∈R^H×W, where M and N represent the number of feature points in the original image and the image to be matched, respectively. The inner product is calculated as follows:

$$S_{i,j} = \left\langle {f_{i}^{A} ,f_{i}^{B} } \right\rangle ,\forall \left( {i,j} \right) \in A \times B$$

(8)

Among them, <·,·> is the inner product. Expand the score matrix by one channel to obtain,‾S as shown in Eq. (9), and directly assign unmatched values to that channel. The score of newly added rows and columns is a fixed value, which can also be obtained through training.

$$\overline{S}_{i,W + 1} = \overline{S}_{H + 1,j} = \overline{S}_{H + 1,W + 1} = z \in R$$

(9)

Where S∈R^H×W, represents the allocation matrix,‾S∈R^(H+1)×(W+1) represents the augmented matching matrix, and each row represents the matching probability between a point and other points to be matched. Since each point can only have one matching point at most, the sum of values in each row of ‾S is 1. Finally, the Sinkhorn algorithm is used to solve for the maximum value of the ‾S allocation matrix, resulting in the optimal allocation result.

PROSAC optimization

Currently, traditional feature optimization methods often adopt the Random Sample Consensus (RANSAC) algorithm. However, since RANSAC selects data randomly without considering the difference in data quality, it may lead to difficulties in convergence under certain circumstances. To overcome this limitation, this paper introduces the PROSAC algorithm. The PROSAC algorithm adopts a semi-random approach. It evaluates the quality of matching point pairs, sorts them in descending order based on their quality, and gives priority to selecting high-quality point pairs. PROSAC can effectively reduce the computation time of the algorithm. Moreover, in cases where RANSAC fails to converge, its semi-random selection method can still ensure the convergence of the algorithm, thereby obtaining more accurate feature point-matching results. Figure 8 shows the results without optimization and the results further optimized with PROSAC. The matching relationship between the recognition image feature points and the sorting image feature points in the figure is connected by colored lines.

From Fig. 8, we can see that there are some feature point pairs matching errors in Fig.(c), which leads to a deviation in the matching results indicated by the blue box in Fig.(e). However, after using the PROSAC algorithm for optimization, the mismatched feature point pairs were filtered out as shown in Fig.(d). The blue box in Fig.(e) outputs the correct result.

Experiment

Experimental environment and image acquisition

The experiment adopts the existing experimental platform of the coal gangue sorting robot of our team, and the experimental middling coal gangue sorting robot is a truss structure. The entire system consists of a belt conveyor, a coal gangue recognition system, a robotic arm sorting system, a robot control system, and a visual servo system. As shown in Fig. 9.

The experimental hardware environment is a PC with processor i7-10700, 16 GB of memory, NVIDIA GeForce RTX 2060 GPU, and MV-HS050GC Vickers camera. The industrial camera obtains 1000 images of coal and gangue from the gangue recognition system, including 500 images of coal and gangue respectively. The pixels of the coal gangue recognition image and sorting image are 455$\:\times\:$360 and 1440$\:\times\:$1080, respectively. We expanded the original samples to 10,000 by translating, rotating, and scaling the sample images, and constructed a coal gangue recognition sample library to ensure the rich diversity of the samples. Partial recognition images are shown in Fig. 10. The Gangue Dataset contains mixed images of coal gangue under scale object distance, rotation angle, and scale transformation.As shown in the Fig. 11, they are respectively referred to as GD-OT, GD-RT, and GD-ST datasets.

Experimental results and analysis

The eye in hand camera installation mode was used in the experiment. Due to differences in object distance, scale, rotation angle, and other factors, images captured in different regions may be affected. Therefore, matching experiments were conducted to verify the recognition images and sorting images under the above conditions. Precision (P), Recall (R), PR curve, and matching time (T) are used as evaluation indicators²⁸.

$${\text{Precision (P)}} = \frac{{N_{{{\text{TP}}}} }}{{N_{{{\text{TP}}}} + N_{{{\text{FP}}}} }} \,$$

(10)

$${\text{Recall (R)}} = \frac{{N_{{{\text{TP}}}} }}{{N_{{\text{P}}} }}$$

(11)

N_TP, N_FP, and N_P represent the numbers of true positive matching points, false positive matching points, and the total numbers of matching points, respectively.

The matching results of our algorithm under different conditions are shown in Figs. 12, 13, 14 and 15. The colored dots in the image are the detected feature points and matching points connected by colored lines.

Comparison experiment of matching different object distances

We set the scale of the image in our experiment to α .The rotation angle is 0°. To test the matching performance of the algorithm at different object distances, sorting images of different object distances were collected in the sorting area, where the vertical distances between the robotic manipulator and the belt were 80 cm, 70 cm, 60 cm, and 50 cm, respectively. Figure 12 shows the results of the matching experiment of our algorithm on coal gangue recognition images and sorting images with different object distances.

From the matching results in Fig. 12, it can be seen that our method can correctly match the target coal gangue at different object distances. To further demonstrate the effectiveness of the algorithm proposed in this paper, the Descent-based Dense Feature Network (D2-Net) feature matching method²⁹, MatchNet matching method³⁰, Superpoint-Superglue (SP-SG) matching method³¹, LR-Superglue(LR-SG)matching method³², and DBSCAN-Superglue(DBSCAN-SG) matching method³³ were used to match the coal gangue recognition image with the coal gangue sorting image.

Table 3 records the experimental data of matching rates and matching times for different methods, and takes the average value as the result.

Table 3 Matching experimental data results for different object distances.

Full size table

The experimental results indicate that as the object distance increases, the matching rates of the six methods begin to decrease, and the matching time does not change significantly. Except for the method in this article, the matching rates of the other five methods have significantly decreased. Our method requires slightly longer time than SP-SG and DBSCAN-SG methods, but has higher matching accuracy. Our method maintains a matching rate of over 97% and a match time of less than 96ms.

Comparison experiment of matching different scales

We set the object distance of the coal gangue recognition image and sorting image to 50 cm, and the rotation angle is 0°. The scale factors are set to 0.3$\:{\upalpha\:}$, 0.5$\:{\upalpha\:}$, 0.7$\:{\upalpha\:}$, and 0.9$\:{\upalpha\:}$, respectively. Figure 13 demonstrates that our algorithm can match the target coal gangue under different scaling factors.

To further demonstrate the effectiveness of the algorithm proposed in this paper, the D2-Net feature matching method, MatchNet matching method, LR-SG matching method, DBSCAN-SG matching method and SP-SG matching method were used to match the coal gangue recognition image with the coal gangue sorting image. Table 4 records the experimental data of matching rates and matching times for different methods, and takes the average value as the result.

Table 4 Matching experimental data results for different scales.

Full size table

According to the results in Table 4, the following conclusions can be drawn. With the scale increases, the matching time of each method becomes longer, but the matching rate also gradually increases. Our method has significant advantages in speed and matching rate compared to other methods, such as D2-Net and MatchNet. The LR-SG method is more sensitive to scale transformations. The DBSCAN-SG method requires less time, but its matching accuracy performs poorly. Our method has a higher matching rate than the SP-SG algorithm and maintains a matching rate of around 98%.

Comparative experiments on different rotation angles

We set the object distance to 50 cm and the scale factor to $\:{\upalpha\:}$. We only change the rotation angle, and other experimental conditions remain unchanged. We rotate at angles of -90°, -45°, 45°, and 90°, respectively. Under experimental conditions, obtain coal gangue sorting image samples with different rotation angles, where the clockwise rotation direction is positive. Using the algorithm proposed in this article, matching experiments were conducted on different rotation angles, as shown in Fig. 14.

To further demonstrate the effectiveness of the algorithm proposed in this paper, the D2-Net feature matching method, MatchNet matching method, LR-SG matching method, DBSCAN-SG matching method and SP-SG matching method were used to match the coal gangue recognition image with the coal gangue sorting image. Table 5 records the experimental data of matching rates and matching times for different methods, and takes the average value as the result.

Table 5 Matching experimental data results for different rotation angles.

Full size table

According to Table 5, when the rotation angle is between − 90° and 90°, the matching of the six methods first increases and then decreases, with the highest matching rate occurring at a rotation angle of 0°. In terms of matching time, the matching time of the six methods first decreases and then increases, with the shortest matching time occurring when the rotation angle is 0°. Our method still maintains a matching rate of over 98.1% and a matching time of less than 95ms, indicating significant advantages over other methods.

The matching results of complex conditions

The scene of coal gangue sorting is dynamically changing. There may be water stains, coal slag, and coal foreign objects on the belt. Therefore, algorithms were added for validation in these situations, and the results are shown in Fig. 15. The experiment includes matching results under different object distances, angles, and scales.

From the experimental results, it can be seen that although the matching environment is complex. But it can still perfectly match the target gangue. The Precision, Recall and matching time are 98.2%, 98.3%, and 84.6 ms, respectively. It has achieved good performance.

In summary, we have compiled the average matching results under different experimental conditions, as shown in Table 6. From the table, it can be seen that the average matching rate of this method for coal gangue recognition images and sorting images is 98.2%, with an average matching time of 84.6ms. It has the characteristics of a high matching rate, good real-time performance, and strong robustness. The performance advantage of our method can further improve the accuracy of coal gangue sorting robots.

Table 6 The comparison of matching results.

Full size table

In addition, the comparison results of our method with other matching performance are shown in Fig. 16. The larger the area of the envelope of the P-R curve (where P is precision and R is recall), the higher the precision.

Discussion and conclusion

In light of alterations in the pose of the target gangue during transmission, this paper puts forth an IMSSP-Net gangue image fast matching method to re-obtain the pose information of the target coal gangue. The proposed method will facilitate enhanced precision and efficiency in the robot’s grasping capabilities. Furthermore, to address the issue of the low matching rate of conventional methods in complex contexts, the SuperPoint and SuperGlue networks are combined to achieve precise and effective matching between gangue recognition and sorting images. To address the limited feature extraction capability of traditional methods, the improved MSRCR is adopted to enhance the feature point detection capability of the SuperPoint algorithm. For the matching results, the PROSAC method is used to refine and filter the matched features, thereby improving the matching precision. The experimental results demonstrate that the proposed IMSSP-Net Net method is more suitable for the research of coal gangue image matching. Compared with other algorithms, our algorithm can be applied to coal gangue sorting in complex environments. The average matching precision of the algorithm is 98.2%, and the time is 84.6ms. At present, the matching time and precision can meet the sorting tasks of coal gangue sorters well, but further research and improvement are needed to address the phenomenon of stacking in large coal throughput.

Data availability

The original contributions presented in the study are included in this article, further inquiries can be directed to the corresponding author.

References

Wang, G., Fan, J., Xu, Y. & Ren, H. Innovation process and prospect on key technologies of intelligent coal mining. J. Ind. Mine Autom. 44 (2), 5–12 (2018).
Google Scholar
Wang, P. et al. A cooperative strategy of multi-arm coal gangue sorting robot based on immune dynamic workspace. Int. J. Coal Prep. Util. 43 (5), 794–814. https://doi.org/10.1080/19392699.2022.2078808 (2023).
Article CAS Google Scholar
Ma, H., Zhang, Y., Wang, P., Wei, X. & Zhou, W. Research on key generic technology of multi-arm intelligent coal gangue sorting robot. Coal Sci. Technol. 51 (01), 42736 (2023).
Google Scholar
Zhang, Y. et al. Research on key technologies of intelligent gangue sorting robot.Journal of mine automation. Ind. Mine Autom. 48 (06), 69–76. https://doi.org/10.13272/j.issn.1671-251x.17931 (2022).
Article Google Scholar
Zhang, N. et al. Precise detection of coal and gangue based on natural γ-ray. Sci. Rep. 14, 1276. https://doi.org/10.1038/s41598-024-51424-w (2024).
Article ADS CAS PubMed PubMed Central Google Scholar
He, X. et al. Application of unmanned aerial vehicle (UAV) thermal infrared remote sensing to identify coal fires in the Huojitu coal mine in Shenmu city, China. Sci. Rep. 10, 13895. https://doi.org/10.1038/s41598-020-70964-5 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zhang, L., Sui, Y., Wang, H., Hao, S. & Zhang, N. Image feature extraction and recognition model construction of coal and gangue based on image processing technology. Sci. Rep. 12, 20983. https://doi.org/10.1038/s41598-022-25496-5 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Yang, J., Yuan, Q., Liang, J. & Ye, K. Research of fast sorting method of Multi-objective Workpiece based on Linemod-2D and Otsu. Modular Mach. Tool. Autom. Manuf. Tech. 10 (10), 79–82. https://doi.org/10.13462/j.cnki.mmtamt.2022.10.017 (2022).
Article Google Scholar
Yao, M., Zhang, X. & Xia L.Visual localization and classification of regular objects based on improved template matching. LASER J. 43 (05), 139–144. https://doi.org/10.14016/j.cnki.jgzz (2022).
Article ADS Google Scholar
Wang, J., Liu, L., Yao, T. & Zhang, R. Research on the application of Improved ORB Algorithm in RoboticSorting System. J. Huaiyin Inst. Technol. 31 (03), 53–58 (2022).
Google Scholar
Wei, W., Zhang, X., Zhu, Y. & Improved SIFT Algorithm combined with Cosine Similarity for Face Matching. Comput. Eng. Appl. 56 (06), 207–212 (2020).
CAS Google Scholar
Zhao, Y., Zhai, Y., Dubois, E. & Wang, S. Image matching algorithm based on SIFT using color and exposure information. J. Syst. Eng. Electron. 27 (3), 691–699 (2016).
Article Google Scholar
Chen, X. et al. Rail profile matching method based on stepped calibration plate and improved SURF algorithm. Eng. Res. Express. 5 (2). https://doi.org/10.1088/2631-8695/ACD98D (2023).
Gogan Taylor, B. & Jennifer, O. Julian. Image variability and face matching. Perception. 51(11), 804–819. https://doi.org/10.1177/03010066221119088 (2022).
Cui, J., Sun, C., Li, Y., Fu, L. & Wang, P. Animproved algorithm for fast image matching based on SURF. Chin. J. Sci. Instrum. 43 (08), 47–53. https://doi.org/10.19650/j.cnki.cjsi.J2209747 (2022).
Article Google Scholar
Song, C., Xu, S., Yang, Y. & Hua, M. Binocular Vision Measurement Method using improved FAST and BRIEF. Laser Optoelectron. Progress. 59 (08), 173–180. https://doi.org/10.3788/LOP202259.0810013 (2022).
Article Google Scholar
Zhou, L. & Jiang, F. Image matching algorithm based on FAST and BRIEF. J. Comput. Eng. Des. 36 (05), 1269–1273 (2015).
Google Scholar
Zhong, P., Li, W. & Liu, J. Workpiece Image Reco-Gnition Method based on Improved ORB Algorithm. J. Mach.Tool Hydraul.. 48 (21), 12–16. https://doi.org/10.3969/j.issn.1001-3881.2020.21.003 (2020).
Article Google Scholar
Liao, H., Wang, L., Sun, H. & Liu, Y. An improved ORB feature matching algorithm. J. Beijing Univ. Aeronaut. Astronaut. 47 (10), 2149–2154 (2021).
Google Scholar
Wang, J., Liu, L., Yao, Tan, S. & Zhang Research on the application of Improved ORB Algorithm in RoboticSorting System. J. Huaiyin Inst. Technol. 31 (03), 53–58 (2022).
Google Scholar
DeTone, D., Malisiewicz, T., Rabinovich, A. & Superpoint Selfsupervised interest point detection and description. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City: IEEE. 224–236. (2018).
Liu, Y., Zhu, L., Yamada, M. & Yang, Y. Semanticcorrespondence asan optimal transport problem. J. Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. 4462-4471https://doi.org/10.1109/CVPR42600.2020.00452 (2020).
Liu, K. et al. A robust network for adaptive multisource image registration based on SuperGlue. J. Digit. Signal. Process. 140. https://doi.org/10.1016/J.DSP.2023.104128 (2023).
Sarlin, P. E., DeTone, D., Malisiewicz, T. & Rabinovich, A. Superglue: learning feature matchi-ng with graph neural networks. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Seattle. 4938–4947. (2020). https://doi.org/10.1109/CVPR42600.2020.00499
Yuan, X., Jie, S., Jian, W., Jin, X. & Wei, L. Automated multispectral remote sensing image registration using local self-similarity. Acta Geodaetica Cartogr. Sin. https://doi.org/10.13485/j.cnki.11-2089.2014.0039 (2014).
Article Google Scholar
Wang, Y., Liu, J., Saliency Object Detection Method Based on Power Law Transformation and Algorithm, I. G. L. C. Comput. Eng. Appl. 55(14):168–176. https://doi.org/10.3778/j.issn.1002-8331.1811-0300 (2019).
Article Google Scholar
Li, K., Wang, L., Liu., Ran, Q., Xu, K. & Guo, Y. Decoupling makes weakly supervi-sed local feature better. arXive-prints. 15817–15827. https://doi.org/10.48550/arXiv.2201.02861 (2022).
Jian, F., Fan, L., Jian, H., Zhao, J. & He, Z. Infrared small dim target detection under maritime near sea-sky line based on regional-division local contrast measure. IEEE Geosci. Remote Sens. Lett. 1–5. https://doi.org/10.1109/LGRS.2023.3316272 (2023).
Zhao, J., Yang, D., Li, Y., Xiao, P. & Yang, J. Intelligent Matching Method for Heterogeneous Remote sensing images based on style transfer. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens.6723-6731https://doi.org/10.1109/JSTARS.2022.3197748 (2022).
Ye, Y., Teng, X., Yu, Q. & Li, Z. Optical-SAR image matching based on MatchNet and multi-point matching constraint. Acta Aeronautica et Astronaut. Sinica. 45 (10), 230–247. https://doi.org/10.7527/S1000-6893.2023.29162 (2024).
Article Google Scholar
Li, S., Zhou, B., Yang, B., Ali, F. & Liang, Z. Feature tracking and matching for w-ide baseline images with closed-loop sequence. Comput. Electr. Eng. https://doi.org/10.1016/j.compeleceng.2023.108871 (2023).
Article Google Scholar
Yuan, X., Chen, J., & Wang, X. Large aerial image Tie Point matching in real and difficult Survey areas via Deep Learning Method. Remote Sens. 14 (16), 3907–3907. https://doi.org/10.3390/rs14163907 (2022).
Article ADS Google Scholar
Hao, W., Wang, P., Ni, C., Zhang, G. & Huangfu, W. SuperGlue-based accurate feature matching via outlier filtering. VIsual Comput. 40, 3137–3150. https://doi.org/10.1007/s00371-023-03015-5 (2024).
Article Google Scholar

Download references

Acknowledgements

Thanks to the National Natural Science Foundation of China (Grant No. 51975468) for the funding and all the authors of this article.

Funding

This work was supported by the National Natural Science Foundation of China (51975468), the program of Shaanxi Natural Science Foundation (2023-JC-YB-362) and the Natural Science Foundation of Shaanxi Provincial Department of Education (23JK0548).

Author information

Authors and Affiliations

School of Mechanical Engineering, Xi’an University of Science and Technology, Xi’an, 710054, China
Zhang Ye, Ma Hongwei, Wang Peng, Zhou Wenjian & Cao Xiangang
Shaanxi Key Laboratory of Mine Electromechanical Equipment Intelligent Monitoring, Xi’an, 710054, China
Zhang Ye, Ma Hongwei, Wang Peng, Zhou Wenjian & Cao Xiangang
School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430000, China
Zhang Mingzhen

Authors

Zhang Ye
View author publications
Search author on:PubMed Google Scholar
Ma Hongwei
View author publications
Search author on:PubMed Google Scholar
Wang Peng
View author publications
Search author on:PubMed Google Scholar
Zhou Wenjian
View author publications
Search author on:PubMed Google Scholar
Cao Xiangang
View author publications
Search author on:PubMed Google Scholar
Zhang Mingzhen
View author publications
Search author on:PubMed Google Scholar

Contributions

Y.Z. conceived the study. H.W.M., P.W. and W.J.Z. were the principal investigators; X.G.C. and M.Z.Z. directed the overall study design; Y.Z., H.W.M. and P.W. performed the experiments; X.G.C. and M.Z.Z. analyzed the data; Y.Z. wrote the manuscript. All authors discussed and interpreted the results.

Corresponding author

Correspondence to Zhang Ye.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Ye, Z., Hongwei, M., Peng, W. et al. Research on efficient matching method of coal gangue recognition image and sorting image. Sci Rep 14, 25536 (2024). https://doi.org/10.1038/s41598-024-75654-0

Download citation

Received: 26 April 2024
Accepted: 07 October 2024
Published: 26 October 2024
DOI: https://doi.org/10.1038/s41598-024-75654-0

Subjects

Abstract

Similar content being viewed by others

Image feature extraction and recognition model construction of coal and gangue based on image processing technology

Design and application of coal gangue sorting system based on deep learning

A deep learning method based on multi-scale fusion for noise-resistant coal-gangue recognition

Introduction

Related work

Proposed method

Motivation

Model construction

Improved MSRCR algorithm and feature extraction

Improved MSRCR algorithm

Feature extraction

Feature matching and optimization

SuperGlue feature matching

PROSAC optimization

Experiment

Experimental environment and image acquisition

Experimental results and analysis

Comparison experiment of matching different object distances

Comparison experiment of matching different scales

Comparative experiments on different rotation angles

The matching results of complex conditions

Discussion and conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links