Adaptive multi-feature fusion for visible-infrared image registration and character enhancement of bamboo slips

Wan, Teng; Qi, Fengchen; Yang, Yanna; Qi, Ying; Zhang, Qiang; Du, Shaoyi

doi:10.1038/s40494-026-02368-z

Download PDF

Article
Open access
Published: 13 February 2026

Adaptive multi-feature fusion for visible-infrared image registration and character enhancement of bamboo slips

Teng Wan¹,
Fengchen Qi¹,
Yanna Yang¹,
Ying Qi¹,
Qiang Zhang¹ &
…
Shaoyi Du²

npj Heritage Science volume 14, Article number: 96 (2026) Cite this article

918 Accesses
Metrics details

Abstract

Ancient bamboo/wooden slips suffer severe character degradation after millennia of burial, requiring infrared imaging for text identification. This work proposes a multimodal coarse-to-fine registration method to fuse visible and infrared images while preserving texture/color and restoring degraded inscriptions. The approach comprises: (1) Coarse registration using edge-feature-priority strategy, leveraging stable slip contours for global alignment via downsampling; (2) Fine registration with improved ICP algorithm incorporating weighted features and dynamic weight adjustment, transitioning from edge-dominance to corner-dominance for precise local registration; (3) Multi-stage hybrid optimization combining gradient methods with multi-restart simulated annealing, maximizing mutual information for optimal transformation matrices. The method addresses weak texture, modal differences, and severe character degradation by selecting appropriate registration strategies and feature weights at different stages. Experiments demonstrate superior performance over existing methods in visual quality and quantitative metrics. Difference fusion based on registered multimodal images achieves effective degraded character restoration, significantly improving inscription readability.

SSA-based adaptive infrared-visible image fusion for ink enhancement in ancient bamboo slips

Article Open access 13 January 2026

Digital restoration and feature recognition of a Qing-Dynasty vernacular dwelling based on multimodal data fusion

Article Open access 13 December 2025

A dual-stream feature decomposition network with weight transformation for multi-modality image fusion

Article Open access 03 March 2025

Introduction

Bamboo and wooden slips, as important writing carriers in ancient China^1,2, were extensively used from the Qin-Han to Wei-Jin periods. Primarily made from wooden or bamboo strips with characters written in ink using brushes, they served multiple domains including government decrees, judicial matters, documentation, medicine, and daily life, containing exceptionally rich content. Since the 20th century, with the development of large-scale archaeological excavations of bamboo slips in Dunhuang, Juyan, Zhangjiashan, Liye, Changsha Zuomulou, and other sites, numerous precious artifacts have been unearthed, providing unprecedented primary materials for the study of ancient Chinese history, possessing irreplaceable historiographical value and cultural significance. The organization and research of bamboo slips have not only advanced the development of disciplines such as paleography, history, and legal history, but have also greatly enriched the documentary system of Chinese civilization. Through in-depth study of bamboo slip documents, scholars can reconstruct the authentic appearance of ancient society and correct and supplement the deficiencies of transmitted texts, yielding immeasurable cultural value and historical significance.

However, bamboo slip research faces numerous challenges and difficulties. Character degradation represents one of the core issues affecting bamboo slip studies. Due to erosion over lengthy historical periods and the complexity of burial environments, numerous excavated bamboo slips exhibit varying degrees of character degradation, ink blurring, surface weathering, folding, and damage³. Particularly in visible images, the inscriptions on bamboo slips often exhibit varying degrees of degradation due to background interference, wood aging, and physical damage or fracture⁴, which severely impairs the interpretation and scholarly study of bamboo slip documents.

In recent years, Gong et al.⁵ employed hyperspectral imaging technology to analyze genealogy seals in the Yibin Museum collection, revealing significant differences in pigment reflectance properties across visible and near-infrared bands. Guo et al.⁶ extracted pigment information from ancient paintings using imaging spectroscopy technology, distinguishing background pigments from overlay information, thereby facilitating the authentication and restoration of ancient paintings. Sun et al.⁷ effectively assessed the deterioration degree of Dunhuang murals using hyperspectral imaging. Zhou et al.⁸ utilized hyperspectral imaging to extract blurred seal information from calligraphy and paintings, contributing to further research on these cultural artifacts. Hou et al.⁹ achieved enhancement, extraction, and recognition of faded text in ancient calligraphy and paintings using hyperspectral technology. Deng et al.¹⁰ addressed the challenges of accuracy and efficiency in mold detection for paper-based cultural artifacts by proposing a triple-path multimodal feature fusion network based on hyperspectral imaging, which integrates RGB spatial information, spectral features, and joint spectral-spatial features to achieve precise mold identification.Cucci et al.¹¹ addressed the recognition challenges posed by degraded ancient Egyptian hieroglyphs by proposing a method that combines visible-near infrared hyperspectral imaging with convolutional neural networks, providing a novel technical pathway for digital documentation and text recognition in cultural heritage. Mezina et al.¹² proposed a deep learning approach for anomaly detection in X-ray images of paintings. The research team constructed a dataset using high-resolution X-ray images of the Ghent Altarpiece, and based on this dataset, developed a novel neural network architecture that integrates the Discriminatively Trained Reconstruction Anomaly Embedding Model (DRAEM) with Nested U-Net for anomaly detection. Vila et al.¹³ successfully revealed ancient Greek text concealed on the back of papyrus using shortwave-infrared hyperspectral imaging, substantially improving the contrast between writing and papyrus substrate. Mocella et al.¹⁴ applied X-ray phase-contrast tomography for virtual unwrapping and deciphering of unrolled Herculaneum papyri. Parsons et al.¹⁵ recovered text invisible to the naked eye from unrolled, extremely fragile ancient Herculaneum papyrus scrolls using non-destructive imaging and machine learning techniques. Lv et al.¹⁶ addressed the complex artistic styles and irregular damage patterns in Kizil Grotto murals by proposing a multimodal restoration method based on diffusion models.Zhou et al.¹⁷ addressed the challenges of scarce annotated data and difficulty in distinguishing visually similar characters in Oracle Bone Script recognition by proposing the OracleNet model, which effectively improves recognition accuracy and robustness in cross-domain scenarios. The application of multimodal imaging technology in cultural heritage preservation and research has provided new technical approaches for addressing bamboo slip character degradation issues. Particularly in bamboo slip research, infrared images often reveal ink traces that are difficult to identify under visible, while visible images provide material texture and structural information. For bamboo slips (jiandu), the utilization of spectral technology enables faded text information, originally invisible under visible, to reappear in the infrared band. Zhang et al.^18,19,20 employed infrared imaging technology to make faded text on bamboo slips reappear in infrared images and compiled this information into publications, thereby advancing bamboo slip research. Cao et al.²¹ proposed a character restoration method for Qin and Han bamboo slips based on improved conditional generative adversarial networks (cGAN), and constructed a dataset comprising 500 pairs of original and ground truth images of Qin-Han bamboo slip characters. In summary, this cross-band imaging approach exhibits significant complementarity between different modalities, with visible and infrared band images complementing each other at the information level: the former provides structural and textural details, while the latter reveals concealed or degraded implicit information. This cross-band complementary characteristic constitutes an important foundation for artifact visualization and restoration, and serves as a critical basis for the registration of infrared and visible images of bamboo slips. As shown in Fig. 1 below. This situation presents challenges for bamboo slip research, as experts conducting character recognition, fragment rejoining, and various other studies often need to repeatedly compare between visible and infrared images, substantially increasing workload and affecting research efficiency.

**Fig. 1: Complementary information from visible and infrared imaging of bamboo slips and the necessity of image registration.**

The development of multimodal bamboo slip image registration technology holds significant positive implications for bamboo slip research. First, due to differences in imaging principles and equipment parameters, substantial geometric misalignment and photometric differences exist between visible and infrared images^22,23, making direct fusion for image enhancement often unable to achieve ideal results. Therefore, proposing high-precision multimodal bamboo slip image registration algorithms to achieve precise alignment between visible and infrared images has become a critical technological component for improving bamboo slip research quality. Precise multimodal image registration can align inter-modal differences, laying the foundation for subsequent image fusion and character enhancement processing. Through image fusion methods that combine complementary information from visible and infrared images of bamboo slips, not only can degraded character information be enhanced while preserving texture information²⁴, significantly improving the recognition of degraded text, but this also provides clearer and more accurate image evidence for subsequent work such as bamboo slip transcription and fragment rejoining. High-quality digitized bamboo slip images not only support remote access and online research, breaking geographical limitations and promoting international cooperation and exchange, but the in-depth development of bamboo slip digitization work also holds important strategic significance for inheriting and promoting excellent traditional Chinese culture. Through establishing comprehensive bamboo slip digital resource libraries²⁵, not only can convenient retrieval and analysis tools be provided for researchers, but the development of interdisciplinary research can also be promoted, driving the cross-disciplinary integration of history, archeology, linguistics, computer science, and other fields, injecting new vitality into bamboo slip research. The field of image registration has witnessed significant development over the past decades, with applications extending from medical imaging to cultural heritage preservation. Traditional image registration methods can be broadly categorized into intensity-based and feature-based approaches. The method proposed by Viola and Wells²⁶ directly optimizes similarity metrics between image intensities. Mutual information methods^27,28,29 and Fourier-based methods^30,31 aim to match images by identifying similar pixel intensities in overlapping regions, and have been widely adopted as multimodal image registration approaches. Scale-Invariant Feature Transform (SIFT)^32,33,34 and Speeded Up Robust Features (SURF)³⁵ have been extensively employed for natural image registration.

The Iterative Closest Point (ICP) algorithm was proposed by Besl and McKay³⁶ for rigid point set registration, achieving progressive point set alignment through alternating nearest point matching and least-squares transformation estimation. Chen and Medioni³⁷ previously proposed the multi-view point set alignment concept, which also laid the groundwork for ICP development. Subsequently, Zhang³⁸ extended ICP to accommodate free-form curve and surface registration. These early works established the classical ICP framework based on geometric features (point coordinate distances). As application scenarios expanded, the robustness and convergence of ICP have been continuously improved. Fitzgibbon³⁹ proposed the LM-ICP algorithm incorporating nonlinear optimization and robust kernel functions in registration, thereby enhancing robustness and convergence speed in 2D and 3D point set registration. Segal et al.⁴⁰ further proposed Generalized-ICP, introducing point-to-plane error and local covariance structure into the optimization objective function, making the registration process statistically more rigorous. Yang et al.⁴¹ proposed the Go-ICP algorithm, guaranteeing global optimality under the L2 error metric. Bouaziz et al.⁴² proposed the Sparse Iterative Closest Point (Sparse ICP) algorithm, which introduces sparse regularization into the traditional ICP framework by integrating sparse constraints directly into the optimization model, enabling the algorithm to maintain high registration accuracy and convergence stability even in the presence of noise, occlusion, or partial overlap. Guo et al.⁴³ proposed the Adaptive Weighted Robust Iterative Closest Point (AW-RICP) algorithm, enhancing ICP robustness in scenarios with noise, outliers, and partial overlap through adaptive sparse neighbor selection and weight learning mechanisms. Zhou et al.⁴⁴ proposed the Fast Global Registration (FGR) algorithm, which transforms the registration problem into an optimization problem with robust loss functions and utilizes Black-Rangarajan duality for rapid solution, thereby achieving fast registration without requiring initial alignment. However, the FGR algorithm depends on feature quality.

The advent of deep learning has transformed the trajectory of image registration development, particularly in medical imaging applications. VoxelMorph⁴⁵ introduced an unsupervised learning framework that employs convolutional neural networks to directly predict deformation fields. This approach significantly reduces computational time during inference while maintaining registration accuracy comparable to traditional iterative methods. Arar et al.⁴⁶ proposed an unsupervised multimodal image registration method based on geometry-preserving image-to-image translation, addressing the modal difference challenges that traditional methods struggle to handle by transforming multimodal registration into single-modal registration.

The application of transformer architecture⁴⁷ in image registration tasks has also achieved promising results. Vision Transformer (ViT)⁴⁸ first demonstrated that transformers could be successfully applied to vision tasks, establishing the foundation for subsequent visual transformer development. TransMorph⁴⁹ further demonstrated that visual transformers can capture long-range spatial dependencies more effectively than CNNs, thereby improving registration performance. However, these transformer-based methods require substantial computational resources and large training datasets, limiting their applicability in resource-constrained scenarios.

This work focuses on high-precision registration between visible and infrared images and degraded character enhancement. The objective is to propose a robust and broadly applicable registration and character enhancement framework oriented toward degraded text enhancement, enabling accurate alignment of infrared and visible images of bamboo slips and achieving character restoration based on the registration results. This work provides effective technical support for deep information mining and digital preservation of bamboo slip images, filling a technological gap in the field of cultural heritage conservation. The main contributions are as follows:

A coarse registration module for multimodal bamboo slip images is proposed, utilizing the relatively stable characteristics of edge information across different modalities and under downsampling conditions to achieve global registration between multimodal bamboo slip images.
A module for fine registration of globally registered multimodal bamboo slip images is proposed. Through innovation of the Iterative Closest Point (ICP) algorithm by replacing the original single reliance on raw points with edge-corner features, while innovatively modifying the weights of different features dynamically during algorithm iterations, the challenges posed by weak texture in bamboo slip image registration are addressed.
A registration optimization module based on a hybrid optimization strategy combining gradient optimization with multi-restart simulated annealing is proposed, fine-tuning registration results through mutual information maximization.

The paper is structured as follows. The Methods section details the registration and character enhancement algorithms. The Results section presents experimental validation of the registration method and character enhancement outcomes. The Discussion section summarizes the main contributions and outlines future research directions.

Methods

This work proposes a multi-level registration framework for weak-texture bamboo slip images, which organically integrates edge features with SIFT corner features and achieves high-precision registration through a coarse-to-fine hierarchical strategy, as shown in Fig. 2. The core concept of this method is to combine the characteristics of bamboo slip images, implementing the registration process hierarchically, and adaptively adjusting the weights of different features at various stages. The entire registration process is divided into three stages: coarse registration, fine registration, and mutual information optimization. Multi-feature fusion and dynamic weight adjustment strategies are incorporated into the first two stages.

**Fig. 2: Framework of the proposed adaptive multi-feature fusion method for bamboo slip registration and character enhancement.**

Multi-feature detection and weighted matching

Multimodal bamboo slip images exhibit significant texture differences, making traditional single features ineffective in addressing registration challenges. This section details the multi-feature extraction and matching strategies designed for bamboo slip characteristics, including adaptive multi-intensity SIFT feature extraction, dedicated bamboo slip edge feature extraction, and feature matching mechanisms based on different weights. Input images are first uniformly converted to grayscale to eliminate the influence of color component differences on feature extraction results. Subsequently, three methods are employed for structural enhancement of the images: First, bilateral filtering is used to smooth noise while preserving edges. Compared to Gaussian filtering, bilateral filtering retains more gradient information at edges, facilitating subsequent edge and corner point identification. Second, histogram equalization is applied to enhance overall image contrast, stretching the dynamic range of dark regions to highlight weak boundaries in low-texture areas and improve the stability of feature point response values. Finally, Laplacian enhancement is used for high-pass processing of the image to emphasize edge contours. While these results are not directly used for registration, they provide auxiliary support for subsequent “texture intensity assessment”.

Bamboo slip images typically exhibit characteristics such as weak texture and low contrast, posing challenges for traditional SIFT feature extraction. To address this issue and improve structural clarity and feature stability in images, this work designs an adaptive dual-intensity SIFT feature extraction strategy that effectively handles the special properties of bamboo slip images. We perform texture intensity analysis on images using the Laplacian operator, calculating overall texture intensity values as the basis for adaptive parameter adjustment. In regions with weaker and smoother textures, the Laplacian response is smaller, reflected in smaller variance values, while in regions with stronger and more complex textures, the Laplacian response is larger, reflected in larger variance values. Applied to bamboo slip images, bamboo slips with mild character degradation display more clear characters in visible images with rich details and edges, representing strong texture; bamboo slips with more severe character degradation show increasingly faint or even vanished characters in visible images, revealing only the material background of the bamboo slip with large smooth areas lacking obvious edges and details, representing weak texture. Based on texture intensity results, the algorithm can distinguish between weak-texture and strong-texture images. For bamboo slips with mild character degradation, visible images contain relatively clear text, where these characters serve as excellent feature point sources character strokes, turns, and intersections are all detected as corner points, specifically high-intensity corner points. In such cases, we increase the corner detection threshold to detect high-intensity corner points as much as possible for registration, as these high-intensity corner points typically achieve higher matching accuracy and better precision than medium-intensity corner points. For weak-texture images, we employ more relaxed parameters to extract medium-intensity corner points. After extracting feature points and descriptors separately, the results are merged to form a comprehensive corner point set. This dual-intensity strategy can simultaneously capture prominent features and medium-intensity features in images, substantially improving feature extraction capability in weak-texture regions. To ensure uniform spatial distribution of feature points while avoiding redundancy, two optimization strategies are applied in the algorithm: (1) Non-maximum suppression: Only the feature points with the strongest response values are retained within local regions, effectively reducing redundancy of nearby feature points. Each feature point is examined, and if a feature point with a higher response value exists within the region, the current feature point is suppressed; (2) Local uniformity constraint: The image is divided into grids, with a maximum of 15 strongest-response feature points selected within each grid cell, ensuring uniform distribution of feature points throughout the image and avoiding clustering in locally texture-rich regions.

Bamboo slip edges represent shared and stable features between different modal images, especially in the coarse registration stage where edge point matching accuracy further improves after downsampling, playing a primary role in coarse registration. This work designs an edge feature extraction method specifically tailored to bamboo slip characteristics. The extraction results of bamboo slip edge contours are shown in Fig. 3 below. This method focuses on suppressing interference from text details and bamboo slip material texture, reducing the possibility of decreased edge matching accuracy caused by significant differences in edge extraction between the two modal images where infrared images of bamboo slips reveal more characters than visible images thereby highlighting bamboo slip main contours that exhibit less variation across different modalities. Simultaneously, visible images of bamboo slips contain rich texture information, while infrared images, due to their imaging characteristics specifically, near-infrared light’s stronger penetrating ability allowing it to penetrate the texture layer of wood surfaces, reducing surface structure’s impact on imaging, combined with various wood chemical components exhibiting relatively uniform absorption characteristics where texture structure-induced chemical composition differences in this band are insufficient to produce strong contrast contain almost no wood texture information. Without processing, wood texture information in visible images would generate numerous false edge responses that would be extracted as edges, but infrared images would lack corresponding texture features, reducing registration algorithm reliability. Therefore, we first apply Gaussian blur processing to images, using a 5 × 5 kernel to reduce character detail interference while preserving main structural contours. Subsequently, image gradients are calculated through Sobel operators, with gradient statistics further analyzed to adaptively set Canny edge detection thresholds. This adaptive threshold strategy better accommodates edge characteristics of different bamboo slip images. After applying Canny edge detection, morphological closing operations are employed to connect broken edges, enhancing edge continuity. External contours are then extracted, and area thresholds are applied to filter small-area noise, preserving main bamboo slip edge contours. This edge extraction method fully considers bamboo slip characteristics, effectively suppressing text detail interference and highlighting main external edges of bamboo slips, providing reliable features for subsequent edge-based registration.

**Fig. 3: Bamboo slip edge contour information extracted through the improved edge extraction method.**

After feature extraction, correspondence relationships between features from different images must be established, and reasonable weight allocation strategies must be designed to fully leverage the effectiveness of different features at various registration stages. First, for corner point feature matching, the algorithm employs a FLANN-based fast matching algorithm. The KD-tree method is used for index construction, with higher checking iterations set to improve matching precision. To enhance matching quality, an adaptive distance ratio threshold strategy is adopted. Compared to the fixed distance ratio thresholds typically used in traditional SIFT matching for filtering, stricter thresholds may be more suitable for texture-rich images with high feature discrimination; for images like bamboo slips with weak texture and large modal differences, overly strict thresholds may result in insufficient valid matching points. Therefore, for bamboo slip images, using adaptive distance ratio thresholds is more conducive to finding matching points. We first calculate the distance ratio distribution of all matching point pairs, then statistically determine appropriate thresholds, ensuring matching strictness while adapting to different image characteristics. To further improve matching reliability, we also implement a bidirectional matching verification strategy. This strategy requires that a pair of feature points must correspond to each other in both forward and reverse matching to be considered valid matches. The inclusion of this strategy significantly reduces mismatches and improves matching point reliability. When sufficient matching points are available, we further apply the RANSAC algorithm for geometric consistency verification, eliminating outliers that do not conform to the primary geometric transformation relationship. Second, for establishing edge feature correspondences, we adopt a distance-based nearest point matching strategy. For each edge sampling point in the source image, the point with minimum Euclidean distance among all edge points in the target image is identified as the corresponding point. To avoid incorrect matching, we set a distance threshold, with point pairs exceeding this threshold considered invalid matches. Additionally, we incorporate distance-based exponential decay weights for feature points, ensuring that closer edge point pairs receive higher weights, further enhancing matching reliability. Finally, we design a weight allocation method based on feature point response values. For corner feature points, matching weights are calculated based on keypoint response values. Specifically, for each matching pair, the geometric mean of the two keypoint response values is taken as the initial weight, then weight values are normalized to ensure higher-quality feature points have greater influence. For edge points, higher weights are assigned in the initial stage to fully utilize the advantages of edge features in coarse registration. Through multi-feature detection and weighted matching strategies, the algorithm can effectively address multimodal registration challenges in bamboo slip images, laying the foundation for subsequent multi-level registration. The adopted response value-based feature weight allocation mechanism and bidirectional matching verification strategy significantly improve matching point reliability, while the introduction of edge features effectively supplements the deficiency of corner features in text regions, enhancing the algorithm’s capability to utilize bamboo slip edge features.

Multi-level registration algorithm implementation

This section details the implementation process of the multi-level registration algorithm, including the coarse registration module based on downsampling and feature weighting, the fine registration module based on improved ICP with corner-edge multi-feature fusion and dynamic weight adjustment, and mutual information optimization as three key stages. The algorithm flowchart of this framework is shown in Fig. 4 below. The core innovations lie in the hierarchical processing strategy and dynamic feature weight adjustment mechanism, which effectively address the challenging problems in bamboo slip image registration.

**Fig. 4: Flowchart of the registration Algorithm.**

Coarse registration stage

We employ a downsampling strategy in the coarse registration stage, performing 1/2 downsampling on original images to reduce image detail interference while preserving main structural features, providing good initial transformation estimates for subsequent fine registration. Downsampling uses area averaging, which effectively suppresses text detail noise while preserving main structural information.

The coarse registration stage adopts an edge-feature-priority weight allocation strategy, with edge feature weights set to 5 and corner feature weights set to 3, using edge points as primary constraints with corner information as auxiliary support. This weight setting is based on two key points: First, bamboo slip external contours remain relatively stable across different modal images, and spatial resolution loss potentially caused 1/2 downsampling may cause corner point positions to shift or disappear; second, in the coarse registration stage, global alignment is more important than local details, with local registration to be performed in the subsequent fine registration stage.

Visualization of matching point positions and connections in the coarse registration stage is shown in Fig. 5 below. In Fig. 5, I have selected common cases in bamboo slips, comprising 6 image pairs, with the left image being infrared and the right being visible in each pair. Figure 5a represents bamboo slips with mild character degradation, where visible images show relatively clear text serving as excellent feature point sources character strokes, turns, and intersections are all detected as corner points, specifically high-intensity corner points. Since infrared images of bamboo slips are inherently intended to reveal degraded characters, characters appear particularly clear in infrared images. Therefore, in this type of bamboo slip image with mild character degradation, corner point accuracy is very high with good registration results. In such images, whether edge point information dominates does not affect coarse registration results. Figure 5b, c represent bamboo slips with severe character degradation, where characters in visible images are faded, blurred, and difficult to recognize. In such images, extracting sufficient high-intensity corner point information from visible images is challenging, while high-intensity corner points extracted from clear text structures in infrared images mostly exist as medium-intensity corner point information in visible images. We reduced the corner point extraction threshold to identify more potential corner point information for matching. However, after lowering the corner point extraction threshold, not only potential corner points but also noise points may be extracted, thereby affecting corner point matching accuracy. In this situation, we incorporated bidirectional matching verification strategies and geometric consistency verification in the algorithm to screen corner points and improve accuracy. We also note that bamboo slip edge information, after downsampling, still maintains main edge structures, with direction information remaining relatively stable across different scales, exhibiting scale invariance. Moreover, long edges are less susceptible to noise interference than isolated corner points, providing edge information with stronger noise resistance. This aligns with the coarse registration stage’s focus on global-level registration of bamboo slip images, so we increased edge information weights in the coarse registration stage, allowing edge information to dominate. Figure 5d, e represent even more severe bamboo slip character degradation cases, where corner point matching remains difficult even after detecting medium-intensity corner points. As shown in Fig. 5c, d, the number of corner point matching pairs is clearly fewer than in Fig. 5a–c. In such cases, registration must rely even more on edge information. Figure 5f also shows bamboo slips with severe character degradation. In this image, we can clearly observe the decrease in corner point matching accuracy after medium-intensity corner point detection. The corner point matching in the figure shows obvious mismatches. However, edge point matching is correct, and good coarse registration can still be achieved when edge points dominate. The situations represented by these 6 image pairs demonstrate that our choice of edge-dominated, corner-assisted approach is correct.

**Fig. 5: Visualization of feature point matching results in the coarse registration stage.**

After obtaining SIFT matching points and edge correspondence points, the algorithm uses a weighted affine transformation estimation algorithm to calculate the transformation matrix. The core approach is to repeat matching point pairs multiple times according to their weights. For example, for a point pair (p_i, q_i) with weight w_i, the system duplicates it ⌊w_i⌋ times, thereby granting high-weight points greater influence during transformation estimation. The RANSAC algorithm is used to estimate the affine transformation at this stage:

$$\left(\begin{array}{l}{x}^{{\prime} }\\ {y}^{{\prime} }\\ 1\end{array}\right)=\left(\begin{array}{rcl}{a}_{11} & {a}_{12} & {t}_{x}\\ {a}_{21} & {a}_{22} & {t}_{y}\\ 0 & 0 & 1\end{array}\right)\left(\begin{array}{l}x\\ y\\ 1\end{array}\right)$$

(1)

where the rotation angle θ and scaling factor s can be calculated using the following formulas:

$$\theta =\arctan 2({a}_{21},{a}_{11})\cdot \frac{180}{\pi }$$

(2)

$$s=\frac{\sqrt{{a}_{11}^{2}+{a}_{12}^{2}}+\sqrt{{a}_{21}^{2}+{a}_{22}^{2}}}{2}$$

(3)

To ensure transformation validity, the algorithm performs rigorous verification of the estimated transformation matrix, extracting and checking the following parameters: rotation angle θ and scaling factor s. When abnormal parameters are detected, the system discards the current transformation and uses the initial identity transformation. This conservative strategy ensures registration process stability, avoiding failures in subsequent fine registration and MI mutual information optimization stages due to coarse registration stage failure, ensuring that the transformation matrix output from the coarse registration stage, even if not sufficiently precise, provides reliable initial values for the next stage of fine registration. The parameter ranges are determined from actual imaging conditions. Since bamboo slips are typically elongated strips, and the captured multimodal bamboo slip images are intended for practical research use, bamboo slips are generally positioned vertically in both visible and infrared images, with the size ratio of bamboo slips between the two images typically not varying significantly.

Fine registration stage

After obtaining coarse registration results, the transformation matrix is adjusted to the original image dimensions and applied to the original infrared image, yielding the coarsely registered image. The registered image and original visible image are then input into the fine registration stage. This stage aims to further refine alignment results, perform local registration, and improve overall image registration accuracy. The algorithm employs an improved Iterative Closest Point (ICP) algorithm, innovatively introducing a dynamic weight adjustment mechanism for corner-edge features.

In the fine registration stage, to obtain more precise feature points, corner and edge features are re-extracted using original resolution images. For corner feature extraction, the same method as in Section “Multi-Feature Detection and Weighted Matching” is applied but to images after coarse registration transformation, using adaptive parameters. For edge feature extraction, binary masks are first created and applied to ensure only valid image regions participate in edge detection. This ensures feature points used in the fine registration stage have the highest positional accuracy while avoiding interference from background regions in feature extraction.

The proposed ICP algorithm improvements are primarily reflected in two aspects: (1) Multi-feature fusion: Traditional 2D ICP algorithms are geometric registration methods based on original point correspondences, typically not relying on specific feature points. This causes classical ICP algorithms to struggle with weak-texture, weak-corner image registration, as weak-texture regions lack significant geometric variations, leading to error-prone nearest neighbor searches, while uniform point cloud distributions in smooth regions fail to provide effective registration constraints. Once sufficient constraint information is lacking, algorithms easily fall into local optima, resulting in poor final registration. Our proposed algorithm combines feature points, simultaneously considering edge contour points and corner points, establishing a unified feature framework for fusion. (2) Dynamic weight adjustment: As iterations progress, corner and edge feature weights change dynamically, achieving smooth transition from global alignment to local fine matching. The core iterative process can be expressed as:

$${T}_{k+1}=\Delta {T}_{k}\circ {T}_{k}$$

(4)

where T_k represents the global transformation after the k-th iteration, ΔT_k represents the local incremental transformation estimated at iteration k, and ∘ represents transformation composition operation.

After each iteration, the algorithm checks whether mutual information has improved. If improved, the update is accepted; otherwise, the previous best transformation is maintained. This mutual information-based evaluation mechanism ensures monotonic convergence of the registration process. In each iteration, we use a combination of edge and corner points to estimate the transformation matrix. The combination process can be expressed as:

$${P}_{{\text{combined}}}={P}_{{\text{sift}}}\cup {P}_{{\text{edge}}}$$

(5)

$${W}_{{\text{combined}}}={W}_{{\text{sift}}}\cup {W}_{{\text{edge}}}$$

(6)

where P_sift and P_edge represent coordinate sets of SIFT feature points and edge points respectively, and W_sift and W_edge represent corresponding weight sets. To achieve weighted transformation estimation, we adopt a point duplication strategy, duplicating corresponding points based on weight values, as shown in the following formula:

$${P}_{{\text{weighted}}}={\cup }_{i=1}^{n}{\cup }_{j=1}^{\lfloor {w}_{i}\rfloor }{p}_{i}$$

(7)

where p_i represents the i-th point’s coordinates, w_i represents its weight, and ⌊w_i⌋ represents the number of duplications after rounding down the weight. This point duplication strategy intuitively implements the weighting mechanism, with higher-weighted points having greater influence in transformation estimation. After weighting, the RANSAC method is used to estimate affine transformation.

We innovatively introduced dynamic weight adjustment during iterations, implementing gradual increase in corner feature weights and gradual decrease in edge feature weights as iteration count increases through linear interpolation. The specific linear interpolation method we adopt is:

$${w}_{{\text{edge}}}(i)={w}_{{\rm{e}}{\rm{d}}{\rm{g}}{\rm{e}}}^{{\rm{i}}{\rm{n}}{\rm{i}}{\rm{t}}{\rm{i}}{\rm{a}}{\rm{l}}}\cdot (1-\frac{i}{N})+{w}_{{\rm{e}}{\rm{d}}{\rm{g}}{\rm{e}}}^{\mathrm{fi}{\rm{n}}{\rm{a}}{\rm{l}}}\cdot \frac{i}{N}$$

(8)

$${w}_{\text{sift}}(i)={w}_{{\rm{s}}{\rm{i}}{\rm{f}}{\rm{t}}}^{{\rm{i}}{\rm{n}}{\rm{i}}{\rm{t}}{\rm{i}}{\rm{a}}{\rm{l}}}\cdot (1-\frac{i}{N})+{w}_{{\rm{s}}{\rm{i}}{\rm{f}}{\rm{t}}}^{\mathrm{fi}{\rm{n}}{\rm{a}}{\rm{l}}}\cdot \frac{i}{N}$$

(9)

where i is the current iteration number, N is the maximum number of iterations, ${w}_{edge}^{initial}=5.0$, ${w}_{edge}^{\mathrm{fi}nal}=1.0$, ${w}_{sift}^{initial}=3.0$, ${w}_{sift}^{\mathrm{fi}nal}=20.0$.

This dynamic weight adjustment function ensures continuity and smoothness of weight changes, avoiding instability from abrupt transitions. In initial iterations (i = 0), edge weight is 5.0 and corner weight is 3.0; at final iteration (i = N), edge weight decreases to 1.0 while corner weight increases to 20.0.

We incorporated dynamic weight adjustment with increasing iterations in the fine registration stage because in early registration, after downsampling, edge information’s scale invariance and stronger noise resistance make edge point information more accurate and reliable in coarse registration. Corner points depend on pixel-level precise positioning, while bamboo slips suffer from character degradation with originally blurred or vanished characters in visible images, making correspondence with infrared image characters difficult. Combined with spatial resolution loss during downsampling potentially causing corner position shifts or disappearance, edge features are more important and precise in the coarse registration stage, helping quickly obtain rough but robust alignment.

In the fine registration stage, as images return to original size, corner points can achieve sub-pixel positioning, with all stroke intersections, turning points, endpoints, and other structural details reappearing. These potentially detectable corner points represent precise positions of local geometric structures, with corner matching’s high precision characteristics perfectly meeting fine registration requirements. Meanwhile, edge detection produces edge bands of certain width rather than precise point positions, potentially generating excessive edge pixels in high-resolution images. For fine adjustments required in precise registration, this uncertainty accumulates into larger errors. Therefore, compared to the coarse registration stage, corner points become particularly important in fine registration, needing to gradually dominate. Through this strategy, we ensure that in early fine registration, edge information’s global nature prevents ICP from converging to incorrect local optima; in mid-stage, balanced edge and corner weights achieve stable convergence direction; in late fine registration, corner dominance achieves final precise alignment. While avoiding local optimum traps, this achieves smooth optimization paths, reducing algorithm jumps between different feature types through coarse-to-fine continuous optimization trajectories, providing more stable convergence processes with strong adaptability to common character degradation, blurring, and contamination in bamboo slip images.Visualization of local precise registration during the fine registration stage is shown in Fig. 6 below. We can observe that after completing the coarse registration stage (image iter0), although global registration has been achieved, registration effectiveness remains suboptimal at edges or local positions. This manifests in the red-green overlay images as red or green bands appearing at bamboo slip edge positions. However, with increasing iteration count, local positions of bamboo slips undergo fine registration. By iteration completion, registration results for all bamboo slip iter50 images have improved, with the originally red or green bands along edges substantially disappearing, transforming to normal yellow. This indicates that after the fine registration stage, registration precision and effectiveness have further improved.

**Fig. 6: Visualization of local precise registration during the fine registration stage.**

To prevent background regions from interfering with registration, we introduce a masking mechanism. The mask is a binary image where regions with value 1 represent valid bamboo slip areas and regions with value 0 represent background. Masks are continuously updated with transformations. Masks ensure only bamboo slip region features participate in registration, effectively avoiding background noise and interference. Particularly for bamboo slip images with irregular shapes and complex backgrounds, mask processing is crucial for improving registration performance.

To ensure reasonable transformation parameters for each iteration, we introduce transformation quality assessment and anomaly rollback mechanisms. For estimated transformation matrices, we extract corresponding scaling factors and rotation angles, then check whether these transformation parameters are within predefined reasonable ranges. If parameters exceed ranges, we roll back to the previous best transformation. This anomaly detection rollback mechanism greatly enhances algorithm robustness, avoiding unreasonable registration results from local optima. Particularly for bamboo slip images, local feature similarity may cause mismatching, producing unreasonable transformation parameters. Through parameter verification and rollback mechanisms, the algorithm avoids these errors, maintaining registration process stability.

Additionally, we introduce mutual information-based quality assessment, accepting new iteration results only when new transformations improve mutual information. Mutual information metrics as multimodal image registration measures can effectively assess registration quality, reliably reflecting actual registration conditions even when images have large visual appearance differences due to modal differences.

Mutual information optimization

After completing fine registration, this work introduces mutual information optimization as the final refinement step to further improve registration accuracy. The introduction of mutual information optimization is based on the following considerations: ICP algorithms primarily rely on spatial correspondences of feature points and edge points, while mutual information is directly based on statistical dependencies of image gray-level distributions. For image pairs with different imaging mechanisms such as infrared and visible images, mutual information can capture deep statistical correlations that traditional similarity measures cannot identify, making it particularly suitable for handling registration problems with significant gray-level distribution differences. The calculation method for mutual information is as follows:

$$MI(X,Y)=H(X)+H(Y)-H(X,Y)$$

(10)

where H(X) and H(Y) are the marginal entropies of random variables X and Y respectively, and H(X, Y) is their joint entropy. In image registration, X and Y represent pixel gray values of the two images.

The core concept of mutual information is evaluating registration quality by analyzing the joint histogram of two images. When two images are perfectly aligned, corresponding pixels’ gray values exhibit the strongest statistical correlation, maximizing mutual information. Based on this principle, we use mutual information maximization as the optimization objective, iteratively adjusting transformation parameters to find optimal registration results. Compared to traditional pixel-difference-based metrics, mutual information has stronger adaptability to nonlinear gray-level relationships between images, effectively handling common issues in bamboo slip images such as illumination variations and contrast differences. This makes it particularly suitable as an evaluation metric for multimodal image registration in our bamboo slip visible and infrared image registration method.

To overcome the computational inefficiency limitations of traditional grid search methods, we designed a multi-stage hybrid optimization strategy to maximize mutual information and obtain optimal transformation matrices, as shown in the following formula. This strategy combines advantages of global search and local refinement, performing global exploration through multi-restart simulated annealing algorithms, then using gradient optimization for local refinement, finally achieving parameter convergence through fine-tuning optimization. This hierarchical optimization design ensures both global convergence and significantly improved computational efficiency.

$${T}^{* }=\arg \mathop{\max }\limits_{T}MI({I}_{ref},T\circ {I}_{moving})$$

(11)

where T^* represents the optimal transformation matrix, I_ref is the reference image, I_moving is the image to be registered, and ∘ denotes transformation operation.

We adopt a multi-restart strategy because it is an effective method for preventing optimization algorithms from falling into local optima. Considering that single simulated annealing runs may be influenced by initial point selection, we employ a five-restart parallel search strategy. First, the algorithm uses the transformation matrix obtained from fine registration as the baseline starting point, which is typically already near the global optimum’s neighborhood. Subsequently, the algorithm generates four additional starting points around the baseline, each generated by adding controlled random perturbations to the baseline transformation. The simulated annealing formula for the multi-restart strategy is:

$${T}_{i}={T}_{{\text{base}}}+\Delta {T}_{i}$$

(12)

where T_base is the baseline transformation matrix from ICP registration, and ΔT_i is the random perturbation matrix for the i-th starting point.

At this stage, bamboo slip images entering mutual information optimization have undergone coarse and fine registration stages. Assuming no registration failure, we consider the visible and infrared images of bamboo slips to have achieved approximate registration. In this case, perturbations should not be too large. However, considering potential registration errors in previous stages leading to larger deviations or situations where parameter detection anomaly rollback mechanisms revert to initial positions, we carefully designed the starting point perturbation strategy to provide sufficient search diversity while not deviating from reasonable transformation ranges, aiming to complete registration for these bamboo slips in the final algorithm stage. Rotation perturbations are limited to ± 5°, covering most angular deviations in bamboo slip images. Scaling perturbations are controlled within 10%, compensating for possible scale differences while avoiding excessive deformation. Translation perturbations are set to 15 pixels, sufficient to handle positional offsets between images while maintaining computational stability.

Our introduced adaptive cooling mechanism is one of the core improvements to traditional simulated annealing algorithms. Traditional fixed cooling rates struggle to balance global search and local convergence needs, so we introduce a dynamic cooling strategy based on acceptance rate feedback. The algorithm maintains a sliding window to monitor candidate solution acceptance rates in real-time. When acceptance rates are too high, indicating potentially excessive temperature, the algorithm accelerates cooling to promote convergence; when acceptance rates are too low, it slows cooling to maintain sufficient search capability. The target acceptance rate is set at 30%, a value proven effective in theory and practice for balancing exploration and exploitation. Acceptance rate calculation uses a sliding window method:

$${r}_{k}=\frac{1}{W}\mathop{\sum }\limits_{i=k-W+1}^{k}{a}_{i}$$

(13)

where W = 50 is the sliding window size, and a_i ∈ {0, 1} indicates whether the candidate solution was accepted at iteration i.

The neighborhood solution generation strategy employs temperature-dependent perturbation strength control. The cooling rate adjustment strategy is: when acceptance rate exceeds 1.5 times the target value, use a fast cooling rate of 0.98; when below 0.5 times the target value, use a slow cooling rate of 0.995; otherwise use a standard cooling rate of 0.99. The target acceptance rate is set at 0.3.

After the simulated annealing phase, the algorithm selects the three best-performing candidate solutions for gradient refinement. We select multiple candidates rather than just the optimal solution because simulated annealing’s randomness may result in other promising regions near the optimal solution; refining multiple candidates enables more comprehensive exploration of the solution space.

During gradient optimization, we use finite difference methods to calculate numerical gradients, providing higher accuracy than forward differences. For a 2 × 3 affine transformation matrix, the algorithm needs to compute gradients for six parameters, each requiring two mutual information evaluations. Perturbation step size selection is crucial too small may cause numerical instability, while too large affects gradient estimation accuracy. We use 10⁻⁴ as the optimal perturbation step size, achieving good balance between computational precision and numerical stability.

$$\frac{\partial {\text{MI}}}{\partial {T}_{ij}}=\frac{{\text{MI}}(T+\epsilon {e}_{ij})-{\text{MI}}(T-\epsilon {e}_{ij})}{2\epsilon }$$

(14)

where T_ij represents the (i, j) element of the transformation matrix, e_ij is the corresponding unit matrix, and ε is the perturbation step size.

During gradient update optimization, traditional fixed learning rates struggle to adapt to objective function variations during optimization, so we designed a strategy based on improvement history and no-improvement counts for dynamic adjustment. The algorithm maintains an improvement history window, calculating average improvement and improvement trends over recent iterations. When average improvement is positive and trend is rising, learning rate multiplies by 1.05 (capped at 0.1); when average improvement is non-positive or consecutive no-improvement count exceeds 5, learning rate multiplies by 0.8; when improvement trend decreases, learning rate multiplies by 0.9. Learning rates are strictly limited to ensure numerical stability:

$${T}_{k+1}={T}_{k}+{\alpha }_{k}\nabla {\text{MI}}({T}_{k})$$

(15)

Throughout gradient optimization, we also introduce transformation parameter validity checks. After each parameter update, the algorithm checks whether the new transformation matrix satisfies geometric constraints. If new transformation parameters exceed reasonable ranges, the algorithm automatically rolls back to the previous valid state and retries with reduced learning rate. This protection mechanism ensures optimization always proceeds within physically reasonable parameter space, avoiding unrealistic results from numerical optimization.

After gradient optimization, the algorithm enters the final local fine-tuning stage. This stage aims to make fine adjustments based on already-found excellent solutions, further improving registration accuracy. Fine-tuning search range is strictly controlled within 0.5 units: for rotation parameters, this means 0.5-degree angular adjustments; for scaling parameters, 0.5% ratio adjustments; for translation parameters, 0.5-pixel position adjustments. Fine-tuning optimization uses random search rather than deterministic grid search, more effectively exploring parameter space with limited computational resources. In each iteration, the algorithm randomly generates a candidate solution within the current optimal solution’s neighborhood; if the candidate has higher mutual information, it is accepted and updates the current optimal solution. Random search advantages include avoiding optimal points that grid search might miss, particularly when optimal solutions are not on grid points.

To ensure physical reasonableness and practicality of registration results, we establish strict transformation parameter constraint mechanisms. These constraints are formulated based on actual characteristics and imaging conditions of bamboo slip images, ensuring registration result credibility while preventing optimization algorithms from producing unrealistic transformation parameters. First, rotation angle constraints are based on actual deviations possible during bamboo slip imaging. In actual bamboo slip digitization, even with professional imaging equipment, small angular deviations may exist between infrared and visible cameras. Statistical analysis of numerous bamboo slip image pairs reveals that angular deviations are within 5 degrees in most cases, so rotation angle constraints are set to absolute values not exceeding 5 degrees. This constraint covers normal imaging deviations while excluding obviously unreasonable rotation transformations. Second, scaling factor constraints consider effects of camera calibration errors and lens distortion. Theoretically, when using infrared and visible cameras with identical focal lengths to image the same bamboo slip, the two images should have identical scales. However, slight scale differences may exist in practice, mainly due to minor camera calibration errors, nonlinear lens distortion, and subtle differences in imaging distance. Finally, translation parameters have greater tolerance relative to rotation and scaling parameters, as bamboo slip placement may differ between two imaging sessions. However, excessively large translations typically indicate the registration algorithm may have fallen into incorrect local optima. Therefore, while translation parameters have no rigid numerical constraints, the algorithm judges translation reasonableness through geometric consistency verification. Anomaly detection mechanisms are important supplements to parameter constraints, checking not only individual parameter reasonableness but also overall consistency of parameter combinations.

Our rollback mechanism design fully considers algorithm robustness requirements. When detecting abnormal parameters, the algorithm doesn’t simply restart optimization but rolls back to the most recent reliable state. Within the multi-level registration framework, this typically means rolling back to reasonable results from the previous iteration or registration stage. This progressive rollback strategy ensures algorithm stability while maximally preserving early optimization achievements.

Character enhancement and visualization optimization

After completing the multi-level registration, to better display and analyze the ancient character content in bamboo slips, we propose an optimization method for character enhancement and visualization based on registration results. The flowchart of the method is shown in Fig. 7 below, which realizes the extraction of characters from the registered infrared images and the enhancement and optimized display of characters on the visible images of bamboo slips.

**Fig. 7: Flowchart of the fusion module.**

First, we preprocess the original visible image and registered infrared image, enhancing contrast between characters and background in infrared images through Contrast Limited Adaptive Histogram Equalization(CLAHE) contrast enhancement. We employ CLAHE to enhance contrast within local regions while avoiding noise amplification from excessive enhancement. Subsequently, we perform adaptive threshold segmentation on the contrast-enhanced infrared image, using Gaussian-weighted adaptive thresholding for character segmentation. The adaptive threshold segmentation formula is as follows:

$$T(x,y)=\frac{1}{| N| }\mathop{\sum }\limits_{(i,j)\in N}{G}_{\sigma }(i-x,j-y)\cdot I(i,j)-C$$

(16)

where N is a 25 × 25 neighborhood centered at (x, y), G_σ is the Gaussian weighting function, and C = 15 is a constant offset.

After completing adaptive threshold segmentation, we perform morphological optimization on the segmented character information, optimizing character completeness and continuity through morphological operations. We use opening and closing operations to connect incomplete broken strokes from segmentation extraction while removing extracted noise information. Subsequently, we convert the stroke-connected character image from three-channel RGB format to four-channel RGBA format, making the background transparent for subsequent operations. During this process, the algorithm inevitably identifies certain dark background regions in infrared images, uncleaned areas on bamboo slips, or image noise as characters, applying contrast enhancement and segmentation extraction to these portions. These are actually noise information, typically existing as isolated black dots or small black regions, differing from coherent and large-area character regions. We need to suppress this noise. We perform connected component analysis on converted black regions using area-based connected component filtering, removing noise regions with insufficient area. Using eight-connectivity labeling algorithm, we calculate the area for each connected component in the labeled image. Components with insufficient area are replaced with colors close to the background rather than completely removed, maintaining visual continuity. Subsequently, we also convert the visible image to four-channel RGBA format, but unlike the infrared image, the visible image only changes format to facilitate subsequent operations between the two images without altering image content. For the morphologically optimized RGBA format character image, we reassign colors to character regions based on alpha channel information. Since the characters to be enhanced are black, and black is represented as RGB values (0,0,0) in RGB color representation, black information cannot be obtained through addition operations but only through subtraction to achieve black character reproduction and enhancement. Therefore, this work employs difference fusion to highlight character information, changing extracted characters to the bamboo slip background color. During subtraction, locations on bamboo slips where characters originally existed but displayed bamboo slip background color due to degradation will change back to black. This method effectively subtracts character images from background images, highlighting text content while maintaining background information integrity. Additionally, since images have been registered, we modify the registered infrared image dimensions according to visible image dimensions, ensuring both images have identical dimensions before inputting to the character enhancement module for difference fusion processing. We then define a color matching function to restore colors modified during difference fusion for portions of the original visible image where characters are not degraded or mildly degraded with still-visible characters. These originally black characters have their colors changed during difference fusion, becoming colors different from both background and black within a certain range. We identify this color range and convert it back to black. During this process, we restrict bamboo slip background colors and image background colors from participating in color conversion, ensuring background authenticity. Ultimately, characters on bamboo slips reappear, completing enhancement of degraded bamboo slip characters. The character enhancement method plays an important role in bamboo slip image digitization and research, providing clear character display for bamboo slip studies, enhancing the research value and practicality of multimodal bamboo slip images, laying foundations for subsequent character recognition and content analysis, and providing new approaches for digital preservation and cultural heritage transmission of bamboo slips. This method forms a complete technical system with the aforementioned multi-level registration algorithm, from image alignment to content enhancement, providing comprehensive technical support for bamboo slip digitization research.

Results

In this section, we validate the effectiveness of the proposed bamboo slip infrared-visible image registration algorithm through a series of experiments. First, we introduce the constructed bamboo slip infrared-visible image dataset and its creation process. Subsequently, we elaborate on the evaluation metrics and verify the superiority of our method through comparative experiments with existing methods.

Jiandu infrared-visible image dataset

To evaluate the effectiveness of the proposed method in bamboo slip image registration, we employed a dataset of infrared-visible image pairs of bamboo slips. This dataset comprises image pairs from significant excavated documents, predominantly consisting of Han Dynasty wooden slips from Diwan, providing a reliable foundation for research in the field of bamboo slip image registration.

The dataset contains over 840 pairs of high-quality unregistered images. The corresponding image pairs contain identical content but exhibit variations in displacement, rotation, and scale.

Comparative experiments

We compared the proposed algorithm with other multimodal image registration algorithms, including traditional non-deep learning methods: MI⁵⁰, DASC^51,52, ICP³⁶, CAO-C2F⁵³, KAZE-SAR⁵⁴, SIFT³², AKAZE⁵⁵, as well as deep learning-based methods including the multimodal image registration module from BSAFusion⁵⁶, NeMAR⁴⁶, and VoxelMorph⁴⁵. For NeMAR and VoxelMorph, we retrained the models using our dataset, with NeMAR trained for 800 epochs and VoxelMorph trained for 1500 epochs.

Red-green overlay images in 2D image registration are important visualization tools for evaluating registration effectiveness and are among the commonly used methods for visualizing 2D registration algorithms. The basic principle involves displaying the fixed image in the red channel and the registered moving image in the green channel, with registration quality assessed by observing their overlap. When two images are perfectly aligned, overlapping regions appear yellow, as the superposition of red and green light produces a yellow effect.

Different manifestations in red-green overlay images reflect different types of registration errors. Ideal registration results show most regions appearing yellow or near-yellow, indicating perfect feature overlap between the two images, successful identification of correct spatial transformation by the registration algorithm, and effective correction of geometric distortions between images. When translation errors exist, the overall image exhibits red-green separation “ghosting" effects, indicating overall displacement deviation with insufficiently accurate translation parameters in X or Y directions. Rotation errors typically manifest as arc-shaped red-green separation at image edges, indicating angular deviation between images with inaccurate rotation center or angle estimation, with errors more pronounced in peripheral image regions. Scaling errors usually present as good alignment at image centers but gradual red-green separation toward edges, indicating inconsistent scaling ratios between images. More complex cases involve local deformation errors, manifesting as good yellow alignment in some regions while red-green separation in others, typically indicating nonlinear distortions requiring more complex transformation models. The most severe case is complete registration failure, presenting large areas of pure red and green regions with almost no yellow overlap, indicating the registration algorithm failed to find correct correspondences, possibly due to insufficient generalization of the registration algorithm on bamboo slip images or insufficient feature points. Good registration results have clear visual criteria in red-green overlay images. Excellent registration performance includes complete overlap of main structures with boundary contours showing clear yellow, aligned detail features with accurate overlap of fine features like textures and corner points, clear edges without obvious red-green “fringing" effects, and overall consistency with registration quality maintained throughout the entire image. Different image registration application domains have varying requirements for registration accuracy. Medical imaging requires extremely high precision with perfect overlap of main anatomical structures; remote sensing image registration allows certain errors but landmarks and boundaries should be clearly aligned; industrial inspection requires precise registration of critical defect regions; while in bamboo slip image registration applications, precise registration of text regions on bamboo slips is most important. Text recorded on bamboo slips not only serves as historical “testimony" but also carries rich information about culture, politics, economics, religion, and even language evolution. Therefore, text on bamboo slips is the most important component, and precise text region registration is essential for subsequent enhancement of degraded bamboo slip characters. Visualization results of registration effects on bamboo slips for our proposed method versus comparison experiments, along with original visible and infrared images of bamboo slips, are shown in Fig. 8 below.

**Fig. 8: Visualization of registration results from comparative experiments.**

The precise registration of infrared and visible images in the Diwan Han bamboo slip dataset faces a series of unique and complex technical challenges, primarily stemming from the physical characteristics of bamboo slip artifacts and differences in imaging conditions.

Multimodal imaging differences constitute the most fundamental challenge in the registration process. Infrared and visible imaging employ entirely different spectral ranges and imaging principles, causing the same bamboo slip to present distinctly different visual features under the two modalities. Surface details clearly visible in visible images, such as wood texture, stain distribution, and color variations, may completely disappear or manifest as different gray-scale patterns in infrared images. Conversely, deep ink trace information revealed by infrared imaging’s ability to penetrate surface contamination layers may be unobservable in visible images. This fundamental difference in feature representation makes it difficult for traditional feature point matching-based registration algorithms to find sufficient reliable correspondence points, severely affecting registration accuracy and stability. Differences in content correlation between infrared and visible images make registration challenging. In severely faded bamboo slips, visible images may show almost no text content, while infrared images clearly display textual information. In such cases, fundamental differences exist in the information content of the two images, causing complete failure of traditional image similarity-based registration methods. Even when text portions are visible, contrast and clarity of text in the two imaging modalities may differ significantly, with different edge definitions and stroke thickness representations that interfere with edge or gradient information-based registration algorithms. Furthermore, weak texture features of bamboo slip images further exacerbate registration difficulty. After thousands of years of burial, bamboo slip surfaces contain extremely sparse texture information, lacking obvious corner points, edges, and other geometric features. Traditional registration algorithms like SIFT and SURF typically rely on rich local feature descriptors to establish correspondences between images, but in bamboo slip images, these algorithms often cannot extract sufficient stable feature points. Even when some feature points can be detected, due to texture monotony and repetitiveness, these feature points have low discriminability and are prone to mismatching. Natural textures of wooden surfaces become blurred after prolonged weathering, causing substantial attenuation of texture direction and frequency information that could otherwise serve as registration references.

The ICP algorithm demonstrates relatively good registration performance on most image pairs, with red-green overlap regions primarily appearing yellow, indicating relatively accurate spatial alignment. However, in certain complex regions and bamboo slip edge contours, as shown in Fig. 8, in the displayed fifth image pair (second column, tenth row), green clearly appears at bamboo slip contour edges with slight red-green separation, indicating misalignment still exists. DASC algorithm performance is relatively unstable, achieving good registration on some image pairs with extensive yellow overlap regions, but showing obvious registration deviations on other pairs, particularly with noticeable red-green separation observable in certain images in the eighth and tenth rows. MI algorithm performs poorly in most cases, with most regions of registered images appearing green. CAO-C2F algorithm, due to its reliance on strong feature points for image registration, and AKAZE algorithm show unstable registration effects on weak-texture bamboo slip images, achieving good spatial alignment only on some image pairs where yellow overlap regions dominate. SAR-KAZE algorithm demonstrates relatively good robustness, maintaining basic registration accuracy across test image pairs; while not optimal, registration failures are relatively rare. SIFT algorithm, as a classic feature matching algorithm relying on corner features for registration, performs well on visible image pairs with obvious characters, but when severe text degradation occurs on bamboo slips with insufficient high-intensity feature points in visible images, SIFT algorithm registration performance deteriorates significantly.

We evaluated VoxelMorph, NeMAR, and BSAFusion as deep learning-based comparative baseline models for registration tasks. Given that bamboo slip images exhibit extreme dimensional heterogeneity, varying from 100 × 200 to 300 × 3000 pixels, and these model architectures require fixed input dimensions, we initially experimented with 256 × 2048 resolution to better retain the elongated structural characteristics of bamboo slips. Nevertheless, training collapsed with NaN loss during early epochs, achieving merely 13.8% of the intended training iterations. We subsequently adopted 256 × 256 resolution based on the original input specifications of respective models, employing proportional scaling with padding for preprocessing. At this reduced resolution, however, the irrecoverable information loss rendered high-precision registration unattainable.

Finally, our proposed method demonstrates excellent registration performance across all test image pairs, with red-green overlap regions appearing almost entirely yellow with minimal obvious red-green separation, indicating the algorithm’s superiority over other comparison algorithms in both registration accuracy and stability. Particularly noteworthy is that even in regions with complex textures and low contrast, the algorithm maintains high-precision registration effects, demonstrating its robustness in handling various image conditions.

After algorithm registration, feature points in infrared images should align with corresponding feature points in visible images. Therefore, we measure Euclidean distances between transformed source points and target points to quantify actual registration algorithm performance. To comprehensively evaluate the performance of our proposed multi-level image registration method, we designed an evaluation metric system encompassing multiple dimensions. For information theory aspects, we employ Mutual Information (MI)²⁶ and Normalized Mutual Information (NMI)²⁹ to measure statistical dependencies between two images. For distance measurements, we calculate distance metrics from three dimensions, including Root Mean Square Error (RMSE), Mean Absolute Error (MAE)⁵⁷, and Median Absolute Error (MEE), to evaluate pixel-level registration accuracy. Additionally, we employ Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM)⁵⁸ to measure image-level similarity between transformed visible and near-infrared images. We also introduce metrics such as Normalized Cross-Correlation (NCC), Correlation Coefficient (CC), and Gradient Mutual Information (GMI)⁵⁹ to evaluate inter-image correlation and edge structure preservation capability.

The test set consists of 150 infrared-visible image pairs of bamboo slips, spanning three levels of degradation: 56 pairs (37.3%) with mild degradation, where character strokes are clearly visible with intact structural features, and both modalities yield abundant extractable feature points; 43 pairs (28.7%) with moderate degradation, where strokes are partially blurred or exhibit localized damage, resulting in a reduced number of feature points in visible images; and 51 pairs (34.0%) with severe degradation, where character strokes in visible images are heavily faded or imperceptible, yielding extremely limited extractable feature points. The balanced distribution across these categoriesensures comprehensive coverage of registration scenarios with varying difficulty levels. All images were sourced from the Bamboo and Wooden Slips Academic Resources Sharing Platform using standardized imaging equipment and acquisition protocols. All evaluation metrics were computed on this test set, with means and standard deviations reported in Tables 1 and 2.

Table 1 Comparison of registration methods across multiple evaluation metrics (Part 1: MI to MEE)

Full size table

Table 2 Comparison of registration methods across multiple evaluation metrics (Part 2: SSIM to CC)

Full size table

Our method achieves optimal performance across multiple key metrics. In information theory metrics, our method’s Mutual Information (MI) significantly outperforms the second-best method SAR-KAZE. Normalized Mutual Information (NMI) also performs best, showing marked improvement over SAR-KAZE. In distance measurement metrics, our method demonstrates excellent performance: Mean Absolute Error (MAE) is optimal among all comparison methods; Median Absolute Error (MEE) similarly achieves best performance; Root Mean Square Error (RMSE), while slightly lower than MI and DASC algorithms, exceeds other algorithms. In image quality assessment, our method’s Structural Similarity Index (SSIM) significantly outperforms other comparison methods, reflecting the perceptual quality advantages of registered images. Correlation Coefficient (CC) performs well, also achieving best performance. Our method shows superior comprehensive performance. Gradient Mutual Information (GMI) performs well, reflecting the effectiveness of our edge feature fusion strategy in maintaining edge structures.

Ablation study

In our proposed registration algorithm, the fine registration module improved from the original ICP algorithm by incorporating corner and edge point multi-feature fusion strategy, dynamic weight adjustment strategy, and a mutual information-based optimization method represents one of the core innovations of our approach. The presence of the fine registration stage enables the algorithm to achieve smooth optimization paths while avoiding local optimum traps. Through dynamic adjustment of feature weights, it achieves continuous optimization trajectories, reduces abrupt transitions between different feature types, provides more stable convergence processes, and demonstrates strong adaptability to common issues in bamboo slip images such as character degradation, blurring, and contamination.

To validate the effectiveness of the fine registration module, we conducted ablation experiments targeting the fine registration stage. In the ablation study, we subdivided the fine registration module and performed separate ablation experiments on corner features, edge features, and mutual information optimization (MIO). The quantitative metrics of the experiments are presented in Table 3.

Table 3 Ablation results

Full size table

Through the ablation study results in Table 3, we can observe the contribution of each module in the fine registration stage to the registration performance and their synergistic effects. From the single-module experiments, geometric feature constraints demonstrate more stable registration performance compared to pure mutual information optimization, with corner point constraints showing the most significant effect, followed by edge point constraints. This phenomenon indicates that in the fine registration task of bamboo slip images, structured geometric features can provide more reliable registration guidance. In particular, corner points, as the most prominent local features in images, possess natural advantages in establishing correspondences. In contrast, while mutual information optimization alone can exploit statistical information of images, it tends to fall into local optima when facing the complex textures and low-contrast features of bamboo slip images.

The results of dual-module combinations reveal the complementarity among different modules. When mutual information optimization is combined with either geometric feature constraint, the registration accuracy achieves significant improvement, indicating that mutual information, as a global statistical measure, can effectively compensate for the deficiencies of geometric features in certain regions. It is noteworthy that the joint use of corner points and edge points alone can achieve near-optimal performance, reflecting the synergistic effect of multi-level geometric features in constructing robust correspondences. Corner points provide precise positional anchors, while edge points supplement contour and structural information, jointly forming a complete geometric constraint framework.

The complete model employs a three-module joint optimization strategy and achieves the best performance across all evaluation metrics. Although the improvement over dual-module combinations is limited, this enhancement remains statistically significant, and the reduction in standard deviation indicates that the combination strategy improves the algorithm’s stability. This result confirms that our designed multi-level optimization strategy is both reasonable and necessary: geometric feature constraints provide structural priors for registration, while mutual information optimization performs fine-tuning through global statistical information. The three modules work synergistically, enabling the algorithm to maintain good robustness while ensuring registration accuracy.

Degraded character enhancement

After processing registered infrared and visible images through the bamboo slip degraded character enhancement module, originally degraded characters become clearer. The generated images can, to a certain extent, simultaneously possess texture and color information from the original visible images along with clear character information originally visible only in infrared images. This is beneficial for subsequent bamboo slip fragment combination work, as experts need only examine a single character-enhanced image to simultaneously obtain bamboo slip feature information from both visible and infrared images without comparing between images. When physically joining bamboo slip fragments, images possessing obvious physical characteristics such as color and texture are easier to match with physical objects compared to infrared images showing only clear characters. Bamboo slips buried underground exhibit varying degrees of character degradation and ink blurring due to complex burial environments. As shown in Fig. 9 below, we selected bamboo slips with different degrees of character degradation to comprehensively demonstrate the effects of degraded character enhancement. It is clearly evident that after processing with our proposed character enhancement method, characters on bamboo slips show significant improvement compared to original visible images. For bamboo slip images with mild character degradation, characters become clearer after enhancement, with some degraded strokes reappearing. For bamboo slips with severe character degradation, where character information is barely discernible in original visible images and only visible through infrared imaging, characters reappear in locations where they originally existed but had disappeared due to degradation after character enhancement is completed.

**Fig. 9: Visualization of bamboo slip character enhancement effects.**

Bamboo slips exhibit an elongated shape rather than standard rectangular images. During overall visualization, image length is severely compressed, making it difficult to observe individual character enhancement. As shown in Fig. 10 below, we extracted local regions from complete bamboo slips to visualize character enhancement effects. Images are grouped in sets of three, showing visible images, infrared images, and character-enhanced result images respectively. Results demonstrate significant improvement in text readability. In visible images, many character strokes appear blurred or completely disappeared, while infrared images, though revealing some latent ink trace information, suffer from background noise and insufficient contrast. Both modal images exhibit information deficiencies individually but provide strongly complementary information. visible images preserve partial surface ink morphological information, while infrared images supplement ink information from deep and faded regions. After character enhancement algorithm processing, character readability and clarity show significant improvement. Compared to visible images, stroke edges become sharper, character structures more complete, and ink-to-substrate contrast substantially enhanced. Characters originally difficult to recognize or completely invisible in input images show good recovery and enhancement of stroke structures and character features after dual-modal enhancement processing.

**Fig. 10: Visualization of local character enhancement on bamboo slips.**

To quantitatively evaluate the fusion performance, we computed a set of image quality metrics for the 12 registered image pairs presented in Fig. 9, as summarized in Tables 4 and 5. The results demonstrate that the fused images maintain high correlation and structural similarity with the visible light source images, while effectively incorporating information from the infrared modality. The standard deviation (SD) of the fused images indicates enhanced contrast, which facilitates the visual recovery of degraded characters. Furthermore, the color difference metric (ΔE) confirms that the fusion process preserves satisfactory color fidelity.

Table 4 Comparison of fusion quality metrics (Part 1: CC, PSNR and SD)

Full size table

Table 5 Comparison of fusion quality metrics (Part 2: SSIM, MI and Delta E)

Full size table

Discussion

We proposes a multimodal image multi-stage coarse-to-fine registration algorithm and character enhancement method for bamboo slip preservation and research, comprising two independent modules. The registration module is implemented using a multi-level, multi-stage coarse-to-fine approach, selecting different registration methods and feature weights at various registration stages based on bamboo slip image characteristics, effectively addressing core challenges in bamboo slip registration and achieving superior performance compared to other registration algorithms. In the character enhancement module, this work successfully recovers degraded characters using complementary information from aligned visible and infrared images. The difference fusion-based method combined with adaptive threshold segmentation effectively reconstructs degraded characters while preserving original texture information, significantly improving subsequent bamboo slip text readability and providing new methods for digital preservation and research of bamboo slip artifacts. Additionally, the visualization effects of enhanced bamboo slip visible images support bamboo slip fragment rejoining research by eliminating the need for continuous cross-referencing between visible and infrared images, thereby improving the efficiency of bamboo slip fragment rejoining and reconstruction workflows. Experiments validate the effectiveness of the proposed method. Future work could further verify the method’s generalizability on other image datasets and explore its applications in other cultural heritage preservation domains.

Data availability

The datasets used in this study were obtained from the Bamboo and Wooden Slip Academic Resource Data Sharing Platform (URL: https://jdsjk.nwnu.edu.cn/). Code availability:The source code is publicly available at https://github.com/qwe1aa/JDRegistration. The repository contains implementation details and algorithms described in this paper.

References

Hu, P. S. & Zhang, D. F. Selected Interpretations of Han Dynasty Bamboo Slips from Xuanquan, Dunhuang (in Chinese) (Shanghai Chinese Classics Publishing House, Shanghai, 2001).
Yu, X. Language and State: A Theory of the Progress of Civilization (in Chinese) 2nd edn (FriesenPress, 2021).
Zhu, J. et al. Rejoining fragmented ancient bamboo slips with physics-driven deep learning. Preprint at https://doi.org/10.48550/arXiv.2505.08601 (2025).
Aguado-Martínez, M. et al. Document scanners for minutiae-based palmprint recognition: a feasibility study. Pattern. Anal. Applic. 24, 459–472 (2021).
Article Google Scholar
Gong, M. et al. Analysis of genealogy seals in Yibin museum collection using hyperspectral imaging technology (in chinese). Sci. Conserv. Archaeol. 33, 78–84 (2021).
Google Scholar
Guo, X. et al. Hidden information extraction from ancient paintings using imaging spectroscopy technology (in chinese). J. Image Graph. 22, 1428–1435 (2017).
Google Scholar
Sun, M., Zhang, D., Ren, J., Chai, B. & Sun, J. What’s wrong with the murals at the Mogao Grottoes: a near-infrared hyperspectral imaging method. Sci. Rep. 5, 14371 (2015).
Article CAS PubMed PubMed Central Google Scholar
Zhou, X., Shen, H. & Wu, L. Research on extraction of blurred seals using hyperspectral image system (in chinese). Sci. Conserv. Archaeol. 32, 56–60 (2020).
Google Scholar
Hou, M. et al. Extraction and recognition of faded text in ancient calligraphy and painting based on spectral enhancement index and LeNet-5 (in chinese). Sci. Conserv. Archaeol. 34, 72–80 (2022).
Google Scholar
Deng, X. et al. Mold spot detection for paper artifacts based on multimodal feature fusion. npj Heritage Sci. 13, 540 (2025).
Article CAS Google Scholar
Cucci, C. et al. Hyperspectral imaging and convolutional neural networks for augmented documentation of ancient Egyptian artefacts. Heritage Sci. 12, 75 (2024).
Article Google Scholar
Mezina, A., Burget, R. & Kotrly, M. A deep learning approach for anomaly detection in X-ray images of paintings. npj Heritage Sci. 13, 127 (2025).
Article Google Scholar
Vila, A. et al. Ancient greek text concealed on the back of unrolled papyrus revealed through shortwave-infrared hyperspectral imaging. Sci. Adv. 5, eaav8936 (2019).
Article Google Scholar
Mocella, V., Brun, E., Ferrero, C. & Delattre, D. Revealing letters in rolled herculaneum papyri by x-ray phase-contrast imaging. Nat. Commun. 6, 5895 (2015).
Article CAS PubMed Google Scholar
Parsons, S. et al. Educelab-scrolls: Verifiable recovery of text from herculaneum papyri using x-ray ct. Preprint at https://doi.org/10.48550/arXiv.2304.02084 (2023).
Lv, G., Wang, H., Wang, K., Zhao, H. & Zhao, L. Virtual restoration method of Kizil Grotto murals based on multimodal controlled diffusion models. npj Heritage Sci. 13, 554 (2025).
Article Google Scholar
Zhou, S., Wang, X., Qiu, J., Bu, W. & Wang, H. OracleNet: enhancing oracle bone script recognition with adaptive deformation and texture-structure decoupling. npj Heritage Sci. 13, 1–14 (2025).
Article Google Scholar
Zhang, D. (ed.) Diwan Han Bamboo Slips (in Chinese) (Zhongxi Book Company, Shanghai, 2017).
Gansu Bamboo Slips Museum & Gansu Provincial Institute of Cultural Relics and Archaeology (eds) Han Bamboo Slips from Juyan: Volume 4 (in Chinese) (Zhongxi Book Company, Shanghai, 2016).
Hao, S. & Zhang, D. Research on Xuanquan Han Bamboo Slips (in Chinese) (Gansu Culture Publishing House, Lanzhou, 2009).
Cao, S., Pan, T., Wang, Y. & Song, T. Character restoration of qin and han bamboo slips based on improved conditional generative adversarial networks. npj Heritage Sci. 13, 1–9 (2025).
Article Google Scholar
Li, H., Ding, W., Cao, X. & Liu, C. Image registration and fusion of visible and infrared integrated camera for medium-altitude unmanned aerial vehicle remote sensing. Remote Sens. 9, 441 (2017).
Article Google Scholar
Shahsavarani, S. et al. Robust multi-modal image registration for image fusion of infrared and visible infrastructure images. Sensors 24, 3994 (2024).
Article PubMed PubMed Central Google Scholar
Luo, Y. & Luo, Z. Infrared and visible image fusion: methods, datasets, applications, and prospects. Appl. Sci. 13, 10891 (2023).
Article CAS Google Scholar
Liu, Y. et al. Deepjiandu dataset for character detection and recognition on jiandu manuscript. Sci. Data 12, 398 (2025).
Viola, P. & Wells III, M. W. Alignment by maximization of mutual information. Int. J. Comput. Vision 24, 137–154 (1997).
Article Google Scholar
Maes, F., Collignon, A., Vandermeulen, D., Marchal, G. & Suetens, P. Multimodality image registration by maximization of mutual information. IEEE Trans. Med. Imaging 16, 187–198 (1997).
Article CAS PubMed Google Scholar
Pradhan, S. & Patra, D. Enhanced mutual information based medical image registration. IET Image Process. 10, 418–427 (2016).
Article Google Scholar
Studholme, C., Hill, D. L. G. & Hawkes, D. J. An overlap invariant entropy measure of 3d medical image alignment. Pattern Recognit. 32, 71–86 (1999).
Article Google Scholar
Tong, X. et al. Image registration with fourier-based image correlation: a comprehensive review of developments and applications. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 12, 4062–4081 (2019).
Article Google Scholar
Dong, Y., Long, T., Jiao, W., He, G. & Zhang, Z. A novel image registration method based on phase correlation using low-rank matrix factorization with mixture of gaussian. IEEE Trans. Geosci. Remote Sens. 56, 446–460 (2017).
Article Google Scholar
Lowe, D. G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004).
Article Google Scholar
Ma, J. et al. Robust feature matching for remote sensing image registration via locally linear transforming. IEEE Trans. Geosci. Remote Sens. 53, 6469–6481 (2015).
Article Google Scholar
Gao, J. & Cai, X.-f. Image matching method based on multi-scale corner detection. In 13th International Conference on Computational Intelligence and Security (CIS), (IEEE, 2017).
Bay, H. et al. (eds). Surf: Speeded up robust features. (eds Leonardis, A., Bischof, H. & Pinz, A.) Computer Vision – ECCV 2006, 404–417 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2006).
Besl, P. J. & McKay, N. D. A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14, 239–256 (1992).
Article Google Scholar
Chen, Y. & Medioni, G. Object modeling by registration of multiple range images. Image Vis. Comput. 10, 145–155 (1992).
Article Google Scholar
Zhang, Z. Iterative point matching for registration of free-form curves and surfaces. Int. J. Comput. Vision 13, 119–152 (1994).
Article Google Scholar
Fitzgibbon, A. W. Robust registration of 2D and 3D point sets. Image Vis. Comput. 21, 1145–1153 (2003).
Article Google Scholar
Segal, A. V., Hähnel, D. & Thrun, S. Generalized-ICP. Robot. Sci. Syst. 5, 435–442 (2009).
Google Scholar
Yang, J., Li, H., Campbell, D. & Jia, Y. Go-ICP: a globally optimal solution to 3D ICP point-set registration. IEEE Trans. Pattern Anal. Mach. Intell. 38, 2241–2254 (2016).
Article PubMed Google Scholar
Bouaziz, S., Tagliasacchi, A. & Pauly, M. Sparse iterative closest point. Comput. Graph. Forum 32, 113–123 (2013).
Article Google Scholar
Guo, Y. et al. Adaptive weighted robust iterative closest point. Neurocomputing 508, 225–241 (2022).
Article Google Scholar
Zhou, Q.-Y. et al. (eds) Fast global registration. (Leibe, B., Matas, J., Sebe, N. & Welling, M.) Computer Vision – ECCV 2016, 766–782 (Springer International Publishing, 2016).
Balakrishnan, G., Zhao, A., Sabuncu, M. R., Guttag, J. & Dalca, A. V. Voxelmorph: a learning framework for deformable medical image registration. IEEE Trans. Med. Imaging 38, 1788–1800 (2019).
Article Google Scholar
Arar, M. et al. Unsupervised multi-modal image registration via geometry preserving image-to-image translation (2020). In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, 2020).
Vaswani, A. et al. Attention is all you need. Proc. NeurIPS (2017).
Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. Proc. ICLR (2021).
Chen, J. et al. Transmorph: transformer for unsupervised medical image registration. Med. Image Anal. 82, 102615 (2022).
Article PubMed PubMed Central Google Scholar
Mattes, D., Haynor, D. R., Vesselle, H., Lewellen, T. K. & Eubank, W. Pet-ct image registration in the chest using free-form deformations. IEEE Trans. Med. Imaging 22, 120–128 (2003).
Article PubMed Google Scholar
Kim, S., Min, D., Ham, B., Do, M. N. & Sohn, K. Dasc: robust dense descriptor for multi-modal and multi-spectral correspondence estimation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1712–1729 (2017).
Article PubMed Google Scholar
Kim, S. et al. Dasc: Dense adaptive self-correlation descriptor for multi-modal and multi-spectral correspondence. In Proc. IEEE/CVF CVPR (2015).
Jiang, Q. et al. A contour angle orientation for power equipment infrared and visible image registration. IEEE Trans. Power Del. 36, 2559–2569 (2021).
Article Google Scholar
Pourfard, M. et al. Kaze-sar: Sar image registration using kaze detector and modified surf descriptor for tackling speckle noise. IEEE Trans. Geosci. Remote Sens. 60, 1–12 (2022).
Article Google Scholar
Alcantarilla, P. F., Nuevo, J. & Bartoli, A. Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Patt. Anal. Mach. Intell. 34, 1281−1298 (2013).
Li, H., Su, D., Cai, Q. & Zhang, Y. Bsafusion: A bidirectional stepwise feature alignment network for unaligned medical image fusion. In Proc. AAAI Vol. 39 pp. 4725–4733 (2025).
Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P.Numerical Recipes in C: The Art of Scientific Computing 2nd edn (Cambridge University Press, 1992).
Qu, G., Zhang, D. & Yan, P. Information measure for performance of image fusion. Electron. Lett. 38, 313–315 (2002).
Article Google Scholar
Pluim, J. P. W., Maintz, J. B. A. & Viergever, M. A. Image registration by maximization of combined mutual information and gradient information. IEEE Trans. Med. Imaging 19, 809–814 (2000).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant Nos. U24A20252, 62327808 and 62361053, the Key Research and Development Program of Shaanxi Province of China under Grant Nos. 2024PT-ZCK-66 and 2024CY2-GJHX-48 and the Northwest Normal University 2024 Young Faculty Research Capacity Enhancement Program No. NWNU-LKQN2024-24, the Higher Education Institutions Innovation Fund Project in Gansu Province, No. 2025A-004, the Gansu Provincial Higher Education Industry Support Plan Project, Grant No. CYZC-2024-29. Gansu Province 2023 Key Talent Project, Grant No.2023010.

Author information

Authors and Affiliations

Gansu Engineering Research Center for JianDu, Gansu Provincial Engineering Research Center for Jiandu Intelligent Computing and Digital Humanities, the College of Computer Science and Engineering, Northwest Normal University, Lanzhou, China
Teng Wan, Fengchen Qi, Yanna Yang, Ying Qi & Qiang Zhang
State Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, China
Shaoyi Du

Authors

Teng Wan
View author publications
Search author on:PubMed Google Scholar
Fengchen Qi
View author publications
Search author on:PubMed Google Scholar
Yanna Yang
View author publications
Search author on:PubMed Google Scholar
Ying Qi
View author publications
Search author on:PubMed Google Scholar
Qiang Zhang
View author publications
Search author on:PubMed Google Scholar
Shaoyi Du
View author publications
Search author on:PubMed Google Scholar

Contributions

T.W. contributed to the conceptualization and methodology of the study. F.C.Q. conducted the experiments and prepared the original draft of the manuscript. Y.N.Y. was responsible for data curation. Y.Q. and Q.Z. reviewed and edited the manuscript. S.Y.D. supervised the study. All authors have read and approved the final version of the manuscript.

Corresponding author

Correspondence to Shaoyi Du.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wan, T., Qi, F., Yang, Y. et al. Adaptive multi-feature fusion for visible-infrared image registration and character enhancement of bamboo slips. npj Herit. Sci. 14, 96 (2026). https://doi.org/10.1038/s40494-026-02368-z

Download citation

Received: 15 November 2025
Accepted: 03 February 2026
Published: 13 February 2026
Version of record: 13 February 2026
DOI: https://doi.org/10.1038/s40494-026-02368-z

Adaptive multi-feature fusion for visible-infrared image registration and character enhancement of bamboo slips

Abstract

Similar content being viewed by others

SSA-based adaptive infrared-visible image fusion for ink enhancement in ancient bamboo slips

Digital restoration and feature recognition of a Qing-Dynasty vernacular dwelling based on multimodal data fusion

A dual-stream feature decomposition network with weight transformation for multi-modality image fusion

Introduction

Methods

Multi-feature detection and weighted matching