Abstract
Tomb excavation is essential for cultural heritage preservation and historical research, but conventional recording methods based on hand-drawn sketches and photographs are time-consuming and prolong artifact exposure. This increases the risk of deterioration caused by environmental fluctuations, microbial attack, and oxidation. To address this issue, this study proposes a rapid documentation workflow integrating a laser scanner and a 360° panoramic camera. The workflow includes three modules: (1) scene construction using high-precision LiDAR point clouds for 3D scene and digital surface model (DSM) generation with multi-temporal registration; (2) dynamic object removal for automatic detection and elimination of occluders during excavation; and (3) orthophoto and 3D model generation from registered multi-temporal point clouds and imagery. Applied to the Wuwangdun tomb, the method produced high-resolution orthophotos and detailed 3D models that support accurate off-site annotation and documentation.
Similar content being viewed by others
Introduction
Tombs in ancient China carry profound and complex significance: they are not merely resting places for the deceased but also embody political, religious, social, and cultural functions1,2. Archeological excavation remains the only means to recover certain types of material and contextual evidence, and such data are indispensable for a deeper understanding of human history3,4.
Although tomb excavation can reveal a wealth of crucial historical information, it is inherently destructive and irreversible: once excavated, the original stratigraphic relationships and environmental context cannot be restored5,6,7. Consequently, modern archeological practice emphasizes the principles of “preservation first” and “minimal intervention”8,9. When an archeological site is in a relatively stable condition, excavation should be avoided where possible so that opportunities for study and intervention can be reserved for a time when analytical and conservation technologies are more mature10,11. However, such idealized static preservation is not always achievable in practice.
Owing to the combined effects of urban expansion, environmental change, and increasingly rampant tomb looting, many significant burial sites now face damage or complete loss12,13,14. Against this backdrop, salvage excavation has emerged as an emergency, science-based intervention employed when a site faces imminent and irreversible damage15. Because the preservation conditions of artifacts have already changed substantially, salvage excavation must be carried out rapidly to extract the maximum amount of information and to scientifically document the condition of the remains, all under the principle of “minimum intervention.”
Traditional archeological field recording documents the physical form of a location, space, or artifact using drawings and sketches16. Such manual records inevitably omit information and cannot precisely reproduce the site’s geometry and appearance17. Digital 3D recording can generate accurate digital replicas of physical objects more efficiently. Digital replicas of archeological sites enable the wider application of advancing computational techniques in archeology, thereby fostering the development of the field18,19,20,21. Over the past decade, digital 3D recording of archeological sites, features, and artifacts has become standard practice and has largely supplanted traditional analog documentation methods17,22.
During preliminary investigations, we identified several factors that hinder archeological teams from flexibly and dynamically applying 3D modeling technologies in the field. The main issues include, but are not limited to:
-
Time inefficiency of conventional image-based reconstruction: in practical excavations, data collected in the morning often must be processed and available by the afternoon, since archeologists use the results to adjust excavation plans. Conventional image-based reconstruction is time-consuming. Typical workflows, including sparse reconstruction, dense reconstruction, mesh generation, and texture mapping, can take hours to days to complete, whereas excavations often require rapid data acquisition and processing23.
-
Conflict between recording requirements and on-site excavation activities: to ensure the quality of conventional 3D recording (e.g., UAV surveys), sites are often cleared to avoid occlusions, which conflicts with frequent on-site activities such as digging, cleaning, and measuring, as shown in Fig. 1a.
-
GNSS interference and inconsistent georeferencing under protective structures: protective structures, such as metal canopies, are often erected at excavation sites (see Fig. 1b), which can degrade GNSS reception for UAVs and similar platforms24. This increases operational risk and prevents consistent georeferencing across temporal acquisitions, thereby hindering multi-temporal comparative analysis of the tomb site.
a Dynamic objects (e.g., archeologists, tools) present during excavation that can occlude the site and interfere with 3D reconstruction. b A metal shelter erected at the excavation site that can interfere with GNSS signals.
To address the issues above, we propose the following solutions:
-
Rapid on-site 3D data acquisition using a portable LiDAR-Camera mapping system: a self-developed portable LiDAR-Camera mapping system was used to complete a full-site scan within minutes and to rapidly acquire high-precision point clouds and imagery. Building on this capability, we propose a fast 3D reconstruction method to enable rapid on-site acquisition and processing. Using our system, full-site data can be captured within 5 min and processed within 30 min, meeting the rapid documentation needs of excavation.
-
Intelligent occlusion masking via a visual large language model: an intelligent masking method based on a visual large language model was developed. The method automatically detects and masks occluding objects on site without requiring site clearance, preventing occlusions from degrading 3D reconstruction.
-
Iterative Closest Point (ICP)-based temporal registration with multi-sensor fusion: we propose a temporal registration method that fuses multiple sensors, combining ICP with additional sensor data (for example, IMU and trajectory). The method achieves high-precision alignment of multi-temporal datasets even without GNSS, enabling multi-temporal comparison in a unified coordinate frame.
Using the laser scanner and a 360° panoramic camera in combination with the algorithms presented in this paper, we produced high-precision 3D reconstructions and orthophotos, offering essential technical support and data assurance for the excavation. This paper details the technical services implemented at Wuwangdun, including data acquisition, processing, and analysis, and presents the principal results obtained.
Methods
Study site
In the early Western Zhou period, King Cheng enfeoffed Xiong Yi at Jingshan (Danyang), conferring upon him the title “Lord of Chu”25. Under the leadership of Xiong Yi and his successors, Chu evolved from a peripheral fiefdom into one of the “Seven Warring States,” boasting an illustrious history spanning approximately 800 years26. Throughout this period, the Chu political center migrated eastward–relocating from Danyang to Ying, and subsequently to Chen and Juyang. Finally, in 241 BCE, under significant pressure from the burgeoning Qin state, King Kaolie relocated the capital to Shouchun (present-day Huainan, Anhui). This event marked a definitive eastward shift in Chu’s political, economic, and cultural gravity.
Historically, the capitals of Chu were strategically situated along the Han, Yangtze, and Huai River basins, fostering a distinct material and spiritual culture characterized by enduring poetic traditions and exquisitely crafted lacquerware. Consequently, the study of Chu culture is pivotal for understanding the broader development of ancient Chinese civilization. While numerous Chu burials have been excavated to date, the majority are of medium or small scale; large, high-status tombs remain comparatively rare, leaving significant gaps in our understanding of elite Chu mortuary practices27,28,29.
The Wuwangdun Tomb fills this critical void. Situated approximately 15 kilometers east of the Shouchun site, the tomb preserves an intact, self-contained funerary complex (Fig. 2). Roughly rectangular in plan and encircled by a protective moat, the complex covers an internal area of approximately 1.5 million square meters.
The map shows the eastward movement of Chu’s political center from Danyang to Shouchun. The Wuwangdun Tomb (yellow star) is situated adjacent to the final capital, Shouchun (F), reflecting the burial customs of high-ranking nobility in the late Warring States period.
Research framework
To efficiently document data throughout the entire process of tomb excavation, we propose a systematic workflow and introduce several innovative methods. As illustrated in the Fig. 3, this approach comprises the following steps:
-
Data collection: rapid spatial data acquisition at excavation sites using laser scanners and 360° panoramic cameras.
-
Scene reconstruction: the filtered and registered point clouds are transformed into a common coordinate system and fused to produce an integrated global 3D model and a digital surface model (DSM).
-
Dynamic object removal: we utilize the Segment Anything Model (SAM)30 to segment dynamic (moving) objects and reproject the resulting masks, thereby excluding these objects from the panoramic images.
-
Reconstruction: through precise camera pose estimation, we generate high-resolution orthophotos from the DSM and export textured meshes derived from the 3D model.
The workflow consists of four main stages: data collection, scene reconstruction, dynamic object removal, and final 3D reconstruction. Each stage incorporates specific techniques to address the unique challenges of archeological excavation documentation.
We applied the proposed workflow and methods to the excavation of the Wuwangdun tomb, achieving favorable practical outcomes.
Data collection
In this study, data acquisition was conducted using a rigidly integrated LiDAR-panoramic camera system, and the detailed specifications of the equipment are listed in Table 1. The LiDAR sensor is a SLAM-based scanning unit providing real-time trajectory estimation at a frequency of 10 Hz. The panoramic camera is a dual-fisheye 360° imaging module that outputs equirectangular spherical images at a resolution of 5888 × 2944 pixels and a frame rate of 30 fps.
The LiDAR and camera are rigidly mounted on the same platform, and their spatial relationship was determined through extrinsic calibration. The calibration yields a fixed rigid-body transformation between the LiDAR coordinate frame and the camera coordinate frame, represented by the rotation matrix Rex and translation vector Tex. This extrinsic calibration allows for direct transformation of LiDAR poses into camera poses without the need for additional optimization steps.
Since the LiDAR trajectory is updated at 10 Hz (0.1 s interval) while the camera operates at 30 fps, image timestamps are associated with the nearest LiDAR pose. Therefore, the maximum temporal discrepancy in pose estimation does not exceed 0.1 s. Given the short working distance (≤2 m) and moderate acquisition speed, this temporal offset introduces negligible spatial deviation and satisfies the accuracy requirements of this study.
The exported panoramic imagery follows a spherical equirectangular projection model rather than a conventional pinhole camera model. Spatial resolution is characterized by angular resolution rather than focal length and intrinsic parameters. For a panoramic width of 5888 pixels, the horizontal angular resolution is approximately 0.00107 rad per pixel. At a working distance of 2 m, this corresponds to a theoretical ground sampling distance (GSD) of approximately 2.1 mm in the equatorial region. It should be noted that due to spherical projection, spatial resolution varies with latitude and decreases toward the polar regions.
Data were acquired following the scheme shown in Fig. 4. The mapping system was mounted on a telescopic extension pole (Fig. 4a), allowing us to capture areas that could not be accessed on foot. The Wuwangdun tomb chamber measures approximately 23 m by 22 m. As shown in Fig. 4b, the red markings denote wooden beams. To protect the artifacts, operators stood only on these beams or on soil-covered surfaces; all other wooden structures were off limits, which substantially increased logistical complexity. The chamber was divided into nine structural sectors, and data collection within each sector followed a square-spiral trajectory as illustrated in Fig. 4c.
a Scanner with telescopic pole; b tomb chamber layout, with red areas indicating wooden beams where operators may stand; c square-spiral data-collection trajectory: outer loops capture peripheral areas, and the pole length is adjusted to acquire inner regions.
Scene reconstruction
There are two main steps in the scene construction process: unified coordinate system, DSM, and 3D model generation.
First, we need to align the data acquired at different times into a common coordinate system. The coordinate system of the laser scanner is defined by the LiDAR’s position and orientation at the start of the scan, making it inherently a local reference frame. In the context of continuous archeological excavation, such a local system cannot support temporal comparison or cross-session referencing. Therefore, it is essential to employ an algorithmic correction to transform the acquired data into a unified global coordinate system.
For the initial data acquisition, an absolute spatial reference was introduced through a set of control points. Specifically, four control points (Cpt1-Cpt4) were established in stable areas (Region c in Fig. 5) around the excavation site. These points were identified using clearly recognizable targets, and their coordinates were obtained from high-density point cloud measurements and GNSS observations. The control points were used to transform the reconstructed point clouds from the original local coordinate system generated by the SLAM-based scanning process into a unified site coordinate system, as illustrated in Fig. 6.
a Region a is a metal canopy installed to shield the site and imagery from sunlight and rain; b Region b is a compacted-earth area used to stabilize the surrounding ground and prevent collapse; c Region c serves as the registration reference area and is equipped with fixed triangular-prism markers with prominent corner features to enable consistent multi-temporal coordinate alignment; d Region d is the active excavation zone, whose spatial extent changes dynamically as work progresses.
a A high-precision point cloud of the archeological area was acquired using FARO and other surveying devices. b Ground control points were surveyed in an open outdoor area. c The point cloud was registered to the world coordinate system using these control points. d Fixed points within the study area were selected as control points. e Their coordinates were measured from the high-precision point cloud in the world coordinate system.
After this initial georeferencing step, the first-stage reconstruction served as the reference dataset. Point clouds acquired in subsequent excavation stages were registered to this reference using Algorithm 1, ensuring consistent spatial alignment among multi-temporal datasets and enabling reliable comparison of excavation progress.
Algorithm 1
Multi-temporal Point Cloud Registration Workflow
Input: Source Point Cloud Ps, Target Point Cloud Pt, Invariant Region Definition Rinv
Output: Aligned Source Point Cloud Pfinal
Phase 1: Pre-processing (Crop)
1: \({P}_{s}^{{\prime} }\leftarrow \,CROP{P}_{s},{R}_{inv}\) ⊳ Crop source to invariant area
2: \({P}_{t}^{{\prime} }\,CROP{P}_{t},{R}_{inv}\) ⊳ Crop target to invariant area
Phase 2: Global Coarse Registration (RANSAC)
3: Tbest ← I
4: Nmax_inliers ← 0
5: for i ← 1 toKiterations do
6: \({S}_{s},{S}_{t}\leftarrow \,RANDOMSAMPLE{P}_{s}^{{\prime} },{P}_{t}^{{\prime} }\) ⊳ Random point-pair sampling
7: Tcurr ← COMPUTETRANSFORMSs, St ⊳ Estimate R and t
8: \({n}_{inliers}\leftarrow \,COUNTINLIERS{P}_{s}^{{\prime} },{P}_{t}^{{\prime} },{T}_{curr}\)
9: if ninliers > Nmax_inliers then
10: Nmax_inliers ← ninliers
11: Tcoarse ← Tcurr
12: end if
13: end for
Phase 3: Fine Registration (ICP)
14: ⊳ Refine alignment by minimizing Euclidean distances
15: \({T}_{final}\leftarrow \,ICP{P}_{s}^{{\prime} },{P}_{t}^{{\prime} },{T}_{coarse}\)
Phase 4: Final Transformation
16: Pfinal ← APPLYTRANSFORMPs, Tfinal ⊳ Apply to original full cloud
17: return Pfinal
In practice, all acquired datasets were transformed into the coordinate frame of the initial acquisition. The complete registration workflow is summarized in Algorithm 1. To enhance registration accuracy and robustness under the challenging conditions of an active excavation, we introduced two critical prior steps before applying the standard ICP algorithm. First, the multi-temporal point clouds are cropped to the scene’s invariant regions (Region C in Fig. 5). Next, an initial alignment is estimated using RANSAC (Random Sample Consensus): random point-pair sampling is used to compute the rotation and translation, and the solution is iteratively optimized to maximize the number of inliers. Finally, the initial alignment is refined with the ICP algorithm to minimize the Euclidean distances between corresponding points.
Second, DSM generation and 3D model reconstruction are crucial steps in the scene reconstruction process. The data processing workflow for generating the DSM and the 3D mesh model is illustrated in Fig. 7. The process begins with the original point cloud data and proceeds through three main phases: data preprocessing, DSM generation, and 3D model reconstruction.
The process involves point cloud preprocessing, spatial interpolation for DSM creation, and Poisson equation solving for 3D mesh reconstruction.
Data preprocessing
Raw point cloud data typically contains noise and redundant information. To ensure data quality and computational efficiency, the raw data first undergoes a preprocessing stage. This involves denoising to remove measurement errors, filtering to eliminate outliers, and thinning (downsampling) to reduce point density while preserving the underlying geometric structure.
DSM generation
The processed point cloud serves as the basis for DSM creation. To convert the discrete points into a continuous elevation surface, spatial interpolation techniques are employed. As shown in the workflow, Inverse Distance Weighting (IDW) is utilized to interpolate elevation values for unmeasured locations based on the values of surrounding sample points. Following interpolation, the data undergoes gridding and rasterization, transforming the spatial information into a regular grid matrix. This process yields the final DSM, effectively representing the terrain’s topography.
3D model reconstruction
Parallel to DSM generation, the workflow reconstructs a full 3D surface model. This phase requires preparing a dense point cloud with consistent normals to define the surface orientation correctly. We employ the Poisson Surface Reconstruction method31, which formulates the reconstruction as a spatial Poisson problem. By solving the Poisson equation, the algorithm extracts an isosurface that approximates the point cloud data. Finally, a triangular mesh is generated from this implicit surface, resulting in a continuous, untextured 3D model that accurately captures the geometry of the target object.
Dynamic object removal
During the excavation process, the movement of personnel can lead to motion blur and dynamic objects in panoramic images. To enhance the quality of orthophotos and 3D models, it is necessary to remove dynamic objects from panoramic images. Due to the significant distortion in panoramic images, dynamic objects may appear in unconventional forms. Therefore, we propose a dynamic object removal algorithm based on panoramic image cube projection.
The module consists of three main steps: panoramic image projection, motion object detection based on a visual large language model, and re-projection of the detected motion object masks back onto the panoramic image for removal. The detailed steps are as follows:
-
Panoramic image projection: first, project the panoramic image onto a cube to obtain perspective images on six faces. The transformation relationship between spherical coordinates and camera coordinates is established, allowing for the mapping of any point on the sphere to the panoramic image.
-
Motion object detection based on a visual large language model: we utilize the SAM to segment dynamic objects in the perspective images. By inputting keywords such as “person,” “instrument held by person,” “white instrument,” and “black plastic tube,” we can effectively obtain the range of dynamic objects in the perspective images.
-
Re-projection of motion object masks: the segmented masks are re-projected back onto the panoramic image to remove dynamic objects.
The panoramic image is projected onto a cube, resulting in six perspective images(As shown in Fig. 8b). In the world coordinate system, a point \(P({X}_{W}^{p},{Y}_{W}^{p},{Z}_{W}^{p})\) has an imaging point \(p({X}_{C}^{p},{Y}_{C}^{p},f)\) on the perspective image plane. The transformation relationship under the local projection angle (αk, βk) is as follows:
The process includes a projecting the original panorama onto a cube to obtain b cube face projections. The perspective views are then processed using SAM for detecting and masking dynamic objects to create c masked perspective views. Finally, the masks are re-projected back to generate the d final masked panorama.
By mapping each point’s coordinates (φ, ϕ) on the sphere to the panoramic image, we achieve computer storage of the panoramic image. The pixel coordinates in the local projection angle (αk, βk) can be calculated as follows:
Each spherical coordinate (φ, ϕ) is mapped to the panoramic image via Eq.(2). Combining Eqs.(1) and (2), we can obtain the pixel coordinates in the local projection angle (αk, βk):
Perspective images for any projection viewpoint (αk, βk) can be obtained by using the above formulas. In this study, we use six perspectives: (0°, 0°), (0°, 90°), (0°, −90°), (90°, 0°), (180°, 0°), and (270°, 0°) for motion object detection.
Instance segmentation is a core task in computer vision, aiming to simultaneously identify and segment individual objects within an image. In this study, we employ the SAM to detect dynamic objects in the perspective images. SAM is a visual large language model that can effectively segment objects based on user-defined keywords.
The detection process involves inputting keywords such as “person,” “instrument held by person,” “white instrument,” and “black plastic tube” into SAM. The model then generates masks for the detected dynamic objects(As shown in Fig. 8c), which are subsequently re-projected back onto the panoramic image for removal(As shown in Fig. 8d).
3D reconstruction
The reconstruction process involves generating orthophotos and 3D models from the processed panoramic images and point clouds. The orthophotos are created from the DSM, while the 3D models are constructed from the point clouds. The detailed steps are as follows: synchronizing timestamps, estimating image poses, and generating orthophotos and 3D models.
Timestamps of panoramic images and point clouds are synchronized to refer to the same instant, ensuring accurate reconstruction. IMU data from the LiDAR and camera—rigidly connected—provide inherently aligned signals for this purpose. Extrema in each IMU stream are selected as synchronization anchors, and the conversion between their timestamp systems is computed (Fig. 9).
a IMU from the LiDAR sensor. b IMU from the panoramic camera.
The LiDAR outputs an acquisition trajectory, i.e., the pose at each timestamp. For each panoramic image, its timestamp is converted to the LiDAR timebase using the time-alignment results; the corresponding LiDAR pose is then obtained by linear interpolation along the trajectory time series. This procedure enables spatiotemporal alignment between images and point clouds.
Assume the LiDAR pose at time ti is (Ti, Ri) and at time tj is (Tj, Rj). The panoramic image timestamp is tk in the LiDAR clock. Let
The translation is linearly interpolated as
The orientation is interpolated along the SO(3) geodesic,
where \(\exp (\cdot )\) and \(\log (\cdot )\) denote the canonical maps between SO(3) and \({\mathfrak{so}}(3)\) (implemented with the standard closed-form Rodrigues formulas; small-angle handling is applied for numerical stability).
Since the LiDAR sensor and the panoramic camera are rigidly integrated, the camera’s pose can be directly derived by applying the pre-calibrated extrinsic parameters to the interpolated LiDAR trajectory. Let the extrinsic transformation between the LiDAR and the camera frames be defined as (Rex, Tex). The camera pose at any given frame timestamp tk is then calculated as follows:
By implementing this direct georeferencing workflow, the system effectively bypasses the computationally intensive global optimization (e.g., bundle adjustment) typically required in traditional image-based reconstruction. This approach not only significantly enhances computational efficiency but also provides superior robustness in archeological environments characterized by low-texture or repetitive patterns, where conventional feature-based matching often fails.
Once the DSM, the untextured 3D model, and the image poses are available, panoramic images are orthorectified and projected onto the DSM to produce orthophotos, and the image textures are mapped onto the 3D model to generate a textured mesh.
Results
This section reports two groups of results. First, we demonstrate the effectiveness of the automatic dynamic-object detection and removal pipeline applied during preprocessing to eliminate moving or occluding elements that would otherwise degrade reconstruction. Second, based on the cleaned and registered datasets, we present the multi-temporal orthophotos and 3D reconstructions produced for each chamber, together with quantitative accuracy assessments.
Absolute geometric accuracy assessment
To evaluate the absolute geometric accuracy of the reconstructed datasets, an accuracy assessment was conducted using independently measured control points. In this study, the reconstructed point clouds were first aligned to a reference coordinate system through the proposed multi-temporal registration framework. The reference coordinate system was subsequently related to the site coordinate system, allowing the reconstructed datasets to be expressed in an absolute spatial reference.
After this transformation, the coordinates of several control points were extracted from the registered point clouds through point picking. These coordinates were then compared with their corresponding reference coordinates obtained from GNSS measurements. The coordinate differences between the reconstructed datasets and the reference measurements were used to estimate the absolute positioning accuracy.
To ensure the robustness of the evaluation, four datasets were randomly selected from different excavation stages. Each dataset was registered to the reference point cloud, and the coordinates of the control points were extracted and compared with their corresponding reference coordinates. The resulting accuracy statistics are summarized in Table 2.
The results indicate that the absolute positioning accuracy of the reconstructed datasets is on the order of approximately 1.048 cm, demonstrating that the proposed workflow can maintain reliable spatial accuracy while enabling consistent multi-temporal documentation of the excavation process.
It should be noted that the control points used for accuracy evaluation are distributed around the outer boundary of the chamber, whereas the main area of interest is located in the central region. Therefore, the reported residuals primarily reflect the absolute geometric consistency of the overall reconstructed scene. Because the central region is enclosed by the surrounding control points and typically benefits from denser observations and stronger geometric constraints, its local geometric accuracy may be slightly better than that suggested by the boundary control-point residuals. In this sense, the reported control-point errors may be regarded as a conservative estimate for the core study area.
Dynamic-object detection and removal
Conventional 3D recording workflows often require site clearance during acquisition to avoid interference from dynamic objects such as excavation personnel, which conflicts with frequent on-site activities. To address this issue, an automatic occlusion detection and removal pipeline based on a visual large language model was developed; it identifies and removes dynamic occluders without clearing the site. Figure 10 illustrates results from Wuwangdun: Fig. 10a shows the orthophoto generated from the raw imagery, and Fig. 10b shows the orthophoto after dynamic-object removal. After cleaning, excavation personnel, tools, and other occluders are effectively removed, reducing reconstruction artifacts and improving the efficiency and flexibility of field data acquisition. The same-colored bounding box in both panels marks an identical region to facilitate visual comparison.
The red and yellow boundaries delineate the region where dynamic objects were removed. a Orthophoto generated from raw imagery containing dynamic objects; b orthophoto after dynamic-object removal using the proposed method.
Results of multi-temporal acquisitions
We produced multiple orthophotos during data processing to support artifact documentation and analysis, and these products also serve as important references for subsequent archeological study. The Wuwangdun tomb chamber was roofed with multiple layers of wooden planks. During the removal of the planks, a few artifacts were exposed, so we prioritized comprehensive, site-wide data acquisition and processing to generate overall orthophotos. Figure 11 presents three representative overall orthophotos. Initially, the chamber comprised five large burial compartments oriented east(e), west(w), south(s), north(n), and middle(m). As an additional layer of planking was removed, the internal layout was revealed: each of the four peripheral compartments was subdivided into two smaller chambers, producing eight peripheral chambers plus the central chamber, resulting in a nine-room configuration that is consistent with standards for ancient royal tombs.
a Orthophoto of the first layer of planking; b orthophoto of the second layer of planking; c orthophoto of the third layer of planking.
After the nine-chamber configuration was revealed, the plank covers in each chamber were removed, and numerous artifacts were exposed, requiring detailed recording and analysis. Accordingly, separate data acquisition and processing were performed for each chamber, producing high-resolution orthophotos. As shown in Fig. 12, these orthophotos clearly display the distribution and fine details of the artifacts within the chambers. The images also reveal clear differences among chambers: the east and west chambers are noticeably smaller than the north and south chambers, which may reflect differences in function; the west chamber appears undisturbed, whereas the east and north chambers show severe looting and structural damage. These observations carry important archeological implications. In addition, chamber-specific orthophotos enable direct spatial comparison and the analysis of construction techniques and material features at different heights.
The leftmost column shows the orthophotos of the top cover slabs of the nine individual chambers in the Wuwangdun area. The right columns present the orthophotos of each chamber acquired at three subsequent time points. The images are arranged in chronological order from left to right, corresponding to the progressive advancement of the excavation.
All orthophotos were registered to a common coordinate system. Comparison of orthophotos from different dates shows that the products retain accurate georeferencing: the coordinate discrepancy between images is only 0.79 cm (as shown in Table 3). Figure 13 presents overall-area and central-chamber orthophotos acquired on March 16 and March 18. The March 16 dataset includes an additional layer of planking over the central chamber compared with the March 18 dataset. Although the two datasets differ substantially, overlapping regions remain, permitting multi-temporal comparison.
a Orthophoto of the overall site acquired on March 18; b orthophoto of the central chamber acquired on March 18; c orthophoto of the central chamber acquired on March 16.
Traditional manual measurements that estimate relative accuracy by measuring the same object in two images are subjective and often provide too few control points for robust evaluation. Feature points were extracted from each orthophoto, descriptors were matched using FLANN, and outliers were removed with RANSAC. Fifteen corresponding point pairs were obtained; their absolute coordinate distances are listed in the Table. 3. The mean absolute distance error is 0.79 cm, and the maximum error is 2.42 cm. Figure 14 shows the matching result.
Red points denote the extracted feature points in the two images, and green line segments represent the matched feature pairs. The absolute coordinate error between the images was evaluated using the coordinate distances of matched points, while the relative error was assessed from the differences in matching segment lengths.
From the 15 points, \({C}_{15}^{2}=105\) unique line segments were formed. For each matching segment pair, the length difference was computed as the relative error. The average segment-length error is 0.78 cm.
Overall, the multi-temporal orthophotos exhibit at most centimeter-scale absolute coordinate discrepancies and millimeter-scale relative errors between matching features.
For each orthophoto produced, a corresponding 3D model was generated. Figure 15 presents examples of 3D models from the Wuwangdun excavation. All models were registered to a common coordinate frame using the registration method proposed in this study. The 3D models clearly illustrate excavation progress and the spatial relationships among artifacts, providing visual support for methodological adjustments and field decision-making.
a Top layer model: this model primarily reflects the five large burial chambers in the tomb, including the east, west, south, north, and central chambers. b Middle layer model: four crossbeams divide the surrounding four side chambers into eight smaller chambers, forming a nine-chamber layout. c Bottom layer model: clearly displays the distribution of artifacts within the burial chamber and their spatial relationships. d Detailed chamber views: shows detailed texture maps of individual sections of the tomb. Labels “e,” “w,” “s,” “n,” and “m” correspond to the east, west, south, north, and central chambers, respectively, with subscripts indicating specific sub-chambers or layers.
Discussion
Dynamic, efficient, and high-precision 3D recording of tomb excavations remains a core challenge for archeological documentation. To address this challenge, we developed and field-implemented a workflow that tightly couples a laser scanner with a 360° panoramic camera and a set of processing algorithms designed for rapid, robust, and repeatable documentation. The acquisition setup uses a portable scanner mounted on a telescopic pole and a panoramic camera; the scanner records high-density point clouds (10 Hz trajectory) while the camera captures imagery at 30 fps, allowing dense geometric and radiometric sampling of the scene even where operators cannot step. In the Wuwangdun deployment, the chamber was divided into nine structural sectors, and data were collected following a square-spiral trajectory per sector, which ensured full coverage with minimal disturbance to the site.
The preprocessing and synchronization stage combines IMU-based time alignment, coarse cropping to invariant regions, and an initial RANSAC alignment prior to ICP refinement. IMU synchronization uses characteristic extrema in the two sensors’ inertial traces to establish time offsets and enable frame-level pose interpolation; cropped invariant regions (walls, floors, fixed markers) provide stable geometry for robust multi-temporal registration. RANSAC delivers a reliable initial transform while ICP refines it to the millimeter level, producing unified multi-epoch point clouds suitable for fusion.
For occlusion handling, we project panoramas into multiple perspective views, apply the SAM to detect nonartifact occluders, and reproject the resulting masks to both images and point clouds. This mask-re-projection pipeline removes personnel, tools, tarps, and other dynamic objects from the imagery and point sets, obviating the need for site clearance during acquisition and reducing reconstruction artifacts. Quantitatively, masking is evaluated by masked-area ratios in orthophotos and point-removal ratios in the point clouds; in our experiments, masking significantly reduced spurious points and lowered correspondence residuals during registration.
From the unified and cleaned point clouds, we generate a DSM via spatial interpolation (we used IDW for a robust balance of accuracy and efficiency) and produce an untextured surface mesh using Poisson reconstruction to obtain watertight geometry. Textures are then produced by projecting the synchronized panoramic images onto the DSM and textured mesh, yielding high-resolution orthophotos and textured 3D models. The pipeline produced 62 orthophotos and 62 three-dimensional models for the site. Geometric validation based on control points and tie-point comparisons indicates a mean relative coordinate discrepancy of approximately 0.79 cm, with a maximum of 2.42 cm, between reconstructed products, reflecting the internal geometric consistency of the orthophotos and 3D models. In addition, absolute accuracy assessment based on independently measured control points yielded an overall mean error of 1.05 cm, an RMSE of 1.12 cm, and a maximum error of 1.76 cm, indicating centimeter-level absolute geometric agreement with the site reference system.
Taken together, the implemented hardware-software solution demonstrates practical field readiness: it achieves rapid on-site acquisition, enables off-site processing to shorten artifact exposure time, and yields high-fidelity products that support reliable multi-temporal comparison and archeological interpretation. At the same time, performance remains conditioned by site geometry, dense occlusions under protective structures, and areas of low texture that reduce feature strength; these operational constraints define the practical bounds within which the workflow delivers the stated accuracy.
Data availability
The data used in this study were obtained from the Wuwangdun Digital Archaeology Project. These data are restricted and were used under license for the current study; therefore, they are not publicly available. However, data are available from the authors upon reasonable request and with permission from Wuhan University.
Code availability
The underlying code used and/or analyzed in the current study is available from the corresponding author upon reasonable request.
References
Bin, B. Religious beliefs as reflected in the funerary record. In Early Chinese Religion: The Period of Division 220–589. 2 vols. Ed. John Lagerwey and Lü Pengzhi. Leiden: Vol. 2, pp. 989–1073. (Brill, 2010).
Campbell, R., Jaffe, Y., Kim, C., Sturm, C. & Jaang, L. Chinese bronze age political economies: a complex polity provisioning approach. J. Archaeol. Res. 30, 69–116 (2022).
Demoule, J.-P. We still have to excavate—but not at any price. Archaeol. Dialogues 18, 5–10 (2011).
Orser Jr, C. E. Historical Archaeology (Routledge, 2016).
Roosevelt, C. H., Cobb, P. J., Moss, E., Olson, B. R. & Ünlüsoy, S. Excavation is destruction digitization: advances in archaeological practice. J. Field Archaeol. 40, 325 – 346 (2015).
Evangelista, L. E. S. Resting in Peace or in Pieces? Tomb I and Death Management in the 3rd Millennium BC at the Perdigões enclosure (Reguengos de Monsaraz, Portugal) PhD thesis, Universidade de Coimbra (2017).
Baker, B. J. Confronting taphonomic challenges from excavation through curation of human remains. J. Archaeol. Method Theory 32, 1 (2025).
Davite, C. Fragility and Durability. Problems and Techniques of the Archaeological Conservation Campaigns 115–128. https://doi.org/10.1007/978-3-030-32754-5_6 (Springer International Publishing, 2020).
Richmond, A. & Bracker, A. L. Conservation: Principles, Dilemmas and Uncomfortable Truths (Routledge, 2009).
Manders, M. In situ preservation:‘the preferred option’. Mus. Int. 60, 31–41 (2008).
Wilkins, C. D. Preserving sacred sites of yesterday for tomorrow: the archaeological conservancy. in Lagerwey, J. & Lü, P. (eds), The Intersection of Sacredness and Archaeology. Cham: Springer 343–351 (2024).
Hollesen, J. Climate change and the loss of archaeological sites and landscapes: a global perspective. Antiquity 96, 1382–1395 (2022).
Ray, N. & Nikolaus, J. Changing urban environments and the impact on coastal cultural heritage at Marsa Matruh, Egypt. J. Marit. Archaeol. 17, 445–464 (2022).
Goelet, O. Tomb robberies in the valley of the kings. In Wilkinson R. H., Weeks K. R. (eds), The Oxford Handbook of the Valley of the Kings. Oxford 448–466 (2016).
Demoule, J.-P. Rescue archaeology: a European view. Annu. Rev. Anthropol. 41, 611–626 (2012).
Wang, S., Wang, Y., Hu, Q., Li, J. & Ai, M. Unmanned aerial vehicle and structure-from-motion photogrammetry for three-dimensional documentation and digital rubbing of the zuo river valley rock paintings. Archaeol. Prospect. 26, 265–279 (2019).
Lang, M., Hussein, R. & Kluge, P. The 3D digital documentation of shaft k24 in Saqqara. In Lucarelli, R., Roberson, J. A. & Vinson, S. (eds), Ancient Egypt, New Technology: The Present and Future of Computer Visualization, Virtual Reality and Other Digital Humanities in Egyptology. Leiden: Brill 186–212 (2023).
Wang, S., Hu, Q., Wang, S., Ai, M. & Zhao, P. Archaeological site segmentation of ancient city walls based on deep learning and lidar remote sensing. J. Cult. Herit. 66, 117–131 (2024).
Niu, W. et al. Dynamically documenting archaeological excavations based on 3d modeling: a case study of the excavation of the# 3 fossil of hominin cranium from Yunxian, Hubei, China. Herit. Sci. 12, 279 (2024).
Wong, L. & Santana Quintero, M. Tutankhamen’s two tombs: replica creation and the preservation of our cultural heritage in the digital age. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci 42, 1145–1150 (2019).
Alshawabkeh, Y. & Baik, A. Integration of photogrammetry and laser scanning for enhancing scan-to-hbim modeling of Al Ula heritage site. Herit. Sci. 11, 147 (2023).
Masciotta, M. G., Sanchez-Aparicio, L. J., Oliveira, D. V. & Gonzalez-Aguilera, D. Integration of laser scanning technologies and 360° photography for the digital documentation and management of cultural heritage buildings. Int. J. Archit. Herit. 17, 56–75 (2023).
Croce, V. et al. Comparative assessment of neural radiance fields and photogrammetry in digital heritage: impact of varying image conditions on 3D reconstruction. Remote Sens. 16, 301 (2024).
Ulvi, A. Documentation, three-dimensional (3D) modelling and visualization of cultural heritage by using unmanned aerial vehicle (UAV) photogrammetry and terrestrial laser scanners. Int. J. Remote Sens. 42, 1994–2021 (2021).
Qiang, Y. The exploration of Chu capitals. PhD thesis, Alma Mater Studiorum – Università di Bologna, Bologna (2021).
Chen, J. State formation and bureaucratization: evidence from pre-imperial China. J. Econ. Hist. 84, 690–726 (2024).
Yang, Y. Taotie, dragon, phoenix, and farmer: a highly decorated Qin excavated from Jiuliandun. Early China 38, 129–150 (2015).
Luo, Z. et al. Scientific analysis and research on the Warring States bronze mirrors unearthed from Changsha Chu Cemetery, Hunan Province, China. Archaeometry 64, 1187–1201 (2022).
Meyer, D. Texts, textual communities, and meaning: the genius loci of the Warring States Chu Tomb Guodian One. Asiat. Stud./Études Asiat. 63, 827–856 (2009).
Kirillov, A. et al. Segment anything. In Proc. of the IEEE/CVF International Conference on Computer Vision 4015–4026 (2023).
Kazhdan, M., Bolitho, M. & Hoppe, H. Poisson surface reconstruction. In Proc. Fourth Eurographics Symposium on Geometry Processing. Aire-la-Ville: Eurographics Association (2006).
Acknowledgements
This work was supported by the National Key Research and Development Program of China (2024YFB3908900), the National Natural Science Foundation of China (No.42371439), and the Hubei Provincial Natural Science Foundation of China (No.2025AFB434).
Author information
Authors and Affiliations
Contributions
W.W.O. and Q.W.H. designed the study and wrote the original draft. S.H.W. provided guidance on manuscript preparation. P.C.Z., S.L.W., and M.Y.A. conducted data analysis and contributed to the writing and editing. Q.W.H. and X.J.Z. acquired the funding. S.W.H. participated in the manuscript review. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ou, W., Hu, Q., Wang, S. et al. Dynamically documenting archaeological excavations based on a portable LiDAR-camera mapping system: the Wuwangdun full-process recording example. npj Herit. Sci. 14, 237 (2026). https://doi.org/10.1038/s40494-026-02515-6
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s40494-026-02515-6

















