Introduction

With the advancement of geological sciences, the digital representation of outcrops has become an indispensable part of geological research. Outcrops, which are natural surface exposures of rock layers, provide critical information about geological structures, depositional environments, and rock compositions1,2,3. In the past, geologists primarily relied on field surveys to obtain this information. While fieldwork can provide detailed raw data, its application is limited due to restricted data collection areas, operational risks in the field, and difficulties in data preservation and sharing4,5.

In recent years, with the rapid development of drone-based oblique photogrammetry, 3D visualization tools, and deep learning algorithms, methods for studying geological outcrops have seen significant improvements. Drone oblique photogrammetry offers new possibilities for high-precision data collection, generating accurate 3D models by capturing high-resolution images from multiple angles6,7,8,9,10. However, this technology is not a novel topic, as several commercial and research tools currently exist to visualize 3D outcrops captured by drones, such as VRGS (Virtual Reality Geological Studio), Stratbox, and Lime. While these tools are powerful, they each have limitations. For example, VRGS focuses more on immersive experiences, while Stratbox specializes in analyzing rock layers within depositional environments. In comparison, the system proposed in this paper is web-based, offering advantages such as being platform-independent, requiring no installation, and supporting real-time operations. It not only enables high-precision 3D outcrop modeling but also integrates deep learning techniques for automatic rock type recognition, particularly excelling in dealing with complex geological structures11,12,13,14,15.

Moreover, Cesium, an open-source 3D geospatial engine, has been widely used in various fields, especially for visualizing geological data16,17,18,19,20. Through Cesium technology, researchers can interactively observe and analyze 3D geological structures in a virtual environment. However, despite Cesium’s powerful 3D visualization capabilities, accurately reproducing complex geological structures remains challenging due to issues such as data precision, completeness, the accuracy of 3D modeling, and the effective representation of intricate geological features21,22,23. In this context, this paper develops a geological outcrop characterization platform that combines drone oblique photogrammetry and Cesium 3D visualization. The platform utilizes a deep learning model to automate rock feature analysis, enhancing the accuracy and efficiency of geological research.

As an important tool in image recognition, deep learning has been widely applied to the task of automatic rock classification. In particular, the VGG19 model, with its deep convolutional structure, demonstrates outstanding performance in handling complex images24,25. Compared to other common deep learning models like YOLO and ResNet, VGG19 excels in recognizing rock textures and details, as its multi-layer convolutional kernels capture fine features at different scales. Therefore, this paper adopts the VGG19 model for automatic identification of rock thin sections. However, relying solely on thin section analysis cannot fully reveal the macroscopic structure of outcrops. The introduction of virtual outcrop technology addresses this deficiency. By integrating the results of thin section analysis with virtual outcrop models, this paper proposes a solution for collaborative applications, allowing for the macroscopic display of outcrop spatial distribution while accurately identifying rock types at the microscopic level.

Although several studies have already applied deep learning to rock classification26,27,28,29,30, demonstrating its widespread use in image recognition, this paper further improves the VGG19 model by optimizing its accuracy, detail recognition, and efficiency. By integrating it with Cesium, this study achieves a collaborative application of virtual outcrops and deep learning, enabling geologists to not only observe the 3D structure of outcrops but also obtain specific lithological information for each area. This significantly enhances the precision and efficiency of geological research.

Given the above background, this paper will detail this new technical approach and showcase its practical application through a case study of the Yiqikelike section. It aims to address the limitations of traditional geological exploration methods and improve the accuracy and usability of geological data. Specific objectives include:

  1. a)

    Utilizing drone oblique photogrammetry and Cesium 3D visualization technology to create accurate digital models of geological structures, expanding the scope of geological data acquisition and reducing the risks associated with field exploration.

  2. b)

    Applying the VGG19 deep learning algorithm to optimize lithology identification, improving the efficiency and accuracy of core thin section data analysis and accelerating geological information processing and interpretation.

  3. c)

    Exploring new methods for integrating Cesium with VGG19 in geological exploration, to promote the further development of geological science research, offering new perspectives and tools for analyzing Earth’s history and its processes of change, thereby advancing both research and practice in the field of geological sciences.

Related technologies and algorithms

UAV oblique photography technology

In the challenging field of geological exploration, the introduction of Unmanned Aerial Vehicle Tilt Photography technology (UTP) has not only significantly enhanced the efficiency of data collection but also greatly improved the quality and safety of exploration processes 31,32. This technology utilizes high-resolution cameras carried by drones to capture images of the Earth’s surface from multiple angles. Through a series of complex post-processing steps, such as multi-view image joint adjustment, dense matching of multi-view images, point cloud construction, and texture mapping, it generates high-precision three-dimensional models. These models are crucial for accurately understanding the geological structures of both the surface and subsurface, especially in situations like mineral development and disaster monitoring where rapid assessment and response are needed.

UTP is particularly suited for areas that are difficult to access or unsafe for human entry, such as mountains, canyons, or other harsh environments 33. By using drones for exploration, key data can be obtained without direct contact, significantly reducing the risk of personnel casualties and health hazards. Moreover, the use of drones substantially lowers the direct costs of geological exploration projects, as it reduces the reliance on traditional ground equipment and manpower while increasing the speed and frequency of data collection.

UTP demonstrates immense potential in the assessment of mineral resources. Tilt photography not only identifies surface mineral indications but also addresses occlusions and distortions through multi-view image joint adjustment. It can create detailed three-dimensional models of the Earth’s surface, aiding in the construction of ore body models and providing more accurate resource estimates. Dense matching of multi-view images further enhances the precision of image matching, while point cloud construction and texture mapping steps transform these data into digital terrain models with detailed layers. This results in the final three-dimensional models not only accurately reflecting the shape of terrain and features but also possessing realistic surface textures (Fig. 1).

Fig. 1
figure 1

Roadmap of Unmanned Aerial Vehicle Tilt Photography Technology.

Cesium digital outcrop characterization technology

The development of Cesium technology began in 2011, initially as a web-based 3D mapping engine. With the popularization of WebGL technology, Cesium started to support the rendering of more complex 3D terrain and building models. In the field of geology, particularly in creating 3D digital models of geological outcrops, Cesium has demonstrated its unique advantages 34,35.

As WebGL technology has evolved, Cesium is now capable of rendering increasingly complex 3D terrains and building models, especially showing unique advantages in geological applications. The process of creating a Cesium digital outcrop first involves the collection of complex geological data, including terrain and geological structures, usually achieved through remote sensing techniques and ground surveys. These data are then converted into 3D Tiles format and constructed into accurate three-dimensional geological models on the Cesium platform. These models not only include terrain elevation information but also precisely present the color and texture of rock layers 36,37. This technology, combining 3D visualization and geological data processing, enables geologists to intuitively observe and analyze complex geological structures, such as rock layers, faults, and folds, in a virtual environment. Currently, Cesium technology is widely applied in various fields, including geological exploration, geological education, and environmental science, aiding students and professionals in better understanding geological structures and processes. Despite facing challenges in data precision, large-scale data handling, and model realism, the development of Cesium digital outcrop characterization technology, bolstered by advancements in big data and artificial intelligence, has brought a new transformation to the field of geological sciences. It not only enhances the efficiency and accuracy of geological structure analysis but also opens new pathways for teaching, research, and practical application in geology 38,39. Cesium digital outcrop technology is set to play an increasingly important role in the future of geological sciences (Fig. 2).

Fig. 2
figure 2

Cesium Architecture Diagram.

VGG19 lithology identification algorithm

VGG19, as an advanced deep Convolutional Neural Network (CNN), has shown exceptional performance in the field of image recognition, particularly in lithology identification, since its inception 40,41,42. The core of this algorithm lies in its deep convolutional structure, which consists of a series of convolutional layers, activation functions (especially the ReLU function), pooling layers, and fully connected layers stacked together. In the convolutional layers, VGG19 extracts local features of the image at different depths through multiple convolutional kernels, gradually constructing a comprehensive feature map. The use of the ReLU activation function enhances the model’s ability to represent complex image features by suppressing negative signals to amplify positive features. Meanwhile, the incorporation of pooling layers, which downsample the feature map, not only reduces the number of model parameters and computational complexity but also helps the model resist minor variations in images. The fully connected layers at the end of the network synthesize the extracted features and make decisions through classifiers. The design philosophy of VGG19 is to enhance the model’s expressiveness by increasing network depth and complexity, making it perform exceptionally well in processing images of rocks with complex structures and rich textures 43,44. With its powerful feature extraction and classification capabilities, VGG19 has become an important tool in fields such as lithology identification, providing strong technical support for digital exploration in geology and related fields. Among its components, the convolutional layer is the most crucial, primarily functioning to extract features from images. The convolution operation can be represented by the following formula:

$$(I * K)(i,j) = \sum m\sum nI(m,n) \cdot K(i - m,j - n)$$
(1)

\(I\) represents the input image.

\(K\) represents the convolution kernel (or filter).

\((i,j)\) represents the pixel position in the image.

m and n represent the dimensions of the convolution kernel.

In VGG19, the ReLU (Rectified Linear Unit) is the activation function used to introduce non-linearity into the network. The purpose of the ReLU function is to set all negative values to zero while retaining positive values. Its formula is expressed as:

$$f(x) = max(0,x)$$
(2)

\(x\) represents the output of the convolutional layer.

The pooling layer is used to reduce the spatial dimensions of the feature map, decreasing the number of parameters and computational load. The most commonly used type of pooling is max pooling, which can be represented as:

$$P(i,j) = max_{k,l \in [0,M - 1]} (I(i + k,j + l))$$
(3)

\(P\) represents the output after pooling.

\(I\) represents the input feature map.

\(M\) represents the size of the pooling window.

\((i,j)\) represents the pixel position.

The fully connected layer is located at the end of the network and is primarily used for classification tasks. The mathematical representation of a fully connected layer is:

$$y = Wx + b$$
(4)

\(y\) is the output vector.

\(W\) is the weight matrix.

\(x\) is the input vector.

\(b\) is the bias vector.

In VGG19, the output of the last fully connected layer is classified using the softmax function, whose formula is:

$$\sigma (z)_{{\text{j}}} = \frac{{{\text{e}}^{{\text{z j}}} }}{{\sum {_{{\text{k}}} {\text{e}}^{{\text{z k}}} } }}$$
(5)

\(z\) is the input vector.

\(j\) and \(k\) are the indices of classes.

\(\sigma (z)_{{\text{j}}}\) is the predicted probability for the \(j\) class.

Through layers of convolution, non-linear activation, pooling, and fully connected operations, VGG19 can extract deep features from original images, effectively applying to lithology identification, classification, and other visual tasks.

Algorithm improvement and integration method

Algorithm improvement strategies

This study has made a series of improvements to the standard VGG19 model to enhance its performance in core lithology identification, especially when dealing with rock images with complex components and details. Improvements include refined adjustments in convolutional layers, the introduction of feature decomposition and recombination techniques, the application of reinforced learning algorithms, and the strengthening of attention mechanisms 45, represented by the formula:

$$O(F(P(C\prime (x;W_{c} ,b_{c} ));W_{f} ,b_{f} ))$$
(6)

\(C\prime\) represents the convolutional layer adjusted by the attention mechanism.

\(P\) represents the pooling layer.

\(F\) represents the fully connected layer.

\(O\) represents the output layer.

\(W_{c} ,b_{c}\), \(W_{f} ,b_{f}\) are parameters obtained through transfer learning.

The structure of VGG19’s convolutional layers was adjusted to more deeply capture the fine structure and complex texture of the rock surface. Multi-scale processing was introduced, applying filters of different sizes at various levels to capture the macro and micro details of rock features. Feature decomposition and recombination techniques were introduced in the VGG19 network. This technique decomposes the features of rock images into different components and recombines and analyzes them through sub-networks. The model can not only identify the overall features of rocks but also analyze and identify their individual components, such as different minerals and rock structures. Reinforced learning algorithms were applied to optimize the performance of VGG19, enabling the model to more accurately predict and analyze various components in rock images. During training, the model continuously adjusts parameters to maximize the accuracy of component identification. The basic structure can be represented as:

$${\text{Convolutional layer}}:\;C(x;W_{{\text{c}}} ,b_{{\text{c}}} ) = ReLU(W_{{\text{c}}} * x + b_{{\text{c}}} )$$
(7)
$${\text{Pooling layer}}:\;P(C) = MaxPool(C)$$
(8)
$${\text{Fully connected layer}}:F(P;W_{f} ,b_{f} ) = W_{f} P + b_{f}$$
(9)
$${\text{Output layer }}\left( {{\text{Softmax}}} \right):\;O(F) = Softmax(F)$$
(10)

\(x\) represents the input image.

\(W_{c} ,b_{c} ,W_{f} ,b_{f}\) respectively represent the weights and biases of the convolutional and fully connected layers.

\(C,P,F,O\) respectively represent the outputs of the convolutional, pooling, fully connected, and output layers.

The attention mechanism in the VGG19 model was strengthened to focus more on key component features in rock images, adjusting the weights of the output of the convolutional layers, which can be represented as:

$$C\prime (x;W_{c} ,b_{c} ) = A(C(x;W_{c} ,b_{c} )) \cdot C(x;W_{c} ,b_{c} )$$
(11)

\(A(x)\) represents the attention module.

By assigning higher weights to specific areas in the image, areas containing key lithological component information, such as specific minerals or rock textures, are emphasized. The improvements to the VGG19 model in this study make it more suitable for complex lithological component analysis tasks (Fig. 3). The comprehensive application of these strategies has improved the model’s performance in identifying the microscopic features and components of rocks, especially when dealing with rock images with complex components and details. These improvements not only enhance the accuracy of model recognition but also provide new directions and ideas for the application of deep learning in the geological field. With the advancement of technology and the availability of more data, these methods are expected to play an important role in future lithological analysis.

Fig. 3
figure 3

Model structure diagram.

Integrated methodology research

In practical research, an innovative integrated framework was designed to combine the 3D visualization capabilities of Cesium with the image recognition technology of VGG19, specifically targeting the recognition of cast thin section images of core samples. The essence of this framework lies in effectively integrating two main functionalities: on one hand, utilizing Cesium’s 3D visualization interface to display and manage cast core thin section images; on the other hand, employing VGG19 to perform deep learning analysis on the uploaded thin section images for identifying rock characteristics.

In operation, users can upload local core thin section images through Cesium’s user interface. These images are then automatically forwarded to the backend VGG19 model for processing and classification. The core of the integrated system is efficient data fusion. Initially, when users upload cast core thin section images, the system automatically performs preprocessing, including image normalization, size adjustment, and format conversion, to ensure the images meet the input requirements of the VGG19 model. Once the VGG19 model completes the classification of the rock samples, the identification results of each sample are stored as labels and linked to their corresponding original images. This information is then transmitted back to the Cesium interface, allowing users to visually explore the specific location and lithology identification results of each cast core thin section in the 3D view. This method enables users to not only obtain detailed lithological information about each sample but also explore the spatial relationships between samples.

To enhance the performance of the integrated system, a series of optimization measures were implemented. In the image processing workflow, a batch processing mechanism was developed, allowing the system to handle multiple uploaded images simultaneously, thus improving processing efficiency. For the VGG19 model, specialized hardware acceleration techniques were utilized to ensure rapid and accurate image analysis. In terms of the user interface, the interaction process was carefully designed to ensure users could easily upload images and quickly obtain analysis results. Moreover, considering different users might have varying network bandwidths, the image upload and data transmission processes were optimized to reduce waiting time and improve user experience.

By combining Cesium’s 3D visualization technology and VGG19’s deep learning capabilities, the integrated framework not only enhances the accuracy and efficiency of lithology identification but also provides users with an interactive and intuitive platform for in-depth exploration and analysis of rock samples (Fig. 4). This integrated approach brings new tools to geological science research and practice, helping to more comprehensively understand and analyze geological samples and offering a new, efficient means of analysis for the field of geology.

Fig. 4
figure 4

Multi-source integration method diagram.

Practical application and results analysis

Overview of the study area

The Yiqikelike section is located at the southern foothills of the Tianshan Mountains in Kuqa City, Aksu Region, Xinjiang Uyghur Autonomous Region, China (coordinates: 42.13605°N, 83.39018°E). This area features complex and unique geological structures, making it a significant region for geological studies 46,47. The Yiqikelike Profile is geologically renowned for its rich variety of rock types and mineral compositions, providing valuable information for the study of the region’s geological history and tectonic evolution. Geographically, this area is part of the Tianshan Mountain range, characterized by diverse landscapes consisting of mountains, basins, and rivers. Its geological history is extensive, having experienced multiple tectonic movements and magmatic activities, resulting in complex geological structures. The Yiqikelike Profile showcases various rock layers from the Paleozoic to the Cenozoic era, with diverse geological features covering a wide range of rock types and mineral combinations. The sedimentary rocks in this region primarily include sandstone, shale, and limestone, recording sedimentary environments and climate changes from different geological periods. Igneous rocks, mainly granite and basalt, reflect the region’s magmatic history. Metamorphic rocks, such as gneiss and schist, reveal the metamorphic processes of the deeper crust. Additionally, the area’s rocks record various sedimentary environments from marine to terrestrial, displaying the regression of ancient oceans and the uplift of land. These rock layers also contain abundant fossils, like paleontological and microfossils, providing important evidence for studying paleoecology and paleoclimate. The Yiqikelike Profile in Xinjiang is a region rich in geological features and of significant scientific research value. In-depth study of this area can provide valuable information for understanding the Earth’s history, tectonic evolution, and resource potential. The rocks and minerals of this area are crucial for understanding the geological structure and evolution of the Tianshan region and the broader Central Asian area 48,49,50,51.

This research aims to perform detailed lithological analysis of the Yiqikelike Profile by combining Cesium’s 3D visualization technology with an improved VGG19 deep learning model, aiming to enhance the accuracy and efficiency of lithological identification. Cesium’s 3D mapping technology was used to create a detailed digital model of the Yiqikelike Profile. By accurately mapping and displaying geological structures, Cesium enables researchers to intuitively observe the distribution and characteristics of rock layers (Fig. 5). This visualization not only provides guidance for the collection of rock samples but also enhances the understanding of the region’s geological history. Integrating Cesium’s 3D model with the improved VGG19 model offers a novel approach to lithology identification, allowing for more accurate interpretation of the composition and structure of rocks, revealing the complex geological history and tectonic activities of the region. These studies are significant for predicting future geological changes and resource development.

Fig. 5
figure 5

Research area map. a. Schematic diagram of the study area (https://map.tianditu.gov.cn/); b. Structural map of the Kuqa Depression; c. Schematic of sedimentary environmental structures.

Cesium digital outcrop exploration study

The digital outcrop exploration study of the Yiqikelike Profile using Cesium technology represents a significant advancement in the field of geological sciences. This technology accurately reproduces three-dimensional models of geological structures, providing a unique perspective for observing and analyzing the distribution, structure, and historical evolution of rock layers. This research focuses on the Yiqikelike Profile in the Xinjiang region, conducting detailed geological exploration and analysis using Cesium technology. The process involves creating detailed 3D digital models of the Yiqikelike Profile, starting with extensive geological mapping and data collection in the area using drone-based oblique photography techniques, including terrain, rock layer structures, fault distributions, and more. The modeling process generated an accurate three-dimensional view of the profile, incorporating the color, texture, and spatial distribution of the rock layers. With the detailed Cesium digital outcrop created, the research team was able to conduct a more in-depth analysis of the geological features of the Yiqikelike Profile. This includes but is not limited to:

  1. a)

    Rock Layer Distribution and Structural Analysis: Using the Cesium model, researchers can visually observe the distribution and structure of rock layers, identifying fault lines, folds, and other structural features.

  2. b)

    Historical Evolution Study: The model provides important clues about the formation and evolution of rocks, aiding researchers in reconstructing the geological history of the area.

  3. c)

    Resource Assessment: By analyzing the types and distributions of rock layers, a preliminary assessment of potential mineral resources can be made.

Although Cesium boasts powerful 3D visualization capabilities, significant challenges remain when it comes to accurately reconstructing complex geological structures. These challenges include data precision and completeness, the accuracy of 3D modeling, and the effective representation of intricate geological features. To overcome these challenges, this study adopts several innovative approaches.

First, by employing advanced geological surveying technologies such as drone oblique photogrammetry and laser scanning, we ensure high precision and data completeness. Multi-angle imaging and data fusion significantly enhance the detail and integrity of geological models. Second, we developed new data processing algorithms capable of handling large-scale complex geological datasets. These algorithms remove noise, correct data biases, and automatically fill in missing data sections, further improving the accuracy of 3D modeling. Additionally, our research team worked closely with geological experts to calibrate the models using expert knowledge, ensuring that the models accurately reflect complex geological features. Particularly in the reconstruction of faulting and folding, expert input was used to fine-tune parameters, enhancing the model’s ability to represent complex geological characteristics.

By using Cesium for the digital outcrop exploration of the Yiqikelike section, this study not only provides a new perspective for observing and analyzing the geological structures of this complex region, but also demonstrates the immense potential of 3D mapping technologies in geological sciences. The application of this technology not only advances geologists’ understanding of Earth’s history but also promotes the efficient development and management of geological resources (see Fig. 5). With further technological advancements and applications, Cesium is expected to play an increasingly important role in future geological research and practice (Fig. 6).

Fig. 6
figure 6

Digital outcrop characterization diagram. a. Digital outcrop characterization map; b. Reverse observation map; c. Detail distribution map.

Application study of VGG19 lithology identification algorithm

In this study, the VGG19 model, an advanced deep convolutional neural network, was employed to conduct an in-depth analysis of rock thin section images from the Yiqikelike section in Xinjiang. The essence of this approach lies in its ability to efficiently and accurately process a large-scale dataset of rock thin section images and perform precise lithology classification based on this data. The dataset used consisted of two parts: a large number of rock thin section images from publicly available online databases and rock core samples collected from the Yiqikelike section. This comprehensive dataset included 28 types with 1296 images of sedimentary rock thin sections, 40 types with 1724 images of igneous rock thin sections, and 40 types with 2057 images of metamorphic rock thin sections, totaling over 5000 images. To enhance the model’s ability to learn the unique features of various rock thin section images, the dataset was augmented to twice its original size through flipping, rotating, and scaling images. It was then randomly divided into 80% training, 10% validation, and 10% test sets for the experiment. Subsequent preprocessing such as size adjustment, color normalization, and noise reduction was carried out to meet the requirements of the VGG19 model.

To demonstrate the superiority of VGG19 in rock identification, this study conducted comparative experiments with other popular deep learning models, including YOLO and ResNet. Using the same dataset, we tested the performance of YOLO, ResNet, and VGG19 on rock image classification tasks.

According to the results, VGG19 was selected as the primary algorithm due to its advantages in rock identification. First, VGG19 achieved an accuracy of 99.2%, significantly outperforming YOLO (94.8%) and ResNet (96.5%), demonstrating its ability to classify and identify complex rock images more precisely. Additionally, VGG19 exhibited excellent detail recognition capabilities, capturing fine textures and structural features critical for lithology determination. Moreover, VGG19 excelled in handling complex rock structures, with its multi-layer convolutional kernels effectively extracting details at various scales. While VGG19’s processing speed was not as fast as YOLO’s real-time performance, it struck a better balance between accuracy and efficiency, making it suitable for rock classification tasks that require high precision. Therefore, considering accuracy, detail recognition, and processing speed, VGG19’s overall performance makes it a better choice than YOLO and ResNet for rock identification tasks.

Model

Accuracy

Processing Speed (ms/image)

Detail Recognition Ability

Advantages

Disadvantages

VGG19

99.2%

45

High

Strong capability in recognizing rock details (texture, structure), high classification accuracy

Large model size, high computational complexity, longer training time

YOLO

94.8%

20

Medium

Fast real-time detection, suitable for large-scale data processing and online analysis

Weaker detail classification ability, higher error rate

ResNet

96.5%

38

High

Deep network structure is suitable for processing complex rock features, good classification performance

High computational resource consumption, relatively slower training and inference speed

Inception

95.0%

42

Medium

Good adaptability to different types of images, efficient model structure

Inferior to VGG19 in recognizing rock texture and fine structures

DenseNet

97.1%

40

High

Fewer model parameters, high recognition accuracy, suitable for large-scale image classification tasks

Complex model, longer training time

Experiment 1:

We first conducted an experiment using the standard VGG19 model to train on the rock image dataset and obtained baseline performance metrics. The standard model’s classification accuracy was 95.7%, serving as a reference for subsequent model improvements.

Experiment 2:

A multi-scale processing module was introduced to process image details with convolutional kernels of varying scales simultaneously, capturing the micro features in rock images. The results showed that multi-scale processing significantly improved the ability to recognize fine textures, especially in classifying small rock features, increasing accuracy to 97.2%.

Experiment 3:

An attention mechanism was added to improve the model’s ability to recognize key areas within images. This mechanism focused resources on processing the parts of images containing complex rock layers, effectively reducing misclassification. The results indicated that model accuracy further improved to 98.3%, with both precision and recall enhanced, reducing the error rate by approximately 2.1%.

Experiment 4:

We applied reinforcement learning algorithms to fine-tune the model, optimizing its generalization ability across different rock types. Reinforcement learning allowed the model to dynamically adjust parameters based on classification results, significantly improving performance across various rock categories. The final experiment showed that the model’s accuracy reached 99.2%, with greatly improved generalization, especially in tasks involving complex rock layer classification.

Through this series of experiments, the optimized VGG19 model achieved a 3.5% improvement in classification accuracy compared to the standard model. It also demonstrated higher robustness and efficiency in processing complex rock images. These improvements validate the effectiveness of the model enhancements, laying a solid foundation for its broader application in geological outcrop identification tasks.

In this research, special attention was given to two core evaluation metrics: the Loss function and Accuracy. The Loss function, a key indicator for measuring the difference between model predictions and actual results, guides weight adjustment in the training of deep learning models. Cross-entropy loss was used, an effective method for addressing class imbalance in classification problems. Accuracy, as the most intuitive performance indicator, reflects the model’s overall effectiveness in classification tasks. In lithology identification tasks, high accuracy means that the model can reliably categorize rock images into the correct categories.

Overall, this study not only affirmed the potential application of deep learning in geology but also opened new directions for future research in this field. As technology advances, deep learning is expected to play an increasingly important role in the field of geological sciences, bringing profound impacts on geological research and practice. By continuously monitoring and adjusting the Loss function and Accuracy, the model ensured efficiency and accuracy in processing complex datasets of rock thin section images, providing a new and efficient method for lithology identification in the field of geology. With the ongoing application and development of deep learning technology in geological sciences, this method is expected to play an even more significant role in the future (Fig. 7).

Fig. 7
figure 7

Model performance evaluation chart.

The synergistic application of digital outcrop characterization technology and deep learning algorithms

The synergistic application of digital outcrop characterization technology and deep learning algorithms offers significant advantages in geological research, particularly in terms of cross-platform functionality, portability, and interactivity. This approach not only overcomes the limitations of traditional geological research, which relies heavily on manual analysis, but also introduces technical innovations based on existing visualization tools.

Digital outcrop technology, through drone oblique photogrammetry, generates high-precision 3D models that provide clear spatial information about the macroscopic structure of outcrops. However, traditional tools primarily focus on visualizing the outcrop and rarely delve into in-depth analysis of the outcrop’s internal features. By integrating deep learning algorithms, especially in the application of core image recognition, researchers can automatically analyze rock types, mineral compositions, and texture features. Deep learning not only improves the efficiency of image recognition but, through cross-platform implementation (such as Cesium-based 3D mapping technology), enables models to be viewed and interacted with on various devices and platforms. This cross-platform functionality and portability allow geologists to quickly load and manipulate complex outcrop models both in the field and in the lab.

The collaborative application emphasizes convenient user interaction. Through 3D digital outcrop technology, geologists can observe outcrops from different perspectives, modify parameters in real time, adjust visualization effects, or select specific areas for in-depth analysis via an interactive interface. When combined with deep learning algorithms, this interactivity enables automatic classification and identification of core images, mapping the results directly onto the 3D outcrop model, thus achieving seamless integration from microscopic to macroscopic scales (see Fig. 8). For example, deep learning can analyze microscopic image data of cores and automatically identify lithological features, which are then mapped to corresponding regions of the 3D outcrop model, providing a more accurate representation of geological layers. Compared to existing tools like VRGS, Stratbox, and Lime, the collaborative application not only offers the same powerful visualization capabilities but also enhances the model’s intelligence and ease of use. Traditional tools are mostly used for visualization and basic geological structure analysis, while the synergistic application introduces deep learning algorithms for more in-depth lithological identification and cross-platform operational capabilities. Additionally, as deep learning algorithms have the ability to self-learn and optimize, the collaborative application can continuously improve analysis accuracy as the amount of data increases, providing long-term technical support for comprehensive research on core analysis and outcrop models.

Fig. 8
figure 8

Schematic diagram of the synergistic application. a. Panorama of the study area; b. Close-up view; c. Detailed distribution map; d. Outcrop characterization measurement diagram; e. Core thin section schematic 1; f. Core thin section schematic 2.

The combination of digital outcrop characterization technology and deep learning algorithms not only enhances the automation of geological research but also, through cross-platform functionality and easy interactivity, allows researchers to more conveniently and efficiently identify core images and analyze 3D outcrops. This synergistic application achieves technical innovation based on existing tools, significantly improving both the practicality and accuracy of geological studies.

Discussion

The core of this study lies in integrating Cesium’s digital outcrop representation technology with VGG19’s deep learning lithology identification algorithm for geological exploration. The research found that Cesium’s 3D mapping technology provides an intuitive macroscopic display of geological structures, while VGG19 offers high-precision classification and analysis of rocks at the microscopic level, overcoming the limitations of traditional manual identification. The process of data fusion explored advanced algorithms and techniques to improve the efficiency and effectiveness of integrating different data sources. Particularly, optimizations in the VGG19 model for processing fine textures and complex structures of rocks significantly enhanced the accuracy of lithology identification. Despite challenges in data accuracy, algorithm optimization, and the fusion of different types of data, interdisciplinary collaboration has significantly enhanced the potential applications of this integrated technology. Furthermore, the application of this technology not only enhances understanding of geological structures and history but also provides new methodologies for fields like geological exploration and resource assessment. Future plans include further optimizing this integrated method to enhance its applicability in different geological settings and exploring more data-driven and machine learning-based geological research methods to drive innovation and development in geological sciences.

Conclusion

In this study, we explored the integrated application of two cutting-edge technologies, Cesium and VGG19, in the field of geological sciences, particularly in the geological exploration of the Yiqikelike section in Xinjiang. The main contributions and key issues addressed by this study are summarized as follows:

  1. a)

    Based on unmanned aerial vehicle (UAV) oblique photography technology, this study successfully implemented a digital outcrop characterization visualization platform using Cesium. This method provided high-resolution three-dimensional geological structural data, offering significant support for detailed analysis of geological structures and rock layers.

  2. b)

    The study overcame technical challenges in constructing high-precision 3D geological models and processing massive datasets of rock images, achieving effective integration of different data sources.

  3. c)

    The successful integration of Cesium 3D visualization technology with the VGG19 deep learning lithology identification algorithm represents a technical innovation and provides a new perspective and methodology for geological exploration.

  4. d)

    The successful application of this integrated method is of significant importance in the academic field and has made a substantial impact in practical application areas such as geological exploration, resource assessment, and environmental monitoring. It provides a more comprehensive and in-depth analysis of geological data, helping to more accurately assess geological resources and environmental risks, and offering a scientific basis for related policymaking.

This research not only demonstrates the wide-ranging application prospects of Cesium and VGG19 in the field of geological sciences but also provides new directions and impetus for future research and practice in geology. Through continuous technological innovation and interdisciplinary collaboration, this integrated approach is expected to play a key role in various branches of geological sciences, opening a new chapter in the field."