Background & Summary

According to the Global Burden of Disease Study in 2019 (https://ghdx.healthdata.org/gbd-results-tool), oral diseases affect nearly 3.5 billion people, posing a large health burden for society. The World Health Organization (WHO) report also states that oral care is expensive and usually outside the components of universal health coverage. People, particularly in low- and middle-income countries, cannot afford such services (https://www.who.int/team/noncommunicable-diseases/global-status-report-on-oral-health-2022). As there are vast numbers of sufferers and a shortage of medical resources, accurate, inexpensive, and accessible methods for diagnosing and treating disease are highly important for three purposes: (1) to improve the dental health care service; (2) to reduce patient costs; (3) to cover more patients, particularly those in remote areas. However, traditional dental diagnosis and treatment methods (e.g., dentists taking X-rays of patients and then obtaining diagnosis and treatment strategies manually)1 have difficulty meeting the requirements of high efficiency, low cost, and accessibility. As one of the most promising technologies for improving medical services and reducing health burdens, imaging-based machine learning technology has been widely introduced into dentistry for assisting clinicians in diagnosis and treatment, image translation, and image segmentation, etc2,3,4,5,6,7,8,9,10.

Based on the advantages of radiographs in bone imaging, as shown in Fig. 1, three radiographs have become the most common method to assist dentists in obtaining a diagnosis and developing treatment strategies. First, CT is a 3D image that provides high-resolution anatomical information on the patient’s case11 and is most commonly used for implant planning because it provides accurate information on the height and width of the jaw and the position of important structures12,13. However, the price of CT machines produced in European and American countries is at least 74,350 dollars, and the radiation is 58.9 − 1025.4μSv14. Second, extraoral radiographs (e.g., panoramic radiographs (PaX-ray)) show both the mandible and the surrounding oral and maxillofacial structures, including the temporomandibular joints12, and the radiation is 5.5 − 22.0μSv14. However, because it is a two-dimensional image, it cannot provide accurate measurement information of the jaw. The inherent magnification and overlapping of teeth in the technique varies depending on the machine12. Third, intraoral X-rays such as periapical radiographs (PeX-ray) provide information on the entire tooth from the crown to the root and are often used to rule out lesions at the apex of the tooth, which may occur when the tooth has become nonvital12, and the price of intraoral X-rays is much lower than the above two radiographs, and the radiation is  < 5μSv14. However, a PeX-ray can only provide bone information within 2-3 mm around the apex of the tooth and is limited to teeth in one arch12.

Fig. 1
figure 1

Three different types of dental radiographs. (a)CBCT, (b) PaX-ray, (c) PeX-ray.

The multimodal dental dataset is expected to facilitate advancements in fields such as assisting doctors in diagnosis, image translation, and image segmentation. CBCT images, panoramic radiographs, and periapical radiographs are the three most commonly used types of clinical dental imaging. Researchers can leverage this image data to train models that assist doctors in diagnosis and treatment, including the diagnosis of oral diseases15,16, automatic measurements in orthodontics17,18, and preoperative planning for dental implants etc19,20. The dataset provides a CSV file indicating whether CBCT files contain dental implants, which can be used for implant detection tasks in dentistry. However, it lacks specific disease annotations. Researchers can annotate the dataset according to their tasks, thereby facilitating model training and evaluation.

One of the common tasks in image translation is to convert images from one domain to another. According to the aforementioned report by the WHO, three-quarters of the world’s population threatened by oral diseases reside in low- and middle-income regions. Compared to CBCT, panoramic and periapical radiographs are much cheaper and expose patients to much less radiation. If it were possible to reconstruct CT scans from panoramic or a small number of periapical radiographs, it would significantly reduce patients’ radiation exposure and costs, which is particularly important for low- and middle-income populations. Research on reconstructing CT scans from panoramic images already exists21, and the dataset containing data from different modalities can further advance this task.

Similarly, the multimodal oral dataset supports segmentation tasks for teeth and alveolar bone22,23. While segmentation predominantly occurs on oral CT datasets, publicly available CBCT datasets are scarce. Our dataset consists of 329 CBCT data from 169 patients, addressing this issue and empowering researchers to effectively explore segmentation tasks.

Despite the high potential of imaging-based machine learning in contributing to dentistry research and clinical usage, oral image datasets are limited to machine learning research. We surveyed all studies mentioned in three recent overviews of dentistry, which involve 74 works, and only 2 studies were based on publicly available oral datasets7,24,25. These two datasets are the Tufts Dental Database (Panetta et al., 2022)26 and the Virtual Skeleton Database (Kistler et al., 2013)27. The privateness of datasets prevents third parties from objectively evaluating and exploring a study, which is detrimental to the development of the field of dentistry. Conversely, to our knowledge, there are five publicly available datasets, as shown in Table 1. However, current publicly available oral datasets also have several limitations.

  • First, the limited number of cases in available datasets, especially CT data, poses a challenge to the development of data-driven deep learning. As shown in Table 1, only one CT dataset is available among the majority of public datasets, which mainly consist of PaX-ray. In addition, among the 74 studies we surveyed, only 13 (17.6%) used CT data from private datasets. Furthermore, among those studies that included more than 100 patients, only 5 (6.7%) made use of CT data7,24,25.

  • Second, the absence of paired data for different modalities in these datasets makes it impossible to compare techniques across modalities. Furthermore, these datasets do not support the development of multi-scenario applications, which refer to applications that must be used in various scenarios due to different medical or other conditions, each requiring data from different modalities. As Table 1 shows, no publicly available datasets currently contain data for all the modalities mentioned above.

  • Third, current public datasets lack diversity and complexity and are often biased towards overhealth and overdisease28, which renders them unable to accurately represent the real clinical setting. In addition, models trained on datasets with these flaws suffer from data drift, resulting in a good performance during training but poor performance during a real deployment. Ultimately, techniques developed based on these datasets are difficult to implement in actual clinical settings.

Table 1 Publicly available oral datasets.

Based on the above considerations, we present a publicly accessible multimodal dental dataset29 that is useful for machine learning research and clinic services. First, the dataset includes 329 CBCT images. All CBCT image data were collected from 169 patients using Smart3D-X (Beijing Langshi Instrument Co., Ltd., Beijing, China) (Fig. 3a). A total of 67 patients had more than one CBCT image taken at different times. Second, this dataset has the three most common modalities of data: CBCT images, panoramic radiographs, and periapical radiographs. The periapical radiograph is generated from the CBCT using cxr-ct (https://github.com/KendallPark/cxr-ct)(Fig. 3a). In this dataset29, 188 CBCT images have paired periapical radiographs. All panoramic radiographs have paired CBCT images and periapical radiographs. Finally, to keep the characteristics of the real clinical setting (such as variety), the dataset contains various types of patients (e.g., the entire upper jaw has no teeth, all dentures, irregular teeth, and implanted teeth), as shown in Fig. 2, encouraging other researchers in the field to use it to develop and test their methods of assisting clinicians in diagnosis and treatment, image translation, and image segmentation, etc.

Fig. 2
figure 2

Classification of four teeth. (a) The entire upper jaw has no teeth, (b) all dentures, (c) irregular teeth, and (d) implanted teeth.

Methods

Ethics statement

This research has received approval from the Ethics Committee of Guilin Medical University (Approval No: GLMC20230502). The approved content encompasses the collection of imaging data, reconstruction of patients’ oral three-dimensional models, and sharing of imaging data. Within this dataset, all personally identifiable information, except for the patient’s gender and age, has been either removed or regenerated to align with U.S. HIPAA regulations. Moreover, the dataset is exclusively restricted for legitimate scientific research purposes. Additionally, informed consent has been obtained from the patients.

Patient characteristics

Considering the potential hazards of obtaining radiological images, this study did not design a prospective experiment to recruit volunteers for unnecessary radiological imaging examinations to obtain data but used existing patient data. We collected data from all adult patients who visited dental hospitals from 2021 to 2022. After excluding data with quality issues, attempts were made to obtain informed consent from the users, ultimately obtaining informed consent from 169 patients, as shown in Table 2. Eight patients simultaneously had data for three different modalities.

Table 2 Gender and age distribution in Multi-modal dental dataset.

Data collection

The dataset29 contains 329 volumetric oral cavity CBCT scans, encompassing data from 169 patients, along with 8 panoramic radiographs, each corresponding to a different patient. Additionally, there are 16,203 periapical radiographs available, with three different angle views for each tooth, totaling 5,401 teeth, corresponding to 188 CBCT files.

CBCT is a variation of traditional CT that uses a cone-shaped X-ray beam to capture the data of the oral cavity and creates a 3D representation inside the oral cavity30,31. Compared to traditional CT, CBCT has many advantages, such as low cost, easy accessibility, and low radiation exposure, and it has been widely used in the field of dentistry32. All CBCT images in the dataset are from a CBCT machine that uses a two-dimensional flat panel detector to collect object cone beam ray projection data and a large diameter cone X-ray beam for scanning and performs 180°–360° synchronous rotation of the patient’s head on the plane for the acquisition of volumetric image data of the entire scanned area33 (Fig. 3a). All images are reconstructed using the Filted Back-Projection (FBP) reconstruction method, and the T-MAR artifact correction function is used to automatically identify high-density substances in the mouth and remove artifacts by deep learning (Fig. 3b). Among the 329 medical records we collected, the output size of 327 images is set to 640 × 640, these images’ slice thickness is 0.25 mm, and the pixel spacing is 0.25 mm × 0.25mm. The output size of 2 images is set to 550 × 550, the slice thickness of these images is 0.15 mm, and the pixel spacing is 0.15 mm ×  0.15 mm. All images are saved in the Digital Imaging and Communications in Medicine (DICOM) format34.

Fig. 3
figure 3

(a) Data acquisition process for three modality data. (b) Software processing CT data. (c) Technical validation to ensure high-quality image data.

A panoramic radiograph also uses a cone-shaped X-ray beam to capture the data of the oral cavity and creates a single flat 2D image of the curved structure of the entire mouth (Fig. 3a). Compared to traditional CBCT, the panoramic radiograph only generates approximately 1/40 radiation but lacks spatial structure information. All panoramic radiograph images in the dataset are obtained from the CBCT machine using the principles of narrow slot and circular orbital tomography principles. The machine rotates 180° around the patient for data acquisition. The output size of the images is set to 1468 × 2904, the thickness of these images is 0.075 mm, and the pixel spacing is 0.075 mm × 0.075 mm.

A periapical radiograph is typically used by the X-ray beam to capture the data of the oral cavity and creates a 2D image of the teeth. Compared to the other radiographs mentioned above, the periapical radiograph only focuses on a small part of the oral cavity (usually covering 3-4 teeth) through the built-in film or intraoral X-ray sensors, generating little radiation. In the real clinical setting, periapical radiograph images are obtained from a portable handheld X-ray generator and the built-in film (the size of the film usually contains 40 mm × 30 mm). However, it is nontrivial to collect many periapical radiograph images, particularly paired CBCT, panoramic, and periapical radiographs. First, taking radiographs multiple times can cause patients to receive unnecessary radiation doses. Second, although obtaining dental films is simple, to obtain complete oral information, ensure the complexity and diversity of data, and meet the needs of developing machine learning technology, 10-30 data collections are required. Finally, in current dental hospitals, the built-in film of a patient is usually handed over to the patient and is not stored as data in the hospital.

Considering that a CT image is obtained by a rotating X-ray source, the CT image contains all the information of a single X-ray image. Thus, many researchers focus on using CT images to generate the corresponding X-ray image and have achieved good results35,36,37. In this study, to obtain periapical radiographs, we generated them from CBCT images using the Siddon-Jacobs ray-tracing algorithm38,39, which is one of the methods for computing DRR(Digitally Reconstructured Radiograph). The Siddon-Jacobs ray-tracing algorithm simulates the process of X-rays passing through the human body and being attenuated by human tissue to generate radiographic images. Due to its convenience and efficiency, it is the most commonly used method for generating computed DRRs40,41,42. Furthermore, research has shown that the images generated using this algorithm exhibit errors within an acceptable range when compared to real images43. Additionally, the dataset is continuously updated, and in the future, we will integrate emerging technologies to generate periapical radiographs.

Figure 4 shows that periapical radiograph generation consists of four steps. First, a 60 mm × 50 mm × 50 mm cube is cut out from the 3D CBCT image of the patient’s tooth. The midline lm passes through the teeth, and the line ls of the cube is tangent to outsize the face. For a patient, 20-32 cubes are obtained. Second, to apply the Siddon-Jacobs ray-tracing algorithm on the cube, a rotation is applied to the cube to ensure that the cube’s direction is the same. The outside of the face faces the positive direction of the y-axis, and the teeth face the positive direction of the z-axis. Third, the X-ray process is simulated by propagating incident X-ray photons (from a radiation source) through a cube using the Siddon-Jacobs ray-tracing algorithm of the Insight Segmentation and Registration Toolkit (ITK) imaging package. When using this algorithm, we set the distance between the X-ray source and the cube as 1000mm, add a random value of 0-5mm, and set three angles of X-ray incidence, which are 20-25 degrees to the left and 5-10 degrees to the left, and 20-25 degrees to the right, to generate periapical radiographs with different angles. Finally, considering the size of the built-in film of adults, there are usually two sizes of images in real life. Therefore, the periapical radiograph generated above will be cut as 40 mm  × 30 mm. In the dataset, there are a total of 329 CBCT files. We attempted to label each tooth in every file, however, severe tooth loss in some CBCT image files hindered accurate annotation of each tooth’s position. Additionally, tooth incompleteness issues emerged after segmentation from annotated files. These data were removed following expert quality control procedures. Nonetheless, even after removal, we still have 188 PeX file data, comprising 16,203 images of 5,401 teeth from three different angles. For machine learning, this still represents a considerable amount of data.

Fig. 4
figure 4

Process of obtaining periapical radiographs from CBCT image data.

Privacy

To ensure the protection of patient privacy, all demographic-sensitive information of patients, except for gender and age, has been either deleted or replaced with new values. Patient names and IDs have been replaced with randomly generated new IDs. Other IDs in the files, such as StudyInstanceUID, have also been regenerated. The date of birth has been removed, and other dates (e.g., study time, etc.) have been randomly offset to fall between 2200 and 2300. However, the chronological order of timestamps for multiple visits by each patient has been retained. The dataset does not include individuals under the age of 18 or patients aged 89 and above.

Data statistics

The demographics of the patients are summarized in Table 2 and Fig. 5a. As shown in Fig. 5a, the number of patients who chose to have CBCT images taken was far greater than the number of patients who chose to have PaX-rays taken. Possible reasons for this disparity are that PaX-ray is two-dimensional and has significant limitations, including distortion, lack of spatial structure information, etc.; thus, oral surgeons prefer to take CBCT images so that the pathology can be evaluated in 3 dimensions and the pathology of the lesion can be determined12. Second, if the patient has a plan for dental implants, CBCT is the primary choice. In CBCT images, the number of female patients exceeds that of male patients, and this trend is also observed among patients who have had multiple visits Table 3.

Fig. 5
figure 5

(a) Number of images separated by the patient’s gender. (b) The distribution of age for CBCT, PaX-ray, and PeX-ray images.

Table 3 Gender and age distribution in patients with multiple visits.

The age distribution of the patients is shown in Fig. 5b using a boxplot, which indicates that the patients who underwent panoramic radiographs were younger. Table 4 presents the imaging settings for different types of images, with peak kilovoltage (kVp) and X-ray tube current affecting the radiation exposure dose and slice thickness representing the axial resolution34. As depicted in Table 4, the slice thickness is mostly set to 0.25 mm, accounting for 99.7% of CBCT images. The remaining two were set to 0.15 mm. For all CBCT images, the kVp was 100. The X-ray tube current was typically between 6-8 mA. For panoramic radiographs, the slice thickness was 0.25 mm, kVp was 100, and X-ray Tube Current was set to 10 mA for patients.

Table 4 Scan settings used to acquire the Multi-modal dental dataset.

Data Records

The multimodal dental dataset29 has been released on PhysioNet for users to download. As illustrated in Fig. 6, the dataset is structured hierarchically. CBCT images and panoramic and periapical radiographs are organized into separate folders at the top-level directory. CBCT images and panoramic radiographs are saved in DICOM format, while periapical radiographs are generated by cutting dental slices from CBCT images and irradiating them from three angles. The resulting periapical radiographs are stored in TIF format under the PeX-ray folder within three subfolders. The files within these three folders are named using the patient’s ID followed by an underscore and a number to indicate the patient’s visit number. For example, ‘0006_0’ denotes data from the first visit of patient 0006. In addition to these three folders, the remaining CSV files are as follows: CBCT_Info.csv, PaX_Info.csv, PeX_Info.csv, Patient_Statistics_Info.csv, and Implant_Marking_Info.csv. CBCT_Info.csv contains patient age and gender information for each CBCT file, along with details about the CBCT file itself, such as tube current and tube voltage used for the CBCT image acquisition, as well as the dimensions of the file, etc. PaX_Info.csv is similar to CBCT_Info.csv but specifically records information related to panoramic radiographs. PeX_Info.csv provides statistics on the number of periapical radiographs from different angles for each patient. Patient_Statistics_Info.csv offers patient-level statistics, indicating whether each patient has data for these three modalities and the corresponding file names. Implant_Marking_Info.csv marks whether patients have dental implants. The specific meaning of each column in every CSV file is detailed in Table 5. All files are sorted in ascending order based on patient ID.

Fig. 6
figure 6

Structure of the data included in the multimodal dental dataset29.

Table 5 The specific meaning of each column in CSV files.

Technical Validation

To obtain high-quality standard data, quality control and calibration of the CBCT scanning device are essential for CBCT and panoramic radiographs. Therefore, an autocalibration procedure is executed daily to ensure calibrated and accurate performance of the CBCT scanner31. Additionally, the manufacturer of the CBCT scanner conducts an annual quality control service to maintain the high quality of the CBCT image.

For periapical radiographs, the performance of cutting CBCT to generate periapical radiographs and the Siddon-Jacobs ray tracing algorithm is key to obtaining standard high-quality data. Therefore, we organized 13 people to label the CBCT images and record the labeled data in the file. One person was selected as the person in charge; the 13 people were divided into groups of two, and the remaining one was in a separate group. Each person was required to label approximately 25 CBCT images. After labeling, the quality of the labels was checked by mutual inspection within each group. The process for checking the labels was as follows: first, we sliced the CBCT images based on these labels and then used 3D Slicer software to inspect each slice to see whether the tooth corresponding to the label was in the middle of the slice and whether the height of the section included the entire portion of the tooth. If there were quality issues with the labeled data, the person who labeled the file had to re-label it. After the checking was completed, we sent all the labeled files to the responsible person, who checked all the labeled files again to ensure the correctness of the labels and the quality of the obtained periapical radiographs. There are currently 188 labeled files that meet the requirements, and the rest will be updated in the future.

Usage Notes

Currently, there is a shortage of publicly available dental datasets, particularly lacking CBCT data. The establishment of the multimodal dental dataset29 aims to provide a broader range of diverse data types to facilitate the advancement of machine learning in dental healthcare services. To access the data, researchers are required to complete the following steps:

  • Become a credentialed user of the PhysioNet platform.

  • Complete the mandatory training.

  • Submit a data access request and await approval.

Once the application is approved, researchers will be granted access to the data.