Abstract
With the development of artificial intelligence, technique improvement of the classification of skin disease is addressed. However, few study concerned on the current classification system of International Classification of Diseases, Tenth Revision (ICD)-10 on Diseases of the skin and subcutaneous tissue, which is now globally used for classification of skin disease. This study was aimed to develop a new taxonomy of skin disease based on cytology and pathology, and test its predictive effect on skin disease compared to ICD-10. A new taxonomy (Taxonomy 2) containing 6 levels (Project 2ā4) was developed based on skin cytology and pathology, and represents individual diseases arranged in a tree structure with three root nodes representing: (1) Keratinogenic diseases, (2) Melanogenic diseases, and (3) Diseases related to non-keratinocytes and non-melanocytes. The predictive effects of the new taxonomy including accuracy, precision, recall, F1, and Kappa were compared with those of ICD-10 on Diseases of the skin and subcutaneous tissue (Taxonomy 1, Project 1) by Deep Residual Learning method. For each project, 2/3 of the images were included as training group, and the rest 1/3 of the images acted as test group according to the category (class) as the stratification variable. Both train and test groups in the Projects (2 and 3) from Taxonomy 2 had higher F1 and Kappa scores without statistical significance on the prediction of skin disease than the corresponding groups in the Project 1 from Taxonomy 1, however both train and test groups in Project 4 had a statistically significantly higher F1-score than the corresponding groups in Project 1 (Pā=ā0.025 and 0.005, respectively). The results showed that the new taxonomy developed based on cytology and pathology has an overall better performance on predictive effect of skin disease than the ICD-10 on Diseases of the skin and subcutaneous tissue. The level 5 (Project 4) of Taxonomy 2 is better on extension to unknown data of diagnosis system assisted by AI compared to current used classification system from ICD-10, and may have the potential application value in clinic of dermatology.
Similar content being viewed by others
Introduction
A significant rise was demonstrated in the incidence of the majority of skin disease over the past decades1. Compared to disorders from other systems, diagnosis of skin disease is much more depended on lesion presentation, with more than 1500 different dermatological diagnoses, general practitioner diagnostic accuracy in dermatological disease has been estimated to be from 48 to 77%2, therefor the clinicians face a challenge to increase diagnostic accuracy and further improve theropy efficiency.
A lot of researches focused on the technique improvement of the diagnosis, especially on artificial intelligence3. BinderĀ et al.4 used computerized image analysis and an artificial neural network to automatically diagnose pigmented skin lesions. The sensitivity and specificity of the computerized system were 90% and 74%, respectively.
Verma et al.5 classified erythemato-squamous diseases by ensemble 5 different data mining techniques, and the results showed that the proposed ensemble method generates more efficient use of the dataset and give more accurate rate than individual data mining techniques.
Sharma et al.6 compared Support Vector Machine and Artificial Neural Network, along with an ensemble of these two techniques for classification of erythemato-squamous diseases, and found that the ensemble model has achieved a remarkable performance with the highest accuracy.
Moradi and Mahdavi-Amiri7 propose a kernel sparse representation based method for segmentation and classification of melanoma images, and the evaluation results demonstrate their approach to be competitive as compared to the available state-of-the-art methods.
Yap et al.8 developed a multimodal classifier, which outperforms a baseline classifier that only uses a single macroscopic image in both binary melanoma detection and in multiclass classification.
Chang and Chen9 used decision tree of data mining combining with neural network classification methods to construct the best predictive model on six major skin diseases, and found that the neural network model had the highest accuracy in prediction.
The main work of these investigations is listed in Table 1. However, all of the investigations focused on improvement of diagnosis effects with the assistance of the artificial intelligence techniques, few researches concentrating on the imperfection of the current classification system of dermatology and venereology have been developed. The International Classification of Diseases, Tenth Revision (ICD)-10 is now globally universal in order to keep consistency in disease diagnosis, however, the literature on the shortcomings of the ICD-10 is scant. Recent studies have found deficiencies in the classification of allergic conditions by ICD-10 codes10,11, and a new revision āāICD-11 āāis currently being developed with the aim of solving problems12.
With in-depth researches on pathogenesisĀ of skin disease, the knowledge on dermatology is improved and multiple diseases have been approved that their initial classifications are not accurate, for example, pyogenic granuloma sounds like an infectiousĀ diseases but actually is a kind of hemangioma, classification and nomenclature of vascular malformations have also changed13, and sebopsoriasis lacks a specific code14. So, the modern dermatology faces an imperiousĀ demandĀ of classification with being more scientific. Esteva et al.15 developed a dermatologist-level system for skin cancer classification, although the aim of this study was to test an artificial intelligence capable of classifying skin cancer, it provides a direction to re-classify skin disease from different aspects.
Based on the above considerations, we conduct this study to develop a new taxonomy based on the cytology and pathology, and to further test the new taxonomy on diagnosis effects by Deep Residual Learning method, and compared with the ICD-10 on Diseases of the skin and subcutaneous tissue, in order to find a new classification benefiting prediction, having potential application in clinical practices in dermatology and venereology.
Materials and methods
FigureĀ 1 demonstrates the whole structure of methodology used in this research, and the approach used in this paper is completely data driven.
Taxonomy
Taxonomy 1
ICD-10 Version: 2016āWorld Health Organization (http://apps.who.int/classifications/icd10/browse/2016/en).
Taxonomy 2
The taxonomy 2 represents 1,000 individual diseases arranged in a tree structure with three root nodes representing: (1) Keratinogenic diseases (KCs), (2) Melanogenic diseases (MCs), and (3) Diseases related to non-keratinocytes and non-melanocytes (Non-KC and non-MC). The taxonomy 2 was derived by dermatologists using a bottom-up procedure. Among the tree structure, individual diseases, initialized as leaf nodes, were merged based on organic or cellular similarity, until the entire structure was connected. The taxonomy 2 contains 6 levels, and the level 1ā3 are present in Fig.Ā 2. For each type of disease, a number indicates a different disease, and so on up to level 6.
The taxonomy is used in generating training classes that are both well-suited for machine learning classifiers and medically relevant. The root nodes are used in the first validation strategy and represent the source cell/organization of disease. The children of the root nodes (for example, malignant melanocytic lesions) are used in the second validation strategy, and represent disease classes that have similar clinical treatment plans.
Projects setting
All images come from the following public databases, Atlas (http://www.atlasdermatologico.com.br/), Dermatoweb (http://www.dermatoweb.net/), dermnet (http://www.dermnet.com/), Dermnetnz (https://www.dermnetnz.org/), Emedicinehealth (https://www.emedicinehealth.com/), Globalskinatlas (http://www.globalskinatlas.com/), Meddean (http://www.meddean.luc.edu/), Uiowa (https://medicine.uiowa.edu/). A total of 56,571 images were collected. The acquisition program generates a list of images with classification tags for each website, downloads the corresponding images, and obtains a picture library with a description of the classification tags.
Taxonomy 1 was defined as Project 1. Finally, based on the resources of the image library, which should be balanced in two taxonomies, 11 classes were selected as project 1, including pemphigus, lichen planus, congenital ichthyosis, other dermatitis, pediculosis, scabies, herpes viral infections, unspecified viral infection, gonococcal infection, other sexually transmitted diseases, other congenital malformations of skin, and not elsewhere classified.
Level 3 from Taxonomy 2 is defined as Project 2, and contains a total of 2 classes: Inflammatory diseases; Infectious diseases. Level 4 from Taxonomy 2 is defined as Project 3, and contains a total of 4 classes: Virus, Parasite, Bacteria, Dermatitis. Level 5 from Taxonomy 2 is defined as Project 4, and contains a total of 11 categories: porokeratosis; herpes, simple genital; lichen planus; condilomas acuminados; ichthyosis; viral exanthems; pediculosis pubis; pemphigus; gonorrhea; eczema; sarna noruega.
Data processing instructions
According to the Taxonomy 2, finally 1,847 images were extracted. And then, the images are screened to ensure that the two taxonomies contain the same ones, and finally a total of 1,160 images were obtained.
Predictive model evaluation by recurrent neural network
After annotation of the images, our predictions on the two taxonomies are based on Deep Residual Learning for Image Recognition (deep learning), which belongs to CNN. For fair comparison, we adopt ResNet-50 pre-trained on ImageNet as the feature extraction network. Specifically, SGD optimizer with momentum 0.9 and weight decay 5e-4 is adopted, the initial learning rate is set as 1e-4. The batch size is set to 64 and the drop-out rate is 0.5.
Identify the images according to the Taxonomy 1: Project_1 represents the specific information of each picture marked using taxonomy1 classification system. Entity_id is the unique ID of the picture. Code_1 represents the number of images in each category under images marked with the taxonomy1 classification system. code_id is the category unique ID.
Identify the images according to the Taxonomy 2 (3ā5 levels): Project 2, Project 3, Project 4 represents the specific information of each picture marked at the 3, 4, 5 level using the Taxonomy 2 system, respectively. entity_id is the unique ID of the picture. And code_2 represents the Taxonomy 2 system. At the 2, 3, 4, level under the marked images, respectively, the number of images in each category. code_id is the category unique ID.
For each project, 2/3 of the images were included as the training group, and the rest 1/3 of the images acted as the test group according to the category (class) as the stratification variable.
The accuracy, Kappa coefficient, Precision, Recall, and F1-score were calculated and compared between the two taxonomies.
Formulas:
TP indicates the number of correct predictions for this category in the real classification, FP indicates the number of false predictions in this category for unreal classification, FN indicates that the number of this category is not correctly predicted in the real classification.
Results
The overall comparison on predicted results between projects
Table 2 showed the comparison of the predicted results of projects by different categories. Only the Project 4 has a higher accuracy on prediction of skin disease.
Except for the test group in Project 3, all of the train and test groups in the Projects (2,3, and 4) from Taxonomy 2 have a higher precision on prediction of skin disease than the corresponding group in the Project 1 from Taxonomy 1, while no differences are significant. For the recall rate of Projects, both train and test groups in the Projects (2,3, and 4) from Taxonomy 2 are better than the corresponding group Project 1 from Taxonomy 1, while only the test group in Project 4 has a statistically significantly higher recall rate than the test group in Project 1 (Pā=ā0.016).
For the F1-score, both train and test groups in the Projects (2, 3, and 4) from Taxonomy 2 are better than the corresponding groups in Project 1 from Taxonomy 1, and both the train and test groups in Project 4 have a statistically significantly higher F1-score than the corresponding groups in Project 1 (Pā=ā0.025 and 0.005, respectively).
All of the train and test groups in the Projects (2, 3, and 4) from Taxonomy 2 have a higher Kappa value on prediction of skin disease than the corresponding groups in the Project 1 from Taxonomy 1.
Comparisons among classes in Projects
The results showed that all of the parameters including sensitivity and recall, specificity, positive predictive value (PPV) and precision, negativeĀ predictive value (NPV), and F1 in the 11 diseases of the train groups are all better than those in the test group in Project 1 (Table 3). And the F1 in part of diseases, especially of gonococcal infection and Herpes viral infections, in the test group are much lower compared with that in the train group.
While the results showed that all of the parameters including sensitivity and recall, specificity, PPV and precision, NPV, and F1 in the 11 diseases of the train groups are similar with those in the test group at different classification levels in Projects 2ā4 of Taxonomy 2 (Project 2/Level 3, Table 4; Project 3/Level 4, Table 5; Project 4/Level 5, Table 6).
Discussion
Descriptive dermatology of the morphological phenomena of skin has been developed for more than two thousand years16. Briefly, our ancestors have separated skin disorders, depending either on their location, their appearance or more interestingly their suspected cause. In consequence, the textbooks, that have fashioned our education, have also adopted sometimes very different ways to present and classify skin diseases17. Classification by similarities became more and more difficult as the complexity of disease was realized18. New classification which may help diagnosis, disease management, and discipline development is in urgent need.
This study developed a new taxonomy (Taxonomy 2) containing 6 levels (project 2ā4) of most skin disease based on cytology and pathology, which is a completely new work on the dermatology and venereology compared to the previous work focusing on classification of one type or several skin disease by AI techniques4,5,6,7,8,9.
In order to investigate the predictive effect of the new taxonomy on skin disease, we further compared the accuracy, precision, recall, F1, and Kappa of the new taxonomy with the ICD 10 using Deep Residual Learning method. Precision, recall, and F1-score are commonly used to evaluate the predictive effect of models/projects in multi-class prediction. Precision is the number of correctly predicted samples divided by the number of all samples, that is, the prediction accuracy rate of the model, and is used to measure the proportion of correct discrimination among all predicted categories, similar to sensitivity. Recall is used to measure the proportion of correctly identified in all true categories, similar to specificity. The two constitute a pair of contradictory measures. F1 score is used to weigh these two indicators. Deep CNNs has a potential widely application for diagnosis of skin diseases, with a higher accuracy compared with human dermatologists19,20, that is why we applied it to prediction diseases based on different taxonomies, at same time to avoid instability of human beings.
Our results confirmed that the new taxonomy had a better performance in all parameters, and the final level of classification had a significant higher F1-score than the ICD-10 taxonomy, which means it may be better on extension to unknown data and may provide a better taxonomy system for skin disease prediction under assistance of AI techniques in the future.
The literature on the shortcomings of the ICD-10 is very few. A compatible version of the ICD-10 specifically adapted to dermatology was produced in Spain in 1999 to overcome these shortcomings. GonzĆ”lez-López et al.21 confirmed that the ICD-10 system does have some minor shortcomings when it comes to coding certain diseases, particularly newly discovered and emerging diseases. A classification of hypersensitivity/allergic diseases was constructed to validate it for ICD-11 by crowdsourcing the allergist community11, because the well-known misclassification and/or under-notification of these diseases in the ICD, which has a direct and huge detrimental impact on hypersensitivity/allergic diseases data22. However, a reclassification of whole disciplinary systems of dermatology hasnāt been tried yet, so we attempted to construct a new taxonomy in this study. The results of current study confirmed that the taxonomy 2 developed has advance on the disease prediction compared to ICD-10 on skin diseases, which may have a potential application value in future clinical practice in dermatology and venereology.
The current study has the following limitations: 1. AI is the only detection technology for comparison, but is not the gold standard for prediction, so it has systemĀ error, which may affect the comparison result. 2. The dermatological data didnāt include histopathological images, and it may influence accurate classification effect. 3. The train and test groups of Project 1 have differences on all of the three parameters. And the Project 3 and Project 4 have a difference on precision and F1-score, respectively. Our purpose of dividing the images into 2 groups is to prevent model overfitting, which means that it performs well in the training group, but may be very poor when it is changed to other data and cannot be well predicted. We used 2/3 of the data to build the model and adjust the parameters in order to build a good model, however the difference between train and test groups indicate a low credibility of the results, the images of different types of diseases are not balanced, which may result from the not good enough quality of images of skin diseases, especially for some types.
Conclusion and future work
In conclusion, this study is a try for dermatology precise or effective classification for discipline development, and this new taxonomy based on cytology and pathology we developed is an innovation and challenge for current dermatology classification from ICD-10, and has been provided to have an overall better performance on predictive effect including sensitivity and recall, specificity, PPV and precision, NPV, and F1, compared with ICD-10. The new taxonomy has the potential application value for clinical practice using AI techniques for skin prediction. However, a coming comprehensive system covering more skin disease and having different data including dermoscopic and histopathogical images are necessary for further confirmation of the stability of the taxonomy.
Data availability
The data that support the findings of this study are available from the first author (Jin Bu, dr.jinbu@gmail.com) upon reasonable request.
Change history
26 April 2023
A Correction to this paper has been published: https://doi.org/10.1038/s41598-023-34103-0
References
Andersen, L. K. & Davis, M. D. The epidemiology of skin and skin-related diseases: A review of population-based studies performed by using the Rochester Epidemiology Project. Mayo Clin. Proc. 88(12), 1462ā1467. https://doi.org/10.1016/j.mayocp.2013.08.018 (2013).
Federman, D. G. & Kirsner, R. S. The abilities of primary care physicians in dermatology: Implications for quality of care. Am. J. Manag. Care 3(10), 1487ā1492 (1997).
Lee, K., & Soyer, H. P. Future developments in teledermoscopy and total body photography. Int. J. Dermatol. Venereol. 2(1):15ā18(2019).
Binder, M. et al. Epiluminescence microscopy-based classification of pigmented skin lesions using computerized image analysis and an artificial neural network. Melanoma Res. 8(3), 261ā266. https://doi.org/10.1097/00008390-199806000-00009 (1998).
Verma, A. K., Pal, S. & Kumar, S. Classification of skin disease using ensemble data mining techniques. Asian Pac. J. Cancer Prev. 20(6), 1887ā1894. https://doi.org/10.31557/APJCP.2019.20.6.1887 (2019).
Sharma, D. K. Data mining techniques for prediction of different categories of dermatology diseases. Acad. Inform. Manag. Sci. J. 16, 103ā116 (2013).
Moradi, N. & Mahdavi-Amiri, N. Kernel sparse representation based model for skin lesions segmentation and classification. Comput. Methods Programs Biomed. 182, 105038. https://doi.org/10.1016/j.cmpb.2019.105038 (2019).
Yap, J., Yolland, W. & Tschandl, P. Multimodal skin lesion classification using deep learning. Exp. Dermatol. 27(11), 1261ā1267. https://doi.org/10.1111/exd.13777 (2018).
Chang, C. L. & Chen, C. H. Applying decision tree and neural network to increase quality of dermatologic diagnosis. Exp. Syst. Appl. 36(2), 4035ā4041 (2009).
Tanno, L. K. et al. Undernotification of anaphylaxis deaths in Brazil due to difficult coding under the ICD-10. Allergy 67(6), 783ā789. https://doi.org/10.1111/j.1398-9995.2012.02829.x (2012).
Tanno, L. K. et al. Constructing a classification of hypersensitivity/allergic diseases for ICD-11 by crowdsourcing the allergist community. Allergy 70(6), 609ā615. https://doi.org/10.1111/all.12604 (2015).
World Health Organization. Classification of diseases (ICD). Version updated in 2016. https://icd.who.int/browse10/2016/en. Accessed 2019 April 20
Dasgupta, R. & Fishman, S. J. ISSVA classification. Semin. Pediatr. Surg. 23(4), 158ā161. https://doi.org/10.1053/j.sempedsurg.2014.06.016 (2014).
Goldman, J. A. et al. Digital mucinous pseudocysts. Arthritis Rheum 20(4), 997ā1002. https://doi.org/10.1002/art.1780200413 (1977).
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639), 115ā118. https://doi.org/10.1038/nature21056 (2017).
Dainichi, T., Hanakawa, S. & Kabashima, K. Classification of inflammatory skin diseases: A proposal based on the disorders of the three-layered defense systems, barrier, innate immunity and acquired immunity. J. Dermatol. Sci. 76(2), 81ā89. https://doi.org/10.1016/j.jdermsci.2014.08.010 (2014).
Aractingi, S. Classifying skin diseases: Until where should we go. Exp. Dermatol. 26(8), 681ā682. https://doi.org/10.1111/exd.13230 (2017).
Thomsen, R. J. et al. Classification of skin diseases in nineteenth century America. Int. J. Dermatol. 32(2), 142ā147. https://doi.org/10.1111/j.1365-4362.1993.tb01459.x (1993).
Zhang, X. et al. Towards improving diagnosis of skin diseases by combining deep neural network and human knowledge. BMC Med. Inform. Decis. Mak. 18(Suppl 2), 59. https://doi.org/10.1186/s12911-018-0631-9 (2018).
Chang, W. Y. et al. Computer-aided diagnosis of skin lesions using conventional digital photography: A reliability and feasibility study. PLoS ONE 8(11), e76212. https://doi.org/10.1371/journal.pone.0076212 (2013).
GonzĆ”lez-López, G. et al. Difficulties coding dermatological disorders using the ICD-10: The DIADERM study. Actas Dermosifiliogr. 109(10), 893ā899. https://doi.org/10.1016/j.ad.2018.06.006 (2018).
Simpson, C. R. et al. Will Systematized Nomenclature of Medicine-Clinical Terms improve our understanding of the disease burden posed by allergic disorders. Clin. Exp. Allergy 37(11), 1586ā1593. https://doi.org/10.1111/j.1365-2222.2007.02830.x (2007).
Funding
This work was supported by the National Natural Science Foundation of China (No. 81971482), the Science and Technology Planned Project of Bureau of Education of Guangzhou (No. 1201610221), the Science and Technology Program of Guangzhou, China (No.202102080341).
Author information
Authors and Affiliations
Contributions
J.B. created the new taxonomy, analyzed the results, and prepared the first draft and finished the manuscript on the basis of comments from other authors; Y.L. performed the statistical analysis of the whole study; L.-Q. Q., G. H., and P. J. collected all of the imaging data used; H.-F. H. and E.-X. S. managed the project, and E.-X. S. provided the funding support of the study. All authors participated the providing guidance on methods, confirmation of the results, and review of the draft and revised manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this Article was revised: The original version of this Article contained an error in the Funding section. Full information regarding the corrections made can be found in the correction notice for this Article.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bu, J., Lin, Y., Qing, LQ. et al. Prediction of skin disease using a new cytological taxonomy based on cytology and pathology with deep residual learning method. Sci Rep 11, 13764 (2021). https://doi.org/10.1038/s41598-021-92848-y
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-021-92848-y




