Addendum to: Nature Communications https://doi.org/10.1038/s41467-022-35400-4, published online 10 December 2022
In the original version of the article, we developed a set of heuristics for comparing stability data, which was used to analyze the dataset from the Perovskite Database Project. After the initial publication, we received feedback from a reader and identified some errors in the original dataset. To ensure that those errors had no major influence on the conclusion, we checked the data carefully and rerun the analysis with the corrected data. The detailed results are listed below.
-
1.
The errors in the data we used to draw our conclusions come from the errors in the original database. Those existed when we downloaded the data from the Perovskite Database, and none of our operations (stability data extraction, statistical analysis, “data.m” file generation) changed the original data. After a careful check, we found 95 errors in the 7419 data points we used. Those are listed in the attached file, “data correction.xlsx”, and are primarily concerned with the “Cell_architecture” parameter, i.e. mostly either nip or pin.
-
2.
After correcting the errors, we rerun the analysis. The new results show negligible changes. This is what to be expected as only a small fraction of the dataset had the wrong cell_architecture label.
For the results related to the device architecture, the data correction resulted in small shifts in the average and variances and a slightly larger shift in the TA/TB ratios. The analysis of the categories with more data, i.e. nip and pin (Tables 1 and 2) were less affected by the data correction, than those with fewer samples, e.g., inorganic HTL and doped organic HTL (Tables 3 and 4), which illustrates the importance of large dataset. There were no changes in the results for analyses not based on a separation of devices based on cell architecture, such as the analysis of the importance of the tolerance factor (Tables 5 and 6). To summarize, the data correction did not result in any changes in the conclusions.

Fig. 1. The kernel density estimation of the log(TS80m) values for n-i-p and p-i-n structured devices. a Before data correction. b After data correction.

Fig. 2. The kernel density estimation of the log(TS80m) values for spiro-based n-i-p structured devices and p-i-n structured devices. a Before data correction. b After data correction.

Fig. 3. The kernel density estimation of the log(TS80m) values for devices with different HTLs and electrodes. a Before data correction. b After data correction.

Fig. 4. The kernel density estimation of the log(TS80m) values for devices with different tolerance regions. a Before data correction. b After data correction.
-
3.
A small number of errors in a large dataset is to be expected, especially for datasets based on manual human data extraction which is the case in the Perovskite Database. Moreover, a small amount of errors in a dataset usually does not change the validity of a statistical analysis more than somewhat widening the error bars. That is one of the advantages of big data. Errors are not limited to databases of device metrics, as also all experimental data have noise. Noise in a large heterogeneous dataset does usually not have a significant influence on the results as the errors average out each other and are dwarfed by the rest of the data.
To sum up, our work focuses on the development of a set of heuristics that can be used for a rough comparison of stability data. A small number of errors in the dataset used for the analysis does not negate the meaning and the value of this work. When a few errors were found in the original dataset, the question was raised whether or not this had affected the conclusions drawn from our analysis. We then checked and corrected the data and found that this did not influence the reliability of our analysis or the conclusions that were drawn from it. If something, this extra check supports the robustness of the analysis.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhang, Z., Wang, H., Jacobsson, T.J. et al. Addendum: Big data driven perovskite solar cell stability analysis. Nat Commun 15, 4788 (2024). https://doi.org/10.1038/s41467-024-48894-x
Published:
DOI: https://doi.org/10.1038/s41467-024-48894-x