Secure multi-party test case data generation through generative adversarial networks

Wang, Zheng; Zhao, Liutao; Meng, Fanyin; Zhu, Zhongshan; Lu, Yiqing

doi:10.1038/s41598-026-35773-2

Download PDF

Article
Open access
Published: 13 January 2026

Secure multi-party test case data generation through generative adversarial networks

Zheng Wang¹,
Liutao Zhao²,
Fanyin Meng³,
Zhongshan Zhu² &
…
Yiqing Lu⁴

Scientific Reports , Article number: (2026) Cite this article

879 Accesses
1 Altmetric
Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

Abstract

In the current landscape of software testing, challenges persist in test case data generation, including variability in data quality and the inherent difficulty of data synthesis. These challenges are further exacerbated in scenarios where data are widely distributed across heterogeneous organizational environments. Privacy regulations and security concerns impose strict constraints on data sharing, preventing centralized data aggregation and highlighting the necessity of a federated environment as a more practical solution. To address the privacy protection and data sharing challenges in federated test case data generation, we propose a Generative Adversarial Network (GAN)-based method specifically designed for federated settings. By leveraging the strong data generation capabilities of GANs, the proposed approach is able to generate high-quality and diverse test case data while preserving data privacy. Specifically, through a protocol grammar-based deep learning framework combined with test case encoder–decoder encoding mechanisms and a GAN-driven sample character generator, the proposed method can predict and generate variant test case samples. In the federated environment, each participant trains the generator and discriminator locally, while model parameters are securely aggregated to achieve global model optimization. Experimental results demonstrate that the generated test case data outperforms traditional methods in terms of coverage and effectiveness, significantly enhancing the efficiency and quality of software testing. Ultimately, the proposed framework provides a scalable solution for identifying latent vulnerabilities in critical infrastructure while strictly adhering to data sovereignty requirements in cross-organizational environments.

Wasserstein GAN for moving differential privacy protection

Article Open access 04 June 2025

Defending against and generating adversarial examples together with generative adversarial networks

Article Open access 15 April 2025

Generative adversarial reduced order modelling

Article Open access 15 February 2024

Data availability

To ensure protocol diversity and operational authenticity, three industrial datasets were employed: Modbus-TCP (50,000 messages spanning read/write operations), OPC UA (30,000 requests covering Read, Write, and Browse functions), and CoAP (20,000 IoT interaction messages including GET, PUT, POST, and DELETE operations). Data requests can be made to corresponding author via this email: lu.yiqing@fecomee.org.cn.

References

Liang, W. & Ji, N. Privacy challenges of iot-based blockchain: a systematic review. Clust. Comput. 25, 2203–2221 (2022).
Google Scholar
Wang, X., Sun, Y. & Ding, D. Adaptive dynamic programming for networked control systems under communication constraints: A survey of trends and techniques. Int. J. Netw. Dyn. Intell. 85–98 (2022).
Casti, J. L. On system complexity: Identification, measurement, and management. In Complexity, language, and life: Mathematical approaches, 146–173 (Springer, 1986).
Kumar, S., Aggarwal, A. G. & Gupta, R. Modeling the role of testing coverage in the software reliability assessment. Int. J.Math. Eng. Manag. Sci. 8 (2023).
Aghababaeyan, Z. et al. Black-box testing of deep neural networks through test case diversity. IEEE Trans. Softw. Eng. 49, 3182–3204 (2023).
Google Scholar
Rampérez, V., Soriano, J., Lizcano, D. & Lara, J. A. Flas: A combination of proactive and reactive auto-scaling architecture for distributed services. Future Gener. Comput. Syst. 118, 56–72 (2021).
Google Scholar
Xie, H. et al. A verifiable federated learning algorithm supporting distributed pseudonym tracking. In International Conference on Database Systems for Advanced Applications, 173–189 (Springer, 2024).
Sai, S., Hassija, V., Chamola, V. & Guizani, M. Federated learning and nft-based privacy-preserving medical-data-sharing scheme for intelligent diagnosis in smart healthcare. IEEE Internet Things J. 11, 5568–5577 (2023).
Google Scholar
Chen, Z. & Jiang, L. Promise and peril of collaborative code generation models: Balancing effectiveness and memorization. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, 493–505 (2024).
Tram, H. T. N. Automated vulnerability scanning tools for securing cloud-based e-commerce supply chains. J. Appl. Cybersecurity Anal. Intell. Decis. Syst. 12, 11–21 (2022).
Google Scholar
Xie, H. et al. Verifiable federated learning with privacy-preserving data aggregation for consumer electronics. IEEE Trans. Consum. Electron. 70, 2696–2707 (2024).
Google Scholar
Yang, Y. et al. Federated learning for software engineering: a case study of code clone detection and defect prediction. IEEE Trans. Softw. Eng. 50, 296–321 (2024).
Google Scholar
Preuveneers, D. et al. Chained anomaly detection models for federated learning: An intrusion detection case study. Appl. Sci. 8, 2663 (2018).
Google Scholar
Singh, G., Sood, K., Rajalakshmi, P., Nguyen, D. D. N. & Xiang, Y. Evaluating federated learning-based intrusion detection scheme for next generation networks. IEEE Trans. Netw. Serv. Manag. 21, 4816–4829 (2024).
Google Scholar
Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).
Google Scholar
Ghoshal, A., Kumar, S. & Mookerjee, V. Dilemma of data sharing alliance: When do competing personalizing and non-personalizing firms share data. Prod. Oper. Manag. 29, 1918–1936 (2020).
Google Scholar
Li, Z. et al. Data heterogeneity-robust federated learning via group client selection in industrial iot. IEEE Internet Things J. 9, 17844–17857 (2022).
Google Scholar
Ji, X., Tian, J., Zhang, H., Wu, D. & Li, T. Joint device selection and bandwidth allocation for cost-efficient federated learning in industrial internet of things. IEEE Internet Things J. 10, 9148–9160 (2023).
Google Scholar
Xie, H. et al. Industrial wireless internet zero trust model: Zero trust meets dynamic federated learning with blockchain. IEEE Wirel. Commun. 31, 22–29 (2024).
Google Scholar
Zhao, L., Xie, H., Zhong, L. & Wang, Y. Explainable federated learning scheme for secure healthcare data sharing. Health Inf. Sci. Syst. 12, 1–14 (2024).
Google Scholar
Miller, B. P., Fredriksen, L. & So, B. An empirical study of the reliability of unix utilities. Commun. ACM 33, 32–44 (1990).
Google Scholar
Ghosh, A., Shah, V. & Schmid, M. An approach for analyzing the robustness of windows nt software. In 21st National Information Systems Security Conference, Crystal City, VA, vol. 10 (Citeseer, 1998).
Ghosh, A. K., Schmid, M. & Shah, V. Testing the robustness of windows nt software. In Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No. 98TB100257), 231–235 (IEEE, 1998).
Eddington, M. Peach fuzzing platform. Peach Fuzzer 34, 32–43 (2011).
Google Scholar
Biyani, A. et al. Extension of spike for encrypted protocol fuzzing. In 2011 Third International Conference on Multimedia Information Networking and Security, 343–347 (IEEE, 2011).
Cui, W., Kannan, J. & Wang, H. J. Discoverer: Automatic protocol reverse engineering from network traces. In USENIX Security Symposium, 1–14 (Boston, MA, USA, 2007).
Comparetti, P. M., Wondracek, G., Kruegel, C. & Kirda, E. Prospex: Protocol specification extraction. In 2009 30th IEEE Symposium on Security and Privacy, 110–125 (IEEE, 2009).
Wondracek, G., Comparetti, P. M., Kruegel, C., Kirda, E. & Anna, S. S. S. Automatic network protocol analysis. In NDSS, vol. 8, 1–14 (Citeseer, 2008).
Whalen, S., Bishop, M. & Crutchfield, J. P. Hidden markov models for automated protocol learning. In Security and Privacy in Communication Networks: 6th Iternational ICST Conference, SecureComm 2010, Singapore, September 7-9, 2010. Proceedings 6, 415–428 (Springer, 2010).
Godefroid, P., Peleg, H. & Singh, R. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), 50–59 (IEEE, 2017).
Sweeney, L. k-anonymity: A model for protecting privacy. International journal of uncertainty, fuzziness and knowledge-based systems 10, 557–570 (2002).
Google Scholar
Machanavajjhala, A., Kifer, D., Gehrke, J. & Venkitasubramaniam, M. l-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data (tkdd) 1, 3–es (2007).
Riyana, S., Sasujit, K. & Homdoung, N. Achieving privacy preservation constraints based on k-anonymity in conjunction with adjacency matrix and weighted graphs. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT) 18, 34–50 (2024).
Google Scholar
Riyana, S., Nanthachumphu, S. & Riyana, N. Achieving privacy preservation constraints in missing-value datasets. SN Comput. Sci. 1, 227 (2020).
Google Scholar
Riyana, S. Achieving anatomization constraints in dynamic datasets. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT) 17, 27–45 (2023).
Google Scholar
Rasouli, M., Sun, T. & Rajagopal, R. Fedgan: Federated generative adversarial networks for distributed data. arXiv preprint arXiv:2006.07228 (2020).
Hardy, C., Le Merrer, E. & Sericola, B. Md-gan: Multi-discriminator generative adversarial networks for distributed datasets. In 2019 IEEE international parallel and distributed processing symposium (IPDPS), 866–877 (IEEE, 2019).
Dong, Y., Liu, Y., Zhang, H., Chen, S. & Qiao, Y. Fd-gan: Generative adversarial networks with fusion-discriminator for single image dehazing. Proc. AAAI Conf. Artif. Intell. 34, 10729–10736 (2020).
Google Scholar
Bhardwaj, T. & Sumangali, K. A federated incremental blockchain framework with privacy preserving xai optimization for securing healthcare data. Sci. Rep. 15, 38001 (2025).
Google Scholar
Dwork, C. Differential privacy. In International colloquium on automata, languages, and programming, 1–12 (Springer, 2006).
Riyana, S., Sasujit, K. & Homdoung, N. Privacy-enhancing data aggregation for big data analytics. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT) 17, 440–456 (2023).
Google Scholar
Riyana, S. (\(lp_1\),..., \(lp_n\))-privacy: privacy preservation models for numerical quasi-identifiers and multiple sensitive attributes. J. Ambient Intell. Humaniz. Comput. 12, 9713–9729 (2021).
Shamsinezhad, E., Banirostam, H., BaniRostam, T., Pedram, M. M. & Rahmani, A. M. Providing and evaluating a model for big data anonymization streams by using in-memory processing. Knowl. Inf. Syst. 1–34 (2025).
Shamsinezhad, E., Banirostam, T., Pedram, M. M. & Rahmani, A. M. Anonymizing big data streams using in-memory processing: A novel model based on one-time clustering. J. Signal Process. Syst. 96, 333–356 (2024).
Google Scholar

Download references

Funding

This research was partially funded by the GEF project—Strengthening coordinated approaches to reduce invasive alien species (IAS) threats to globally significant agrobiodiversity and agroecosystems in China, with project number 9874. This paper has also received funding from the Excellent Talent Training Funding Project in Dongcheng District, Beijing, with project number 2024-dchrcpyzz-9.

Author information

Authors and Affiliations

Institute of Digital Economy, Beijing Academy of Science and Technology, Beijing, 100032, China
Zheng Wang
Beijing Computing Center Co., Ltd., Beijing Academy of Science and Technology, Beijing, 100032, China
Liutao Zhao & Zhongshan Zhu
Beijing Beike Rongzhi Cloud Computing Technology Co., Ltd., Beijing Academy of Science and Technology, Beijing, 100032, China
Fanyin Meng
Foreign Environment Cooperation Center, Ministry of Ecology and Environment, Beijing, China
Yiqing Lu

Authors

Zheng Wang
View author publications
Search author on:PubMed Google Scholar
Liutao Zhao
View author publications
Search author on:PubMed Google Scholar
Fanyin Meng
View author publications
Search author on:PubMed Google Scholar
Zhongshan Zhu
View author publications
Search author on:PubMed Google Scholar
Yiqing Lu
View author publications
Search author on:PubMed Google Scholar

Contributions

Z.W. and L.Z. designed research plans, methodology and made writing-original draft preparation. Z.W., L.Z., and F.Me. put forward research ideas, conceptualization, and made writing-reviewing and editing. Z.Z. and Y.L. contributed significantly to analysis and writing-reviewing and editing. All authors reviewed the manuscript.

Corresponding author

Correspondence to Yiqing Lu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, Z., Zhao, L., Meng, F. et al. Secure multi-party test case data generation through generative adversarial networks. Sci Rep (2026). https://doi.org/10.1038/s41598-026-35773-2

Download citation

Received: 08 September 2025
Accepted: 08 January 2026
Published: 13 January 2026
DOI: https://doi.org/10.1038/s41598-026-35773-2