Abstract
In the current landscape of software testing, challenges persist in test case data generation, including variability in data quality and the inherent difficulty of data synthesis. These challenges are further exacerbated in scenarios where data are widely distributed across heterogeneous organizational environments. Privacy regulations and security concerns impose strict constraints on data sharing, preventing centralized data aggregation and highlighting the necessity of a federated environment as a more practical solution. To address the privacy protection and data sharing challenges in federated test case data generation, we propose a Generative Adversarial Network (GAN)-based method specifically designed for federated settings. By leveraging the strong data generation capabilities of GANs, the proposed approach is able to generate high-quality and diverse test case data while preserving data privacy. Specifically, through a protocol grammar-based deep learning framework combined with test case encoder–decoder encoding mechanisms and a GAN-driven sample character generator, the proposed method can predict and generate variant test case samples. In the federated environment, each participant trains the generator and discriminator locally, while model parameters are securely aggregated to achieve global model optimization. Experimental results demonstrate that the generated test case data outperforms traditional methods in terms of coverage and effectiveness, significantly enhancing the efficiency and quality of software testing. Ultimately, the proposed framework provides a scalable solution for identifying latent vulnerabilities in critical infrastructure while strictly adhering to data sovereignty requirements in cross-organizational environments.
Similar content being viewed by others
Data availability
To ensure protocol diversity and operational authenticity, three industrial datasets were employed: Modbus-TCP (50,000 messages spanning read/write operations), OPC UA (30,000 requests covering Read, Write, and Browse functions), and CoAP (20,000 IoT interaction messages including GET, PUT, POST, and DELETE operations). Data requests can be made to corresponding author via this email: lu.yiqing@fecomee.org.cn.
References
Liang, W. & Ji, N. Privacy challenges of iot-based blockchain: a systematic review. Clust. Comput. 25, 2203–2221 (2022).
Wang, X., Sun, Y. & Ding, D. Adaptive dynamic programming for networked control systems under communication constraints: A survey of trends and techniques. Int. J. Netw. Dyn. Intell. 85–98 (2022).
Casti, J. L. On system complexity: Identification, measurement, and management. In Complexity, language, and life: Mathematical approaches, 146–173 (Springer, 1986).
Kumar, S., Aggarwal, A. G. & Gupta, R. Modeling the role of testing coverage in the software reliability assessment. Int. J.Math. Eng. Manag. Sci. 8 (2023).
Aghababaeyan, Z. et al. Black-box testing of deep neural networks through test case diversity. IEEE Trans. Softw. Eng. 49, 3182–3204 (2023).
Rampérez, V., Soriano, J., Lizcano, D. & Lara, J. A. Flas: A combination of proactive and reactive auto-scaling architecture for distributed services. Future Gener. Comput. Syst. 118, 56–72 (2021).
Xie, H. et al. A verifiable federated learning algorithm supporting distributed pseudonym tracking. In International Conference on Database Systems for Advanced Applications, 173–189 (Springer, 2024).
Sai, S., Hassija, V., Chamola, V. & Guizani, M. Federated learning and nft-based privacy-preserving medical-data-sharing scheme for intelligent diagnosis in smart healthcare. IEEE Internet Things J. 11, 5568–5577 (2023).
Chen, Z. & Jiang, L. Promise and peril of collaborative code generation models: Balancing effectiveness and memorization. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, 493–505 (2024).
Tram, H. T. N. Automated vulnerability scanning tools for securing cloud-based e-commerce supply chains. J. Appl. Cybersecurity Anal. Intell. Decis. Syst. 12, 11–21 (2022).
Xie, H. et al. Verifiable federated learning with privacy-preserving data aggregation for consumer electronics. IEEE Trans. Consum. Electron. 70, 2696–2707 (2024).
Yang, Y. et al. Federated learning for software engineering: a case study of code clone detection and defect prediction. IEEE Trans. Softw. Eng. 50, 296–321 (2024).
Preuveneers, D. et al. Chained anomaly detection models for federated learning: An intrusion detection case study. Appl. Sci. 8, 2663 (2018).
Singh, G., Sood, K., Rajalakshmi, P., Nguyen, D. D. N. & Xiang, Y. Evaluating federated learning-based intrusion detection scheme for next generation networks. IEEE Trans. Netw. Serv. Manag. 21, 4816–4829 (2024).
Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).
Ghoshal, A., Kumar, S. & Mookerjee, V. Dilemma of data sharing alliance: When do competing personalizing and non-personalizing firms share data. Prod. Oper. Manag. 29, 1918–1936 (2020).
Li, Z. et al. Data heterogeneity-robust federated learning via group client selection in industrial iot. IEEE Internet Things J. 9, 17844–17857 (2022).
Ji, X., Tian, J., Zhang, H., Wu, D. & Li, T. Joint device selection and bandwidth allocation for cost-efficient federated learning in industrial internet of things. IEEE Internet Things J. 10, 9148–9160 (2023).
Xie, H. et al. Industrial wireless internet zero trust model: Zero trust meets dynamic federated learning with blockchain. IEEE Wirel. Commun. 31, 22–29 (2024).
Zhao, L., Xie, H., Zhong, L. & Wang, Y. Explainable federated learning scheme for secure healthcare data sharing. Health Inf. Sci. Syst. 12, 1–14 (2024).
Miller, B. P., Fredriksen, L. & So, B. An empirical study of the reliability of unix utilities. Commun. ACM 33, 32–44 (1990).
Ghosh, A., Shah, V. & Schmid, M. An approach for analyzing the robustness of windows nt software. In 21st National Information Systems Security Conference, Crystal City, VA, vol. 10 (Citeseer, 1998).
Ghosh, A. K., Schmid, M. & Shah, V. Testing the robustness of windows nt software. In Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No. 98TB100257), 231–235 (IEEE, 1998).
Eddington, M. Peach fuzzing platform. Peach Fuzzer 34, 32–43 (2011).
Biyani, A. et al. Extension of spike for encrypted protocol fuzzing. In 2011 Third International Conference on Multimedia Information Networking and Security, 343–347 (IEEE, 2011).
Cui, W., Kannan, J. & Wang, H. J. Discoverer: Automatic protocol reverse engineering from network traces. In USENIX Security Symposium, 1–14 (Boston, MA, USA, 2007).
Comparetti, P. M., Wondracek, G., Kruegel, C. & Kirda, E. Prospex: Protocol specification extraction. In 2009 30th IEEE Symposium on Security and Privacy, 110–125 (IEEE, 2009).
Wondracek, G., Comparetti, P. M., Kruegel, C., Kirda, E. & Anna, S. S. S. Automatic network protocol analysis. In NDSS, vol. 8, 1–14 (Citeseer, 2008).
Whalen, S., Bishop, M. & Crutchfield, J. P. Hidden markov models for automated protocol learning. In Security and Privacy in Communication Networks: 6th Iternational ICST Conference, SecureComm 2010, Singapore, September 7-9, 2010. Proceedings 6, 415–428 (Springer, 2010).
Godefroid, P., Peleg, H. & Singh, R. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), 50–59 (IEEE, 2017).
Sweeney, L. k-anonymity: A model for protecting privacy. International journal of uncertainty, fuzziness and knowledge-based systems 10, 557–570 (2002).
Machanavajjhala, A., Kifer, D., Gehrke, J. & Venkitasubramaniam, M. l-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data (tkdd) 1, 3–es (2007).
Riyana, S., Sasujit, K. & Homdoung, N. Achieving privacy preservation constraints based on k-anonymity in conjunction with adjacency matrix and weighted graphs. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT) 18, 34–50 (2024).
Riyana, S., Nanthachumphu, S. & Riyana, N. Achieving privacy preservation constraints in missing-value datasets. SN Comput. Sci. 1, 227 (2020).
Riyana, S. Achieving anatomization constraints in dynamic datasets. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT) 17, 27–45 (2023).
Rasouli, M., Sun, T. & Rajagopal, R. Fedgan: Federated generative adversarial networks for distributed data. arXiv preprint arXiv:2006.07228 (2020).
Hardy, C., Le Merrer, E. & Sericola, B. Md-gan: Multi-discriminator generative adversarial networks for distributed datasets. In 2019 IEEE international parallel and distributed processing symposium (IPDPS), 866–877 (IEEE, 2019).
Dong, Y., Liu, Y., Zhang, H., Chen, S. & Qiao, Y. Fd-gan: Generative adversarial networks with fusion-discriminator for single image dehazing. Proc. AAAI Conf. Artif. Intell. 34, 10729–10736 (2020).
Bhardwaj, T. & Sumangali, K. A federated incremental blockchain framework with privacy preserving xai optimization for securing healthcare data. Sci. Rep. 15, 38001 (2025).
Dwork, C. Differential privacy. In International colloquium on automata, languages, and programming, 1–12 (Springer, 2006).
Riyana, S., Sasujit, K. & Homdoung, N. Privacy-enhancing data aggregation for big data analytics. ECTI Trans. Comput. Inf. Technol. (ECTI-CIT) 17, 440–456 (2023).
Riyana, S. (\(lp_1\),..., \(lp_n\))-privacy: privacy preservation models for numerical quasi-identifiers and multiple sensitive attributes. J. Ambient Intell. Humaniz. Comput. 12, 9713–9729 (2021).
Shamsinezhad, E., Banirostam, H., BaniRostam, T., Pedram, M. M. & Rahmani, A. M. Providing and evaluating a model for big data anonymization streams by using in-memory processing. Knowl. Inf. Syst. 1–34 (2025).
Shamsinezhad, E., Banirostam, T., Pedram, M. M. & Rahmani, A. M. Anonymizing big data streams using in-memory processing: A novel model based on one-time clustering. J. Signal Process. Syst. 96, 333–356 (2024).
Funding
This research was partially funded by the GEF project—Strengthening coordinated approaches to reduce invasive alien species (IAS) threats to globally significant agrobiodiversity and agroecosystems in China, with project number 9874. This paper has also received funding from the Excellent Talent Training Funding Project in Dongcheng District, Beijing, with project number 2024-dchrcpyzz-9.
Author information
Authors and Affiliations
Contributions
Z.W. and L.Z. designed research plans, methodology and made writing-original draft preparation. Z.W., L.Z., and F.Me. put forward research ideas, conceptualization, and made writing-reviewing and editing. Z.Z. and Y.L. contributed significantly to analysis and writing-reviewing and editing. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wang, Z., Zhao, L., Meng, F. et al. Secure multi-party test case data generation through generative adversarial networks. Sci Rep (2026). https://doi.org/10.1038/s41598-026-35773-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-35773-2


