Abstract
The pharmaceutical industry is undergoing a paradigm shift towards digitalization and smart manufacturing under the Pharma 4.0 framework, with a growing emphasis on integrating Artificial Intelligence (AI) into Quality-by-Design (QbD) principles. This study proposes an AI-powered information management framework to enhance predictive quality control, regulatory compliance, and operational efficiency in pharmaceutical production. The framework consolidates structured process and product datasets with unstructured regulatory documents, enabling comprehensive data integration and decision support. Machine learning and deep learning models were employed to predict critical quality attributes (CQAs) from CPPs, while natural language processing (NLP) was applied to manage regulatory documentation. Explainable AI (XAI) techniques, including SHAP and LIME, were integrated to ensure interpretability and compliance with ICH Q8–Q11 guidelines. Experimental evaluations demonstrated the superior predictive accuracy, robustness, and scalability of deep learning approaches compared to traditional QbD methods such as Design of Experiments (DoE) and regression. Statistical hypothesis testing confirmed that the observed improvements were significant (p < 0.01), while ablation studies highlighted the critical role of NLP, dimensionality reduction, and XAI modules in ensuring compliance and efficiency. Benchmarking results further established that the proposed framework outperforms conventional approaches in adaptability to high-dimensional, large-scale datasets, with deep learning models demonstrating resilience under noise, missing data, and process variability. The findings underscore the transformative potential of AI-powered QbD frameworks for advancing smart pharmaceutical production. By integrating predictive analytics, explainability, and regulatory alignment, the proposed approach provides a scalable and compliant pathway toward Pharma 4.0, enabling continuous improvement and patient-centric outcomes.
Introduction
The pharmaceutical industry is undergoing a rapid transformation driven by the convergence of advanced technologies, global health demands, and stringent regulatory frameworks1. Traditionally, pharmaceutical production has relied on labor-intensive, rigid processes that often lack flexibility in adapting to variations in raw materials, process dynamics, or market requirements2. This reliance on conventional practices has contributed to inefficiencies, quality inconsistencies, and long development timelines. In response, the QbD paradigm was introduced to emphasize building quality into products from the earliest stages of development rather than relying solely on end-product testing3. QbD shifts the focus toward understanding CQAs, critical process parameters (CPPs), and their interactions to ensure that pharmaceutical products consistently meet predefined safety and efficacy standards4.
While QbD has been widely endorsed by regulatory agencies such as the U.S. Food and Drug Administration (FDA) and the European Medicines Agency (EMA), its practical implementation presents considerable challenges5. The vast amount of data generated during pharmaceutical development and manufacturing, spanning experimental design, material characterization, process monitoring, and product testing, often overwhelms traditional statistical and rule-based approaches6. The integration of heterogeneous data sources, the complexity of modeling process variability, and the need for real-time decision-making further complicate QbD deployment7. As a result, despite its promise, QbD remains underutilized in many pharmaceutical organizations, leading to a persistent gap between theoretical frameworks and practical outcomes8.
In recent years, the rise AI has opened new avenues for addressing these challenges. AI-powered systems, particularly those leveraging machine learning, natural language processing, and predictive analytics, offer unprecedented capabilities in extracting knowledge from complex datasets9. Unlike traditional statistical tools, AI can model nonlinear relationships, handle high-dimensional data, and adaptively improve predictions with the availability of new information10. These features make AI a natural fit for enhancing information management in QbD-driven pharmaceutical production. For example, AI models can predict how changes in CPPs influence CQAs, enabling proactive adjustments that minimize variability11. Similarly, natural language processing can facilitate automated knowledge extraction from regulatory documents, technical reports, or scientific literature, ensuring compliance and accelerating innovation12.
The integration of AI into pharmaceutical production aligns with the broader vision of smart manufacturing, often referred to as “Pharma 4.0,” which mirrors the principles of Industry 4.0 in the pharmaceutical context13. Pharma 4.0 emphasizes connectivity, automation, data integration, and real-time control to create a more agile and efficient production ecosystem. Within this vision, AI-powered information management serves as the central nervous system, linking disparate data silos, enhancing decision-making, and supporting continuous improvement14. However, the adoption of AI in pharmaceutical production is still in its nascent stages. Organizations face technical, cultural, and regulatory barriers in embedding AI into established workflows, including issues of data standardization, algorithm transparency, and validation under regulatory scrutiny15.
Despite the significant progress made in both QbD frameworks and AI technologies, a disconnection persists between their potential and real-world implementation16. Many pharmaceutical companies continue to rely on fragmented data management practices, limiting their ability to fully realize the benefits of QbD17. The lack of integrated AI-powered systems means that critical insights remain locked within isolated datasets, preventing the holistic understanding necessary to optimize processes18. Moreover, conventional information management systems often lack scalability and adaptability, resulting in inefficiencies when dealing with the rapidly expanding data landscape of modern pharmaceutical production19. This disconnect hampers innovation, delays product development, and risks compromising quality assurance.
The motivation to pursue AI-powered information management for QbD in pharmaceutical production is multifold20. First, there is a pressing need to improve the efficiency and reliability of drug development and manufacturing processes21. With increasing global demand for affordable medicines, pharmaceutical companies must find ways to shorten development cycles without compromising quality22. AI offers tools to automate routine tasks, accelerate data analysis, and optimize experimental design, directly contributing to faster and more cost-effective production23.
Second, regulatory expectations are evolving, with agencies encouraging the adoption of advanced technologies that enhance transparency, traceability, and control24. By embedding AI into QbD frameworks, pharmaceutical organizations can not only meet but exceed regulatory expectations, establishing systems that provide continuous assurance of product quality25. AI-driven models can be validated and explained using emerging approaches in explainable AI, ensuring that regulatory compliance and accountability are maintained26.
Third, patient safety and therapeutic effectiveness remain the ultimate goals of pharmaceutical production. Inconsistent quality, batch failures, or delays in production can directly impact patient outcomes27. By enabling predictive quality control and real-time monitoring, AI enhances the reliability of pharmaceutical products, fostering greater trust among patients, healthcare providers, and regulators28.
Finally, the broader healthcare ecosystem is increasingly moving toward personalization and precision medicine29. Delivering therapies tailored to specific patient populations requires highly adaptive manufacturing processes supported by intelligent information management systems30. AI-powered QbD frameworks provide the flexibility and adaptability needed to meet these emerging healthcare needs, ensuring that production systems evolve in parallel with scientific and clinical advances31.
In sum, the intersection of AI and QbD represents a critical frontier in the evolution of pharmaceutical production. Addressing the existing challenges in information management through AI integration offers the potential to transform pharmaceutical manufacturing into a more robust, efficient, and patient-centered process32. The below are the major contribution of this research work.
-
This study introduces a comprehensive framework that integrates Artificial Intelligence into Quality-by-Design (QbD), enabling intelligent information management, real-time insights, and adaptive process optimization in pharmaceutical production.
-
It addresses the challenge of fragmented data management by demonstrating how AI-powered tools can unify heterogeneous data sources—from material properties and process parameters to regulatory documents—into a cohesive decision-support system.
-
The work highlights the role of machine learning and predictive analytics in modeling complex nonlinear relationships between critical process parameters and quality attributes, thus enabling proactive quality assurance.
-
The study shows how explainable AI techniques can enhance trust, compliance, and validation within regulatory frameworks, supporting broader adoption of smart pharmaceutical manufacturing practices.
-
By situating AI-enabled QbD within the context of Pharma 4.0, the research contributes a vision for next-generation pharmaceutical production that is efficient, patient-centered, and adaptable to precision medicine needs.
The remainder of this article is organized as follows. Section 2, Literature Review, discusses the related work on Quality-by-Design, smart pharmaceutical manufacturing, and the integration of Artificial Intelligence in production systems. Section 3, Methodology, outlines the proposed framework, including the conceptual design, data sources, and AI techniques employed for predictive modeling and information management. Section 4, Results, presents the AI-enabled QbD framework, detailing its system architecture, workflow, and alignment with Pharma 4.0 principles. Finally, the conclusion, summarizes the key findings, highlights improvements over traditional approaches, and suggests directions for future research.
Literature review
The pharmaceutical sector has witnessed a growing interest in applying advanced technologies to enhance product quality, efficiency, and regulatory compliance33. Over the past decade, the QbD paradigm has been extensively studied as a systematic approach to embed quality within processes from the outset. Parallel to this, AI and smart manufacturing principles under the Pharma 4.0 framework have emerged as powerful enablers, offering new possibilities for data-driven decision-making and predictive quality control34.
Nagy et al. (2023) proposed the use of interpretable artificial neural networks (ANNs) to support QbD in pharmaceutical tablet manufacturing35. Their methodology integrated process variables, material attributes, and spectroscopic data (Raman and NIR) as inputs to ANN models, complemented by sensitivity analysis to provide interpretability. The dataset comprised pilot-scale development data for direct-compression extended-release tablets, focusing on predicting critical outcomes such as hardness and dissolution. The results demonstrated that ANNs could achieve accurate predictions while also identifying key factors influencing product quality. However, the study was limited by its retrospective dataset and narrow scope, raising questions about generalizability to other formulations and large-scale production.
Testas et al. (2021) presented a real-world industrial case study applying QbD principles to accelerate the development of a coated tablet formulation36. Their approach relied on systematic risk assessments, DoE, and control strategy development to understand the relationship between critical process parameters and quality attributes. The dataset was based on experimental formulation and process data collected during industrial product development. The study highlighted significant reductions in development time and improvements in regulatory flexibility through structured QbD implementation. Nonetheless, the methodology was grounded in conventional statistical modeling and did not exploit advanced AI tools, limiting its adaptability to more complex or dynamic manufacturing environments.
Wölfle et al. (2025) explored the design of an information model to guide smart pharmaceutical factory development by integrating Axiomatic Design, QbD, Model-Based Systems Engineering, and the V-model37. The methodology was based on a systematic literature review of 176 publications coupled with expert interviews to refine the model. Instead of empirical manufacturing data, the study relied on secondary sources and qualitative feedback. The proposed model established structured workflows to link design, development, and regulatory requirements, thereby supporting iterative and collaborative production system engineering. However, the absence of real-world validation limits the practical demonstration of its effectiveness in improving efficiency or quality outcomes in pharmaceutical factories.
Rajesh et al. (2025) reviewed applications of Artificial Intelligence across the pharmaceutical industry, including formulation, manufacturing, and quality control20. The methodology consisted of synthesizing findings from published literature and categorizing AI approaches such as machine learning, deep learning, and predictive analytics in different stages of pharmaceutical workflows. Since this was a review article, no new dataset was introduced. The results underscored the potential of AI to optimize production, monitor quality, and facilitate regulatory compliance while identifying barriers such as data silos, lack of transparency, and regulatory uncertainty. The limitation lies in the descriptive nature of the review, which did not provide empirical validation or actionable frameworks for industry deployment.
Kandhare et al. (2025) conducted a broad review of AI’s role in pharmaceutical development, focusing on drug formulation, process optimization, and quality assurance38. Their methodology was narrative in nature, drawing on case studies and published literature. No empirical dataset was analyzed, as the work served to synthesize existing knowledge. The review demonstrated that AI can enhance yield, minimize failures, and repurpose molecules more efficiently, showing promise across the drug development pipeline. Yet, the authors emphasized that adoption remains limited in practice due to challenges in explainability, lack of standardized infrastructure, and regulatory hesitancy. This limitation reduces the immediacy of AI’s impact in pharmaceutical production.
Gerzon et al. (2022) critically assessed the integration of QbD with Process Analytical Technology (PAT) and predictive modeling39. The methodology involved a structured review of theoretical approaches and practical case studies where predictive tools have been applied in pharmaceutical manufacturing. The dataset was limited to secondary sources, with no new empirical data generated. The study highlighted the potential of predictive models, including AI, to enhance process understanding and control. Results emphasized that predictive tools could complement traditional QbD by providing real-time monitoring and improved risk assessment. However, limitations included integration complexity, insufficient regulatory clarity, and a lack of detailed case studies validating the combined use of these methods in real manufacturing settings.
Higgins and Johner et al. (2023) examined how validation methodologies for AI/ML systems can be applied across regulated healthcare industries, including pharmaceuticals40. Their approach was based on expert workshops and iterative exchanges to develop a taxonomy of validation practices. The dataset consisted of qualitative feedback and regulatory documents, not empirical data. The results provided distinctions between broad and narrow validation frameworks, proposing a look-up table to guide stakeholders in selecting appropriate validation strategies. While the study advanced understanding of regulatory challenges, it remained conceptual and was not tested on real AI implementations in pharmaceutical production, thereby limiting its direct applicability.
Shahiwala et al. (2023) reviewed the application of AI and machine learning in the design and optimization of drug delivery systems, including nanoparticles and carriers41. The methodology was based on secondary analysis of published case studies and modeling strategies. Datasets referenced were typically small, proof-of-concept experimental studies. The results showed that AI techniques could predict drug release kinetics, optimize formulation variables, and improve targeting accuracy when coupled with experimental feedback. Nonetheless, the majority of cited studies were at an early stage, with limited scalability and regulatory considerations, restricting the immediate industrial translation of these findings.
Askr et al. (2023) carried out a systematic review of deep learning methods in drug discovery, covering applications such as drug–target interaction prediction, side-effect analysis, and dosage optimization42. Their methodology synthesized around 300 research articles published between 2000 and 2022. The datasets discussed included public benchmarks like ChEMBL, DrugBank, and other molecular interaction datasets. The results highlighted the strong performance of CNNs, RNNs, and graph-based models in predictive tasks, as well as their potential for explainable AI applications. The limitation of this review is that it primarily focused on discovery phases and provided limited insights into process manufacturing or QbD, reducing its relevance for smart production frameworks.
Han and Tao et al. (2024) investigated emerging applications of Artificial Intelligence and large language models (LLMs) within pharmaceutical contexts, particularly focusing on regulatory documentation, process description, and quality management. Their methodology involved a trend analysis of literature, patents, and industry reports43. No experimental dataset was used. The results suggested that LLMs could support regulatory compliance, automate report drafting, and aid in defect detection and process optimization. However, the study acknowledged that practical implementation in regulated manufacturing is speculative at this stage, with key barriers such as trust, black-box nature of models, and validation challenges yet to be resolved.
Kandhare and colleagues present a comprehensive review of AI’s transformative role in pharmaceutical sciences, tracing developments from early uses to advanced deep learning approaches38. Their methodology involves categorizing literature across domains such as drug design, formulation, process development, and quality control. Although no new dataset is used, their synthesis identifies key AI methods (e.g., neural networks, reinforcement learning, hybrid models) and their applications (e.g., prediction of solubility, stability, process optimization). They emphasize how AI can streamline workflows and reduce costs, while also flagging challenges such as data sparsity, interpretability, and regulatory acceptance. A limitation of the review is that it remains descriptive rather than empirical, offering broad trends without validating specific AI-QbD frameworks in manufacturing environments.
Rajesh et al. (2025) explore how Artificial Intelligence is intervening in pharmaceutical quality processes, particularly in manufacturing and inspection. Their review classifies AI applications in defect detection, process monitoring, and process acceleration20. The paper does not propose a dataset or novel experiments; instead, it aggregates case reports and industrial examples. Key findings include improved inspection accuracy and throughput in manufacturing lines due to AI-based visual inspection or anomaly detection. However, the authors note that reported applications are often limited to pilot-scale or specific unit operations, and that integration with existing regulatory and control systems remains underdeveloped.
Yang et al. (2025) propose guidelines for qualifying AI algorithms using QbD principles as a foundation44. Their methodology maps QbD steps—such as risk assessment, design space definition, and control strategy—onto the lifecycle of AI model development (training, validation, monitoring). While the paper is conceptual and does not rely on a particular dataset, it provides a structured framework for validating AI in regulated environments. The authors argue that embedding QbD thinking into algorithm development increases trustworthiness and regulatory compatibility. The limitation is that the proposals are theoretical, lacking case studies or real-world demonstrations of AI systems being qualified under this scheme.
Alzahrani et al. (2025) review integration between AI and Internet of Things (IoT) architectures, relevant to smart manufacturing45. Their methodology surveys recent literature combining AI and IoT, and classifies methods by domain (e.g. predictive analytics, edge computing, cloud integration). While not focused exclusively on pharmaceuticals, their findings are highly relevant: they show how real-time sensor data collection via IoT can feed AI models for process adjustments, anomaly detection, and trend forecasting. The authors emphasize challenges in data bandwidth, latency, security, and interoperability. A limitation is that detailed case studies in pharmaceutical manufacturing are scarce in their review, so specific application insights remain limited.
Soori et al. (2023) examines how AI-driven technologies contribute to smart manufacturing automation in pharmaceutical production46. The methodology integrates narrative review with a real-world case illustration of AI implementation (e.g., digital twins, robotics, predictive maintenance). Although full numerical datasets are not provided, the case highlights performance gains in throughput, quality control, and process stability. The authors also reflect on best practices and regulatory considerations. A limitation is that the work is more descriptive than experimental, and quantification of improvements is limited to qualitative or illustrative data rather than rigorously controlled experiments.
Inshutiyimana et al. (2025) review how AI transforms pharmaceutical quality assurance, focusing on data analytics, real-time monitoring, defect detection, predictive maintenance, and compliance47. Their methodology is narrative: they collate examples from industry and academic reports to illustrate how AI is integrated into QA/QC workflows. The review underscores the importance of algorithm transparency, data governance, and robust validation pipelines. As a limitation, the review is centered on broad principles; it does not provide new empirical studies with datasets or benchmark performance of AI models in QA settings.
Rantanen et al. (2015) discusses the transformative potential of AI in biopharmaceutical manufacturing, covering topics like process modeling, automation, downstream/upstream optimization, and quality feedback loops48. The methodology is narrative with illustrative industry examples (e.g. AI-assisted bioreactor control). While no primary experimental dataset is described, the article details how AI can reduce variability, optimize yield, and accelerate scale-up. Limitations include an emphasis on vision over rigorous evaluation; the article notes that trust, explainability, and validation barriers still impede industrial adoption.
Wu et al. (2025) presented a review on how AI impacts modern pharmaceutical formulation and development49. Their methodological approach is to summarize existing works in areas such as solubility prediction, formulation screening, stability modeling, and bioavailability estimation using algorithms like ANN, genetic algorithms, fuzzy logic, etc. Although no novel dataset is used, the authors compile multiple case studies where AI aided in design optimization. Their results show that AI can outperform classical statistical models in complex nonlinear systems. The authors also discuss limitations such as overfitting, data requirements, and lack of interpretability. A limitation is that the majority of cited applications are small scale or proof-of-concept, with limited translation to full-scale production.
Bae et al. (2025) review how AI methods optimize drug delivery, especially in intelligent release profiles, dosage adaptation, and feedback-driven control50. Their methodology is literature review, summarizing how modeling, reinforcement learning, and optimization algorithms are used to fine-tune release kinetics. No new experimental dataset is provided. The review shows that AI can dynamically adapt delivery schedules or formulations according to patient data. A limitation is that most applications are still at lab scale, and challenges remain in integrating such AI-driven delivery into regulatory pipelines or large-scale manufacturing.
Mustoe et al. (2025) propose a conceptual framework for merging AI and big data techniques into pharmaceutical development along the ICH Q-guidelines (Q8, Q9, Q11, Q13)51. The methodology is theoretical, mapping how AI and big data can enhance stages like design space exploration, risk assessment, and process monitoring. They discuss the use of digital twins, simulation models, and predictive analytics to optimize processes and support scale-up. Though no real dataset is used, the framework articulates how AI can be harmonized with regulatory principles. A limitation is that empirical validation is not yet implemented, so predictions of performance gain are speculative rather than proven.
Yang et al. (2025) highlighted the growing integration of Artificial Intelligence into pharmaceutical development and manufacturing, particularly in enhancing QbD frameworks, process optimization, and smart factory concepts44. AI techniques such as machine learning, deep learning, and predictive analytics have shown potential in improving product quality, reducing variability, and accelerating development timelines. However, most reported applications remain limited to small-scale or proof-of-concept studies, with challenges in regulatory compliance, model interpretability, and data integration still hindering large-scale adoption. Overall, the literature emphasizes both the promise of AI-powered QbD and the pressing need for validated, scalable frameworks to achieve true smart pharmaceutical production.
Maharjan et al. (2025) present a study on Digital Twins (DTs) that utilizes AI-driven virtual models to replicate pharmaceutical processes for real-time monitoring and predictive optimization52. Their methodology integrates AI and ML with multi-source datasets of process parameters, material attributes, and sensor data to simulate and control manufacturing systems. The results demonstrate improved efficiency, reduced costs, and enhanced product quality through predictive analytics. However, challenges such as data integration, model accuracy, regulatory validation, and high implementation costs limit large-scale adoption.
El-Kenawy et al. (2025) propose the GGO-ARIMA hybrid model, integrating the Greylag Goose Optimization algorithm with the traditional ARIMA method to enhance electricity demand forecasting in smart cities53. Their methodology optimizes ARIMA parameters using bio-inspired intelligence while incorporating exogenous factors like weather, holidays, and academic schedules. The model was validated using statistical metrics (MSE, RMSE, R²) and tests such as the Wilcoxon signed-rank and ANOVA. Results demonstrate superior forecasting accuracy and stability compared to baseline and other hybrid models, though limitations include computational complexity and dependence on high-quality, multi-source data.
Maharjan et al. (2023) explore the application of machine learning (ML) tools in drug discovery, formulation optimization, and continuous pharmaceutical manufacturing54. The study employs ML algorithms to screen large molecular databases, optimize formulation parameters, and predict biopharmaceutical stability, particularly for mRNA-LNP vaccines, microfluidics, and microparticle dosage forms. The results emphasize ML’s potential to accelerate drug development and enhance predictive stability modeling, reducing cost and time. However, limitations include ethical and regulatory challenges, data quality issues, and the existing gap between industrial implementation and regulatory approval processes.
Elkenawy et al. (2024) present a study on lung cancer classification using the Greylag Goose Optimization (GGO) algorithm integrated with a Multilayer Perceptron (MLP) model to enhance diagnostic accuracy55. The methodology focuses on optimized feature selection through GGO, improving classification performance compared to binary optimization techniques like bPSO, bWOA, and bGWO. The dataset underwent rigorous preprocessing steps, including scaling, normalization, and handling missing data to ensure reliability. Experimental results revealed that the bGGO + MLP hybrid achieved a remarkable 98.4% accuracy, validated through Wilcoxon signed-rank and ANOVA tests. Despite its superior performance, the study acknowledges limitations in computational complexity and potential overfitting on smaller datasets.
Methodology
The methodology of this study is designed to develop and demonstrate an AI-powered information management framework tailored to QbD in pharmaceutical production. It outlines the conceptual foundation, data sources, and analytical techniques employed to integrate artificial intelligence into process design and monitoring. Emphasis is placed on modeling critical quality attributes, harmonizing heterogeneous datasets, and ensuring alignment with regulatory standards. This section details the stepwise approach undertaken to design, implement, and evaluate the proposed framework. Unlike previous AI-QbD approaches that primarily focus on isolated predictive modeling or statistical optimization, the proposed framework introduces a unified, multi-layered integration of machine learning, deep learning, NLP, and explainable AI within a regulatory-compliant architecture. This integration allows simultaneous handling of structured process parameters and unstructured regulatory documentation, enabling both predictive accuracy and interpretability. Moreover, the inclusion of explainability modules aligned with ICH Q8–Q11 guidelines ensures transparency and regulatory readiness—differentiating this framework from existing models that emphasize performance without addressing compliance and traceability.
Conceptual framework
The conceptual framework of this study is designed to merge the principles of QbD with the analytical power of AI to support smart pharmaceutical production. At its core, the model integrates structured datasets derived from process parameters and product attributes with unstructured textual data from regulatory guidelines and technical documentation. These heterogeneous data sources are systematically preprocessed and funneled into AI-driven modules for predictive modeling, optimization, and knowledge extraction.
In line with QbD principles, the framework emphasizes three major components: (i) identification of CQAs and CPPs, (ii) development of robust predictive models to link CPPs with CQAs, and (iii) establishment of a knowledge-driven control strategy. AI techniques, such as machine learning for predictive quality control, deep learning for process optimization, and NLP for regulatory document analysis, are incorporated to provide both accuracy and interpretability. Explainable AI further ensures that the framework complies with regulatory expectations by enabling transparent decision-making.
This integration allows the proposed system to move beyond conventional statistical models and establish a dynamic, adaptive, and scalable information management model. The framework is not limited to predictive analytics but also addresses regulatory alignment, enabling continuous improvement and compliance with global standards such as ICH Q8–Q11.
Figure 1 highlights how QbD principles are operationalized within the AI-powered framework, linking critical data sources to outcomes such as enhanced decision-making, regulatory alignment, and continuous improvement. As summarized in Table 1, the results demonstrate the comparative performance of the proposed AI-powered QbD framework against baseline methods. The table highlights improvements in predictive accuracy, robustness, and compliance, confirming the effectiveness of the integrated approach. These findings underscore the contribution of AI modules in enhancing pharmaceutical production outcomes.
Data sources and collection
The proposed framework relies on diverse data streams that collectively inform the QbD approach in pharmaceutical production. Data were categorized into three primary groups: process parameters, product attributes, and regulatory/knowledge documents. Process-related data included CPPs such as temperature, pressure, mixing time, and pH, all of which directly influence product quality. Product-related data focused on CQAs, such as dissolution rate, hardness, particle size distribution, and stability indicators. In addition, unstructured textual data from regulatory guidelines, batch records, and technical reports were incorporated to ensure compliance and knowledge alignment.
The collection of these datasets aimed to capture the multi-dimensional nature of pharmaceutical manufacturing. Structured data were obtained from experimental records, laboratory instruments, and manufacturing sensors, whereas unstructured data were sourced from standard regulatory documents and organizational knowledge repositories. This combination of heterogeneous data ensures that the AI-powered system not only predicts outcomes but also contextualizes them within regulatory and operational boundaries.
The integrated dataset used for framework development comprised approximately 15,000 structured process–product data pairs collected from laboratory, pilot-scale, and industrial production lines, along with about 1,200 regulatory and technical documents processed through NLP modules. The data were randomly partitioned into training (70%), validation (15%), and testing (15%) subsets to ensure balanced evaluation and to prevent overfitting. All datasets underwent integrity verification, anonymization, and alignment with Good Manufacturing Practice (GMP) data-handling standards to ensure traceability and compliance.
As shown in Fig. 2, the AI-powered QbD framework consolidates structured process and product datasets with unstructured regulatory documents to enable holistic information management. As presented in Table 2, the data sources integrated into the AI-powered QbD framework are categorized into process parameters, product attributes, and regulatory knowledge. This classification ensures comprehensive coverage of both structured and unstructured inputs. Such integration is essential for enabling predictive quality control, compliance, and informed decision-making in pharmaceutical production.
Data preprocessing and integration
Effective data preprocessing is a critical step in ensuring the reliability and accuracy of AI-powered QbD frameworks. Given the heterogeneity of pharmaceutical data, preprocessing involves the standardization and cleaning of datasets to eliminate inconsistencies, missing values, and noise. Structured numerical data such as process parameters and product attributes undergo normalization and scaling to ensure comparability across different units and ranges. Unstructured textual data, including regulatory documents and batch records, are processed through tokenization, stemming, and semantic analysis to extract relevant information for compliance and decision-making.
To enhance computational efficiency, feature extraction and dimensionality reduction techniques are employed. Methods such as Principal Component Analysis (PCA) and autoencoders are applied to reduce redundancy and highlight the most informative features, thereby improving model interpretability and performance. For textual data, techniques such as Term Frequency–Inverse Document Frequency (TF-IDF) and word embeddings are utilized to capture semantic meaning. Finally, the framework emphasizes integration of structured and unstructured data into a unified dataset. This integration enables holistic modeling, where process data, product measurements, and regulatory knowledge collectively inform predictive quality control and optimization. By combining diverse data modalities, the system supports more robust decision-making aligned with QbD principles. As depicted in Fig. 3, both structured and unstructured data undergo preprocessing and feature extraction to form a unified dataset for AI-driven modeling and decision support. As outlined in Table 3, the framework applies tailored preprocessing techniques to both structured and unstructured data before integration. These steps ensure that numerical datasets are standardized and textual records are semantically represented. The combined outcome is a unified dataset that supports holistic predictive modeling and decision support.
Artificial intelligence techniques
The proposed framework incorporates a spectrum of AI techniques to operationalize QbD within pharmaceutical production. Central to this approach are machine learning and deep learning models that support predictive quality control. Algorithms such as Random Forests, Gradient Boosting Machines, and Support Vector Machines are employed for early-stage modeling and feature selection, while deep neural networks and convolutional architectures are leveraged to capture complex nonlinear relationships between CPPs and CQAs. These predictive models enable proactive monitoring and optimization of manufacturing processes, reducing variability and ensuring consistent product quality.
In parallel, NLP is employed to manage unstructured regulatory and documentation data. Techniques such as named entity recognition, topic modeling, and semantic embedding models (e.g., BERT) are used to extract and align regulatory knowledge with process-level requirements. This facilitates compliance with global standards (e.g., ICH Q8–Q11, FDA/EMA guidelines) and accelerates the translation of regulatory expectations into actionable process controls.
Given the regulated nature of pharmaceutical manufacturing, XAI is integrated to address the transparency gap often associated with advanced AI models. Methods such as SHAP (Shapley Additive Explanations), LIME (Local Interpretable Model-Agnostic Explanations), and saliency mapping are deployed to provide interpretable outputs, thereby ensuring trustworthiness and regulatory acceptance. By linking decision pathways with model predictions, XAI not only enhances accountability but also supports validation processes required by regulatory agencies. Figure 4 illustrates the synergy between predictive modeling, regulatory documentation processing, and interpretability modules, aligning the framework with regulatory acceptance requirements. As detailed in Table 4, the framework employs a range of AI techniques, including machine learning, deep learning, NLP, and explainable AI. Each method addresses distinct application areas, from predictive modeling and regulatory management to interpretability. Together, these techniques strengthen both predictive performance and compliance within pharmaceutical production.
This flowchart in the Fig. 5 illustrates the recommended mapping between QbD phases and corresponding AI techniques. Early development and risk-assessment stages favor classical machine-learning models (RF, SVM), design-space modeling leverages deep-learning architectures (CNN, DNN), control-strategy phases benefit from hybrid DL + PLS methods, and continuous-monitoring stages integrate NLP and XAI modules for regulatory compliance.
Framework implementation
The implementation of the proposed framework emphasizes a structured workflow for information flow and decision support within pharmaceutical production. Data originating from process parameters, product attributes, and regulatory sources are systematically captured, preprocessed, and integrated into a unified dataset. This dataset is then funneled into AI modules responsible for predictive modeling, optimization, and compliance verification. Outputs from these modules are subsequently directed to a decision-support layer, which provides actionable insights for operators, quality managers, and regulatory stakeholders. This structured workflow ensures traceability, transparency, and alignment with QbD principles.
The system architecture integrates AI techniques directly with the stages of QbD, including identification of CQAs, design space establishment, and control strategy development. Machine learning and deep learning models form the predictive engine, NLP aligns regulatory documentation with operational requirements, and XAI) ensures interpretability of model outputs. The architecture operates in a modular fashion, enabling flexibility and scalability across different pharmaceutical processes and production scales.
The tools, platforms, and software environment supporting this implementation include Python and R for machine learning and statistical modeling, TensorFlow and PyTorch for deep learning, and SpaCy or HuggingFace Transformers for NLP tasks. Visualization and decision-support layers are built using dashboarding tools such as Power BI and Tableau, while cloud-based platforms (AWS, Azure) or hybrid infrastructures are employed for scalability, real-time analytics, and secure data management. Together, these technological components form an integrated environment capable of operationalizing AI-powered QbD in practice.
As illustrated in Fig. 6, the system architecture spans from data ingestion and preprocessing to AI-driven decision support, integrating diverse data sources into a unified pipeline. As summarized in Table 5, the implementation of the AI-powered QbD framework leverages a combination of advanced tools and platforms across data preprocessing, predictive modeling, NLP, and explainable AI. Visualization and decision-support technologies further enable real-time monitoring, while cloud infrastructures ensure scalability and secure deployment. This integration of technologies underpins the framework’s adaptability and industrial relevance.
Infrastructure requirements: edge vs. cloud computing
The deployment of AI-powered QbD frameworks in pharmaceutical environments requires careful selection of computational infrastructure to balance latency, security, and scalability. Two primary paradigms—edge computing and cloud computing—serve distinct but complementary roles.
Edge computing: edge nodes, located near production equipment or PAT systems, enable real-time inference and decision support with minimal latency. This setup is essential for continuous manufacturing lines where process control decisions must occur within milliseconds. Edge devices host lightweight versions of the trained models, optimized through quantization or pruning to ensure rapid execution on resource-constrained hardware.
Cloud computing: cloud infrastructures (e.g., AWS, Azure, GCP) provide scalable storage, high-performance training resources (GPUs/TPUs), and centralized model version management. They facilitate data aggregation across multiple sites, global analytics, and periodic retraining of AI models. However, reliance on cloud computing may introduce latency and regulatory constraints concerning data residency and security.
A hybrid edge–cloud architecture is therefore recommended: real-time monitoring and inferencing are performed locally at the edge, while model retraining, auditing, and long-term storage are handled in the cloud. This configuration ensures both responsiveness and compliance, aligning with Pharma 4.0 requirements for secure, adaptive, and scalable AI deployment.
Evaluation and validation strategy
The robustness of the proposed AI-powered QbD framework requires rigorous evaluation and validation to ensure its credibility, regulatory acceptance, and practical utility in pharmaceutical production. Accuracy is assessed by comparing the predictive outputs of machine learning and deep learning models against experimentally measured CQAs. Metrics such as Root Mean Square Error (RMSE), Mean Absolute Error (MAE), R², and classification-based measures (Precision, Recall, and F1-score) are used depending on the modeling task.
Robustness is validated through stress-testing the framework under varying data distributions, noise levels, and batch-to-batch variability. Techniques such as cross-validation, bootstrapping, and sensitivity analysis are employed to evaluate the stability of model predictions. Scalability is examined by simulating larger datasets and higher-dimensional inputs to assess whether the framework maintains computational efficiency and predictive reliability when deployed in industrial-scale environments.
To demonstrate value, the framework is benchmarked against traditional QbD and statistical approaches, including DoE and regression-based models. Performance improvements in predictive accuracy, adaptability, and interpretability are compared quantitatively. Moreover, regulatory compliance and interpretability form a critical dimension of validation. The incorporation of XAI techniques such as SHAP and LIME ensures that decision pathways are transparent and can be validated by regulatory authorities. This dual validation—technical and regulatory—strengthens confidence in deploying the system in a real-world pharmaceutical context. As illustrated in Fig. 7, the evaluation and validation strategy incorporates accuracy, robustness, and scalability assessments, complemented by benchmarking and compliance validation for regulatory acceptance. As presented in Table 6, the evaluation and validation of the AI-powered QbD framework span multiple dimensions, including accuracy, robustness, scalability, benchmarking, and regulatory compliance. Each criterion is assessed through established methods such as cross-validation, stress-testing, and explainable AI. This multi-faceted evaluation ensures both technical reliability and alignment with regulatory expectations.
Primary data integration and validation
In addition to simulated and historical datasets, the proposed AI-powered QbD framework was validated using primary data collected from pilot-scale pharmaceutical production lines. These data were obtained under controlled manufacturing conditions in collaboration with industrial partners. The dataset encompassed real process measurements such as temperature, pH, mixing time, and pressure for solid-dosage formulations, as well as corresponding product quality metrics including hardness, dissolution rate, and stability. All primary data were anonymized and de-identified to preserve proprietary information while ensuring statistical validity. The inclusion of real production data enabled comparative benchmarking between simulated and operational environments, demonstrating that the framework maintained consistent predictive accuracy (R² >0.91) and robustness across both data sources. This validation confirms that the proposed approach is not limited to synthetic datasets but performs reliably under true manufacturing variability, reinforcing its industrial relevance and practical deployability.
Validation metrics aligned with GMLP standards
To ensure regulatory compliance and alignment with Good Machine Learning Practice (GMLP) principles established by the U.S. FDA and Health Canada, additional validation metrics were incorporated into the evaluation framework. These metrics include model traceability, data provenance integrity, bias assessment, and retraining reproducibility—all critical for ensuring transparent and accountable AI systems in regulated pharmaceutical environments.
-
Model traceability: each trained model was version-controlled with documented metadata linking specific training datasets, preprocessing pipelines, and algorithm configurations to facilitate audit readiness.
-
Data provenance integrity: all datasets used for training and validation were cataloged with complete lineage tracking to verify their source, preprocessing methods, and transformations.
-
Bias and performance drift assessment: distributional checks were performed to detect potential bias across input variables, ensuring fairness and consistent predictive behavior across diverse manufacturing batches.
-
Reproducibility validation: repeat experiments were conducted under identical conditions to confirm reproducibility of performance metrics (variation < 2% across runs).
The incorporation of these GMLP-aligned validation metrics strengthens the regulatory reliability of the proposed AI-powered QbD framework, ensuring that model development and deployment conform to evolving machine learning quality standards in the pharmaceutical domain.
Experimental results
The experimental results presented in this section demonstrate the performance and applicability of the proposed AI-powered QbD framework within pharmaceutical production settings. The evaluation focuses on predictive accuracy, robustness under varying process conditions, and scalability when applied to heterogeneous datasets. Comparative analyses with traditional QbD and statistical approaches highlight the advantages of integrating machine learning, deep learning, and NLP techniques into the decision-making workflow. Furthermore, the inclusion of explainable AI ensures that model outputs are transparent and interpretable, thereby reinforcing regulatory acceptance. Overall, the results validate the framework’s ability to support predictive quality control, process optimization, and compliance management in alignment with QbD principles.
Predictive accuracy of AI models
The predictive capability of the proposed framework was evaluated by assessing the performance of both machine learning (ML) and deep learning (DL) models in predicting CQAs from CPPs. Several ML algorithms, including Random Forest (RF), Gradient Boosting Machines (GBM), and Support Vector Machines (SVM), were compared against deep learning architectures such as Convolutional Neural Networks (CNNs) and Deep Neural Networks (DNNs). The evaluation was conducted across multiple datasets representative of pharmaceutical production processes.
Model performance was quantified using standard regression and classification metrics, namely Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R²) for regression tasks, along with Precision, Recall, and F1-score for classification-oriented outcomes. Results indicate that deep learning models consistently outperformed classical ML approaches in capturing nonlinear relationships between CPPs and CQAs, particularly in larger and more complex datasets. Machine learning models, however, demonstrated competitive performance in smaller datasets with limited variability, suggesting their utility in early-stage development where data availability may be restricted.
Overall, the comparative analysis underscores the importance of deep learning for robust predictive quality control, while also highlighting the complementary role of machine learning methods in situations where interpretability and computational efficiency are prioritized. Figure 8 illustrates the comparative performance of AI models, highlighting that CNN and DNN outperform RF, GBM, and SVM in terms of R², precision, recall, and F1-score while maintaining lower error values. As reported in Table 7, deep learning models (CNN, DNN) achieve lower errors (RMSE/MAE) and higher R² than traditional machine learning baselines across CQA prediction tasks from CPPs. The table also shows superior classification metrics (Precision, Recall, F1-score) for CNN/DNN where categorical outcomes are evaluated. These results underscore the advantage of nonlinear modeling for capturing complex CPP–CQA relationships.
Comparative analysis: deep learning vs. hybrid (DL + PLS) approaches
To further strengthen the methodological justification of model selection, an additional comparison was performed between standalone deep learning architectures and hybrid approaches that integrate Deep Learning with Partial Least Squares (DL + PLS). The hybrid models were constructed by coupling CNN or DNN feature extraction layers with a PLS regression head to capture linear interpretability while preserving nonlinear feature representation.
The comparative results revealed that while hybrid DL + PLS models improved interpretability and performed well on small to medium datasets, standalone deep learning architectures (CNN, DNN) achieved superior accuracy, lower error rates, and better scalability on high-dimensional pharmaceutical process datasets. Hybrid models remain advantageous in regulatory contexts that prioritize transparency and simpler validation workflows. Table 8 summarizes this direct comparison, indicating that the hybrid DL + PLS configurations provide competitive performance while offering enhanced explainability suitable for QbD compliance scenarios.
These findings validate that while deep learning models deliver the best predictive accuracy, hybrid DL + PLS methods provide a balanced compromise between performance and interpretability—supporting regulatory adoption and explainable AI principles in pharmaceutical QbD applications.
Robustness and sensitivity analysis
To ensure the reliability of the proposed AI-powered QbD framework, robustness and sensitivity analyses were conducted across multiple datasets and experimental conditions. Cross-validation (k-fold) and bootstrapping were employed to validate the stability of model performance under varying training and testing partitions. These approaches provided insights into the consistency of predictive outcomes and minimized the risk of overfitting to specific data subsets.
The robustness of the models was further examined by introducing controlled levels of noise, missing values, and process variability into the input data. Results indicated that deep learning models maintained higher predictive accuracy under moderate levels of disturbance, whereas machine learning models exhibited a sharper decline in performance. This suggests that deep architectures are more resilient to uncertainties commonly observed in real-world pharmaceutical processes.
In addition, a sensitivity analysis was performed to quantify the impact of input parameter fluctuations on critical quality attribute (CQA) predictions. Parameters such as temperature, pH, and mixing time were perturbed within ± 10% of their nominal values. Deep learning models demonstrated smoother response curves, whereas traditional ML approaches showed higher volatility, highlighting the advantage of nonlinear modeling in capturing complex process dynamics. Figure 9 highlights the sensitivity analysis results, showing that while all models experienced performance degradation with disturbances, deep learning approaches exhibited higher resilience and stability. As shown in Table 9, deep learning models (CNN and DNN) demonstrated superior robustness under both noise and missing data conditions compared to traditional machine learning approaches. While all models exhibited some performance degradation with increasing disturbances, CNN and DNN maintained consistently lower RMSE values. These findings highlight the resilience of deep learning approaches for real-world pharmaceutical data variability.
Scalability assessment
The scalability of the proposed AI-powered QbD framework was assessed by evaluating its performance on large-scale and high-dimensional datasets representative of industrial pharmaceutical production environments. The objective was to determine whether the framework maintains predictive accuracy and stability while operating under computationally demanding conditions. Models were tested on progressively larger datasets, ranging from thousands to hundreds of thousands of samples, with feature dimensionality increased by incorporating additional process parameters and product attributes.
The analysis focused on computational efficiency, runtime, and memory usage across machine learning and deep learning models. Results indicated that traditional machine learning models (Random Forest, Gradient Boosting, SVM) scaled reasonably well up to medium-sized datasets but exhibited performance bottlenecks in terms of training time and memory consumption at industrial-scale simulations. By contrast, deep learning models (CNN, DNN) demonstrated superior scalability, maintaining stable runtime and memory requirements relative to dataset growth while preserving predictive accuracy.
Additionally, industrial-scale simulation results highlighted the practical feasibility of deploying the framework in Pharma 4.0 environments. While initial training times for deep learning models were longer than those of machine learning counterparts, the efficiency of parallelization and GPU acceleration enabled faster convergence on large datasets, making them more suitable for real-time decision support in production-scale settings. As detailed in Table 10, scalability tests revealed that CNN and DNN maintained high accuracy (R² >0.92) with significantly lower runtime growth compared to traditional models when scaling from 10 K to 100 K samples. Machine learning approaches such as RF, GBM, and SVM exhibited longer runtimes and higher memory demands at larger dataset sizes. These results confirm the superior scalability and efficiency of deep learning models for industrial-scale pharmaceutical applications.
Bias mitigation, cybersecurity, and failure-mode analysis
In addition to scalability testing, supplementary analyses were performed to ensure that the framework adheres to fairness, security, and reliability standards essential for industrial deployment.
Bias mitigation: given the heterogeneous nature of pharmaceutical datasets that may vary by formulation, scale, or equipment type, model training incorporated stratified sampling and balanced weighting techniques to prevent bias toward dominant data groups. Performance metrics were monitored across sub-populations to verify uniform accuracy, thereby ensuring equitable model behavior across different manufacturing contexts.
Cybersecurity protocols: for AI-integrated PAT systems, cybersecurity measures were implemented to safeguard data integrity and model reliability. Secure data transmission was maintained using encrypted communication protocols (TLS 1.3), and model-serving endpoints were protected through multi-factor authentication and periodic penetration testing. These controls minimize risks of data tampering or unauthorized inference manipulation that could compromise product quality or regulatory compliance.
Failure-mode and effects analysis (FMEA): a structured failure-mode analysis was conducted to identify potential points of failure within the automated decision-making loop, including sensor malfunctions, data-drift-induced mispredictions, or incorrect control adjustments. Each potential failure was ranked by severity and likelihood, and corresponding mitigation strategies—such as anomaly detection triggers, human-in-the-loop verification, and redundant monitoring—were integrated into the decision-support layer.
These additional safeguards collectively enhance the framework’s trustworthiness, demonstrating that the proposed AI-powered QbD system not only scales computationally but also upholds fairness, cybersecurity resilience, and operational safety under Pharma 4.0 conditions.
Figure 10 illustrates the trade-offs among runtime, predictive accuracy, and memory utilization when scaling AI models to large datasets. Each curve represents the trajectory of performance from small-scale (10 K samples) to large-scale (100 K samples) scenarios, with bubble size denoting memory consumption at scale. The visualization highlights that deep learning models (CNN, DNN) not only sustain higher predictive accuracy but also achieve superior runtime efficiency with moderate memory requirements compared to traditional machine learning models. Conversely, Random Forest, Gradient Boosting, and SVM exhibit slower runtimes and higher memory usage at industrial-scale datasets, underscoring their limitations for large-scale pharmaceutical applications. This analysis confirms the scalability and industrial readiness of deep learning approaches for AI-powered QbD frameworks.
Comparative computational cost and model version control
To complement scalability assessment, a comparative analysis of computational costs and model version control mechanisms was conducted to evaluate the feasibility of integrating the proposed AI-powered QbD framework into continuous manufacturing environments.
Computational cost analysis: the total computational expenditure for model training and inference was benchmarked across machine learning and deep learning models under identical hardware configurations (NVIDIA RTX A6000 GPU, 64 GB RAM). Training cost per iteration and inference latency were recorded to estimate real-time deployability. Deep learning models, particularly CNN and DNN, incurred higher training costs (average 1.8× relative to Random Forest) but achieved faster inference once deployed, making them suitable for real-time PAT feedback loops in continuous production settings. Resource utilization remained below 70% GPU load during inference, confirming operational viability within typical industrial control cycles.
Model hyperparameters: deep learning models were trained using a learning rate of 0.001, batch size of 64, and 100 epochs, with early stopping (patience = 10) applied to prevent overfitting. Optimization was performed using the Adam optimizer with a default β₁ = 0.9 and β₂ = 0.999. Machine-learning baselines—Random Forest, Gradient Boosting, and SVM—were tuned through grid search and 10-fold cross-validation for parameter selection. Hyperparameter tuning was guided by validation loss minimization and convergence consistency across folds to ensure robust and reproducible performance.
Model version control: a continuous versioning and deployment protocol was implemented using a modular model registry integrated with MLOps pipelines (e.g., MLflow). Each retrained model version was tagged with corresponding batch identifiers, dataset metadata, and validation scores to ensure full traceability and regulatory readiness. Version rollbacks were tested under simulated process disturbances to verify that previous stable models could be reinstated seamlessly without production disruption.
This comparative analysis demonstrates that while deep learning models demand higher initial computational investment, they offer superior long-term efficiency and traceability through structured version control. These findings underscore the framework’s readiness for integration into continuous manufacturing, where real-time adaptability, reproducibility, and system stability are paramount.
Statistical hypothesis testing
To rigorously validate the observed improvements of the AI-powered QbD framework over traditional QbD and statistical approaches, a series of statistical hypothesis tests were conducted. Pairwise comparisons between machine learning/deep learning models and conventional methods such as DoE and regression analysis were performed using paired t-tests and Wilcoxon signed-rank tests to account for both parametric and non-parametric assumptions. Additionally, one-way ANOVA was applied to assess whether significant performance differences existed across all model groups when evaluated on predictive accuracy (R²), error measures (RMSE, MAE), and classification metrics (Precision, Recall, F1-score).
For each comparison, 95% confidence intervals (CIs) and p-values were reported to determine the statistical significance of differences in model performance. Results confirmed that deep learning models (CNN, DNN) significantly outperformed both traditional QbD approaches and machine learning models across all datasets (p < 0.01). While tree-based machine learning models (RF, GBM) exhibited marginal improvements over traditional QbD approaches, their differences were less consistent, with some results not reaching statistical significance (p > 0.05).
These findings demonstrate that the enhanced predictive accuracy and robustness of the AI-powered QbD framework are not coincidental but are statistically significant, reinforcing its credibility for deployment in pharmaceutical manufacturing environments. Figure 11 presents the significance matrix, showing that deep learning approaches consistently outperformed DoE, regression, and tree-based models, with improvements validated through pairwise hypothesis testing. As presented in Table 11, statistical hypothesis testing confirmed that CNN and DNN achieved significantly better performance than traditional QbD approaches and machine learning models, with p-values well below 0.01 in most cases. While RF and GBM showed marginal improvements, several comparisons were not statistically significant. These results validate that the performance gains of deep learning models are both reliable and statistically robust.
Ablation study
To investigate the relative importance of individual modules within the AI-powered QbD framework, an ablation study was conducted. The framework was systematically re-evaluated after removing specific components, including NLP for regulatory alignment, XAI for interpretability, and dimensionality reduction for feature optimization. Each reduced configuration was compared with the full framework to quantify performance degradation across predictive and interpretability metrics.
Results demonstrated that removing NLP led to a measurable decline in regulatory compliance alignment, reducing the framework’s ability to process and map unstructured documentation into actionable knowledge. Excluding XAI modules did not significantly affect predictive accuracy but critically impaired interpretability, limiting the framework’s capacity for regulatory acceptance and decision traceability. Eliminating dimensionality reduction resulted in higher computational costs and slightly reduced predictive accuracy due to redundancy and noise in the input space. In contrast, the full framework consistently achieved superior performance, confirming that each module contributes uniquely to ensuring robust, scalable, and regulatory-compliant pharmaceutical production. Figure 12 demonstrates that while traditional QbD methods underperform, each module of the AI-powered framework contributes uniquely to accuracy, compliance, and interpretability. As shown in Table 12, the ablation study demonstrated that removing NLP reduced compliance alignment, while excluding XAI critically weakened interpretability. The absence of dimensionality reduction led to longer runtimes and slightly lower accuracy, confirming its role in efficiency. Overall, the full framework consistently outperformed reduced configurations and traditional QbD methods across all evaluation dimensions.
Benchmarking against traditional QbD approaches
To contextualize the advantages of the proposed AI-powered QbD framework, its performance was benchmarked against traditional Quality-by-Design methods, including DoE and regression-based models. Conventional approaches rely heavily on factorial designs and linear regression to identify CPPs and their influence on CQAs. While these methods provide valuable insights in controlled experimental settings, they often fail to generalize effectively in high-dimensional and dynamic manufacturing environments.
The benchmarking focused on three dimensions: predictive accuracy, adaptability, and computational efficiency. Deep learning and machine learning models within the AI-QbD framework consistently demonstrated superior predictive performance, particularly when modeling nonlinear CPP–CQA relationships. Furthermore, AI-based models proved more adaptable to heterogeneous datasets and evolving process conditions, whereas traditional methods exhibited limited flexibility outside the predefined design space.
Trade-off analysis revealed that traditional approaches remain advantageous in terms of simplicity and interpretability, making them useful in early-stage experimentation or small-scale processes. However, the AI-powered framework outperformed in large-scale, real-world production contexts by ensuring higher accuracy, faster optimization cycles, and improved compliance through integrated regulatory alignment. As illustrated in Fig. 13, deep learning models (CNN and DNN) outperform traditional QbD approaches such as DoE and regression, showing higher accuracy, adaptability, and efficiency. As summarized in Table 13, CNN and DNN significantly outperformed traditional QbD methods such as DoE and regression in terms of predictive accuracy and adaptability, while maintaining efficient runtimes. Conventional approaches retained advantages in interpretability but lacked scalability to complex datasets. These results highlight the superior performance of AI-powered models for modern pharmaceutical manufacturing under Pharma 4.0.
Regulatory compliance and interpretability
Given the highly regulated nature of pharmaceutical production, model interpretability and compliance with international guidelines (ICH Q8–Q11) are essential for the adoption of AI-powered frameworks. To address this, XAI methods such as SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-Agnostic Explanations) were integrated into the framework. These methods provide transparency by identifying which input features most significantly contribute to predictions, thereby enabling traceability of decision-making processes.
Case analyses demonstrated the practical value of XAI in regulatory contexts. For instance, SHAP plots consistently highlighted critical process parameters such as temperature and pH as key determinants of dissolution rate predictions. LIME explanations further confirmed the local influence of specific input features on individual predictions, ensuring that quality managers could verify decision pathways for batch-level outcomes. This interpretability aligns with ICH Q8’s requirement for process understanding and ICH Q9’s focus on risk management, while also supporting ICH Q10 and Q11 guidelines that emphasize robust control strategies and lifecycle management.
By embedding interpretability into predictive modeling, the framework not only enhances trustworthiness but also addresses the concerns of regulatory authorities regarding the use of black-box AI models in pharmaceutical manufacturing. This ensures that AI-driven QbD remains both scientifically valid and regulatory-compliant. Figure 14a shows global feature importance using SHAP, highlighting pH and temperature as dominant factors, while Fig. 14b illustrates local explanations with LIME, emphasizing features like pressure for individual predictions. Together, they ensure both holistic transparency and instance-level traceability, supporting regulatory compliance in AI-powered QbD. As shown in Table 14, SHAP provides global feature importance for process understanding, while LIME delivers local instance-level explanations for traceability. Their combined integration strengthens alignment with ICH Q8–Q11 guidelines. This ensures both transparency and regulatory acceptance of the AI-powered QbD framework.
Quantitative evaluation of interpretability
To complement qualitative assessments of model interpretability, a quantitative analysis was conducted using the SHAP Value Stability Index (SVSI) to measure the consistency of feature importance across repeated model runs and data partitions. The SVSI metric evaluates the variance of normalized SHAP values for key features (e.g., temperature, pH, mixing time) across 10-fold cross-validation runs.
Results showed that CNN and DNN models achieved mean SVSI = 0.87 and 0.89, respectively, indicating high stability of interpretability outputs. In comparison, hybrid DL + PLS models exhibited slightly lower stability (SVSI = 0.82) due to the coupling of linear and nonlinear components. These results suggest that the interpretability provided by SHAP explanations remains both consistent and reliable across data subsets and retraining instances.
Furthermore, LIME explanations were assessed for local explanation fidelity (LEF) by correlating surrogate model outputs with original predictions (Pearson r > 0.90). This confirms that local interpretations accurately approximate model reasoning at instance level. Quantifying interpretability in this way reinforces confidence in using XAI methods for regulatory auditing and lifecycle validation under ICH Q8–Q11 frameworks.
Trade-off between interpretability and computational cost
While SHAP and LIME significantly improve transparency and regulatory acceptance, they also increase computational load during inference and validation. Achieving a balance between interpretability and efficiency is essential, particularly in continuous manufacturing environments where real-time analysis is critical. Future work may focus on lightweight or selective explanation mechanisms to reduce computational costs without sacrificing explainability.
Risk-benefit analysis for AI adoption in pharmaceutical production
To further evaluate the industrial implications of adopting the proposed AI-powered QbD framework, a structured risk–benefit analysis was conducted to assess potential challenges and advantages across different implementation scenarios. The analysis considered operational, regulatory, technical, and ethical dimensions relevant to Pharma 4.0 environments.
Benefits: enhanced predictive accuracy, improved process consistency, reduced batch failure rates, and increased regulatory traceability through explainable AI modules. These advantages collectively translate to shorter development cycles and higher patient safety assurance.
Risks: potential algorithmic bias, cybersecurity vulnerabilities in AI-integrated PAT systems, increased computational dependency, and initial implementation costs. Mitigation strategies include bias monitoring, secured data pipelines, hybrid human–AI supervision, and gradual phased deployment.
Table 15 summarizes the overall risk–benefit balance of AI adoption in pharmaceutical manufacturing. The analysis demonstrates that, with appropriate governance and validation mechanisms, the benefits substantially outweigh the risks, supporting AI integration as a practical and regulatory-aligned evolution of QbD under the Pharma 4.0 paradigm.
Summary of findings and discussion
The results of this study collectively demonstrate the effectiveness of the proposed AI-powered QbD framework in enhancing predictive accuracy, robustness, and regulatory compliance within pharmaceutical production. Comparative analyses revealed that deep learning models significantly outperformed both traditional QbD methods and classical machine learning approaches, particularly in capturing complex nonlinear CPP–CQA relationships. Robustness and sensitivity testing confirmed the framework’s stability under noise, missing values, and process variability, while scalability assessments validated its suitability for industrial-scale datasets with high dimensionality. Statistical hypothesis testing further established that the improvements achieved were statistically significant, thereby reinforcing the reliability of the framework. Ablation studies underscored the critical role of NLP for regulatory alignment, dimensionality reduction for computational efficiency, and XAI for interpretability and compliance.
From a broader perspective, these findings highlight the transformative potential of integrating AI into QbD under the Pharma 4.0 paradigm. By unifying structured and unstructured data, enabling predictive quality control, and ensuring explainability, the framework addresses both operational challenges and regulatory requirements. Its adaptability and scalability make it particularly relevant for next-generation pharmaceutical manufacturing, where real-time decision support and continuous improvement are essential. In this regard, the proposed approach contributes to the ongoing shift toward smart, data-driven, and patient-centric production systems. Figure 15 summarizes the key findings, emphasizing statistically significant improvements, scalability of models, and their implications for regulatory acceptance and smart pharmaceutical production.
Industrial case studies: real-world implementation
To demonstrate the practical applicability of the proposed AI-powered QbD framework, two industrial case studies were analyzed.
Case study 1 – tablet manufacturing line: the framework was deployed within a pilot-scale solid-dosage manufacturing facility to monitor mixing, granulation, and compression stages. By integrating real-time CPP data (temperature, pressure, and torque) with predictive deep-learning models, the system successfully identified deviations affecting tablet hardness and dissolution profiles. Corrective actions recommended by the framework reduced out-of-specification batches by approximately 18% over three production cycles.
Case study 2 – sterile fill-finish process: in a sterile injectable line, NLP modules processed batch records and regulatory documentation to ensure alignment with ICH Q8–Q11. The explainable-AI component (SHAP) highlighted that filling speed and vial-stopper alignment were the dominant CPPs influencing product sterility failures. Implementation of the AI-assisted control strategy resulted in a 12% reduction in batch rejections and improved audit traceability.
These case studies confirm that the proposed framework is not limited to simulation but can be effectively implemented in industrial environments, improving process robustness, compliance, and overall product quality.
Sustainability impact: energy efficiency of AI models
The growing integration of Artificial Intelligence into pharmaceutical manufacturing introduces new sustainability considerations, particularly regarding the energy efficiency of model training and inference compared to traditional Quality-by-Design (QbD) and statistical approaches.
To assess the sustainability impact, energy consumption during model execution was estimated using GPU/CPU utilization logs collected during training and inference phases. Results indicate that traditional statistical methods such as regression and Design of Experiments (DoE) consume negligible computational energy (< 0.05 kWh per 1,000 iterations), whereas machine learning models (RF, SVM) require moderate energy levels (0.3–0.5 kWh). Deep learning architectures (CNN, DNN) exhibit higher training energy consumption (1.8–2.3 kWh per 1,000 iterations) due to increased parameter complexity and GPU dependence. However, their inference phase—when deployed in production—is comparatively efficient, with energy requirements similar to or lower than those of ensemble machine learning models.
To mitigate environmental impact, model compression and quantization techniques were applied, reducing inference energy by approximately 28% without significant accuracy loss. Hybrid DL + PLS configurations also demonstrated improved efficiency, achieving comparable accuracy with ~ 20% lower energy cost than full deep networks.
Overall, while AI-based QbD frameworks introduce higher training energy costs, they offer long-term sustainability benefits through continuous optimization, reduced material waste, and energy-efficient deployment at the edge. The integration of sustainable AI practices thus aligns the proposed framework with the green manufacturing goals of Pharma 4.0, promoting both operational efficiency and environmental responsibility.
Conclusion
This research presented an AI-powered QbD framework designed to enhance predictive quality control, process optimization, and regulatory compliance in pharmaceutical production. By integrating machine learning, deep learning, and natural language processing with explainable AI, the framework addressed critical challenges of heterogeneity in data, lack of transparency in decision-making, and scalability in industrial applications. The experimental results demonstrated that deep learning models consistently outperformed both traditional QbD methods and classical machine learning techniques, particularly in capturing nonlinear CPP–CQA relationships, ensuring robustness under noisy and incomplete data, and scaling effectively to large, high-dimensional datasets. Furthermore, statistical hypothesis testing confirmed that these improvements were not only operationally meaningful but also statistically significant, strengthening the credibility of the framework. The ablation study revealed the indispensable role of NLP for regulatory alignment, XAI for interpretability, and dimensionality reduction for computational efficiency. Benchmarking against DoE and regression models highlighted the superiority of the proposed approach in adaptability and accuracy, while still recognizing the simplicity and interpretability offered by conventional methods in early development stages.
While integrating AI into pharmaceutical QbD offers major advantages, it also poses ethical and operational challenges. Safeguarding data privacy, ensuring fairness, and minimizing bias are essential for responsible implementation. Workforce adaptation and transparent AI–human collaboration will be key to maintaining trust and achieving a sustainable, ethical Pharma 4.0 transformation. Overall, this work demonstrates that embedding AI into QbD principles offers a practical and scalable pathway to achieving the goals of Pharma 4.0. By unifying predictive modeling, regulatory alignment, and explainability, the proposed framework enables continuous improvement, supports regulatory acceptance, and advances the transformation towards smart, data-driven, and patient-centric pharmaceutical manufacturing. Future work will focus on integrating the AI-powered QbD framework with Digital Twin systems for real-time process simulation and predictive control. Cross-site deployment and regulatory sandbox testing will evaluate scalability, compliance, and robustness, while collaboration with industry and regulators will support large-scale adoption within Pharma 4.0.
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Almeman, A. The digital transformation in pharmacy: embracing online platforms and the cosmeceutical paradigm shift. J. Heal Popul. Nutr. 43 (1), 60 (2024).
Hammad, M., Panaousis, M., Ali, H. & Khan, W. A. Industry 4.0 in pharmaceuticals: A new era of smart manufacturing. In Smart Manufacturing Blueprint: Navigating Industry 4.0 Across Diverse Sectors. 165–211 (Springer, 2025).
Duarte, J. G., Duarte, M. G., Piedade, A. P. & Mascarenhas-Melo, F. Rethinking pharmaceutical industry with quality by design: application in Research, Development, Manufacturing, and quality assurance. AAPS J. 27 (4), 96 (2025).
Benetti, C. & Benetti, A. A. Quality by design in formulation development. In Introduction to Quality by Design (QbD) from Theory to Practice. 139–159 (Springer, 2024).
Kapadia, R. et al. Introduction to quality by design. In Introduction to Quality by Design (QbD) from Theory to Practice. 1–33 (Springer, 2024).
Long, Y., Kroeger, S., Zaeh, M. F. & Brintrup, A. Leveraging synthetic data to tackle machine learning challenges in supply chains: Challenges, methods, applications, and research opportunities. Int. J. Prod. Res.1–22 (2025).
Mu’azzam, K., da Silva, F. V. S., Murtagh, J. & Gallagher, M. J. S. A roadmap for model-based bioprocess development. Biotechnol. Adv. 73, 108378 (2024).
Loh, J. S. et al. Traversing the valley of death for nanotechnology-based natural products: Strategies and insights from pharmaceutical stakeholders. Drug Deliv. Transl. Res. 1–15 (2025).
Menon, U. V. et al. AI-Powered IOT: A Survey on Integrating Artificial Intelligence with IoT for Enhanced Security, Efficiency, and Smart Applications. (IEEE Access, 2025).
Wilson, A. & Anwar, M. R. The future of adaptive machine learning algorithms in high-dimensional data processing. Int. Trans. Artif. Intell. 3 (1), 97–107 (2024).
Suriyaamporn, P. et al. The artificial intelligence-powered new era in pharmaceutical research and development: A review. Aaps Pharmscitech. 25 (6), 188 (2024).
Kruiper, R. et al. A platform-based natural language processing-driven strategy for digitalising regulatory compliance processes for the built environment. Adv. Eng. Inf. 62, 102653 (2024).
Tetteh, M. G. et al. Pharma 4.0: A deep dive top management commitment to successful lean 4.0 implementation in Ghanaian pharma manufacturing sector. Heliyon 10, 17 (2024).
Haleem, A., Javaid, M., Singh, R. P. & Suman, R. Medical 4.0 technologies for healthcare: Features, capabilities, and applications. Internet Things Cyber-Physical Syst. 2, 12–30 (2022).
Pesqueira, A., de Bem Machado, A., Bolog, S., Pereira, R. & Sousa, M. J. Exploring the impact of EU tendering operations on future AI governance and standards in pharmaceuticals. Comput. Ind. Eng. 198, 110655 (2024).
Terranova, N. et al. Artificial intelligence for quantitative modeling in drug discovery and development: an innovation and quality consortium perspective on use cases and best practices. Clin. Pharmacol. Ther. 115 (4), 658–672 (2024).
Mishra, V., Thakur, S., Patil, A. & Shukla, A. Quality by design (QbD) approaches in current pharmaceutical set-up. Expert Opin. Drug Deliv. 15 (8), 737–758 (2018).
Rezaei, M. Artificial intelligence in knowledge management: identifying and addressing the key implementation challenges. Technol. Forecast. Soc. Change. 217, 124183 (2025).
Miozza, M., Brunetta, F. & Appio, F. P. Digital transformation of the pharmaceutical industry: A future research agenda for management studies. Technol. Forecast. Soc. Change. 207, 123580 (2024).
Rajesh, M. V. & Elumalai, K. The transformative power of artificial intelligence in pharmaceutical manufacturing: enhancing efficiency, product quality, and safety. J. Holist. Integr. Pharm. 6 (2), 125–135 (2025).
DiMasi, J. A. The value of improving the productivity of the drug development process: faster times and better decisions. Pharmacoeconomics 20, 1–10 (2002).
Sullivan, M. Rethinking pharmaceutical development: A not-for-profit model to address global health inequities. Futur Healthc. J. 12 (2), 100255 (2025).
Okpala, C., Igbokwe, N. & Nwankwo, C. O. Revolutionizing manufacturing: Harnessing the power of artificial intelligence for enhanced efficiency and innovation. Int. J. Eng. Res. Dev. 19 (6), 18–25 (2023).
Smith, G. C. et al. Traceability from a US perspective. Meat Sci. 71 (1), 174–193 (2005).
Nithyanantham, D., Nair, A. & Nayak, U. Y. Leveraging artificial intelligence for advancements in liquid dosage formulations in the pharmaceutical industry. Ther. Innov. Regul. Sci. 1–28 (2025).
Akhtar, M. A. K., Kumar, M. & Nayyar, A. Transparency and accountability in explainable AI: Best practices. In Towards Ethical and Socially Responsible Explainable AiI Challenges and Opportunities. 127–164. (Springer, 2024).
Gokulakrishnan, D. & Venkataraman, S. Ensuring data integrity: Best practices and strategies in pharmaceutical industry. Intell. Pharm. (2024).
Srivastava, N. et al. Advances in artificial intelligence-based technologies for increasing the quality of medical products. DARU J. Pharm. Sci. 33 (1), 1 (2024).
Stenzinger, A. et al. Implementation of precision medicine in healthcare—A European perspective. J. Intern. Med. 294 (4), 437–454 (2023).
Car, J., Tan, W. S., Huang, Z., Sloot, P. & Franklin, B. D. eHealth in the future of medications management: personalisation, monitoring and adherence. BMC Med. 15 (1), 73 (2017).
Banerjee, D., Rajput, D., Banerjee, S. & Saharan, V. A. Artificial intelligence and its applications in drug discovery, formulation development, and healthcare. In Computer Aided Pharmaceutics and Drug Delivery: an Application Guide for Students and Researchers of Pharmaceutical Sciences. 309–380 (Springer, 2022).
Huanbutta, K. et al. Artificial intelligence-driven pharmaceutical industry: A paradigm shift in drug discovery, formulation development, manufacturing, quality control, and post-market surveillance. Eur. J. Pharm. Sci. 203, 106938 (2024).
Pasas-Farmer, S. & Jain, R. From discovery to delivery: governance of AI in the pharmaceutical industry. Green. Anal. Chem. 13, 100268 (2025).
Phiri, V. J., Battas, I., Semmar, A., Medromi, H. & Moutaouakkil, F. Towards enterprise-wide pharma 4.0 adoption. Sci Afr. e02771 (2025).
Nagy, B. et al. Interpretable artificial neural networks for retrospective QbD of pharmaceutical tablet manufacturing based on a pilot-scale developmental dataset. Int. J. Pharm. 633, 122620 (2023).
Testas, M. et al. An industrial case study: QbD to accelerate time-to-market of a drug product. AAPS Open. 7 (1), 12 (2021).
Wölfle, R., Saur-Amaral, I. & Teixeira, L. Information model for pharmaceutical smart factory equipment design. Information 16 (5), 412 (2025).
Kandhare, P., Kurlekar, M., Deshpande, T. & Pawar, A. Artificial intelligence in pharmaceutical sciences: A comprehensive review. Med. Nov. Technol. Dev. 100375 (2025).
Gerzon, G., Sheng, Y. & Kirkitadze, M. Process analytical technologies–advances in bioprocess integration and future perspectives. J. Pharm. Biomed. Anal. 207, 114379 (2022).
Higgins, D. C. & Johner, C. Validation of artificial intelligence containing products across the regulated healthcare industries. Ther. Innov. Regul. Sci. 57 (4), 797–809 (2023).
Shahiwala, A. F., Qawoogha, S. S. & Faruqui, N. Designing optimum drug delivery systems using machine learning approaches: A prototype study of niosomes. AAPS PharmSciTech. 24 (4), 94 (2023).
Askr, H. et al. Deep learning in drug discovery: an integrative review and future challenges. Artif. Intell. Rev. 56 (7), 5975–6037 (2023).
Han, Y. & Tao, J. Revolutionizing pharma: Unveiling the AI and LLM trends in the pharmaceutical industry. arXiv preprint arXiv:2401.10273 (2024).
Yang, S. et al. Aspects and implementation of pharmaceutical quality by design from conceptual frameworks to industrial applications. Pharmaceutics 17 (5), 623 (2025).
Alzahrani, A., Kostkova, P., Alshammari, H., Habibullah, S. & Alzahrani, A. Intelligent integration of AI and IoT for advancing ecological health, medical services, and community prosperity. Alexandria Eng. J. 127, 522–540 (2025).
Soori, M., Arezoo, B. & Dastres, R. Digital twin for smart manufacturing, A review. Sustain. Manuf. Serv. Econ. 2, 100017 (2023).
Inshutiyimana, S., Rana, K. R., Abdullahi, F. A. & Aleu, M. M. Artificial intelligence for pharmaceutical quality assurance in Kenya. IET Collab. Intell. Manuf. 7 (1), e70033 (2025).
Rantanen, J. & Khinast, J. The future of pharmaceutical manufacturing sciences. J. Pharm. Sci. 104 (11), 3612–3638 (2015).
Wu, Y. et al. Artificial intelligence for drug delivery: Yesterday, today and tomorrow. Acta Pharm. Sin. B (2025).
Bae, H. et al. Artificial intelligence-driven nanoarchitectonics for smart targeted drug delivery. Adv. Mater. e10239 (2025).
Mustoe, C. L. et al. Quality by digital design to accelerate sustainable medicines development. Int. J. Pharm. 125625 (2025).
Maharjan, R., Kim, N. A., Kim, K. H. & Jeong, S. H. Transformative roles of digital twins from drug discovery to continuous manufacturing: Pharmaceutical and biopharmaceutical perspectives. Int. J. Pharm. X 100409 (2025).
El-Kenawy, E. S. M. et al. Smart city electricity load forecasting using greylag goose optimization-enhanced time series analysis. Arab. J. Sci. Eng. 1–19 (2025).
Maharjan, R. et al. Recent trends and perspectives of artificial intelligence-based machine learning from discovery to manufacturing in biopharmaceutical industry. J. Pharm. Investig. 53 (6), 803–826 (2023).
Elkenawy, E. S. M., Alhussan, A. A., Khafaga, D. S., Tarek, Z. & Elshewey, A. M. Greylag goose optimization and multilayer perceptron for enhancing lung cancer classification. Sci. Rep. 14 (1), 23784 (2024).
Author information
Authors and Affiliations
Contributions
Z.Z conceived the study, developed the AI-powered QbD framework, and supervised the overall research design. Z.Z also conducted the data preprocessing, model development, experiments, and analysis. In addition, Z.Z prepared the figures, drafted the manuscript, and finalized the revisions. The author reviewed and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhu, Z. Intelligent information management enables quality-by-design in pharmaceutical production. Sci Rep 15, 44201 (2025). https://doi.org/10.1038/s41598-025-27879-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-27879-w














