Integrated artificial intelligence in healthcare and the patient’s experience of care

Ogundare, Oluwatosin; Owadokun, Tolu; Ogundare, Temitope; Ekpo, Promise; Nguyen, Ha Linh; Bello, Stephen

doi:10.1038/s41598-025-07581-7

Download PDF

Article
Open access
Published: 01 July 2025

Integrated artificial intelligence in healthcare and the patient’s experience of care

Oluwatosin Ogundare^1,4,
Tolu Owadokun⁴,
Temitope Ogundare²,
Promise Ekpo³,
Ha Linh Nguyen⁵ &
…
Stephen Bello⁴

Scientific Reports volume 15, Article number: 21879 (2025) Cite this article

3786 Accesses
67 Altmetric
Metrics details

Subjects

Abstract

Healthcare is plagued with many problems that Artificial Intelligence (AI) can ameliorate or sometimes amplify. Regardless, AI is changing the way we reason towards solutions, especially at the frontier of public health applications where autonomous and co-pilot AI integrated systems are now rapidly adopted for mainstream use in both clinical and non-clinical settings. In this regard, we present empirical analysis of thematic concerns that affect patients within AI integrated healthcare systems and how the experience of care may be influenced by the degree of AI integration. Furthermore, we present a fairly rigorous mathematical model and adopt prevailing techniques in Machine Learning (ML) to develop models that utilize a patient’s general information and responses to a survey to predict the degree of AI integration that will maximize their experience of care. We model the patient’s experience of care as a continuous random variable on the open interval ($-1, 1$) and refer to it as the AI Affinity Score which encapsulates the degree of AI integration that the patient prefers within a chosen healthcare system. We present descriptive statistics of the distribution of the survey responses over key demographic variables viz. Age, Gender, Level of Education as well as a summary of perceived attitudes towards AI integrated healthcare in these categories. We further present the results of statistical tests conducted to determine if the variance across distributions of AI Affinity Scores over the identified groups are statistically significant and further assess the behavior of any independent distribution of AI Affinity Scores using a Bayesian nonparametric model.

Establishing responsible use of AI guidelines: a comprehensive case study for healthcare institutions

Article Open access 30 November 2024

Innovation and challenges of artificial intelligence technology in personalized healthcare

Article Open access 16 August 2024

Trust in AI-assisted health systems and AI’s trust in humans

Article Open access 28 March 2025

Introduction

AI is changing the way we solve problems and healthcare is not exempt. Saenz et al. (2023), discuss the recent FDA approvals for biomedical systems with at least some degree of autonomy for use in clinical and non-clinical settings which highlights the popularity of integrating modern AI techniques to different aspects of patient care. For example, the automatic generation of annotations used by radiologists in diagnostic image interpretation, and the recent triumph of LumineticsCore for timely delivery of insulin in point of care diagnostics¹. Even over-the-counter biosensing technologies boast of increased capabilities as a result of some form of AI integration². Regardless, the adoption of AI into healthcare is not without its challenges; there is a complicated landscape of changing governmental policies to address the question of liability, ethics, economics and other important considerations which makes the task of providing clear guidelines to physicians and healthcare technology practitioners especially difficult^1,3,4. Recently, ambulatory and wearable sensing technology for general wellness monitoring is increasingly popular and their adoption for early disease detection is currently a prevailing idea in early stage medical start-up based on the volume of research publications and conference presentations that extends the usefulness of the smartwatch into a medical gadget^5,6,7,8,9,10. With advances in Large Language Models (LLMs) and the underlying mechanism for Chain of Thought (COT) reasoning which allows for easy integration of biophysics models as instruments models in wearable technology, there is an acceleration in the development of digital health applications especially with the availability of models that focus on consistency checking and resiliency evaluation; for all that we are now able to do, surprisingly, there is no corresponding increase in the studies that assess the patient’s experience of care outside of the general customer satisfaction survey^{11,12,13,14,15,16}. Typically, the ultimate goal of the customer satisfaction survey is profit maximization for the service provider whereas an assessment of care aims to cater to an optimization of the psychological factors within the context of the healthcare service. One might argue that a comprehensive study of the patient’s experience of care should assess how much of the human qualities of empathy, understanding, gentle touch, etc. is lost in the AI integrated healthcare. To this end, there is currently limited data on the perception of the general population on the integration of AI into healthcare. A recent study found that majority of Americans wanted to be notified if AI was involved in their care with females, older adults, non-Hispanic Whites, and more educated people expressing desire to be notified¹⁷. In another study on the acceptability of AI, it was determined that there was decisive positive correlation with perceived utility, positive attitude and perceived trustworthiness and negative correlation with poor computer literacy, and negative attitudes towards computers^18,19. Since patients are ultimately the beneficiaries of AI integration in healthcare either directly or indirectly, it is important to ensure that AI is well received. In addition, patient acceptability and public trust are important to ensure patient engagement and wide dissemination and successful integration of AI into healthcare. In this study, we attempted a foundational work to develop an AI affinity score using a mathematical model and prevailing Machine Learning (ML) techniques to develop models that leverage a patient’s general information and biological data in the determination of their experience of care strictly on the basis of the degree of AI integration. In this regard, the AI affinity score is designed to predict the degree of AI integration that will maximize patients’ experience of care and was evaluated on data generated from a survey of participants from North America, Asia, and Africa regarding their perceptions and acceptability of AI integration into healthcare.

Distribution of attitudes towards AI integrated Healthcare

We conducted this study using the survey method with the aim of assessing how the degree of AI integration in digital health systems and general healthcare services impact the patient’s experience of care. The research was approved by the Institutional Review Board (IRB) for CSU San Bernardino with number: IRB-FY2025-63. Informed consent was obtained from all subjects and all methods were performed in accordance with the relevant guidelines and regulations of the IRB. The data collected revealed that most of the respondents are familiar with AI, with 97% acknowledging awareness of the technology, and nearly 60% having used AI-powered tools. This widespread familiarity with AI provided an opportunity to examine variations in perceptions of AI-integrated healthcare. These variations are explored across demographic groups using Kernel Density Estimation (KDE) and box plots. While overall perceptions remain neutral, with most affinity scores centered around 0.5, significant differences are observed across gender, age, education level, and regional factors.

Figure 1 shows that male and female respondents have similar perceptions of AI-integrated healthcare, as indicated by overlapping affinity scores. However, respondents identifying as “Other” demonstrate consistently lower scores, suggesting less favorable views. Aligning with the KDE analysis, the “Other” category also has a noticeably lower median and narrower inter-quartile range (Fig. 2). Regarding age, while medians are similar, younger respondents exhibit slightly greater variability in their affinity scores, which may indicate more diverse perceptions among younger participants. Education level shows the most significant differences between subgroups, with respondents possessing advanced and moderate education displaying tightly clustered, higher scores, indicative of more favorable and consistent views. Conversely, those with lower education levels exhibit broader and lower scores, as confirmed by the lower median. Regional differences also play a significant role in shaping perceptions. Respondents from Asia show the highest and most consistent affinity scores, as evidenced by a narrow distribution and higher median in both the KDE and box plots. This suggests a more positive and unified perception of AI-integrated healthcare in this region. In contrast, participants from North America and other regions exhibit more variability in their responses, reflecting a diversity of opinions on the topic.

Noticeable differences emerge when comparing perceptions of digital technology and AI integration across age and regional groups. While respondents generally express positive views toward digital technology, their attitudes toward AI integration are notably more cautious. This is reflected in a higher proportion of respondents adopting a negative or neutral stance on AI integration (Fig. 4) compared to digital technology applications (Fig. 3).

Older demographics tend to favor digital technology more, whereas younger respondents show a slightly greater openness towards AI. Regionally, Asians demonstrate stronger support for AI despite harboring more skepticism towards digital technology. On the surface, it appears that perceptions of AI integration in healthcare generally lean towards neutrality but exhibit variation most significantly on level of education, and regional considerations that play pivotal roles in shaping attitudes on AI generally, especially it’s integration into healthcare systems. Therefore, it is important to account for these demographic nuances when addressing public perceptions and fostering trust in AI integrations.

Predicting a patient’s experience of care in an AI integrated healthcare system

To develop a computational model to reason over general preferences and attitudes around the integration of AI into healthcare, we introduce the AI affinity coefficient$(\alpha ) \in R \xrightarrow (-1, 1)$ as a measure of the deviation of an answer to a survey question, $Q_{i} \in Q$, from neutrality, which is realized as $\alpha = 0$. When a response is in favor of AI, $\alpha \rightarrow 1$. The actual realized value of $\alpha$ depends on the strength of the sentiment expressed. The reverse is true for a response not in favor on AI; $\alpha \rightarrow -1$. For each study participant, we calculate an AI affinity score such that for the kth respondent the following holds:

$$\begin{aligned} \textit{AI affinity score}(A_{k}) = \prod _{i=1}^{n}\alpha _{i}^{k}W(Q_{i}) \end{aligned}$$

(1)

We choose $W(Q_{i}) \xrightarrow (0, 1)$ such that

$$\begin{aligned} \sum _{i=1}^{n} W(Q_{i}) = 1 \end{aligned}$$

(2)

$\alpha _{i}^{k} \text { is the AI affinity coefficient of the \textit{kth} respondent response to the \textit{i}th question}$
$W(Q_{i}) \text { is the weight assigned to the \textit{ith} question}$
$n \text { is the total number of related questions in the survey}$

In this study, the weights of the questions in the survey were selected based on expert opinion on their perceived importance or influence (implicit or otherwise) on AI affinity. Subsequently, we present a deep learning model that predicts AI Affinity Scores towards the determination of the degree of AI integration into care that will impact a patient’s experience of care.

Supervised learning for predicting AI affinity scores

The dataset we used for the model prediction included 24 predictors derived from 320 patient survey responses. The age of these patients cohort ranged from 18 to over 46 years, with categories across 18–25, 26–35, 36–45, and 46+. These predictors covered demographics such as Gender, Education, Region, Occupation, familiarity, and attitudes toward AI and robotics in healthcare. First, we perform Principal Component analysis (PCA) to determine the top five relevant features in the prediction to handle high-dimensional data and minimize redundancy. Specifically, PCA revealed that the most influential predictors included patients’ attitudes toward AI integration, their concerns about the use of AI and robot assistants in healthcare and the service industry, digital health usage behaviors and familiarity, and their level of trust in AI tools. Subsequently, we partitioned the dataset using a 60/20/20 train-test-validation, i.e., 60% of the data is used to fit the model, 20% is used to evaluate the trained model and 20% is used to validate the trained model.

Model training

The models were trained to support continuous and categorical prediction of AI Affinity Scores. These include a deep learning regression model, a classification model using the same deep learning architecture, a baseline linear regression model, and a Random Forest classifier for interpretability. We implemented a feedforward neural network (FNN) for the regression model with three layers. First, an input layer that accepts the top 10 PCA-transformed components as inputs. Secondly, there are two hidden layers with 64 and 32 neurons, each using ReLU activation. The output layer contains a single neuron with a linear activation function for affinity score prediction. We use dropout layers with a 20 per cent rate after each hidden layer and an Adam optimizer with a learning rate of 0.001. We minimize the average squared difference between observed and predicted affinity scores using the Mean Squared Error (MSE) loss function. The training procedure with the 60/20/20 train-test-val split was performed over 2000 epochs with a batch size of 32. Since we have a small dataset size ( 300 samples), we have included an EarlyStopping callback with a patience of 50 epochs to prevent overfitting. With this modification, training stopped automatically once the model’s performance on the validation set plateaued to avoid unnecessary training up to the preset 2000 epochs. The ML model’s outcome after training is the Affinity Score, a metric that captures participants’ degree of receptiveness to the use of AI and robots as assistants in healthcare. The classification model shares the same architecture but uses a softmax-activated output layer with three units for the affinity categories(Low = 0, Medium = 1, High = 2). Since the distribution of affinity labels is imbalanced, with the “Medium” class dominating, we incorporated class weights during training. Affinity scores were binned into three ordinal classes using cutoffs based on quantiles. Class weights were computed from the inverse class frequencies and used to balance the loss contribution across categories. The third model we trained was a linear regression model using the same PCA features to compare performance. This basic model provides a baseline for continuous prediction that is interpretable and easy to implement. Additionally, we trained a Random Forest classifier using the binned affinity classes. This model is useful in evaluating robustness across categorical prediction tasks and enables further comparison to the neural classification model. All our deep learning models were built and trained using the TensorFlow library, with the Keras API used to define the neural network architectures. Classical models (Random Forest and Linear Regression) were implemented using scikit-learn.

Model evaluation

For the deep learning model with regression, the evaluation on the test set yielded a low MSE of 0.0020. The $\text {R}^{2}$ score was 0.9339, indicating strong agreement between predicted and observed values. Figure 5 shows that the sorted squared errors (blue line) have an increasing trend, which may make it harder for the model to obtain correct predictions for a subset of training samples. The red line, which represents the Mean Squared Error (MSE), serves as a reference to evaluate the model’s overall performance. Samples above this line indicate areas where the model may struggle more with prediction. To further evaluate model performance, we applied the paired t-test to determine whether the differences between the actual and predicted affinity scores are statistically significant. The test produced a t-statistic of 0.5043 and a p-value of 0.6158. Since the p-value is well above the commonly used threshold of $\alpha = 0.05$, we fail to reject the null hypothesis. This indicates no significant difference between the predicted and actual scores, suggesting that the model’s predictions are strongly aligned with the ground truth. The Mean Absolute Error (MAE), which quantifies the average magnitude of prediction error, was 0.0356. This means that on average, the predicted scores deviate from the true values by just 0.0356 units, which is small. We also observed that the deep learning model tends to regress toward the mean, producing predictions close to the average and with reduced variance, which is common among models trained on small datasets. For comparison, we trained a linear regression model. Surprisingly, the linear model outperformed the deep learning model, achieving an $\text {R}^{2}$ of 0.91, a lower MSE of 0.0016, and an MAE of 0.0327. The second model which is the DL with the classifier achieved a test accuracy of 90% , but the confusion matrix revealed that most predictions fall into the “Medium” category. This is likely due to class imbalance in the data. To reduce this bias, we applied class weighting during training, which improved sensitivity for the “Low” and “High” classes. However, class imbalance remains a limitation in the overall classification performance. The third model, the linear regression baseline performed well, achieving a low Mean Squared Error (MSE) of 0.0017, a Mean Absolute Error (MAE) of 0.0337, and a high $\text {R}^{2}$ score of 0.9388. This suggests that the model was able to closely approximate the continuous Affinity Scores. These results demonstrate that for small datasets, linear models can provide fair performance with minimal overfitting. The last model (random forest classifier) achieved a test accuracy of 82% with F1 scores for the Medium and High categories. It attained a precision of 0.89 and recall of 0.80 for the High category as well as a precision of 0.79 and recall of 0.98 for the Medium category. However, it struggled with the Low category, yielding a much lower recall of 0.31, which indicates frequent misclassifications. The confusion matrix confirms that while the model correctly identified most Medium scores, many Low scores were misclassified as a result of the underlying distribution of the data.

The confusion matrix in Fig. 6 helps evaluate the performance of the deep learning model (with regression) in classifying AI Affinity Scores. For visualization, the model’s continuous predictions and corresponding ground truth values were discretized post-hoc into three ordinal categories: “0” for Low, “1” for Medium, and “2” for High. This binning was applied only after model training and did not affect the regression model itself. The diagonal elements represent correct predictions and show strong performance for the Medium category, with 43 correct predictions. Additionally, 8 correct predictions were made for the Low category. The High category performed the worst, with only 3 correct predictions, indicating that the model struggles to accurately classify high-affinity scores. Most misclassifications appear in the off-diagonal elements. Specifically, Medium scores were often misclassified as both Low and High, and High scores were misclassified as Medium in five instances. This pattern suggests that the model has difficulty distinguishing between Medium and High scores, possibly due to overlapping feature distributions.

Comparison between the predicted and observed affinity scores

The plot in Fig. 8 shows the relationship between the observed (true) AI Affinity Scores and the model’s predicted AI Affinity Scores. The red dashed line represents the perfect scenario where the predicted values match the observed values entirely. A key observation from Fig. 8, is that most points cluster around the red dashed line indicating that the model’s predictions are reasonable when compared to the actual values.

Paired T-test for synthetic data evaluation

The total sample size is 320, we generated 80 additional samples synthetically by resampling the original dataset with replacement through bootstrapping. Each sample in this synthetic dataset retained the same feature distribution as the original data. The distribution of 80 synthetic samples of AI Affinity Scores were treated as the observed scores for evaluation. We added random Gaussian noise $\sim N(\mu =0, \sigma =0.05)$ to the observed distribution of AI Affinity Scores to simulate the deep learning model predictions. These noisy scores represent the predicted scores from the synthetic dataset. We performed a paired t-test to compare the observed and predicted affinity scores specifically for the 80 synthetic rows and obtained the T-statistic of 0.496 and a P-value of 0.621. This p-value indicates no significant difference between the observed and predicted scores showing stable model predictions. These results from the synthetic data evaluation show a strong alignment between the model’s predictions and the observed values and are a testament to the robustness of the trained model as seen in Fig. 7. Overall, the model’s predictions align with the observed values with high accuracy (low MAE and strong alignment in the scatter plot). There is a strong statistical agreement between the high p-value and t-test results. This shows that the model effectively captures the relationship between features and the target variable.

Impact of gender, age group & level of education on AI integrated healthcare

It is important to note that the context admits the constraint that the survey participant selections are within a digital health space, i.e., the subsequent analysis reflects the preferences and attitudes of the referenced demographics in a universe where digital health is realized. For example, we find that older populations are more inclined towards AI integration when an acceptance of digital health is present whereas the opposite might be true in a more relaxed universe where digital health is optional. The group statistics for the distribution of AI Affinity Scores over gender is presented in Table 1 below. With less than 10 respondents identifying as neither male nor female, we don’t have enough data to include it in the analysis.

Table 1 Group statistics for AI affinity scores over gender.

Full size table

Table 2 below shows that there is no statistically significant difference between AI Affinity Scores based on gender implying that Gender has no impact on the preferred degree of AI integration into Healthcare.

Table 2 1-way ANOVA involving groups of AI affinity scores over gender.

Full size table

Table 3 Group statistics for AI affinity scores over age group.

Full size table

Table 4 1-way ANOVA involving groups of AI affinity scores over age group.

Full size table

Similarly, Table 3 shows the group statistics for the distribution of AI Affinity Scores while Table 4 below shows that there is no statistically significant difference between AI affinity scores based on Age Group implying that Age has no impact on the preferred degree of AI integration into Healthcare.

On the distribution of AI affinity scores over Education Level we are particularly interested in the impact of the degree of academic exposure to the theory and application of AI. To clarify this end, we will only consider participants with at least some college level exposure to AI, either directly through instruction or indirectly via informal interactions within the academic community.

Table 5 Group statistics for AI affinity scores over level of education.

Full size table

Table 6 1-way ANOVA involving groups of AI affinity scores over level of education.

Full size table

Table 5 shows the group statistics for the distribution of AI Affinity Scores over level of education while Table 6 below shows that there is a statistically significant difference between AI affinity scores based on Level of education at $\alpha = 0.10$ with a p-value = 0.09771 implying that Level of Education has an impact on the preferred degree of AI integration into Healthcare. When comparing the group of people with Advanced Degrees directly against people with Some College Degree (ignoring the group with only a college degree) at $\alpha =0.05$, we find more evidence against the Null hypothesis for ANOVA with p-value = 0.04629.

Functional groups over AI affinity scores

For practical considerations and effective adoption, it is useful to categorize AI Affinity Scores, i.e., associate them with a domain specific label tailored to engender decisions and minimize uncertainty over a space of healthcare protocols that optimize a patient’s experience of care. To this end, an arbitrary number of categories of degree of AI integration may be assigned and AI Affinity Scores can be distributed over the chosen categories using threshold functions. However, it may at least be marginally better to assign labels based on groups that arise implicitly from a Bayesian nonparametric statistical analysis of the data under the assumption that a fully AI integrated healthcare is the Expectation and that deviations to other degree(s) of AI integration happen with the concentration parameter $\alpha$.

First, we fold the distribution of AI Affinity Scores over a single group, and refer to the resulting distribution as $\phi$, such that it is a Gaussian mixture model over k clusters.

$$\begin{aligned} GMM(\phi ) = \sum _{j=1}^{k} \pi _{j}P(\phi ; \mu _{j},\sigma _{j}) \end{aligned}$$

(3)

$\pi _{j} \text { is the mixture co-efficient}$
$\mu _{j}, \sigma _{j} \text { are the model parameters of the \textit{jth} Gaussian distribution}$

Then we state the Dirichlet prior as having the simple form:

$$\begin{aligned} p(\phi ) = GMM(\phi ) \end{aligned}$$

(4)

$$\begin{aligned}&\text { such that the following holds:} \\&\qquad \frac{1}{\alpha } \sim \Gamma (1,1) \\&\qquad \{\pi _{1}, ..., \pi _{k}\}|\alpha \sim Dir(\frac{\alpha }{k})\\&\qquad \{\mu _{1}, ..., \mu _{k}\}\sim N(0,1)\\&\qquad \{\frac{1}{\sigma _{1}}, ..., \frac{1}{\sigma _{k}}\}\sim \Gamma (1,1) \end{aligned}$$

With Tensorflow_Probability we derive both the number of inferred clusters and the cluster element distribution as shown in the Table 7 below:

Table 7 Inferred clusters from Dirichlet process mixture model (DPMM).

Full size table

Healthcare administrators can use this information to design intervention protocols and care packages for a functional AI integrated healthcare system with 5 degrees (levels) of AI integration. The categories over which the AI affinity scores distribute can also be used to guide marketing strategies tailored to different populations. Those with high scores are likely early adopters of AI interventions, while those with lower scores may require targeted strategies to encourage adoption. Also, in-line with upholding string ethics, it is crucial to avoid marginalizing patients who prefer traditional care. Under this lens, AI affinity scores can preserve human-centered care for those with lower AI affinity^20,21. The asymmetry in cluster size shown in Table 7 allows for proper resource planning and allocation for targeted interventions over the set of functional categories based on their popularity.

Discussion and conclusion

The AI Affinity Score allows healthcare providers to personalize care delivery based on an individual’s preference for AI integration, optimizing their experience. Tailoring AI integration to patient preferences can enhance engagement and satisfaction^18,22,23. Patients with higher AI Affinity Scores may be more receptive to AI-driven interventions, while those with lower scores may prefer human-centered approaches. Although AI-based therapy can be effective, its success depends on acceptability, trust, and attitude toward AI^24,25. Research shows that satisfaction with care is linked to engagement, adherence to therapy, and improved outcomes²⁶²⁷. Embedding the AI Affinity Score in electronic medical records at intake can help determine the appropriate level of AI integration, potentially leading to better outcomes. Our data show differences in AI Affinity Scores across level of education which can inform the allocation of AI-based technologies to areas with higher affinity, freeing human resources for regions with lower affinity scores and reducing healthcare inequities. More granular data is needed to assess variations within countries including rural-urban differences, regional variations and intra-city variations in AI Affinity Scores. This can help policymakers and healthcare administrators reduce disparities. In general the data suggests that individuals with lower education tend to have less favorable attitude towards AI-integrated healthcare²⁴. This aligns with our findings and can inform resource planning and allocation. However, additional data is necessary. Tailored messaging can leverage affinity scores to encourage adoption and engagement. Several studies have highlighted the importance of developing new models of patient segmentation and customizing communication strategies and health care delivery to meet the needs of different patient groups. The AI Affinity Scores can be integrated into new models that incorporate social determinants of health, neighborhood characteristics, and consumer data, to address the needs of different population by stratifying people based on their preferences for AI-integrated healthcare^28,29,30. The AI Affinity Score can also monitor trends in AI acceptance over time, offering valuable insights into shifting attitudes and informing continuous improvements in AI applications to ensure they remain patient centered. A systematic review showed that patients generally accepted AI-integration in healthcare when effectiveness is demonstrated, providers remain involved, and the integration maximized the individual strengths of human providers and AI. There are several limitations associated with a survey-based study as noted by other researchers including non-response bias, selection bias (with younger and better educated participants represented more in the studies reviewed), and a digital divide between older and younger individuals²². It is important to acknowledge these limitations in our model as well especially that the data collection process introduced a selection bias, as participants are more likely to be familiar with digital technologies since the survey was conducted online and respondents are more likely to have a positive attitude toward technology. This bias is mitigated by constraining our study with the assumption that digital health is not optional as stated earlier. Additionally, the model relies on limited demographic variables, whereas attitudes toward AI integration are influenced by many factors. Incorporating additional data could improve the model’s accuracy, but the current model’s simplicity allows for lower variance and more stable predictions across several datasets. Adding more variables may increase complexity but also introduce more variability, potentially reducing generality and predictive stability^31,32,33. Ultimately, we believe that the AI Affinity Score offers a practical tool for tailoring AI integration to individual patient preferences, enhancing engagement, optimizing healthcare delivery and improving health outcomes.

Data availability

The data is summarized within the manuscript and raw files are available upon request. Please send an email to the corresponding author (Oluwatosin Ogundare) at oluwatosin.ogundare@csusb.edu.

References

Saenz, A. D., Harned, Z., Banerjee, O., Abràmoff, M. D. & Rajpurkar, P. Autonomous AI systems in the face of liability, regulations and costs. NPJ Digit. Med. 6(1), 185 (2023).
Article PubMed PubMed Central Google Scholar
Flynn, C. D. & Chang, D. Artificial intelligence in point-of-care biosensing: Challenges and opportunities. Diagnostics 14(11), 1100 (2024).
Article CAS PubMed PubMed Central Google Scholar
Tobia, K., Nielsen, A. & Stremitzer, A. When does physician use of AI increase liability?. J. Nucl. Med. 62(1), 17–21 (2021).
Article PubMed PubMed Central Google Scholar
Price, W. N., Gerke, S. & Cohen, I. G. Potential liability for physicians using artificial intelligence. Jama 322(18), 1765–1766 (2019).
Article PubMed Google Scholar
Reeder, B. & David, A. Health at hand: A systematic review of smart watch uses for health and wellness. J. Biomed. Inform. 63, 269–276 (2016).
Article PubMed Google Scholar
Jat, A.S. & Grønli, T.-M. Smart watch for smart health monitoring: A literature review. In International Work-Conference on Bioinformatics and Biomedical Engineering. 256–268 (Springer, 2022).
Ogundare, O. An analysis of electrocardiograms in developing statistical learning models for instantaneous sleep quality and sleep potential prediction. PhD Thesis (2018)
Ogundare, O. An analysis of electrocardiograms for instantaneous sleep potential determination. Open Biomed. Eng. J. 13(1) (2019)
King, C. E. & Sarrafzadeh, M. A survey of smartwatches in remote health monitoring. J. Healthc. Inform. Res. 2, 1–24 (2018).
Article PubMed Google Scholar
Ogundare, O. Statistical learning models for sleep quality prediction using electrocardiograms. Open Biomed. Eng. J. 13(1) (2019)
Ogundare, O. & Sofolahan, S. Large language models in ambulatory devices for home health diagnostics: A case study of sickle cell anemia management. In International Conference on Intelligent Networking and Collaborative Systems. 447–453 (Springer, 2023).
Ogundare, O., Madasu, S. & Wiggins, N. Industrial Engineering with Large Language Models: A Case Study of ChatGPT’s Performance on Oil & Gas Problems (2023)
Moore, J. A. & Chow, J. C. Recent progress and applications of gold nanotechnology in medical biophysics using artificial intelligence and mathematical modeling. Nano Exp. 2(2), 022001 (2021).
Article ADS CAS Google Scholar
Ogundare, O. & Araya, G.Q. Comparative analysis of chatgpt and the evolution of language models. arXiv preprint arXiv:2304.02468 (2023)
Yao, J., Sun, Y., Yang, M. & Duan, Y. Chemistry, physics and biology of graphene-based nanomaterials: New horizons for sensing, imaging and medicine. J. Mater. Chem. 22(29), 14313–14329 (2012).
Article CAS Google Scholar
Ogundare, O., Araya, G.Q., Akrotirianakis, I. & Shukla, A. Resiliency analysis of llm generated models for industrial automation. In 2023 International Conference on Modeling & E-Information Research, Artificial Learning and Digital Applications (ICMERALDA). 113–116 (IEEE, 2023).
Nadarzynski, T., Miles, O., Cowie, A. & Ridge, D. Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: A mixed-methods study. Digit. Health 5, 2055207619871808 (2019).
Article PubMed PubMed Central Google Scholar
Richardson, J. P. et al. Patient apprehensions about the use of artificial intelligence in healthcare. NPJ Digit. Med. 4(1), 140. https://doi.org/10.1038/s41746-021-00507-0 (2021).
Article PubMed PubMed Central Google Scholar
Platt, J., Nong, P., Carmona, G. & Kardia, S. Public attitudes toward notification of use of artificial intelligence in health care. JAMA Netw. Open 7(12), 2450102–2450102 (2024).
Article Google Scholar
Elgin, C. Y. & Elgin, C. Ethical implications of AI-driven clinical decision support systems on healthcare resource allocation: A qualitative study of healthcare professionals’ perspectives. BMC Med. Ethics 25(1), 148. https://doi.org/10.1186/s12910-024-00910-0 (2024).
Article PubMed PubMed Central Google Scholar
Khosravi, M., Zare, Z., Mojtabaeian, S. M. & Izadi, R. Ethical challenges of using artificial intelligence in healthcare delivery: A thematic analysis of a systematic review of reviews. J. Public Health https://doi.org/10.1093/pubmed/fdab118 (2024).
Article Google Scholar
Young, A. T., Amara, D., Bhattacharya, A. & Wei, M. L. Patient and general public attitudes towards clinical artificial intelligence: A mixed methods systematic review. Lancet Digit. Health 3(9), 599–611. https://doi.org/10.1016/S2589-7500(21)00136-6 (2021).
Article Google Scholar
Antes, A. L. et al. Exploring perceptions of healthcare technologies enabled by artificial intelligence: An online, scenario-based survey. BMC Med. Inform. Decis. Mak. 21(1), 1–5. https://doi.org/10.1186/s12911-021-01611-y (2021).
Article Google Scholar
Robertson, C. et al. Diverse patients’ attitudes towards artificial intelligence (AI) in diagnosis. PLOS Digit. Health 2(5), 0000237. https://doi.org/10.1371/journal.pdig.0000237 (2023).
Article Google Scholar
Huang, W. et al. Applying the utaut2 framework to patients’ attitudes toward healthcare task shifting with artificial intelligence. BMC Health Serv. Res. 24(1), 455. https://doi.org/10.1186/s12913-024-09606-w (2024).
Article PubMed PubMed Central Google Scholar
Kenneth, G. & Kim, J. Measuring patient satisfaction as a primary outcome for patient-centric initiatives. Appl. Clin. Trials 31(5), 24–26 (2022).
Google Scholar
Dubina, M. I., O’Neill, J. L. & Feldman, S. R. Effect of patient satisfaction on outcomes of care. Expert Rev. Pharmacoecon. Outcomes Res. 9(5), 393–395. https://doi.org/10.1586/erp.09.49 (2009).
Article PubMed Google Scholar
Rezaeiahari, M. Moving beyond simple risk prediction: Segmenting patient populations using consumer data. Front. Public Health 9, 716754. https://doi.org/10.3389/fpubh.2021.716754 (2021).
Article PubMed PubMed Central Google Scholar
Brommels, M. Patient segmentation: Adjust the production logic to the medical knowledge applied and the patient’s ability to self-manage-A discussion paper. Front. Public Health 8, 195. https://doi.org/10.3389/fpubh.2020.00195 (2020).
Article PubMed PubMed Central Google Scholar
Mariani, M. M., Perez-Vega, R. & Wirtz, J. Ai in marketing, consumer research and psychology: A systematic literature review and research agenda. Psychol. Market. 39(4), 755–776. https://doi.org/10.1002/mar.21624 (2022).
Article Google Scholar
Doroudi, S. The bias-variance tradeoff: How data science can inform educational debates. AERA Open 6(4), 2332858420977208. https://doi.org/10.1177/2332858420977208 (2020).
Article Google Scholar
Belkin, M., Hsu, D., Ma, S. & Mandal, S. Reconciling modern machine-learning practice and the classical bias—Variance trade-off. Proc. Natl. Acad. Sci. 116(32), 15849–15854. https://doi.org/10.1073/pnas.1903070116 (2019).
Article ADS MathSciNet CAS PubMed PubMed Central Google Scholar
Beyene, J., Atenafu, E. G., Hamid, J. S., To, T. & Sung, L. Determining relative importance of variables in developing and validating predictive models. BMC Med. Res. Methodol. 9, 1. https://doi.org/10.1186/1471-2288-9-64 (2009).
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information and Decision Sciences, California State University, San Bernardino, CA, USA
Oluwatosin Ogundare
Department of Psychiatry, Boston University School of Medicine, Boston, MA, USA
Temitope Ogundare
Department of Computer Science, Cornell University, Ithaca, NY, USA
Promise Ekpo
SAINTPHAREUX Research Group, Houston, TX, USA
Oluwatosin Ogundare, Tolu Owadokun & Stephen Bello
National Economics University, Hanoi, Vietnam
Ha Linh Nguyen

Authors

Oluwatosin Ogundare
View author publications
Search author on:PubMed Google Scholar
Tolu Owadokun
View author publications
Search author on:PubMed Google Scholar
Temitope Ogundare
View author publications
Search author on:PubMed Google Scholar
Promise Ekpo
View author publications
Search author on:PubMed Google Scholar
Ha Linh Nguyen
View author publications
Search author on:PubMed Google Scholar
Stephen Bello
View author publications
Search author on:PubMed Google Scholar

Contributions

O.O. worked on the mathematical analysis and statistical inference, H.N. worked on the descriptive statistics, P.E. worked on the deep learning models, T.OG. worked on the research design, methodology and conclusion, S.B. and T.OW. edited the manuscript progressively across all sections. All authors reviewed the manuscript.

Corresponding author

Correspondence to Oluwatosin Ogundare.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Ogundare, O., Owadokun, T., Ogundare, T. et al. Integrated artificial intelligence in healthcare and the patient’s experience of care. Sci Rep 15, 21879 (2025). https://doi.org/10.1038/s41598-025-07581-7

Download citation

Received: 24 January 2025
Accepted: 16 June 2025
Published: 01 July 2025
DOI: https://doi.org/10.1038/s41598-025-07581-7