Abstract
Overeating contributes to obesity and poses a significant public health threat. The SenseWhy study (2018–2022) monitored 65 individuals with obesity in free-living settings, collecting 2302 meal-level observations (48 per participant), using an activity-oriented wearable camera, a mobile app, and dietitian-administered 24-hour dietary recalls. Micromovements (e.g., bites, chews) were manually labeled from 6343 hours of footage spanning 657 days. Psychological and contextual information was gathered before and after meals through Ecological Momentary Assessments (EMAs). We predicted overeating episodes based on EMA-derived features and passive sensing data (mean AUROC = 0.86; mean AUPRC = 0.84). Using semi-supervised learning on EMA-derived features alone, we identified five distinct overeating phenotypes: “Take-out Feasting,” “Evening Restaurant Reveling,” “Evening Craving,” “Uncontrolled Pleasure Eating,” and “Stress-driven Evening Nibbling.” These results highlight the complex interplay between behavioral, psychological, and contextual factors associated with overeating, providing a foundation for personalized interventions.
Similar content being viewed by others
Introduction
Obesity remains a significant public health challenge despite various pharmacological and behavioral interventions. Since 1975, the global prevalence of obesity has tripled. As of 2022, 2.5 billion adults aged 18 years and older were considered overweight, including over 890 million who were classified as having obesity1. Traditional behavioral weight loss interventions often fail to provide long-term results, with many individuals experiencing weight regain within 12 months post-intervention2,3. One common target of obesity interventions and treatments is overeating4. However, these efforts are often unsuccessful, potentially because the specific patterns and behaviors that contribute to overeating are not well understood5. A critical reason for this knowledge gap is the lack of a multifaceted approach to studying eating behavior, which often excludes investigation into the complex and interrelated factors contributing to overeating. Consequently, most current treatments do not account for the dynamic interplay of psychological, contextual, and physiological factors, highlighting the need for more personalized and adaptive intervention strategies6.
Considerable effort has been devoted to identifying predictors of overeating, primarily using self-reported data in combination with linear statistical approaches. Most of these studies have focused on single proximal determinants such as stress, cravings, and loss of control (LOC) eating as potential predictors of overeating7,8,9,10. For example, recent studies have identified various single predictors of overeating, such as emotional eating and impulsive responses to food cues11,12,13. While Ecological Momentary Assessment (EMA)—a research methodology that involves repeatedly sampling participants’ behaviors, experiences, and moods in real-time and in their natural environments—provides valuable insights into eating behaviors, it can sometimes suffer from limitations related to the accuracy of meal timing and portion-size reporting. Factors such as recall bias or delayed entries may affect the precision of self-reported data; therefore, objective measures are needed to improve our understanding of overeating.
Wearable sensors are a promising source of objective data on overeating behaviors and their predictors. Wearables can collect data passively and continuously, enabling researchers to obtain behavioral measurements that are both richer and more frequent than those obtained through self-reported measures. Applying wearable sensors to eating studies can increase data reliability and open new avenues for analyzing the occurrence and co-occurrence of overeating predictors by providing richer, finer-grained datasets.
In this study, we first applied machine learning algorithms to identify features that predict overeating using passive sensing data, with and without ecological momentary assessment (EMA) inputs. We then used these identified features to construct distinct clusters of overeating and eating episodes, allowing us to differentiate theoretically and clinically relevant patterns of problematic overeating behaviors. This approach provides individualized data for personalized, adaptive interventions, overcoming the limitations of current one-size-fits-all strategies.
Results
Out of the initial 65 participants, 48 adults were included in the subsequent analyses: seven participants dropped out of the study, five lacked dietitian-administered recalls, and five recorded fewer than 10 meals during the study period, as shown in Fig. 1. The final EMA-only dataset comprised 2302 meal-level observations from participants with obesity, averaging 48 meals per participant. Participants had a mean age of 41 years (range: 21–66), and 77.1% were female. Baseline demographics and the EMA response summary are presented in Table 1. Supplementary Table 1 summarizes the EMA and passive sensing data, including demographic information, EMA response details, and statistics related to passive sensing features for 700 meals (average of 17 meals per participant). To illustrate the distribution of observations across participants, Supplementary Fig. 1 shows the distribution of EMA and feature-complete data (incorporating all available features) per participant. Supplementary Fig. 2 presents the hourly distribution of meals for EMA and feature-complete data across all participants. Details on wear time, adherence, and participant feedback for the SenseWhy wearable camera are available in Supplementary Note 1, Supplementary Table 2, and Supplementary Fig. 3.
Supervised overeating detection
We selected XGBoost as the best-performing model after comparing it with SVM and Naïve Bayes. While SVM is particularly effective in high-dimensional spaces and Naïve Bayes is efficient and robust, XGBoost, a non-linear ensemble method, proved more effective in capturing complex patterns in the data (Fig. 2). We further evaluated the model by examining the training and validation errors across iterations to ensure robustness (see Supplementary Fig. 4). For the detection of overeating, we conducted three separate analyses:
Performance of XGBoost, SVM, and Naïve Bayes models across three data scenarios: a EMA-only data, b Passive sensing-only data, and c Feature-complete data. Each graph shows the AUROC, AUPRC, and Brier score loss for each model to allow comparison of performance across different data inputs and modeling approaches.
EMA-only analysis
XGBoost yielded a AUROC with mean (SD) of 0.83 (0.02), an AUPRC of 0.81 (0.02), and a Brier score loss of 0.13 (0.01). The top five features identified by SHAP were: light refreshment (negative), pre-meal biological hunger (positive), perceived overeating (positive), evening eating (positive), and pleasure-driven desire for food (mixed association). In this context, a positive association means the feature increases the likelihood of overeating, a negative association means it reduces the likelihood, and a mixed association indicates varying effects depending on the instance.
Passive sensing-only analysis
XGBoost resulted in an AUROC of 0.69 (0.04), an AUPRC of 0.69 (0.05), and a Brier score loss of 0.18 (0.02). The top five features were: number of chews (positive), chew interval (negative), chew-bite ratio (negative), number of bites (positive), and chew rate (mixed).
Feature-complete dataset
In this combined dataset, XGBoost achieved an AUROC of 0.86 (0.04), an AUPRC of 0.84 (0.04), and a Brier score loss of 0.11 (0.02). To improve prediction probability calibration, we applied post-calibration using the sigmoid method (Platt’s scaling), which resulted in better alignment between predicted probabilities and observed outcomes (Supplementary Fig. 5). The top five predictive features were: perceived overeating (positive), number of chews (positive), light refreshment (negative), loss of control (positive), and chew interval (negative).
Figure 3 displays SHAP dot and bar plots, highlighting feature importance and their impact across all study levels.
a EMA-only, b Passive sensing-only, and c Feature-complete (combining EMA and passive sensing). The left panels show the mean absolute SHAP values, ranking features by their overall contribution to the model, while the right panels show the distribution of SHAP values for individual feature impacts associated with overeating (red for high feature values and blue for low feature values).
Semi-supervised overeating phenotype clustering
After removing 56 zero-calorie meals (e.g., non-caloric foods and beverages), we arrived at a final dataset of 2,246 meals, of which 369 (16.4%) were identified as overeating episodes. This adjustment ensured that the analysis focused on meals that accurately represented eating behavior, allowing for more reliable detection of overeating patterns. We employed the semi-supervised clustering pipeline and evaluated cluster separability using the silhouette score for 2 to 35 clusters. The separability was visually confirmed through 2D projection using UMAP. The pipeline, applied to the entire dataset of both normal and overeating meals, identified 30 distinct clusters, with a maximum silhouette score of 0.53 and a homogeneity score of 0.66 (Supplementary Fig. 6). To define a cluster as an overeating cluster, we set a threshold of 0.05 for the proportion of total overeating instances.
Visual inspection further confirmed separability in five predominant clusters, each characterized by a high proportion of overeating instances (Supplementary Fig. 7). The final clustering solution achieved a mean purity of 81.4%, a cumulative proportion of overeating instances of 0.85, and an entropy score of 0.36 across the five overeating clusters, further validated using GMM (Supplementary Note 2). Additionally, the silhouette score of 0.59 supported the coherence and distinctiveness of the identified clusters.
Z-score analysis of overeating clusters
We applied z-score analysis to highlight differences within each cluster while identifying shared patterns and overarching themes across clusters. For contextual and psychological factors within each cluster, we selected features with z-scores exceeding the predefined cut-off (\({|z|}\ge 1\)), allowing us to identify the dominant co-occurring factors within the same clusters and uncover shared patterns across clusters. Figure 4 presents a polar bar plot illustrating the z-scores for each contextual and psychological factor across all overeating clusters. We characterized each phenotype based on which feature-level z-scores were large in magnitude for each cluster, highlighting the distinctive characteristics of the cluster. Cluster labels (or names) were assigned to concisely reflect the key factors of each cluster and to provide an intuitive understanding of the overeating patterns observed in the data. A detailed analysis of all contextual factors exceeding the z-score threshold is provided in Table 2. Based on these results, we characterized the following overeating phenotypes:
-
Take-out Feasting
Preference for indulging in restaurant-sourced meals (take-out), often enjoyed in a social setting, emphasizing the social aspect of shared dining experiences.
-
Evening Restaurant Reveling
Pleasure-driven indulgence in food, with a preference for restaurant-sourced meals (dine-in), typically consumed in the evening as part of social dining experiences.
-
Evening Craving
Eating in the evening, often involving self-prepared meals and characterized by hunger, serving as a way to unwind at the end of day.
-
Uncontrolled Pleasure Eating
Focus on the hedonic aspect of food, involving eating for pleasure, often perceived as overeating with loss of control, and accompanied by task-oriented distractions.
-
Stress-driven Evening Nibbling
The radial axis represents the magnitude of the z-scores, with bold labels indicating features with a \({|z|}\ge 1\). The circumference is divided into five phenotype clusters, each represented by a unique color, highlighting the key features that differentiate these clusters. This visualization provides insights into the characteristic features of each overeating phenotype, such as psychological factors, contextual influences, and behavioral patterns, enabling a deeper understanding of their unique profiles.
Eating in the evening in response to stress and feelings of loneliness.
Discussion
In the face of a global obesity crisis imposing staggering healthcare costs on societies, the need for innovative and effective interventions has never been more critical. Obesity treatment presents the unique challenge of modifying a core behavior—eating—that is essential for survival yet influenced by several factors. Research indicates that eating is affected not only by physiological hunger but also by psychological factors such as stress and emotions, social interactions, and environmental cues14,15. The complex nature of obesity necessitates a nuanced understanding of eating behaviors and their surrounding influences. Recognizing this complexity, the health sector is increasingly embracing innovative technologies such as wearable sensors and EMA to detect, monitor, and interpret eating behaviors16,17. These tools offer promising avenues for capturing detailed data on dietary intake and contextual factors, enabling more personalized and effective interventions. By leveraging these advancements, researchers can gain valuable insights into overeating patterns and develop strategies that address the multifaceted nature of eating behaviors, ultimately advancing our efforts to combat obesity.
Findings from our study using EMA data revealed that light refreshments, pre-meal biological hunger, perceived overeating, and pleasure-driven desire for food were key factors influencing overeating. Our results demonstrate that biological hunger and pre-meal pleasure-driven desire (i.e., appetite) reflect distinct influences on eating behavior. Biological hunger is a physiological drive based on energy requirements, whereas pleasure-driven desire leads to food intake based on hedonic value. Prior research supports that despite satiation, hedonic factors contribute to excessive intake when highly palatable foods are available18,19. Additionally, our finding of a negative association between light refreshments and overeating suggests that smaller meals may help prevent overeating, aligning with existing literature20. Perceived overeating positively contributed to objective overeating, highlighting the role of self-awareness. Because perceived overeating was assessed post-meal, this suggests individuals can reliably reflect on their eating behavior after consumption and may benefit from interventions that enhance this self-awareness, such as by using mobile health (mHealth) apps to track eating behaviors, provide feedback, and promote mindfulness when eating21,22.
Given the significance of eating speed and bite and chewing patterns in influencing overeating23,24, our analysis of bite and chew counts indicates the potential for defining data-driven thresholds beyond which the likelihood of overeating substantially increases, as demonstrated by SHAP values. Specifically, our results show that at thresholds of approximately 500 chews and 75 bites per meal, SHAP values increase rapidly, signifying an elevated risk of overeating (Supplementary Figs. 8a, c). These findings could inform early interventions targeting individuals who exceed these thresholds. Notably, the model’s AUROC of 0.69 and AUPRC of 0.69 for passive sensing data alone demonstrate moderate predictive performance, supporting the potential for real-time feedback systems to encourage users to reflect on their eating behavior. Such systems could translate complex patterns—such as elevated chew rate or bite count—into simple, actionable feedback or nudges (e.g., timely reminders to slow down), while also providing an objective measure for evaluating the success of various behavioral nutrition interventions. Adding EMA features further enhanced the model’s performance, increasing the AUROC to 0.86 and the AUPRC to 0.84, substantially improving predictive accuracy.
We also found that a high number of chews—around 500 or more—might indicate a prolonged meal duration, potentially leading to overeating due to extended exposure to food (see Supplementary Fig. 8a). Conversely, when the number of chews is relatively low, approaching approximately 100, the SHAP values also increase, suggesting that a large amount of food was consumed rapidly (see Supplementary Fig. 8b). This rise in SHAP values at a lower number of chews corresponds to a higher bite rate, serving as a proxy for eating pace. A higher bite rate in these instances indicates that food was being consumed more quickly, further elevating the likelihood of overeating during these meals. Furthermore, our SHAP analysis of the chew-bite ratio (Supplementary Fig. 8d) indicates that a higher average number of chews per bite predicts a lower likelihood of overeating. This finding suggests that thoroughly chewing food might enhance satiety and reduce the risk of overeating, contributing to prior literature reporting that increased chewing promotes fullness and regulates appetite hormones25,26.
Through clustering analysis, we uniquely identified distinct overeating phenotypes that align with well-documented behaviors in the literature, offering insights into the drivers of overeating and their potential implications for intervention strategies. The “Evening Craving” and “Stress-driven Evening Nibbling” phenotypes reflect circadian-driven eating and emotional eating behaviors, respectively. The “Evening Craving” phenotype, characterized by nighttime eating of self-prepared meals and driven by biological hunger as a way to unwind, aligns with research on circadian rhythm disruptions influencing eating patterns. Studies have indicated that eating later in the day or at night can lead to increased hunger and a preference for energy-dense foods27. For instance, Goel et al.28 found that individuals with delayed circadian timing exhibited heightened appetite in the evening, which may promote late-night eating. This behavior is also consistent with findings on Night Eating Syndrome (NES), where individuals consume a significant portion of their daily intake during the night due to altered circadian eating rhythms29,30. These findings suggest that interventions targeting meal timing, such as structured eating schedules and circadian-based dietary strategies, may help mitigate the metabolic consequences of eating later in the day.
Our operationalization of evening eating diverges from the traditional definition of NES, which involves nocturnal awakenings to eat due to hunger or emotional distress, accompanied by heightened stress levels before and after the meal. Given the limitations of our data, we categorized evening eating as consumption occurring between 5 p.m. and 6 a.m., without the capacity to objectively assess nocturnal awakenings or associated stress responses.
The “Stress-driven Evening Nibbling” phenotype reflects emotional eating patterns widely reported in the literature. Emotional eating involves consuming food in response to negative emotions rather than physiological hunger, often leading to overeating, particularly of high calorie “comfort” foods. Research indicates that stress can elevate cortisol levels, which increases cravings for energy-dense foods and can trigger overeating later in the day31,32. Studies33,34 demonstrated that individuals under stress are more likely to engage in emotional eating, using food as a coping mechanism to alleviate negative feelings such as loneliness or anxiety. While most studies emphasize the impact of stress on increased eating, some research suggests that stress can also lead to reduced appetite in certain individuals32. We observed a group of meals that were not classified as overeating episodes but had some of the highest pre-and post-meal stress levels. These instances often involved snacking rather than consuming large meals, suggesting a pattern related to undereating or altered eating behaviors under stress. This indicates that stress responses can be heterogeneous, and individual differences may influence whether stress leads to overeating or undereating. Strategies like mindfulness-based stress reduction, and structured coping mechanisms have been shown to help regulate emotional eating behaviors and mitigate stress-related disruptions in appetite35.
Furthermore, the “Evening Restaurant Reveling” phenotype reflects the concept of social facilitation of eating, where the presence of others affects food intake. Our results show that this phenotype specifically involves eating with family and friends, indicating that individuals are more likely to overeat in comfortable social settings. Studies have shown that individuals tend to consume more food when eating in groups with friends compared to eating alone or with strangers, owing to extended meal duration and the influence of social norms. This pattern is partly explained by behavioral mimicry, where individuals adjust their eating to match that of their companions36,37,38. However, prior literature suggests that individuals with obesity are less likely to overeat in groups where they feel less comfortable39.
Similarly, the “Take-out Feasting” phenotype aligns with research on convenience eating and the impact of readily accessible, energy-dense foods on consumption patterns. The accessibility of take-out or fast food, combined with social settings, has been shown to lead to overconsumption due to larger portion sizes and the high palatability of foods40,41. Cohen et al.42 discussed how environmental factors, such as the ubiquity of fast-food outlets and marketing strategies, contribute to automatic eating behaviors that override internal hunger cues, especially in social situations where food is a central component.
In our analysis, we identified another distinct overeating phenotype—“Uncontrolled Pleasure Eating”—characterized by overeating for pleasure and a loss of control during tasks such as work or study. The inclusion of loss of control in this phenotype suggests a deeper psychological component, where external cues or emotional states trigger compulsive eating behaviors43. Environments associated with work or study may contribute to this behavior, as cognitive load can impair self-regulation, leading to mindless eating and a diminished ability to control food intake44. Overall, these phenotypes highlight the multifaceted nature of overeating behaviors, encompassing circadian misalignment, emotional stress, social influences, environmental convenience, and psychological factors such as loss of control. Understanding these distinct patterns is essential for designing targeted interventions that address the underlying mechanisms of overeating and support more effective treatments to reduce overeating.
The phenotypes were derived at the meal level, as our primary objective was to characterize meal-based overeating patterns. Because individuals may exhibit multiple overeating phenotypes across different meals, a single individual-level classification can obscure the nuanced interplay of factors, which prior research has shown to overlap rather than remain mutually exclusive45,46. While this study focuses on meal-level classification, future work should explore personalized phenotype trajectories over time, potentially leading to more individualized frameworks for identifying and intervening on shifting overeating behaviors.
Another avenue of research involves examining these phenotypes in relation to clinically relevant variables (e.g., BMI). Because all participants in this study met obesity criteria, BMI variability was limited, precluding robust analyses of phenotype-BMI associations. Future work could assess whether specific phenotypes correlate with broader health markers or weight trajectories, offering deeper insights into the underlying drivers and progression of overeating and obesity.
A strength of this research was the use of a personalized, objective measure of overeating. Overeating in the literature is often defined subjectively, relying on an individual’s perception of whether someone consumed more than needed11, and while this is important, it is long known that subjective overeating is fraught with recall and participant bias, resulting in errors47. Moreover, overeating is typically reported over extended periods—days, weeks, or months—with little attention given to overeating at the meal level. Although overeating over long periods contributes to weight gain, the promise of sensors48 and EMA10 makes timely interventions feasible. Without a solid meal-level definition of overeating, however, we may not be able to identify when, where, and how to intervene effectively. We define overeating episodes as those where an individual’s energy intake exceeds their personal meal/snack average by one standard deviation. Fixed thresholds49 (e.g., meals over 1,000 kCal) do not adjust for differences in BMI, sex, or typical eating habits, potentially misclassifying normal consumption as overeating for some individuals while overlooking it in others. Our method is also consistent with participants’ own perceptions of overeating, as people often consider themselves to have overeaten when they consume more than what is typical for them personally50. This approach assumes relatively stable meal patterns and energy intake over the 14-day study period. Future studies of longer duration might segment data into shorter intervals for recalculating z-scores or employ dynamic methods (e.g., rolling window averages, adaptive Bayesian models) to continuously update reference distributions and account for evolving eating behaviors over time.
A further strength of this work is the use of advanced machine learning techniques through a clustering approach that captures complex data representations, enabling deeper insights into nuanced eating patterns. While clustering algorithms are inherently sensitive to data characteristics, we mitigated potential limitations by leveraging a DNN encoder to extract nonlinearities from the features and learn latent representations. Additionally, UMAP preserved the intrinsic data topology without distorting the overall data structure. Through two-dimensional projection visualization, we confirmed cluster separability with minimal overlap. While we acknowledge the necessity of testing our method on an independent dataset to assess its generalizability, our novel pipeline effectively addresses common clustering pitfalls, enhancing the reliability and validity of our findings. Moreover, we note that addressing class imbalance with SMOTE can introduce synthetic examples that may not fully represent the true data distribution. While critiques of SMOTE are valid, our Supplementary Note 3 provides further analyses indicating its continued utility in this context51.
Our study additionally distinguishes itself by observing individuals with obesity in their natural settings, encouraging them to maintain their usual routines. This approach captures authentic, often problematic eating habits, which we redefine using digital longitudinal data. By characterizing distinct overeating phenotypes via clustering methodologies, we can stratify individuals during a run-in phase, enabling more targeted phenotype-specific interventions that improve both effectiveness and scalability in addressing obesity. Although we prioritized accurate caloric intake estimates via 24-hour recalls collected by dietitians, the actual classification of overeating episodes was data-driven, allowing the process to be automated for broader adoption. For large-scale implementation, mobile applications and self-reporting tools could estimate caloric intake while wearable devices passively track behavioral and physiological signals, reducing user burden. Future work could evaluate the trade-offs between accuracy and feasibility to ensure that these automated methods align with established dietary assessments, enabling timely, phenotype-specific interventions to help individuals reduce their propensity to overeat.
Despite these notable strengths, some additional limitations warrant discussion. Reactivity to continuous measurement is a recognized concern in studies using wearable cameras and EMA methodologies. To address this, participants were instructed to “act naturally” and maintain their usual routines, and the devices were designed to be small, lightweight, and require minimal user input, thus reducing awareness of data collection. The two-week study duration also allowed participants to acclimate to the procedures, further diminishing potential reactivity. In addition, prior research suggests that activity-oriented cameras can reduce perceived surveillance and social discomfort, helping to mitigate reactivity52. Nevertheless, some level of reactivity is inevitable, even with these precautions. EMA can reduce biases compared to other self-report methods53, yet it remains vulnerable to underreporting or stigma54. Integration of wearable sensors and cameras provides an objective means to validate EMA by capturing micromovement patterns, meal timing, dietary composition, and other consumption behaviors16,48. Discrepancies between self-reported EMA data and sensor-derived metrics may indicate inaccuracies arising from stigma or recall bias. Moving forward, triangulating data from EMA, passive sensing, and dietitian-led recalls can yield a more comprehensive and validated dataset that improves our understanding of human behavior.
In conclusion, there has been growing interest in examining phenotypes of overeating. Our study supports a promising new direction by utilizing both EMA and sensor data to lay the foundation for testing timely personalized interventions that mitigate overeating.
Methods
To improve the accuracy of dietary intake assessments, we incorporated passive data collection methods using wearable cameras. Wearable cameras provide objective, direct verification of meals through recorded footage with validated timestamps, capturing both meal timing and dietary composition. Additionally, participants completed 24-hour dietitian-administered dietary recalls, which included a photo-assisted feature to minimize reporting biases and ensure professional oversight of caloric intake estimations55. To analyze the collected data, we applied advanced machine learning techniques to identify patterns in eating behavior. Specifically, we used wearable device data and machine learning models to detect meal timing and examine the relationships between behavioral, psychological, and contextual predictors of overeating episodes. This methodological approach enables a comprehensive analysis of the co-occurring factors that contribute to overeating.
Study design
The SenseWhy study involved 65 adult participants with obesity (BMI ≥ 30 kg/m²) residing in the Chicago Metropolitan Area, and was conducted over a 14-day (two-week) period56. Each participant received a sensing suite, including a chest-worn, activity-oriented wearable camera secured with a neck-worn lanyard and an under-the-shirt magnetic pad. The camera featured an upward-facing sensor array that captured video and thermal images of the face and surrounding areas, enabling visual tracking of eating behaviors in naturalistic settings. An infrared sensor enhanced data quality, particularly for nighttime or low-light eating conditions. Participants could pause recording to protect bystander privacy when others were present. Further details on device placement and orientation are illustrated in Supplementary Fig. 9. Participants used a customized mobile application, installed on their personal smartphones, to record both dietary intake and event-based EMA surveys. The daily number of EMAs varied according to individual eating frequency, averaging approximately 4 entries per day. For each eating event logged, participants completed two pre-meal surveys and one post-meal survey. The pre-meal surveys—administered at the “Decided To” and “About To” stages—collectively consisted of 8 EMA items along with a meal image capture component for food content reporting. Following the meal, participants completed a post-meal survey that similarly comprised 8 EMA items, a meal image capture, and content reporting. This self-initiated, event-based design aimed to capture real-time, context-specific data while minimizing unnecessary notifications or disruptions.
We used the validated multiple-pass method with the Nutritional Data System for Research (NDSR)57 to collect 24-hour dietary recalls. This approach includes five steps: (1) listing all foods and beverages, (2) reviewing for forgotten items, (3) gathering additional details, (4) probing for forgotten foods, and (5) reviewing the entire recall58. The before- and after-meal photographs and brief descriptions for each eating event were provided to the dietitians via a web-portal to assist during recalls. Image-assisted dietary recall methods have been shown to reduce underreporting and recall bias59. Images were processed by trained dietary interviewers as an initial step, with subsequent steps completed during the recall interview. Although these images were not used for calorie estimation, they helped confirm foods consumed, thereby enhancing recall accuracy.
This study was approved by the Institutional Review Board (IRB) at Northwestern University under protocol number STU00204564, ensuring compliance with ethical standards.
Operationalization and validation of meal data collection
Self-reported EMA data on meals were collected using a mobile application. To validate caloric intake and meal composition, data from 24-hour dietary recalls and self-reported app data were merged by matching meals from both approaches within 60-minute windows. This process resulted in an EMA-only dataset containing validated caloric intake for each meal. From the camera footage, trained annotators labeled the precise start and end times of each meal following a standardized protocol. An eating episode was defined as a series of consecutive feeding gestures with intervals not exceeding 15 min, based on definitions from previous research60. Drinking gestures were also labeled separately. Additionally, fine-grained annotations of chews and bites were conducted. A bite was defined as a large jaw open/close movement when food touches the mouth, followed by chews defined as consecutive small jaw open/close movements.
To accurately associate meals with the camera footage labels, we merged each meal using the start and end time labels from all data sources. We visually confirmed these associations by comparing the contextual self-reported data and the food content from the dietitian-administered 24-hour dietary recall with the camera footage. These validated meals formed a dataset comprising EMA response and passive sensing. We defined an overeating episode as any meal that met or exceeded one z-score above an individual’s usual daily energy intake56,61,62. By analyzing the calorie distribution of all meals an individual consumed, we estimated a personalized one z-score cutoff for typical meal-level consumption. Overeating was then operationalized as a dichotomous variable, with 1 indicating an overeating meal and 0 representing a non-overeating meal. Supplementary Fig. 10 displays 14-day eating profiles for several individuals, showing energy intake for all consumed meals.
Preprocessing and feature extraction
We extracted psychological, contextual, and behavioral features known to be associated with obesity for each meal from various data sources. Supplementary Table 3 provides detailed descriptions of EMA questions, camera-derived features, and the timing of data collection (before or after an eating event). Psychological features were collected using EMA questionnaires administered through the study’s mobile application. Participants responded to Likert-scale questions related to stress, emotions, affect, and hunger before and after each meal.
Contextual features were derived from EMA questions focusing on social and environmental triggers during meals. To streamline the raw responses and address correlations among different answers, we preprocessed the data to consolidate responses into specific categories. Meals cooked from scratch, prepared using ingredient delivery services, or made from frozen prepackaged foods were classified as self-prepared meals. Meals obtained from restaurants were labeled as restaurant-sourced meals, while meals consisting solely of snacks, cereals, or beverages were designated as light refreshments. Participants’ activities during meals were categorized based on whether they were engaged in other tasks. If participants were socializing, watching TV, working, studying, or driving while eating, these instances were classified as task-oriented distractions. If they were not engaged in any of the activities listed above, this was termed focused eating. Meals consumed alone were labeled as solo dining, whereas those eaten with others were termed social dining experiences. Evening eating was defined as any meal consumed between 5 p.m. and 6 a.m., based on meal distribution patterns observed in our dataset.
Behavioral features were extracted from fine-grained annotations obtained through detailed analysis of camera footage for each meal, using a standardized annotation protocol. These features included the total number of chews (chewing motions during a meal), the total number of bites (instances where food was brought to the mouth), the chew rate (number of chews divided by the meal duration in minutes), the bite rate (number of bites divided by the meal duration in minutes), the chew-bite ratio (ratio of total chews to total bites), and the meal duration itself, validated through camera recordings. Annotation was performed by trained raters from a third-party professional labeling service, following a structured training program developed by the research team. Inter- and intra-rater agreement was monitored through an internal auditing system, periodic quality checks, and final validation by the research team. Further details on the annotation protocol, rater training, and quality control measures are provided in Supplementary Note 4. Building upon the extracted features, we conducted our analysis at three levels to evaluate the predictive capabilities of different data subsets. The first level, termed the passive-sensing-only analysis, included only the behavioral features obtained from the camera footage. The second level, the EMA-only analysis, utilized the psychological and contextual features derived from the EMA questionnaires. The third level, the feature-complete analysis, incorporated all available features. A detailed description of the analytical pipeline is shown in Fig. 5.
a Conceptual framework for an integrated ML pipeline aimed at identifying and addressing overeating phenotypes in practice. The pipeline begins with data collection, integrating multimodal sources such as sensor data, EMA data, and dietary recalls to capture a comprehensive view of eating behaviors. This is followed by preprocessing to clean and harmonize the data, ensuring consistency across modalities, and feature extraction to derive key indicators such as chew rate, bite frequency, and emotional states associated with eating episodes. The next stage involves the development of ML models for overeating detection and phenotype ideation. Supervised learning models identify key features predictive of overeating episodes, leveraging behavioral and psychological features, while clustering techniques group individuals into distinct overeating phenotypes based on shared behavioral and contextual patterns. Once phenotypes are characterized, they can be integrated into personalized treatment strategies, tailoring interventions to address specific overeating patterns (context- and behavior-driven overeating). These treatments may include real-time feedback systems to prompt users to reflect on their behaviors, along with recommendations for sustainable behavioral changes. The framework culminates in system deployment, where real-time feedback and monitoring enable continuous assessment of eating behaviors and treatment efficacy. Data from deployed systems can feed back into the pipeline, enabling refinement and validation of models and interventions over time. This iterative process supports the practical application of overeating phenotype identification and management in real-world settings, creating a closed-loop system for adaptive and effective health interventions. b Methodological approach used in this study. The process begins with data preparation, including the labeling and validation of meal times from sensor data, integration of psychological and contextual factors from EMA data and 24-hour dietary recalls. Overeating detection is performed using supervised models, incorporating SMOTE for imbalanced data and Bayesian optimization for fine-tuning. A semi-supervised clustering approach identifies overeating phenotypes, leveraging a non-linear encoder, UMAP for dimensionality reduction, and K-means clustering with z-score analysis for phenotype characterization. Evaluation metrics include AUROC, AUPRC, and Brier score loss for model performance, SHAP for interpretability, and clustering metrics such as silhouette score, homogeneity, and entropy.
Overeating detection: supervised machine learning
Prior to training machine learning models, we first performed feature preprocessing, including standardization of continuous variables (subtracting the mean and dividing by the standard deviation) to ensure zero mean and unit variance. For each subset of features (EMA-only, passive-sensing-only, and feature-complete), we implemented a 5-fold cross-validation procedure, stratified by class (overeating vs. non-overeating) to maintain class proportions within each fold at the meal level. Within each fold, the data were randomly split into training (60%), validation (20%), and test (20%) sets, ensuring no meal data cross-contamination between the sets. To assess the impact of meal-based splitting, we conducted an additional participant-level split analysis where participants were assigned exclusively to one of the training, validation, or test sets. This evaluation was conducted to assess whether the meal-based and participant-based splitting strategies yielded comparable results. Full details of this analysis are provided in Supplementary Note 5. To address the imbalance between overeating and non-overeating labeled meals, we applied the Synthetic Minority Oversampling Technique (SMOTE)63 to the training set. This ensured that the model learned from an equal representation of overeating and non-overeating meals, enhancing its ability to generalize across both categories.
We evaluated several machine learning models, including XGBoost, support vector machines (SVM), and Naïve Bayes in order to predict overeating episodes. Model training was performed on the upsampled training set using SMOTE. We performed hyperparameter optimization using Bayesian optimization64, utilizing the validation set to evaluate model performance during the optimization process. Final performance metrics were derived by averaging the results across the five folds. We also reported the standard deviation of these results, computed from the between-fold results, to assess the variability and robustness of the model’s performance. Ultimately, we selected the best-performing model based on the final performance metrics.
Supervised evaluation and explainability
Model performance was evaluated on the test set within each fold of the cross-validation procedure, using multiple evaluation metrics, including the Area Under the Receiver Operating Characteristic Curve (AUROC), the Area Under the Precision-Recall Curve (AUPRC), and the Brier loss score. We reported the mean and standard deviation of these metrics across the five cross-validation folds. Furthermore, we employed SHapley Additive exPlanations (SHAP)65 to interpret the models and identify the importance and directionality of individual features contributing to the prediction of overeating outcomes. SHAP analysis was conducted on the best-performing fold for each dataset (EMA-only, passive-sensing-only, and feature-complete), allowing us to understand how each feature influenced the model’s predictions and to uncover the underlying factors associated with overeating episodes. The results from SHAP also provided insights into the relative importance of behavioral, psychological, and contextual factors in predicting overeating episodes.
Overeating phenotype extraction: semi-supervised machine learning
We focused on EMA-based data in phenotype generation primarily to ensure comparability with past and future studies, facilitating broader implementation in future interventions. This approach also provided the largest set of recorded meals, thereby increasing statistical power to capture a wider range of overeating episodes and enabling the identification of more distinct behavioral patterns. To uncover the intrinsic structure of our data and enhance interpretability, we implemented a semi-supervised learning pipeline that combined dimensionality reduction and clustering techniques. Dimensionality reduction was performed through a two-step process. First, we employed a deep neural network architecture consisting of a feedforward multilayer perceptron (MLP) serving as a non-linear encoder. This encoder transformed the high-dimensional input data \({\rm{{\rm X}}}=\,{{\mathbb{R}}}^{n\times d}\) into a lower-dimensional latent representation \({\rm{Z}}=\,{{\mathbb{R}}}^{n\times h}\), where \({n}\) is the number of samples, \({d}\) is the number of original features, and \(h\) is the reduced feature spaced dimension. The encoder compresses the input by learning a compact latent representation, ensuring that \(h < d\) while retaining the most informative aspects of the data. The encoder captured non-linear relationships within the data, passively positioning similar data points closer together in the latent space while pushing dissimilar samples apart. The transformation can be found in Eq. (1):
where \({f}_{{encoder}}\) denotes the non-linear function parameterized by weights \(\Theta\) learned during training.
Subsequently, the latent representations \({\rm{Z}}\) obtained from the encoder were further processed using the Uniform Manifold Approximation and Projection (UMAP)66, a non-linear manifold learning technique. UMAP reduced the data to a two-dimensional space \({\rm{Y}}=\,{{\mathbb{R}}}^{n\times 2}\), preserving both global data structures while eliminating redundant and irrelevant features. This step facilitated visualization and interpretability of the high-dimensional data by mapping it onto a two-dimensional space in Eq. (2):
Following dimensionality reduction, we applied the k-means clustering algorithm to the transformed data \({\rm{Y}}\). To validate the robustness of our clustering results, we performed a sensitivity analysis using an alternative clustering method, Gaussian Mixture Modeling (GMM), which allows for soft assignments where each point has a probability of belonging to multiple clusters. We selected the number of clusters that maximized the silhouette score67, indicating the most appropriate number of clusters. Visual confirmation was performed using the two-dimensional UMAP projection to ensure well-separated clusters corresponding to distinct overeating meals.
Z-score analysis for cluster interpretation
To interpret clusters, we computed cluster-level z-scores for each variable. This involved calculating cluster means of a given variable, and then computing a z-score across cluster means for that variable. The resulting value characterizes the deviation of a given cluster from the average value of a given variable in terms of standard deviation units. Specifically, for each feature \(j\) and within each cluster \(i\), we computed the z-score in Eq. (3):
Here, \({\rm{k}}\) is the total number of clusters, \({\mu }_{{ij}}\) represents the mean of feature \(j\) within cluster \(i\), \({\mu }_{j}\) is the overall mean of cluster-specific means for feature \(j\), and \({\sigma }_{j}\) is the standard deviation of the cluster means for feature \(j\).
This z-score represents how many standard deviations the cluster mean \({\mu }_{{ij}}\) deviates from the mean of cluster means \({\mu }_{j}\) for that feature. Features with high absolute z-scores (e.g., \({{|z}}_{{ij}}|\ge 1\)) were considered substantially different from the average cluster mean, indicating their substantial influence on overeating behaviors within that cluster.
Clustering evaluation metrics
Overeating was not included as a feature in the clustering analysis; instead, we evaluated clusters based on the proportion of overeating episodes within each cluster, following clear criteria outlined below. We evaluated the resulting overeating clusters using per-cluster purity68, which measures the extent to which each cluster contains data points from a single class. The silhouette score was used to assess how well-separated and cohesive the clusters are by comparing intra-cluster and inter-cluster distances. Entropy69 quantifies the uncertainty within each cluster, with lower values indicating higher purity. For overall clustering assessment, homogeneity70 measures how uniform the clusters are with respect to the true class labels across the entire dataset. Lastly, we calculated the proportion of correctly assigned overeating meals to assess the effectiveness of the clustering approach in identifying relevant clusters.
Data availability
The data supporting this study cannot be shared openly due to the inclusion of sensitive participant information and the need to comply with Institutional Review Board (IRB) guidelines. Requests for data access can be considered on a case-by-case basis, subject to appropriate ethical approvals and data-sharing agreements.
Code availability
Implementation details of the machine learning models used in this study can be accessed at https://github.com/HAbitsLab/SenseWhy.git.
References
WHO. Obesity and Overweight.https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight (WHO, 2021).
Flore, G. et al. Weight maintenance after dietary weight loss: systematic review and meta-analysis on the effectiveness of behavioural intensive intervention. Nutrients 14, 1259 (2022).
Hall, K. D. & Kahan, S. Maintenance of lost weight and long-term management of obesity. Med. Clin. North Am. 102, 183–197 (2018).
Blüher, M. Obesity: global epidemiology and pathogenesis. Nat. Rev. Endocrinol. 15, 288–298 (2019).
Ruhm, C. J. Understanding overeating and obesity. J. Health Econ. 31, 781–796 (2012).
Gonul, S. et al. An expandable approach for design and personalization of digital, just-in-time adaptive interventions. J. Am. Med. Inf. Assoc. 26, 198–210 (2019).
Yau, Y. H. & Potenza, M. N. Stress and eating behaviors. Minerva Endocrinol. 38, 255–267 (2013).
Abdella, H. M., El Farssi, H. O., Broom, D. R., Hadden, D. A. & Dalton, C. F. Eating behaviours and food cravings; influence of age, sex, BMI and FTO genotype. Nutrients 11. https://doi.org/10.3390/nu11020377(2019).
Tanofsky-Kraff, M., Schvey, N. A. & Grilo, C. M. A developmental framework of binge-eating disorder based on pediatric loss of control eating. Am. Psychol. 75, 189–203 (2020).
Goldschmidt, A. B. et al. Ecological momentary assessment of eating episodes in obese adults. Psychosom. Med. 76, 747–752 (2014).
Godet, A., Fortier, A., Bannier, E., Coquery, N. & Val-Laillet, D. Interactions between emotions and eating behaviors: main issues, neuroimaging contributions, and innovative preventive or corrective strategies. Rev. Endocr. Metab. Disord. 23, 807–831 (2022).
Ahlich, E. & Rancourt, D. Boredom proneness, interoception, and emotional eating. Appetite 178, 106167 (2022).
Modrzejewska, A., Czepczor-Bernat, K., Modrzejewska, J. & Matusik, P. Eating motives and other factors predicting emotional overeating during COVID-19 in a sample of polish adults. Nutrients 13, 1658 (2021).
Wansink, B. Environmental factors that increase the food intake and consumption volume of unknowing consumers. Annu. Rev. Nutr. 24, 455–479 (2004).
Hill, J. O., Wyatt, H. R. & Peters, J. C. Energy balance and obesity. Circulation 126, 126–132 (2012).
Bell, B. M. et al. Automatic, wearable-based, in-field eating detection approaches for public health research: a scoping review. npj Dig. Med. 3. https://doi.org/10.1038/s41746-020-0246-2 (2020).
Fernandes, G. J. et al. HabitSense: a privacy-aware, AI-enhanced multimodal wearable platform for mHealth applications. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 8, 1–48 (2024).
Lowe, M. R. & Butryn, M. L. Hedonic hunger: a new dimension of appetite. Physiol. Behav. 91, 432–439 (2007).
Witt, A. A. & Lowe, M. R. Hedonic hunger and binge eating among women with eating disorders. Int J. Eat. Disord. 47, 273–280 (2014).
Rolls, B. J., Roe, L. S. & Meengs, J. S. Reductions in portion size and energy density of foods are additive and lead to sustained decreases in energy intake. Am. J. Clin. Nutr. 83, 11–17 (2006).
Frayn, M. & Knäuper, B. Emotional eating and weight in adults: a review. Curr. Psychol. 37, 924–933 (2018).
Favieri, F., Marini, A. & Casagrande, M. Emotional regulation and overeating behaviors in children and adolescents: a systematic review. Behav. Sci. 11, 11 (2021).
Salley, J. N., Hoover, A. W., Wilson, M. L. & Muth, E. R. Comparison between human and bite-based methods of estimating caloric intake. J. Acad. Nutr. Diet. 116, 1568–1577 (2016).
Hossain, D., Ghosh, T. & Sazonov, E. Automatic count of bites and chews from videos of eating episodes. IEEE Access 8, 101934–101945 (2020).
Zhu, Y. & Hollis, J. H. Increasing the number of chews before swallowing reduces meal size in normal-weight, overweight, and obese adults. J. Acad. Nutr. Dietetics 114, 926–931 (2014).
Li, J. et al. Improvement in chewing activity reduces energy intake in one meal and modulates plasma gut hormone concentrations in obese and lean young Chinese men. Am. J. Clin. Nutr. 94, 709–716 (2011).
Teixeira, G. P. et al. Role of chronotype in dietary intake, meal timing, and obesity: a systematic review. Nutr. Rev. 81, 75–90 (2023).
Goel, N. et al. Circadian rhythm profiles in women with night eating syndrome. J. Biol. Rhythms 24, 85–94 (2009).
Allison, K. C. et al. The Night Eating Questionnaire (NEQ): psychometric properties of a measure of severity of the Night Eating Syndrome. Eat. Behav. 9, 62–72 (2008).
Stunkard, A. J., Allison, K. C., Lundgren, J. D. & O’Reardon, J. P. A biobehavioural model of the night eating syndrome. Obes. Rev. 10, 69–77 (2009).
Adam, T. C. & Epel, E. S. Stress, eating and the reward system. Physiol. Behav. 91, 449–458 (2007).
Torres, S. J. & Nowson, C. A. Relationship between stress, eating behavior, and obesity. Nutrition 23, 887–894 (2007).
Greeno, C. G. & Wing, R. R. Stress-induced eating. Psychol. Bull. 115, 444–464 (1994).
Sultson, H., Kukk, K. & Akkermann, K. Positive and negative emotional eating have different associations with overeating and binge eating: construction and validation of the Positive-Negative Emotional Eating Scale. Appetite 116, 423–430 (2017).
Torske, A., Bremer, B., Hölzel, B. K., Maczka, A. & Koch, K. Mindfulness meditation modulates stress-eating and its neural correlates. Sci. Rep. 14, 7294 (2024).
Herman, C. P., Roth, D. A. & Polivy, J. Effects of the presence of others on food intake: a normative interpretation. Psychol. Bull. 129, 873–886 (2003).
de Castro, J. M. Family and friends produce greater social facilitation of food intake than other companions. Physiol. Behav. 56, 445–445 (1994).
Herman, C. P. The social facilitation of eating. A review. Appetite 86, 61–73 (2015).
Robinson, E., Benwell, H. & Higgs, S. Food intake norms increase and decrease snack food intake in a remote confederate study. Appetite 65, 20–24 (2013).
Rydell, S. A. et al. Why eat at fast-food restaurants: reported reasons among frequent consumers. J. Am. Diet. Assoc. 108, 2066–2070 (2008).
Rosenheck, R. Fast food consumption and increased caloric intake: a systematic review of a trajectory towards weight gain and obesity risk. Obes. Rev. 9, 535–547 (2008).
Cohen, D. & Farley, T. A. Eating as an automatic behavior. Prev. Chronic Dis. 5, A23 (2008).
Tanofsky-Kraff, M. et al. A prospective study of pediatric loss of control eating and psychological outcomes. J. Abnorm. Psychol. 120, 108–118 (2011).
Wansink, B. & Sobal, J. Mindless eating: the 200 daily food decisions we overlook. Environ. Behav. 39, 106–123 (2007).
Acosta, A. et al. Selection of antiobesity medications based on phenotypes enhances weight loss: a pragmatic trial in an obesity clinic. Obesity 29, 662–671 (2021).
Koning, E., Vorstman, J., McIntyre, R. S. & Brietzke, E. Characterizing eating behavioral phenotypes in mood disorders: a narrative review. Psychol. Med. 52, 2885–2898 (2022).
Schoeller, D. A. Limitations in the assessment of dietary energy intake by self-report. Metab. Clin. Exp. 44, 18–22 (1995).
Dong, Y., Hoover, A., Scisco, J. & Muth, E. A new method for measuring meal intake in humans via automated wrist motion tracking. Appl. Psychophysiol. Biofeedback 37, 205–215 (2012).
Moraes, C. E. F. et al. Food Consumption during Binge eating episodes in binge eating spectrum conditions from a representative sample of a Brazilian metropolitan city. Nutrients 15. https://doi.org/10.3390/nu15071573 (2023).
van Strien, T., Herman, C. P. & Verheijden, M. W. Eating style, overeating, and overweight in a representative Dutch sample. Does external eating play a role?. Appetite 52, 380–387 (2009).
Tarawneh, A. S., Hassanat, A. B., Altarawneh, G. A. & Almuhaimeed, A. Stop oversampling for class imbalance learning: a review. IEEE Access 10, 47643–47660 (2022).
Alharbi, R. et al. I can’t be myself: effects of wearable cameras on the capture of authentic behavior in the wild. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2. https://doi.org/10.1145/3264900 (2018).
Shiffman, S., Stone, A. A. & Hufford, M. R. Ecological momentary assessment. Annu. Rev. Clin. Psychol. 4, 1–32 (2008).
Macdiarmid, J. & Blundell, J. Assessing dietary intake: who, what and why of under-reporting. Nutr. Res. Rev. 11, 231–253 (1998).
Wilfley, D. E., Schwartz, M. B., Spurrell, E. B. & Fairburn, C. G. Using the eating disorder examination to identify the specific psychopathology of binge eating disorder. Int. J. Eat. Disord. 27, 259–269 (2000).
Alshurafa, N. I. et al. Rationale and design of the SenseWhy project: a passive sensing and ecological momentary assessment study on characteristics of overeating episodes. Digit. Health 9, https://doi.org/10.1177/20552076231158314 (2023).
Schakel, S. F. Maintaining a nutrient database in a changing marketplace: keeping pace with changing food products—a research perspective. J. Food Compos. Anal. 14, 315–322 (2001).
Baranowski, T. 24-Hour Recall and Diet Record Methods. Nutritional Epidemiology, 3rd edn, Monographs in Epidemiology and Biostatistics (Oxford Academic, 2013), https://doi.org/10.1093/acprof:oso/9780199754038.003.0004.
Boushey, C. J., Spoden, M., Zhu, F. M., Delp, E. J. & Kerr, D. A. New mobile methods for dietary assessment: review of image-assisted and image-based dietary assessment methods. Proc. Nutr. Soc. 76, 283–294 (2017).
Zhang, S. et al. NeckSense: a multi-sensor necklace for detecting eating activities in free-living conditions. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 4. https://doi.org/10.1145/3397313(2020).
Alshurafa. Detecting real-time episodic overeating for just-in-time interventions. Ann. Behav. Med. 52, S1-S838 https://doi.org/10.1093/abm/kay013 (2018).
Thomas, J. G., Doshi, S., Crosby, R. D. & Lowe, M. R. Ecological momentary assessment of obesogenic eating behavior: combining person-specific and environmental predictors. Obesity 19, 1574–1579 (2011).
Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002).
Snoek, J., Larochelle, H. & Adams, R. P. Practical Bayesian optimization of machine learning algorithms. Adv. Neural Inf. Proc. Syst 25, 2951–2959 (2012).
Lundberg, S. M. & Lee, S.-I. in Advances in Neural Information Processing Systems Vol. 30 (eds I. Guyon et al.) (Curran Associates, 2017).
McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2018).
Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987).
Eick, C. F., Zeidat, N. & Zhao, Z. Supervised clustering-algorithms and benefits. in Proc. 16th IEEE International Conference on Tools with Artificial Intelligence, 774–776 (IEEE Computer Society, 2004).
Deng, S., Sheng, L., Nie, J. & Deng, F. A clustering method based on information entropy payload. arXiv. https://arxiv.org/abs/2209.06582 (2022).
Rosenberg, A. & Hirschberg, J. V-Measure: a conditional entropy-based external cluster evaluation measure. in Proc. 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (Association for Computational Linguistics, 2007).
Acknowledgements
This work has been funded by the US National Institute of Diabetes and Digestive and Kidney Diseases with award number 5K25DK113242. The funders had no role in the analysis, interpretation of data, or preparation of the manuscript. The authors would like to thank Drs. Donald Hedeker, Bonnie Spring, Santosh Kumar, Linda Van Horn, and Evan Forman, as well as Glenn Fernandes, Soroush Shahi, Saki Amagai, Bonnie Nolan, Helen Zhu, and Jeb Sumeracki for their support during this project.
Author information
Authors and Affiliations
Contributions
F.S., C.R., T.S., M.P., and N.A. participated in conceptualization. F.S., B.W., and N.A. were involved data curation, investigation, and formal analysis. N.A. was involved in funding acquisition, supervision, and project administration. F.S., C.R., T.S., and N.A. were involved in validation. F.S., B.W., C.R., R.M., J.S., A.L., and N.A. were involved in visualization, methodology, and writing the original draft. All the authors edited and reviewed the drafted manuscript. Every author had complete access to all the data, and the decision to submit the manuscript for publication was made collectively.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Shahabi, F., Wei, B., Romano, C. et al. Unveiling overeating patterns within digital longitudinal data on eating behaviors and contexts. npj Digit. Med. 8, 567 (2025). https://doi.org/10.1038/s41746-025-01698-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41746-025-01698-9