Longitudinal mental health data collected via the Corona Health smartphone app during COVID-19

Winter, Michael; Vogel, Carsten; Schobel, Johannes; Schlüter, Miriam; Baumeister, Harald; Terhorst, Yannik; Schlee, Winfried; Langguth, Berthold; Heuschmann, Peter; Cohrdes, Caroline; Pryss, Rüdiger

doi:10.1038/s41597-026-07015-7

Download PDF

Data Descriptor
Open access
Published: 11 March 2026

Longitudinal mental health data collected via the Corona Health smartphone app during COVID-19

Michael Winter^1,2,
Carsten Vogel^1,2,
Johannes Schobel³,
Miriam Schlüter^1,2,
Harald Baumeister^4,5,6,
Yannik Terhorst^7,8,
Winfried Schlee^9,10,
Berthold Langguth¹⁰,
Peter Heuschmann^1,2,
Caroline Cohrdes¹¹^na1 &
…
Rüdiger Pryss^1,2^na1

Scientific Data volume 13, Article number: 392 (2026) Cite this article

864 Accesses
1 Altmetric
Metrics details

Abstract

Mental health impacts during the COVID-19 pandemic underscored the importance of real-time assessment methods to capture population-level changes (e.g., longitudinal changes in quality of life). This dataset contains questionnaire responses collected with the Corona Health app, a multilingual mHealth app available on Android and iOS platforms. The dataset includes baseline from 2,704 participants (i.e., adults aged 18 years and older, living in Germany) and 11,541 repeated ecological momentary assessment (EMA) responses, providing longitudinal mental health data throughout various phases during the pandemic period (i.e., data collected between July, 2020 and January, 2025). The questionnaires assessed domains such as quality of life, psychological well-being, coping mechanisms, and pandemic-related concerns. In addition to questionnaire responses, the dataset includes sensor data such as GPS location information and app usage statistics collected with participant consent. The described dataset enables researchers to examine mental health trajectories during and after COVID-19, analyze relationships between psychological factors and pandemic experiences, and investigate patterns in longitudinal mental health data.

Impact of COVID-19 lockdown on mental health in Germany: longitudinal observation of different mental health trajectories and protective factors

Article Open access 17 July 2021

The impact of the initial COVID-19 outbreak on young adults’ mental health: a longitudinal study of risk and resilience factors

Article Open access 05 October 2022

Associations between dimensions of behaviour, personality traits, and mental-health during the COVID-19 pandemic in the United Kingdom

Article Open access 16 July 2021

Background & Summary

The COVID-19 pandemic represented a global health crisis that altered daily life worldwide¹. Beyond the direct health impacts of the coronavirus, the pandemic triggered widespread implementation of public health measures including lockdowns, social distancing, quarantine protocols, and restrictions on social gatherings². These necessary containment strategies, while crucial for controlling viral transmission, created secondary consequences for population mental health^3,4,5. The sudden disruption of social connections, economic uncertainty, and prolonged periods of isolation contributed to temporary increases in symptoms of anxiety, depression, and psychological distress across diverse populations, that have not not returned to pre-pandemic levels yet⁶. Moreover, traditional face-to-face mental health services became limited or inaccessible.

In response to this challenge, the Corona Health mHealth app was developed to capture various aspects of individuals’ pandemic experiences and mental health outcomes⁷. Through EMAs, the mHealth app enabled real-time data collection by prompting users to report their experiences and emotions as they occurred in their natural environments. By utilizing smartphone technology and a momentary assessment approach, Corona Health aimed to bridge the gap between mental health monitoring needs and limited traditional clinical resources during the pandemic. Moreover, a crucial function of Corona Health was that participants received immediate automated feedback on their psychological well-being based on their responses. For example, participants were advised to seek help and where to get more information about support options, such as a crisis hotline, if they exceeded a clinically significant cut-off.

The Corona Health platform hosted multiple concurrent studies addressing different aspects of pandemic-related health and well-being. This dataset specifically contains data from the “Mental Health for Adults (18 Years +)” study, one of five studies implemented within Corona Health⁸. The dataset comprises from 2,704 participants baseline as well as 11,541 repeated EMA responses collected over nearly five years (July 2020 to January 2025), spanning multiple pandemic phases from initial outbreak through longer-term adaptation periods. With multilingual support and both self-reported questionnaire data and objective sensor measurements, the dataset provides statistical power for examining mental health responses and behavioral patterns across diverse population groups and pandemic phases.

Moreover, this dataset offers potential for secondary analysis across multiple research domains. Mental health researchers can utilize the longitudinal data to investigate psychological resilience patterns and examine temporal dynamics of psychopathological symptoms and well-being indicators during prolonged crisis periods. The data enables deeper insights into the distribution of mental health problems, needs, and at-risk groups, as well as associations between a comprehensive set of mental health indicators over time, including analyses in conjunction with behavioral smartphone sensing data. Digital health researchers can evaluate smartphone-based mental health assessment approaches and examine user engagement patterns with health apps during crisis periods, while EMA methodology researchers can analyze compliance patterns and optimal assessment frequencies in longitudinal smartphone-based studies.

The comprehensive, multilingual dataset thus provides a valuable resource for advancing the understanding of pandemic mental health impacts and digital health monitoring approaches. It contributes to a growing body of openly available, high-quality psychological and behavioral data collected during the COVID-19 pandemic. Related efforts include the COVID-19 Snapshot Monitoring (COSMO) dataset from Germany, which offers repeated cross-sectional insights into public risk perceptions, behaviors, and attitudes over 69 survey waves⁹. Further data comprises information about the impact of the COVID-19 pandemic from individuals in France¹⁰. Moreover, a dataset from Turkey focusing on distress intolerance and anxiety across pandemic¹¹ and a longitudinal observation on the change in psychosocial factors in Japan during the pandemic are other related contributions¹².

The present dataset complements these efforts through several distinctive features: it combines validated clinical instruments (e.g., PHQ-9, GAD-7) with passive smartphone sensing data (i.e., GPS location and app usage statistics), enabling joint analysis of self-reported mental health outcomes and objective behavioral indicators. Furthermore, unlike cross-sectional designs, this dataset captures within-person longitudinal trajectories via repeated EMA assessments over nearly five years, providing granular insight into individual-level mental health dynamics across multiple pandemic phases. This integration of EMA methodology with mobile sensing in a single longitudinal framework offers unique opportunities for research at the intersection of digital phenotyping and mental health monitoring. Together, these initiatives enhance international comparative research and provide context for studying the psychological, behavioral, and digital health dimensions of crisis response.

Previous Publications Using This Dataset

The comprehensive nature and extended timeframe of the Corona Health app have made it a valuable resource for the scientific community, resulting in multiple publications that have contributed to our understanding of pandemic-related mental health outcomes. A total of 17 publications have utilized data from the broader Corona Health app^{7,13,14,15,16,17,18,19}, with 9 publications specifically drawing from the “Mental health for adults (18 years +)” dataset presented in this paper. These papers used the dataset at different points in time (e.g., while the app was still active), and therefore the data considered may differ based on several factors (e.g., only specific items were used, or only a particular population was analyzed), which are briefly described in the following and summarized in Table 1.

Table 1 Summary of Studies on Smartphone Use, Mental Health, and Quality of Life Based on Data from the “Mental Health for Adults (18 Years +)” study.

Full size table

Research has explored smartphone data and mental health outcomes with mixed results. One study of 752 participants found limited predictive value of smartphone usage features for detecting insomnia symptoms²⁰. Investigation of GPS-based environmental factors in 249 participants revealed that regional characteristics like COVID-19 infection rates could predict depressive symptoms²¹. Analysis of smartphone social interaction data from 490 participants showed messenger use negatively predicted both depressive and anxiety symptoms²².

Studies on social media revealed concerning patterns: extraversion paradoxically amplified negative effects of social media use on depression in 486 participants²³, while research with 627 participants confirmed a positive association between social media usage and depression symptoms²⁴. Smartphone communication patterns in 364 participants showed age-dependent relationships with loneliness and social well-being²⁵.

Regarding quality of life during the COVID-19 pandemic, latent class analysis of 2,137 German adults identified four distinct patterns (resilient, recovering, delayed, chronic), with the delayed class showing steepest decline and slowest recovery²⁶. Further analysis of this cohort found support- and meaning-focused coping beneficial for quality of life, while escape-avoidance coping showed strong negative associations²⁷. A comprehensive assessment of 1,396 participants revealed diminished quality of life among women, younger individuals, and those with pandemic-related employment disruptions²⁸.

This release provides the complete, longitudinal dataset spanning the full collection period (July 2020 to January 2025), including all baseline assessments (n = 2,704), all repeated EMA responses (n = 11,541), and associated sensor data (i.e., GPS and app usage statistics). Unlike the prior publications, which analyzed temporal snapshots or specific variable subsets, this release enables: (1) comprehensive replication and extension of previous findings; (2) novel analyses across the full pandemic timeline, including understudied recovery and adaptation phases; (3) integration of questionnaire and sensor data that were not jointly analyzed in prior work; and (4) methodological investigations of EMA compliance and engagement patterns across the complete study duration.

Methods

Study Design and App Framework

All the data was collected using the Corona Health mHealth app. The app was publicly available on Android and iOS platforms, and supported multilingual deployment across eight languages (i.e., German, English, Spanish, French, Hungarian, Italian, Russian, Serbia). The app enabled real-time data collection of self-reported mental health measures and sensor-based contextual data. In general, the technical platform of Corona Health followed a modular design, hosting five distinct research studies focusing on various aspects of mental and physical health outcomes in adults and adolescents:

Mental Health for Adults (18 Years +; subject of this paper)
Mental Health for Adolescents (12 to 17 Years)¹⁴
Physical Health for Adults (18 Years +)¹³
Recognizing Stress for Adults (18 Years and up)²⁹
Acceptance of Pandemic Apps³⁰

Corona Health was developed based on the TrackYourHealth framework^15,31, which provided robust support for EMA studies, including multilanguage content delivery, sensor integration, and compliance with data privacy and regulatory standards (e.g., General Data Protection Regulation and Medical Device Regulation in the European Union). All user data were stored anonymously, with mobile sensing features activated only upon user consent.

This dataset specifically contains data from the “Mental Health for Adults (18 years +)” study only. Data collection for this specific study spanned nearly five years, from July 21, 2020, to January 25, 2025, covering multiple phases of the COVID-19 pandemic from initial outbreak through longer-term adaptation periods. The dataset comprises 2,704 baseline questionnaire responses and 11,541 EMA responses from 1,488 participants, providing comprehensive longitudinal mental health data throughout the pandemic period. The questionnaires assessed multiple domains including demographics, quality of life, psychological well-being, coping mechanisms, and pandemic-related concerns.

Input Data

Questionnaire Data

Each questionnaire was created via a structured and automated content pipeline. It started with Excel-based templates, selected for their readability and easy editing by non-IT experts. These templates were converted to JSON for automated distribution via a RESTful API. The baseline questionnaire took approximately 20 minutes to complete, while repeated EMAs were approximately 10 minutes. Repeated EMAs were scheduled weekly, based on initially configured schedule or triggered by the users themselves, and they received local push notifications directly from Corona Health accordingly. All responses were collected through the respective smartphone application using a structured JSON format. The questionnaire framework supported multiple question types including multiple choice questions, Likert scale ratings, text input fields, and slider-based responses. The content pipeline system ensured consistency across multiple languages and allowed for dynamic content updates while maintaining questionnaire integrity.

Mobile Sensing Data

Mobile sensing was implemented via two mechanisms:

Opportunistic sensing: GPS-based coarse location data (accuracy limited to 11.1 km) and device operating system metadata were collected automatically during questionnaire completion for participants who provided consent.
Participatory sensing: Aggregated app usage data (Android only) including screen time and application foreground activity for selected apps (top 5 used apps and predefined social media apps) were collected with explicit user permission.

Sensor data were collected only at the time of questionnaire completion and contingent on user permission. On Android, app usage statistics were obtained through the UsageEvents API provided by the Android operating system, which recorded timestamped application-level foreground and background activity events. These raw events were aggregated on the device into daily-level metrics (e.g., total screen time, per-app usage durations, and activity/inactivity intervals) before transmission to the backend. The specific data collected included:

Total daily phone usage duration (screen-on time)
Daily foreground usage time for the five most frequently used apps and predefined social media apps
Timestamps of first and last usage per tracked application
Duration of background (foreground service) usage for apps running without visible activity
Time intervals of user activity and inactivity across the day

For location data, coarse-grained GPS coordinates were obtained via the device’s native location services and were processed to maintain an accuracy of 11.1 km to protect participant privacy while enabling regional analysis. Device information including operating system version and device type were automatically recorded with each questionnaire submission.

Data Management and Processing

Backend Infrastructure and Storage

All data were stored in a relational database with structured schemas for users, questionnaires, and sensor events. The data architecture supported multilingual content and dynamic feedback based on in-app rules. The database used a relational structure with core entities including Users, Studies, Questionnaires, Questionnaire Elements, Answersheets, and Feedback components.

The backend was developed using the Laravel PHP framework and followed RESTful API design principles with JSON:API specification for data exchange³². Data collection and study participation were managed through a set of RESTful API endpoints. Participants retrieved available studies via GET /studies/ and accessed study-level metadata through GET /studies/{id}. Enrollment was completed by submitting a subscription request to POST /studies/{id}/subscribe. Questionnaire content, including item structure, question types, response options, and validation parameters, was delivered to the app via GET /questionnaires/{id}/structure. Completed responses were submitted through POST /questionnaires/{id}/answersheets, which triggered the server-side validation pipeline described below.

Further, the database used a relational structure (i.e., MySQL) in which studies were associated with one or more questionnaires, each composed of polymorphic elements (i.e., page breaks, headlines, text fields, and questions). Submitted responses were serialized and stored as JSON objects alongside associated sensor data and client device metadata. The complete entity-relationship model is described in⁷.

This architecture ensured scalable and secure data collection while maintaining system responsiveness across diverse mobile devices and network conditions (more information on requirements and technical implementation are described in^7,31).

Data Anonymization and Quality Assurance

All data were collected and stored anonymously. Participants were assigned anonymous user IDs upon registration, and no personally identifiable information was collected or stored. The anonymous design ensured that individual participants could not be identified from the collected data. However, this approach also had limitations (e.g., it prevented the creation of persistent user profiles). As a result, if a user deleted and reinstalled the Corona Health app or switched to a new smartphone, they were treated as a new user in the dataset. Several measures were implemented to ensure data quality:

Real-time validation of questionnaire responses within the app
Automatic detection and handling of incomplete responses
Timestamp validation for all data entries
Automated backup and integrity checking procedures
Offline functionality for the mHealth app to ensure data collection continuity during network interruptions

Ethics, Regulatory Compliance, and Participant Recruitment

The study was approved by the Ethics Committee of the University of Würzburg (Ref No. 130/20-me). Corona Health complied with the Medical Device Regulation (MDR) and General Data Protection Regulation (GDPR). Further, the development process of the Corona Health platform followed risk-based software validation according to IEC 62304 and IEC 82304 standards for medical device software and healthcare applications as well as the GAMP 5 regulations (standard work of the pharmaceutical industry)¹⁶.

Participants were recruited through self-selection via the publicly available Corona Health mHealth app on Google Play and the Apple App Store, launched on July, 2020. The app was promoted through social media channels (Twitter) and newsletters. Participants provided informed consent digitally through the onboarding process of Corona Health before participation, which included detailed information about data collection procedures, anonymization protocols, and the voluntary nature of participation. All participants were required to consent to data collection before proceeding to study enrollment.

The privacy policy explicitly informed participants that anonymized findings may be published in scientific journals for research purposes. Importantly, the study employed an anonymous-by-design approach: no personally identifiable information (e.g., names, addresses, email, phone numbers) was collected or stored, and GPS coordinates were limited to 11.1 km resolution. Prior to public data deposition, additional anonymization measures were undertaken in consultation with the institutional data protection officer, including removal of potentially identifying variables and masking of low-frequency demographic categories. The released dataset constitutes anonymized data that cannot reasonably be linked to identifiable individuals.

Data Records

The dataset comprising the baseline (Baseline.csv) and repeated EMA responses (EMA.csv) can be found at B2SHARE (EUDAT)⁸.

The repository also includes sensor data (GPS/APP_Baseline/EMA.csv; i.e., GPS and app usage statistics) collected during the responses and a detailed description of all components from the baseline and repeated EMA that can be found in a comprehensive codebook (Codebook_Baseline_EMA.xlsx).

Codebook for Baseline and EMA Questionnaires

The codebook provides the structural template for both the baseline and repeated EMA questionnaires used within Corona Health. It is available as a multi-sheet Excel file (Codebook_Baseline_EMA.xlsx), with one sheet dedicated to each version of the questionnaire. Each row in the codebook represents an individual element (e.g., question, headline, or page break), and each column defines specific metadata associated with that element.

The columns are structured as follows:

Column A - elementtype: Defines the type of element in the questionnaire (e.g., pagebreak, headline, text, or question).
Column B - questiontype: Specifies the response format for questions (e.g., SingleChoice, MultipleChoice, Scale, Knob).
Columns C-E - min, max, step: Numeric configuration parameters used primarily for scaled or slider-based inputs (e.g., setting the response range and increment).
Column F - required: Indicates whether a response is mandatory (true or false).
Column G - label: Unique variable name used for storing and linking participant responses in the baseline and EMA files.
Column h - item_de: German-language version of the question text displayed in the app.
Columns I-O - answer_1 to answer_7: Encoded answer options for multiple-choice and single-choice formats, including both numeric codes and display labels.

The Excel file maintains this structure consistently across multiple languages. The first columns until column O reflect the German version (item_de), while subsequent columns provide translations for all supported languages (e.g., English, Spanish, French, Hungarian, Italian, Russian, Serbia). The same structure is applied to the codebook sheet representing EMA questionnaires.

The label field (Column G) provides the semantic identifier used to link response values across all datasets (e.g., cope1, phq9_1, whoqol_env1; see Fig. 1). This naming system allows correct mapping between questions in the codebook and their corresponding values in the baseline and EMA responses.

Structure of the Baseline and repeated EMA Questionnaire

The baseline questionnaire is structured into 12 consecutive pages, each defined by a pagebreak element in Column A (elementtype) of the codebook. These pagebreaks reflect the logical flow and user experience of Corona Health during questionnaire administration. Figure 2 illustrates the flowchart an user experienced during the answering of the initial baseline questionnaire (see⁷ for more technical information).

Each page corresponds to a specific thematic block, covering core constructs from validated instruments as well as custom items relevant to the COVID-19 context. Note that the following page descriptions highlight the primary instruments and constructs, additional items within each thematic block are fully documented in the codebook. In general, the grouping of items into pages was based on thematic and conceptual similarity rather than empirical factor structures, with the primary goal of creating a coherent user experience and logical flow for participants. For example, Page 7 combines the SOEP loneliness items with the PHQ-9 depression scale, as both assess aspects of psychological and social well-being. This page-level organization was designed for questionnaire administration purposes and does not imply a unified underlying construct. The pages are defined and structured as follows:

Page 1 - Study Introduction: Introduction and welcome message, providing general study information and instructions.
Page 2 - Background and Screening: Includes initial screening (e.g., whether the questionnaire is completed for oneself or another person), followed by questions on demographic background, household characteristics, socioeconomic status, and direct COVID-19-related impacts.
Page 3 - Health and Well-being: Covers general health status using the Mini European Health Module (MEHM)³³, family well-being, experiences of interpersonal violence (adapted from PHQ-D)^34,35, and perceived stigma using items from the Inventory of Stigmatizing Experiences (ISE)³⁶.
Page 4 - Coping Strategies: Contains the full 28-item Brief-COPE Inventory (Coping Orientation to Problems Experienced)³⁷, assessing individual coping strategies in response to stress.
Page 5 - Personality Traits: Administers the 10-item version of the Big Five Inventory (BFI-10)³⁸, which measures personality traits across five domains.
Page 6 - Pandemic-related Stressors: Includes additional PHQ-D items focused on psychosocial stressors relevant to the pandemic context (e.g., concerns about income, health, or social connection)^34,35.
Page 7 - Emotional Well-being: Assesses loneliness using items from the German Socio-Economic Panel (SOEP)³⁹ and depressive symptomatology with the PHQ-9 scale⁴⁰.
Page 8 - Panic and Anxiety Symptoms: Screens for panic disorder symptoms using the PHQ-PD module⁴¹ and evaluates anxiety levels with the Generalized Anxiety Disorder 7-item scale (GAD-7)⁴².
Page 9 - Quality of Life: Administers the WHOQOL-BREF instrument (World Health Organization Quality of Life), covering physical, psychological, social, and environmental well-being⁴³.
Page 10 - Sleep and Lifestyle Factors: Includes the ISI-7 (Insomnia Severity Index) to assess sleep quality and difficulties⁴⁴, and items about physical activity levels and alcohol use.
Page 11 - Support Needs and Resilience: Captures participants’ unmet support needs and documents meaningful or positive experiences during the pandemic.
Page 12 - Closing and Feedback: Presents a closing thank-you message and allows participants to leave optional free-text feedback.

Each page was presented sequentially in Corona Health, and the grouping into thematic blocks facilitates later analysis by construct. All questions are labeled in Column G of the codebook (label), which provides the corresponding variable names for linking responses in the dataset.

The repeated EMA questionnaire consists of 10 pages, each marked also by a pagebreak entry in Column A (elementtype) of the EMA sheet in the codebook file. While thematically similar to the baseline, the EMA version includes a reduced number of items, focusing on recurring and longitudinally relevant domains. The selection of EMA items was guided by two primary criteria: (1) minimizing participant burden by reducing questionnaire completion time to approximately 10 minutes, and (2) prioritizing constructs expected to show meaningful within-person variability over time (e.g., depressive symptoms, anxiety, quality of life) while omitting stable trait-like measures assessed only at baseline (e.g., personality traits via the BFI-10, coping styles via the Brief-COPE). The flowchart of the repeated EMA questionnaire is depicted in Fig. 3 (see⁷ for more technical information).

The questionnaire is organized as follows:

Page 1 - Session Introduction: Introduction and welcome message displayed to participants at the start of the session.
Page 2 - Current Context and Well-being: Gathers current living situation, recent pandemic-related experiences, and includes questions from the MEHM and on perceived family climate.
Page 3 - Pandemic-related Stressors: Focuses on psychosocial stressors, adapted from the PHQ-D.
Page 4 - Emotional Well-being: Administers the Loneliness Scale items from the SOEP to assess perceived social isolation and includes the PHQ-9 scale, used to measure depressive symptoms and their severity over the past weeks.
Page 5 - Panic and Anxiety Symptoms: Contains the PHQ-PD module for panic symptoms and the GAD-7 scale to assess general anxiety.
Page 6 - Quality of Life: Includes a shortened version of the WHOQOL-BREF instrument (i.e., EUROHIS-QOL⁴⁵ to measure quality of life across key domains.
Page 7 - Sleep and Lifestyle Factors : Captures responses on the ISI-7 as well as physical activity and alcohol use.
Page 8 - Support Needs and Resilience: Asks participants about support needs and allows them to reflect on positive experiences during the pandemic period.
Page 9 - Closing and Feedback: A thank-you and closing screen, offering participants the opportunity to leave open feedback.

The structure of the EMA questionnaire mirrors that of the baseline, with each item defined in the codebook by its metadata. Specifically, Column G (label) provides the unique variable name used to link responses to corresponding entries in the EMA response file (EMA.csv).

Structure of Baseline and EMA Responses

The baseline questionnaire responses are stored in the file Baseline.csv. Each row corresponds to a single participant’s completed baseline assessment. The columns are structured as follows:

Column A - user_id: Anonymous identifier assigned to the participant upon enrollment.
Column B - collected_at: Timestamp indicating when the baseline questionnaire was submitted (in ISO 8601 format).
Columns C-EU - Questionnaire responses: Each of these columns corresponds to a specific item in the baseline questionnaire. The variable names used in the column headers are defined in Column G (label) of the codebook and reflect the semantic content of the questions (e.g., phq9_1, cope5, whoqol_env2; see Fig. 1).
Column EV - client_os: Operating system of the participant’s mobile device (e.g., android, ios).
Column EW - client_device: Device model or type (e.g., Google Pixel 5, iPhone 12).

The user_id is consistently used across other data components (e.g., GPS and app usage) to enable linkage while maintaining privacy (see Fig. 4). Each response column (C-EU) is mapped directly to the respective questionnaire item as described in the codebook. This enables correct identification of question wording, response format, and associated scale parameters. The response structure supports multilingual alignment, as the underlying variable labels are language-independent and consistent across translations.

The repeated EMA responses are stored in the file EMA.csv. Each row represents a single follow-up assessment submitted by a participant. The dataset is organized as follows:

Column A - user_id: Anonymized identifier assigned to each participant. This identifier is consistent across datasets and can be used to link EMA responses to baseline data and sensor records.
Column B - collected_at: Timestamp indicating when the EMA questionnaire was submitted (ISO 8601 format).
Columns C-BW - Questionnaire responses: These columns contain participants’ answers to the EMA items. Each column corresponds to a specific questionnaire variable defined in Column G (label) of the codebook. Variable names are consistent across language versions and represent the semantic content of the item (e.g., phq9_1, whoqol_phys1, isi7_3; see Fig. 1).
Column BX - client_os: Operating system of the participant’s device (e.g., android, ios).
Column BY - client_device: Device model or type used by the participant at the time of questionnaire completion (e.g., Samsung Galaxy A52, iPhone 13).

Equal as in baseline, the response variables (Columns C-BW) are directly linked to the codebook entries using the unique labels defined in Column G.

GPS Location Data (Android and iOS)

Coarse-grained GPS location data were collected from participants who granted explicit location permission at the time of questionnaire completion. These data were recorded once per submission and are stored in a separate file (GPS_Baseline/EMA.csv). The following columns are included:

Column A - user_id: Anonymous participant identifier that can be used to match GPS records to corresponding questionnaire entries.
Column B - sensordata_collected_at: Timestamp indicating when the GPS data were recorded, typically matching the time of EMA submission.
Column C - sensordata_altitude: Altitude in meters above sea level as reported by the device’s GPS sensor (if available).
Column D - sensordata_longitude: Longitude in decimal degrees, rounded to 0.1^° to ensure spatial privacy (approx. 11.1 km resolution).
Column E - sensordata_latitude: Latitude in decimal degrees, similarly rounded to 0.1^°.

No continuous tracking was performed; GPS coordinates were obtained only at the time of response submission. The spatial resolution was intentionally limited to preserve user privacy and comply with GDPR. These data allow for coarse-grained spatial analyses, such as linking questionnaire responses with regional-level variables (e.g., urban density, local COVID-19 infection rates), while minimizing re-identification risk.

App Usage Statistics (Android only)

App usage data were collected from Android users who granted explicit sensor permission and are available in respective files (APP_Basline/EMA.csv) as structured columns prefixed with appdata_. These data were recorded once per response submission, corresponding to the day of participation. The columns include general screen usage metrics as well as detailed statistics for the five most frequently used apps on that day.

The following columns are included (column letters refer to the APP_Basline/EMA.csv):

Column A - user_id: Anonymous user identifier (also used for questionnaire data).
Column B - appdata_apps: JSON object listing all apps used on the day of submission, including social media and top 5 apps.
Column C - appdata_beginTime: Timestamp (Unix format) of the first app usage event on the day.
Column D - appdata_collected_at: Timestamp when app usage data were collected (i.e., questionnaire submission time).
Column E - appdata_endTime: Timestamp (Unix format) of the last app usage event on the day.
Column F - appdata_screenTime_activeTimes: JSON or structured list of intervals during which the phone was actively used.
Column G - appdata_screenTime_useTimes: Total screen-on time across all apps.
Column H - appdata_sleepTimes: Inferred inactive periods, approximating rest or sleep phases (JSON or range format).

For each of the five most frequently used apps, the dataset contains the following fields, repeated in blocks of four columns (Columns I-AB):

Column I, M, Q, U, Y - appdata_top5Apps_appX_packageName: Package name of app X (e.g., com.whatsapp).
Column J, N, R, V, Z - appdata_top5Apps_appX_completeFGServiceUseTime: Time app X spent running in foreground service mode.
Column K, O, S, W, AA - appdata_top5Apps_appX_completeUseTime: Total foreground usage time for app X.
Column L, P, T, X, AB - appdata_top5Apps_appX_dailyValues: Encoded JSON or structured summary of hourly usage or usage events for app X.

App names are reported as Android package identifiers. All app usage data are aggregated per day and linked to each questionnaire entry. These columns follow the user metadata and precede the questionnaire responses in the dataset. In general, they are only populated if app usage permission was granted by the participant.

Technical Validation

The Corona Health mHealth app incorporated multiple layers of technical validation to ensure the accuracy, consistency, and reliability of the collected data. Upon initial use, participants were required to authenticate themselves through a registration process, during which the app automatically generated and managed an anonymized authentication account. Users did not need to provide any credentials. This process generated a unique anonymous identifier that was used to enforce data integrity rules across the platform. Figure 5 shows the onboarding process for an user when starting the app and registering for a study (see⁷ for more technical information).

Technically, the system was designed to guarantee that certain data could only be submitted once per participant. In particular, the baseline demographic data and initial mental health assessments were locked to a single submission per user, while repeated EMA responses were versioned and linked to the same identifier, enabling longitudinal tracking without redundancy.

All server-side endpoints that perform write operations (e.g., submitting questionnaire responses or sensor data) enforced a multi-layered validation pipeline before any data were persisted. First, the system verified authentication, confirming that the incoming request originated from a registered user with a valid anonymous session token. Second, an authorization check ensured that the authenticated user held the appropriate privileges to perform the requested operation (e.g., submitting a baseline response only if no prior baseline existed for that user, or submitting an EMA response only if the user was enrolled in the corresponding study). Third, a syntactical validation step confirmed that the JSON payload transmitted to the server conformed to the expected structural format, rejecting malformed or incomplete request bodies before further processing.

Beyond structural integrity, two layers of semantical validation were applied: the system first verified that all fields marked as required in the questionnaire schema were present and non-empty (i.e., that all mandatory items had been answered), and then checked whether the submitted values fell within the permissible ranges defined for each question type. For instance, Likert scale items were validated against their predefined minimum and maximum values, slider-based inputs were checked against their configured range and step parameters, and single-choice items were verified against the set of allowed response codes as specified in the codebook. Responses failing any of these validation steps were rejected by the API and not stored in the database, ensuring that only structurally and semantically valid data entered the system.

This robust validation framework was implemented within a secure RESTful API architecture, based on the TrackYourHealth platform¹⁵, which supports version-controlled survey deployment, multilingual item consistency, and real-time rule enforcement. Together with compliance to GDPR, MDR, and IEC 62304/82304 software validation standards, these technical safeguards ensured the methodological rigor and quality of the data collected throughout the multi-year study period.

In addition to the real-time validation mechanisms enforced during data collection, several post-export quality assurance steps were performed on the released dataset. It wasverified that all column headers in Baseline.csv and EMA.csv match the corresponding label entries in the codebook, ensuring correct harmonization between questionnaire definitions and stored responses. Cross-file linkage integrity was confirmed by checking that all user_id values in the sensor data files (GPS_Baseline/EMA.csv, APP_Baseline/EMA.csv) have corresponding entries in the questionnaire response files. Missing data patterns were examined across all response variables; item-level missingness is present due to participant attrition, skipped non-mandatory items, and varying engagement over time. The temporal distribution of responses was inspected to confirm that all timestamps fall within the documented collection period (i.e., July 2020 to January 2025) and that no implausible patterns such as duplicate timestamps or pre-launch submissions are present.

The reliability and scientific utility of the dataset are further demonstrated by its prior use in peer-reviewed research. A total of nine publications have already drawn on the dataset described in this data descriptor (see Table 1), demonstrating its value across a range of psychological, behavioral, and digital health research domains.

Usage Notes

Several aspects should be considered by researchers aiming to reuse the dataset effectively. The data are provided in structured files, including baseline and repeated EMA responses (Baseline/EMA.csv), accompanied by a comprehensive multilingual codebook (Codebook_Baseline_EMA.xlsx). The codebook defines each item, question format, and predefined answer options across all supported languages, facilitating consistent variable mapping.

In general, Tables S.1 and S.2 in the Supplementary Materials provide an overview of participant demographics and study engagement. The baseline sample (n = 2,704) was predominantly female (54.3%), German nationals (98.0%), and in a partnership (60.6%). The age distribution was concentrated in the 25–54 age range (66.8 %), with smaller representation among younger adults aged 18–24 (13.0 %) and older adults aged 65 and above (4.5 %). The majority of participants held academic-level qualifications (42.1 %), and most reported no COVID-19 infection at baseline (92.6 %). Regarding platform use, Android devices were more common (66.9 %) than iOS (33.1 %), and 1,981 participants granted GPS tracking permission while 416 Android users permitted app usage tracking. Of the 1,488 participants who completed at least one EMA, engagement varied considerably, with a median of 8 responses per participant (range: 2–196). Response frequency was highest during 2020–2021 (12,254 responses combined), declining substantially in subsequent years as the pandemic progressed.

Researchers should account for this temporal distribution and the potential for selection bias when interpreting findings or generalizing results. Compared to the general German adult population, women are slightly overrepresented, while older adults aged 65 and above are substantially underrepresented. The sample is considerably more educated and almost exclusively comprises German nationals⁴⁶. These characteristics are consistent with biases commonly observed in voluntary smartphone-based research, where younger, more educated, and more digitally engaged participants tend to be overrepresented, while older adults and those with migration backgrounds are underrepresented⁴⁷. Researchers should exercise caution when generalizing findings to the broader German population, particularly to older adults, individuals with lower educational attainment, and those with migration backgrounds.

Due to the longitudinal design, EMA responses were completed at varying intervals. Each response includes a timestamp, but time gaps between assessments differ across participants and should be accounted for in temporal analyses. Additionally, missing data are present due to participant attrition or skipped items. Researchers are encouraged to examine missingness patterns and apply suitable imputation or exclusion strategies. In rare cases, values may be present in the dataset that fall outside the allowed response ranges as defined in the codebook (e.g., invalid scale values or out-of-range categorical codes). These anomalies likely reflect app-level data entry errors or unexpected client-side behavior and should be identified and handled during data preprocessing.

Sensor data are available for participants who explicitly granted permission at the time of questionnaire completion. These data include coarse-grained GPS coordinates corresponding to an approximate spatial resolution of 11.1 km and app usage statistics, the latter collected only on Android devices. Both types of sensor data were captured opportunistically and linked to individual questionnaire submissions rather than through continuous tracking. At baseline, 1,981 participants granted GPS tracking permission, while 416 Android users permitted app usage tracking. App usage data include total daily screen time, foreground usage durations for the five most frequently used and predefined social media apps, foreground service (background) activity, and intervals of activity and inactivity that can approximate behavioral routines such as sleep. These mobile sensing features provide valuable context for interpreting mental health assessments and facilitate analyses of digital behavior patterns in relation to psychological states. However, researchers should note that app usage statistics are available exclusively for Android users, limiting generalizability of findings derived from these data to approximately two-thirds of the sample. Analyses combining app usage with mental health outcomes should account for this platform-specific availability and consider potential selection effects associated with Android versus iOS users.

Importantly, researchers should note that GPS coordinates were collected only at the time of questionnaire submission, not through continuous tracking. As such, the location data reflect where participants completed assessments rather than comprehensive mobility behavior. This design is well-suited for linking mental health responses to regional-level contextual factors (e.g., local COVID-19 incidence rates, urban versus rural environment, regional socioeconomic indicators) but is not appropriate for analyses requiring true mobility patterns, movement trajectories, or home range estimations. Studies investigating associations between GPS-derived mobility indices and mental states would require continuous location sampling, which was not implemented in Corona Health due to privacy considerations and battery consumption constraints. Suitable analyses with the available GPS data include examining whether regional characteristics predict mental health outcomes, comparing urban versus rural respondents, or linking responses to publicly available area-level data (e.g.,²¹). In contrast, researchers should avoid inferring individual mobility behavior, social contact patterns based on location changes, or time spent at home versus away from the single-timepoint GPS measurements.

To ensure participant anonymity and adhere to GDPR regulations, certain responses were removed or generalized prior to data release. For example, items such as profession in healthcare and low-frequency demographic characteristics were excluded or masked to prevent potential re-identification. These modifications are described in the Data Records section and should be taken into account when conducting subgroup analyses.

No specific scripts or software are required to analyze the dataset. However, it is recommended to use data manipulation tools such as Python (e.g., pandas) or R (e.g., tidyverse) for efficiently handling the structured data, particularly when aligning participant responses with the corresponding questionnaire items defined in the codebook.

For technical details regarding the architecture, implementation, and methodological framework of Corona Health, researchers are referred to related publication⁷. In addition, it is recommended to make an overview of related studies that have utilized this dataset, as they may offer methodological insights and comparative benchmarks for secondary analyses (see Table 1).

Note that some responses from the original responses were removed or modified due to the risk of participant re-identification when considering combinations of certain items. For example, the baseline questionnaire included a question about whether the respondent worked in a healthcare profession. Due to the specificity of this question and the relatively small number of individuals it applies to within the sample, it posed a potential risk of re-identification, especially when considered alongside other sociodemographic variables such as age, gender, and geographic region. To mitigate this risk and comply with data protection regulations, responses to this item were removed in the shared dataset. Additionally, 22 participants reported having a diverse gender identity. Given the low frequency of this response and its potential to allow indirect identification, these participants were masked, and their gender category was randomly replaced with either male or female in the publicly available dataset. This modification was performed solely for data protection purposes and does not reflect the original responses of the affected individuals. This approach follows a practice used by the Federal Statistical Office of Germany, where cases with the gender categories “diverse” or “unknown” are redistributed to the categories “male” and “female” using a predefined recoding procedure, as these categories currently cannot be reported separately for methodological reasons⁴⁸. Furthermore, the data were not deleted, as doing so would raise ethical concerns: collecting information from gender-diverse individuals only to later exclude it from analysis would be inappropriate and contrary to the principles of inclusive and respectful research practice.

Data availability

The complete dataset can be accessed freely and publicly at B2SHARE (EUDAT): https://b2share.eudat.eu/records/cgf63-kme28.

Code availability

No custom code was used in the generation, processing, or analysis of the dataset described in this manuscript. The dataset represents a raw data export from the Corona Health platform. Apart from the anonymization steps described, such as the removal or generalization of potentially identifying variables, no further processing or transformation was applied.

References

Ciotti, M. et al. The covid-19 pandemic. Critical reviews in clinical laboratory sciences 57, 365–388 (2020).
Article CAS PubMed Google Scholar
Al Zobbi, M., Alsinglawi, B., Mubin, O. & Alnajjar, F. Measurement method for evaluating the lockdown policies during the covid-19 pandemic. International Journal of Environmental Research and Public Health 17, 5574 (2020).
Article PubMed PubMed Central Google Scholar
Walther, L., Junker, S., Thom, J., Hölling, H. & Mauz, E. High-frequency surveillance of mental health indicators in the adult population of germany: trends from 2022 to 2023. Deutsches Ärzteblatt International 120, 736 (2023).
PubMed PubMed Central Google Scholar
Ahmed, N. et al. Mental health in europe during the covid-19 pandemic: a systematic review. The Lancet Psychiatry 10, 537–556 (2023).
Article PubMed PubMed Central Google Scholar
Vindegaard, N. & Benros, M. E. Covid-19 pandemic and mental health consequences: Systematic review of the current evidence. Brain, behavior, and immunity 89, 531–542 (2020).
Article CAS PubMed PubMed Central Google Scholar
Benke, C., Asselmann, E., Entringer, T. M. & Pané-Farré, C. A. The role of pre-pandemic depression for changes in depression, anxiety, and loneliness during the covid-19 pandemic: Results from a longitudinal probability sample of adults from germany. European Psychiatry 65, e76 (2022).
Article PubMed PubMed Central Google Scholar
Beierle, F. et al. Corona health-a study-and sensor-based mobile app platform exploring aspects of the covid-19 pandemic. International journal of environmental research and public health 18, 7395 (2021).
Article CAS PubMed PubMed Central Google Scholar
Winter, M. et al. Longitudinal mental health data collected via the corona health smartphone app during covid-19. B2SHARE (EUDAT) https://doi.org/10.23728/b2share.cgf63-kme28 (2026).
Betsch, C. et al. Covid-19 snapshot monitoring (cosmo germany) - data of 69 surveys on pandemic attitudes, feelings and behaviours, https://doi.org/10.23668/PSYCHARCHIVES.15213 (2024).
Bensafi, M. & Stanley, H. B. Covid-19 - symptoms - impact on quality of life and needs of affected people, https://doi.org/10.5281/zenodo.7920303 (2023).
Sari, E., Kagan, G., Karakus, B. S. & Özdemir, O. Dataset on social and psychological effects of the covid-19 pandemic in turkey, https://doi.org/10.17632/sv95c7ydpy.9 (2022).
Yamamoto, T., Sugaya, N. & Uchiumi, C. A two-year longitudinal study examining the change in psychosocial factors under the prolonged covid-19 pandemic in japan https://doi.org/10.17605/OSF.IO/NGW6D (2023).
Allgaier, J., Eichner, F., Störk, S., Heuschmann, P. & Pryss, R. Physical health and ecological momentary assessments during covid-19: Data from the ‘corona health’ app users. Data in Brief 59, 111289 (2025).
Article CAS PubMed PubMed Central Google Scholar
Winter, M., Deserno, L., Romanos, M. & Pryss, R. Mental health and ecological momentary assessments during covid-19: Data from the corona health app adolescents study. Data in Brief 111619 (2025).
Vogel, C., Pryss, R., Schobel, J., Schlee, W. & Beierle, F. Developing apps for researching the covid-19 pandemic with the trackyourhealth platform. In 2021 IEEE/ACM 8th International Conference on Mobile Software Engineering and Systems (MobileSoft), 65–68 (IEEE, 2021).
Holfelder, M. et al. Medical device regulation efforts for mhealth apps during the covid-19 pandemic—an experience report of corona check and corona health. J 4, 206–222 (2021).
Google Scholar
Weiß, M. et al. Common and differential variables of anxiety and depression in adolescence: a nation-wide smartphone-based survey. Child and Adolescent Psychiatry and Mental Health 18, 103 (2024).
Article PubMed PubMed Central Google Scholar
Allgaier, J. & Pryss, R. Practical approaches in evaluating validation and biases of machine learning applied to mobile health studies. Communications Medicine 4, 76 (2024).
Article PubMed PubMed Central Google Scholar
Wilhelm, C. et al. Cancer and covid-19–an app-based study on the effects of the pandemic on cancer patients. Laryngo-Rhino-Otologie 103 (2024).
Simon, L. et al. The predictive value of supervised machine learning models for insomnia symptoms through smartphone usage behavior. Sleep Medicine: X 7, 100114 (2024).
PubMed PubMed Central Google Scholar
Edler, J.-S. et al. Predicting depressive symptoms using gps-based regional data in germany with the corona health app during the covid-19 pandemic: Cross-sectional study. Interactive Journal of Medical Research 13, e53248 (2024).
Article PubMed PubMed Central Google Scholar
Edler, J.-S., Terhorst, Y., Pryss, R., Baumeister, H. & Cohrdes, C. Messenger use and video calls as correlates of depressive and anxiety symptoms: Results from the corona health app study of german adults during the covid-19 pandemic. Journal of Medical Internet Research 26, e45530 (2024).
Article PubMed PubMed Central Google Scholar
Weiß, M. et al. Extraversion moderates the relationship between social media use and depression. Journal of Affective Disorders Reports 8, 100343 (2022).
Article Google Scholar
Mulansky, L., Pryss, R., Cohrdes, C., Baumeister, H. & Beierle, F. Social media app usage in relation with phq-9 depression scores during the covid-19 pandemic. In Adjunct Proceedings of the 2022 ACM International Joint Conference on Pervasive and Ubiquitous Computing and the 2022 ACM International Symposium on Wearable Computers, 188–192 (2022).
Wetzel, B. et al. How come you don’t call me? smartphone communication app usage as an indicator of loneliness and social well-being across the adult lifespan during the covid-19 pandemic. International Journal of Environmental Research and Public Health 18, 6212 (2021).
Article CAS PubMed PubMed Central Google Scholar
Cohrdes, C., Wetzel, B., Pryss, R., Baumeister, H. & Göbel, K. Adult quality of life patterns and trajectories during the covid-19 pandemic in germany. Current Psychology 43, 14087–14099 (2024).
Article Google Scholar
Cohrdes, C. et al. Support-and meaning-focused coping as key factors for maintaining adult quality of life during the covid-19 pandemic in germany. Frontiers in public health 11, 1196404 (2023).
Article PubMed PubMed Central Google Scholar
Eicher, S. et al. Quality of life during the covid-19 pandemic–results of the corona health app study. Journal of Health Monitoring 6, 2 (2021).
PubMed PubMed Central Google Scholar
Winter, M., Probst, T., John, D. & Pryss, R. Recognizing and understanding stress in adults during covid-19: Data insights from the corona health app. Data in Brief 111967 (2025).
Buhr, L., Schicktanz, S. & Nordmeyer, E. Attitudes toward mobile apps for pandemic research among smartphone users in germany: national survey. JMIR mHealth and uHealth 10, e31857 (2022).
Article PubMed PubMed Central Google Scholar
Pryss, R., Schobel, J. & Reichert, M. Requirements for a flexible and generic api enabling mobile crowdsensing mhealth applications. In 2018 4th International Workshop on Requirements Engineering for Self-Adaptive, Collaborative, and Cyber Physical Systems (RESACS), 24–31 (IEEE, 2018).
Gebhardt, D., Katz, Y., Klabnik, S. & Kellen, T. Json:api — latest specification (v1.1) Accessed: 2025-07-15 (2020).
Cox, B. et al. The reliability of the minimum european health module. International journal of public health 54, 55–60 (2009).
Article PubMed Google Scholar
Gräfe, K., Zipfel, S., Herzog, W. & Löwe, B. Screening psychischer störungen mit dem “gesundheitsfragebogen für patienten (phq-d)”. Diagnostica 50, 171–181 (2004).
Article Google Scholar
Kroenke, K., Spitzer, R. L. & Williams, J. B. The phq-9: validity of a brief depression severity measure. Journal of general internal medicine 16, 606–613 (2001).
Article CAS PubMed PubMed Central Google Scholar
Stuart, H., Milev, R. & Koller, M. The inventory of stigmatizing experiences: its development and reliability. World Psychiatry 4, 35–39 (2005).
Google Scholar
Carver, C. S. You want to measure coping but your protocol’too long: Consider the brief cope. International journal of behavioral medicine 4, 92–100 (1997).
Article CAS PubMed Google Scholar
Rammstedt, B., Kemper, C. J., Klein, M. C., Beierlein, C. & Kovaleva, A. A short scale for assessing the big five dimensions of personality: 10 item big five inventory (bfi-10). methods, data, analyses 7, 17 (2013).
Google Scholar
Richter, D. & Weinhardt, M. Ls-s: Loneliness scale-soep. In Psychologische und sozialwissenschaftliche Kurzskalen: Standardisierte Erhebungsinstrumente für Wissenschaft und Praxis (2013).
Williams, N. Phq-9. Occupational medicine 64, 139–140 (2014).
Article PubMed Google Scholar
Wittkampf, K. A., Baas, K. D., van Weert, H. C., Lucassen, P. & Schene, A. H. The psychometric properties of the panic disorder module of the patient health questionnaire (phq-pd) in high-risk groups in primary care. Journal of affective disorders 130, 260–267 (2011).
Article CAS PubMed Google Scholar
Spitzer, R. L., Kroenke, K., Williams, J. B. & Löwe, B. A brief measure for assessing generalized anxiety disorder: the gad-7. Archives of internal medicine 166, 1092–1097 (2006).
Article PubMed Google Scholar
Skevington, S. M., Lotfy, M. & O’Connell, K. A. The world health organization’s whoqol-bref quality of life assessment: psychometric properties and results of the international field trial. a report from the whoqol group. Quality of life Research 13, 299–310 (2004).
Article CAS PubMed Google Scholar
Bastien, C. H., Vallières, A. & Morin, C. M. Validation of the insomnia severity index as an outcome measure for insomnia research. Sleep medicine 2, 297–307 (2001).
Article PubMed Google Scholar
Brähler, E., Mühlan, H., Albani, C. & Schmidt, S. Teststatistische prüfung und normierung der deutschen versionen des eurohis-qol lebensqualität-index und des who-5 wohlbefindens-index. Diagnostica 53, 83–96 (2007).
Article Google Scholar
Federal statistical office of germany. current population of germany. https://www.destatis.de/EN/Themes/Society-Environment/Population/Current-Population/_node.html. Accessed: 2026-02-03.
Rashid, A. et al. Conceptualization of smartphone usage and feature preferences among various demographics. Cluster Computing 23, 1855–1873 (2020).
Article Google Scholar
Statistisches Bundesamt (Destatis). Bevölkerungsstand - wie wird mit den daten von personen mit den geschlechtsausprä-gungen,unbekannt “oder,divers” verfahren? Accessed on February 10, 2026 (2020).

Download references

Acknowledgements

We are grateful to all app users who agreed to provide data for our research purposes. We further like to thank Johannes Allgaier (Würzburg), Fabian Haug (Würzburg), Julian Haug (Würzburg), Marc Holfelder (Erlangen), Lena Mulansky (Würzburg), Michael Stach (Würzburg), and Helmut Greger (Würzburg) for their valuable contributions at different stages of the project. M.W., C.V., P.H., C.C., and R.P. are supported by grants in the project COMPASS. P.H. and R.P. are supported by a grant in the project NAPKON. The COMPASS and NAPKON projects are part of the German COVID-19 Research Network of University Medicine (“Netzwerk Universitätsmedizin”), funded by the German Federal Ministry of Education and Research (funding reference 01KX2021). This publication was supported by the Open Access Publication Fund of the University of Wuerzburg.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

These authors contributed equally: Caroline Cohrdes, Rüdiger Pryss.

Authors and Affiliations

Institute for Clinical Epidemiology and Biometry, University of Würzburg, Würzburg, Germany
Michael Winter, Carsten Vogel, Miriam Schlüter, Peter Heuschmann & Rüdiger Pryss
Institute of Medical Data Science, University Hospital of Würzburg, Würzburg, Germany
Michael Winter, Carsten Vogel, Miriam Schlüter, Peter Heuschmann & Rüdiger Pryss
Institute DigiHealth, Neu-Ulm University of Applied Sciences, Neu-Ulm, Germany
Johannes Schobel
Department of Clinical Psychology and Psychotherapy, Ulm University, Ulm, Germany
Harald Baumeister
German Center for Mental Health (DZPG), Partner-Site Mannheim-Heidelberg-Ulm, Ulm, Germany
Harald Baumeister
German Center for Child and Adolescent Health (DZKJ), Partner-Site Ulm, Ulm, Germany
Harald Baumeister
Department Psychologie, LMU München, Munich, Germany
Yannik Terhorst
Deutsches Zentrum für Psychische Gesundheit (DZPG), Standort München-Augsburg, Augsburg, Germany
Yannik Terhorst
Institute for Information and Process Management, Eastern Switzerland University of Applied Sciences, St. Gallen, Switzerland
Winfried Schlee
Department of Psychiatry and Psychotherapy, University of Regensburg, Regensburg, Germany
Winfried Schlee & Berthold Langguth
Department 2 Epidemiology and Health Monitoring, Robert Koch Institute, Berlin, Germany
Caroline Cohrdes

Authors

Michael Winter
View author publications
Search author on:PubMed Google Scholar
Carsten Vogel
View author publications
Search author on:PubMed Google Scholar
Johannes Schobel
View author publications
Search author on:PubMed Google Scholar
Miriam Schlüter
View author publications
Search author on:PubMed Google Scholar
Harald Baumeister
View author publications
Search author on:PubMed Google Scholar
Yannik Terhorst
View author publications
Search author on:PubMed Google Scholar
Winfried Schlee
View author publications
Search author on:PubMed Google Scholar
Berthold Langguth
View author publications
Search author on:PubMed Google Scholar
Peter Heuschmann
View author publications
Search author on:PubMed Google Scholar
Caroline Cohrdes
View author publications
Search author on:PubMed Google Scholar
Rüdiger Pryss
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization: M.W., C.V., J.S., M.S., H.B., Y.T., W.S., B.L., P.H., C.C., R.P.; Data curation: M.W., C.V., J.S., M.S.; Formal analysis: M.W., C.V., J.S., C.C.; Funding acquisition: NA; Investigation: M.W., C.V., J.S., M.S., H.B., Y.T., W.S., B.L., P.H., C.C., R.P.; Methodology: M.W., C.V., J.S., H.B., Y.T., W.S., B.L., P.H., C.C., R.P.; Project administration: M.W., C.C., R.P.; Resources: M.W., C.V., J.S., M.S., C.C.; Software: M.W., C.V., J.S., M.S.; Supervision: J.S., H.B., W.S., B.L., P.H., C.C., R.P.; Validation: H.B., W.S., B.L., P.H., C.C., R.P.; Visualization: M.W., C.V.; Writing – original draft: M.W., C.V., J.S., M.S., H.B., Y.T., W.S., B.L., P.H., C.C., R.P.; Writing – review & editing: M.W., C.V., J.S., M.S., H.B., Y.T., W.S., B.L., P.H., C.C., R.P. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Michael Winter.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Winter, M., Vogel, C., Schobel, J. et al. Longitudinal mental health data collected via the Corona Health smartphone app during COVID-19. Sci Data 13, 392 (2026). https://doi.org/10.1038/s41597-026-07015-7

Download citation

Received: 05 November 2025
Accepted: 03 March 2026
Published: 11 March 2026
Version of record: 16 March 2026
DOI: https://doi.org/10.1038/s41597-026-07015-7