Background & Summary

The COVID19-Snapshot Monitoring (COSMO) is a research project that aimed to gain insights into how the German population perceived the COVID-19 pandemic, the political measures taken to contain the pandemic and its psychosocial impact on everyday life in Germany between March 2020 and November 2022. The collected data were intended to provide political stakeholders, the media and the population with helpful knowledge about the psychological aspects of the pandemic and to prevent misinformation (e.g., regarding the acceptance of political measures) and knee-jerk reactions based on unfounded assumptions about how members of the public feel or behave1,2. In addition, the pandemic yielded a unique opportunity for social and behavioural scientists to study human behaviour in a crisis. The evidence gathered can support future pandemic preparedness3. In addition, the presented data can help evaluate some of the predictions made by behavioural science and thus also add to the discussion of behavioural science’s usefulness in guiding crisis responses4. For example, in the COSMO study, efforts were made to contextualise behaviours and public health measures that were heavily debated in the media, such as mask wearing5, rapid testing6 and nationwide lockdowns7. Future users can make use of the data by assessing perceptions in relations to such policy changes (e.g. as collected by the Oxford Covid-19 Government Response Tracker (OxCGRT8)).

Methods

A methodological preprint published at the survey’s outset9,10 was utilised to establish a standard protocol, which was recommended by WHO Europe11 for use as a WHO-tool12 and adopted by about 40 countries (e.g., Denmark13, Spain14, Finland15, Iran16).

Ethical review

Ethical clearance was obtained from the University of Erfurt’s institutional review board (#20200302/2020831/20200501). It was approved that the study follows the ethical guidelines of the University of Erfurt, that the safety and well-being of the subjects was ensured at any time, that there were no ethical concerns for the recruitment of the participants, the implementation of the study design or the subsequent data analysis and processing, and that the undertaken measures safeguard the ethical standards as proposed by the DGPs (German Psychological Society). Participants gave informed consent to take part in the study voluntarily, with the understanding that their data would be used solely for scientific purposes and for supporting public health communication efforts. They were informed that the anonymized data would be stored permanently and may be shared with other researchers for future scientific use. Informed consent was obtained through an online consent form, presented prior to the start of the actual study: Participants had to actively agree to the terms and confirm their understanding before proceeding to the questionnaire. To ensure privacy and data protection, all data was collected in an anonymized way. No personally identifying information (such as names or contact details) was collected or stored.

Sampling method

The project comprised 69 cross-sectional surveys conducted in Germany between March 2020 and November 2022. While the surveys were performed weekly to biweekly in the first year of the project, the frequency of survey administration decreased later. Each survey took between 20 and 30 minutes and was completed by approximately 1,000 individuals (N = 69,013). The data provider Bilendi GmbH recruited and compensated the respondents. Data collection typically took place between Tuesday afternoon and Wednesday at midnight during the data collection weeks. Such a short data collection period was chosen as at some timepoints during the pandemic COVID-19 cases, decisions and media debates about political measures were advancing quickly. As the survey aimed to provide a rather exact snapshot, the data were collected within roughly 36 hours. The sampling was quota-representative in terms of age (18–74 years of age) and gender (crossed), as well as the German federal states, according to census data17. Because the quota sample was based on the German census data, which did not include more than two genders at the time, only men and women were sampled by quota. For a better impression of the sample, Table 1 provides the quotas in the sample in comparison to the available census data. As can be seen, the sample was higher educated than the German population, but please note that this was not a stratification criterion. When quotas were completed, participants were screened out after providing their quota-relevant demographics.

Table 1 Representativeness of the sample as compared to 2011 census data.

Measurements

Table 2 provides an overview of constructs available in the dataset. Note that for most constructs there is a set of items or a validated scale available. The questionnaire covered socio-demographic information; the monitoring of risk perceptions18; emotional responses to the pandemic19,20,21; self-efficacy22,23, trust in institutions24; adherence to protective measures; a range of concerns; general burden and wellbeing; vaccination-related attitudes, intentions and behaviours; attitudes towards upcoming and implemented policies and conspiratorial tendencies25,26. All original questionnaires are available online (https://osf.io/w38cn/). Note that the single questionnaires may contain variables not included in the dataset, because there were not assessed at least five times, and this was the criterion for including the variables in this data set.

Table 2 Summarised overview of constructs available in the dataset.

Due to extreme time pressure, high workload and limited resource there was no formalized advisory process regarding which variables to assess or to drop from the study; topics were gathered from press briefings and the news, interactions with study partners, meetings with government and administration representatives and discussions with other scientists (see acknowledgements). Thus, the process can be described as rather bottom up, problem-oriented and use-inspired, partially due to the speed of changes happening during the pandemic as well as the short intervals between the data collections. Most variables and items were therefore newly constructed as they responded to events or behaviours that became important over the course of the pandemic. Yet, there are several validated scales that were used repeatedly, namely the 5 C vaccination readiness27, conspiracy beliefs25,26, pandemic fatigue28, wellbeing (WHO520,21), resilience29 and satisfaction with life30. Detailed information about the items included in the data set, their measurement and English translations are presented in the codebook, which is also available in the repository31.

Data Records

The dataset is available at PsychArchives31 and provides as a csv and Excel file. For use in R we suggest using the csv file. All survey measures that were assessed in at least five surveys were considered for publication. This selection was made to create larger samples per potential question and identify themes and variables that were relevant for a substantial period of time. Some variables were assessed only once or belonged to single experiments and are not presented here. The data set contains a column for each measure and a row for each participant. Two columns identify the time point of the sampling in the form of the chronological number of the survey (TIME, numerical order from 1 to 70; the survey TIME = 36 was used for a related longitudinal follow-up survey, and is not included in the cross-sectional dataset described here), as well as the specific survey date (TIME_NOMINAL, ranging from 03.03.20 to 29.11.22; indicating the day on which the 36-hour data collection period began). Note that there is an additional time-point indicated by TIME = 71 from September 2023 that only includes socio-demographics, vaccination readiness and intention to get vaccinated. We do not consider this a full wave but report this follow up for completeness.

Technical Validation

The data provider guaranteed the data quality. The company regularly checked the quality of participants’ answers in its own surveys and removes bots and unreliable participants. We further ensured the cross-sectional nature of the data collection by not inviting participants who had participated in one of the previous 20 weeks to reduce conditioning effects. An additional check was also run in R, which identified and removed duplicates within a wave and in relation to previous waves for which a participant would still be blocked. Other than that, no validation or clean-up processes were implemented. Subsequent users of the dataset can further clean the data (e.g., by removing participants with fast or slow survey durations or whose answer patterns indicate inattentive responding).

Usage Notes

The presented dataset is unique, as it covers behavioural data from the pandemic from its beginning across almost three important and eventful years. The data collections were highly adaptive, which led to the rapid generation of items. While this confers the advantage that the themes and topics that were brought up by the media, society or politics could be integrated swiftly, creating a thorough theoretical foundation for each item or research question was not always possible. In addition, the large quantity of data allows us to explore a nearly endless number of relationships. While we believe that exploring the data may be valuable, confirmative research will require preregistration. We therefore suggest preregistering the research question and hypotheses at hand.

The data is made available for public use (Sharing level 0), allowing open and immediate access (https://psycharchives.org/en/about#sharing_levels).

COVID-19 Behavioural data dashboard

In order to explore the data in an interactive way, we also provide code for the COSMO Explorer (https://explore.healthpsychology.uni-bamberg.de/shiny/cosmo-explorer-en/), a web-based app that allows users to directly investigate variables (Fig. 1). The variables in the final version include risk perceptions, several protective behaviours, worries and fears, trust, attitudes towards several implemented and discussed measures and regulations, willingness to be vaccinated and several reasons for or against vaccination and belief in conspiracies. Details about the measurement of each variable (e.g., the measurement range) are directly described in a table below the visualisation. The recent version visualises:

  1. (1)

    The development of variable means over time (as shown in Fig. 1): The app allows users to examine how one or multiple variables unfold over time. Variables can be stratified according to various dimensions (as exemplified in Fig. 1), including demographics, such as age, gender, or education, as well as the acceptance of implemented response measures, allowing us to compare specific subgroups in the population. Confidence intervals allow users to estimate significant differences between timepoints and groups.

  2. (2)

    Development of variable distributions over time: The app visualizes the change of distributions for single variables. This information can help to better understand changes in mean values.

  3. (3)

    Correlations between variables: The app visualises relationships between variables by displaying the development of correlation coefficients over time (e.g., risk perception and acceptance of policies). While causation cannot be implied, information about the relationships between perceptions, attitudes and behaviours allows theory testing or development and can indicate important targets for future research questions, interventions and measures.

Fig. 1
figure 1

User interface for the COSMO Explorer. The interactive app allows users to investigate the development of variable means and distributions, as well as their correlations over time. *Major events: (1) first lockdown, 22.03.2020; (2) loosening of restrictions, 06.05.2020; (3) launch of Corona-Warning App, 16.06.2020; (4) first large protests in Berlin, 01.08.2020; (5) partial lockdown, 02.11.2020; (6) first vaccines undergoing approval, 15.11.2020; (7) lockdown, 16.12.2020; (8) first loosening of restrictions after lockdown, 01.03.2021; (9) 3 G regulations (access to public spaces when vaccinated, tested or recovered), 24.11.2021; (10) new government taking over, 08.12.2021; (11) expiration of 3 G regulations, 19.03.2021.

The development of variables and their correlations can be interpreted against the background of incidence rates and important pandemic events, which can also be displayed in the app (see Figure note; for more events consult respective repositories, e.g.8). The COSMO Explorer can be adapted to visualize COSMO data from other countries as well. The app is implemented in R using the Shiny framework. Researchers from other countries are invited to use the code and add other variables included in the dataset to the explorer.

Limitations

The urgency of the situation incurs some limitations to the study, including limited opportunities for scientific review and validation, as described above. This was also pointed out in the WHO standard protocol that was developed based on this work11.