Introduction

Since mid-2020, a condition characterised by variable and disabling persistent symptoms following SARS-CoV-2 infection has become a public health concern in many countries. This condition is diversely known as ‘long COVID’1,2, ‘post-acute COVID-19 syndrome’3,4, ‘post-COVID-19 condition’5,6 or ‘post-acute sequelae of COVID-19’7 among other denominations. An even more diverse range of definitions has been used for this condition, although most are based on the notion of persisting symptoms that emerged during infection and/or within a certain time frame and that continue over an abnormally long period of time5,8,9. As a result, long COVID prevalence has been found to vary widely in the population10,11,12,13.

The various factors reported to be associated with this condition may be grouped into several categories: sociodemographic characteristics (sex, age, ethnicity, education, income, place of residence)14,15,16,17,18,19,20,21,22,23,24,25,26, health status (pre-existing physical and mental comorbidities)16,17,19,21,22,23,24,25,26,27,28,29,30, health behaviours and lifestyle (smoking and alcohol use, physical activity, vaccination)17,18,20,21,31,32, work-related factors (occupational status, type of occupation, occupational changes, work-related infection)17,25,33,34, infection-related factors (number of infections, severity, number and type of symptoms)16,18,21,23,24,35,36, use of health system35 as well as perceived stress, loneliness or concerns about COVID-1927,37. However, these factors were mostly assessed without comprehensively considering the confounding and (potential mediating) pathways of effect. In particular, relative risk estimates are often unadjusted or minimally adjusted for age and sex or, on the contrary, multi-adjusted for heterogeneous factors, which has led to a confused view of the factors at play and obscured the relevant risk pathways for understanding long COVID and its prevention. These limitations also weaken the relevance of the systematic reviews and meta-analyses conducted to date38,39,40,41,42.

This incomplete and somewhat lacking aetiological knowledge generates uncertainty and hampers the development of sound prevention strategies. Moreover, most aetiological studies to date have been conducted in individuals infected with SARS-CoV-2 in healthcare settings, which results in an overrepresentation of individuals with severe forms of COVID-19. Most of these studies also consider long COVID as a complication specific to SARS-CoV-2 infection, whereas the main features of this condition do not appear to be specific. The persistent symptoms that define long COVID, including fatigue, breathlessness, disturbed attention, pain, swinging mood, anxiety and sleep disturbances, can be experienced after various infections or independently of any infection43,44,45. Furthermore, persistent symptoms are frequent in various medical conditions, suggesting that some mechanisms may not be specific to long COVID but may rather encompass biological, psychological and social factors46.

Based on a nationwide random sampling survey conducted in France after the Omicron waves in autumn 202213, the present study was designed to address some of the aforementioned shortcomings of previous studies and to assess factors associated with long COVID in a structured epidemiological investigation. The study therefore uses two control groups (infected without long COVID and never infected) and a step-by-step process to assess the factors categorised according to a conceptual model that accounts for the relationships between these factors. To improve the robustness of the results, four definitions of long COVID were used. Three of them are symptom-based and require a different number of symptoms (≥1 or ≥2), a maximum delay of occurrence from infection (≤3 months vs unlimited) and different impacts on daily living; and the fourth definition considers only the participant’s perception of having long COVID. These definitions may capture different aspects of the condition, which may be associated with distinct factors. Very few studies to date have addressed the issue of different definitions13 and assessed the robustness of their results.

Our findings reveal a broad spectrum of factors associated with long COVID, pertaining either to basic and early defined sociodemographic characteristics (age, sex), current socioeconomic position (household size, financial security and retirement), physical and mental comorbidities (number of pre-existing chronic diseases, respiratory disease, mental and sensory disorders), health behaviours (SARS-CoV-2 vaccination), short-term work-related factors (impact of COVID-19 pandemic on occupation and work conditions), SARS-CoV-2 infection-related factors (number of SARS-CoV-2 infections and number of initial symptoms) or COVID-19-related representations (overall perceptual experience of COVID-19 severity, long COVID information). These results strongly suggest that long COVID should be viewed not only as a complication of SARS-CoV-2 infection but also as part of a broader network of contextual factors that influence the individual’s risk of long COVID far beyond SARS-CoV-2 infection.

Methods

Survey stages and data collection

Between 2 September and 31 December 2022, 10,615 participants aged ≥ 18 years and living in mainland France were selected using a sampling method based on the random digit dialling of landline and mobile telephone numbers (participation rate: 44%, Fig. 1 and Supplementary Table 1)13. In the first stage, they were interviewed by telephone using the computer-assisted telephone interviewing (CATI) system. Information was collected on their sex, age, socioeconomic characteristics and all SARS-CoV-2 infections occurring up until the day of the interview (date of onset, diagnosis approaches, workplace infection, hospitalisation and intensive care unit admission, Supplementary Table 2). Participants were further asked about their present symptoms (from a list of 31 symptoms in random order; Supplementary Table 2) with details about the date of onset, alternative diagnoses and impact on daily functioning (Supplementary Table 2). Participants’ overall perception about having had long COVID was also assessed.

Fig. 1: Flow chart of the study.
figure 1

*Sampling ratio 1/2.0. †Sampling ratio 1/5.7 (sampling ratio of 1/7.0 during the first 3 weeks of the survey and 1/5.0 thereafter; the change is due to the higher levels of SARS-CoV-2 infection than initially excepted).

In the second stage, three sub-samples of participants were invited to continue the survey on an online platform (computer-aided web interview, CAWI), which collected detailed information about pre-existing chronic conditions (from a list of 20 conditions plus ‘other’ commonly used in French health surveys47), health behaviours (smoking, alcohol use, physical activity), vaccination, healthcare use in the past 12 months, social support and impact of the COVID-19 pandemic on income, occupation and social life. To ensure an adequate sample size (>150) in each group, the sampling ratio was set to 1.0 for participants with suspected long COVID, 2.0 for those reporting SARS-CoV-2 infection without long COVID, and 5.7 for those who did not report infection (Fig. 1).

To limit the selection and information biases, the study was presented to participants as a health survey after the COVID pandemic, without specifying its particular objective concerning long COVID. Nevertheless, professionals conducting the interviews were informed that one goal of the study was to identify participants with long COVID, although they had to follow the algorithms implemented in the CATI system in which SARS-CoV-2 infections and symptoms were recorded in separate parts of the questionnaire; the investigated factors were recorded in the CAWI stage of the study.

Ascertainment of long COVID

Four definitions were used to identify long COVID (Supplementary Table 3): first, standard definition of World Health Organisation post-COVID-19 condition (WHO-PCC) if at least one symptom from the list of 31 appeared within 3 months of a probable SARS-CoV-2 infection, lasted for at least 2 months, had an impact (even low) on daily functioning and, according to a physician, could not be explained by an alternative diagnosis; second, strengthened WHO-PCC definition: ‘moderate or strong impact’ WHO-PCC in which at least one symptom had at least a moderate impact on daily functioning; third, National Institute for Health and Care Excellence (NICE) definition of long COVID if participants reported a cluster of at least two symptoms that appeared after SARS-CoV-2 infection, lasted for at least 12 weeks post-infection and could not be explained by an alternative diagnosis; fourth, self-reported ‘perceived’ long COVID if the participant positively answered the question ‘Do you think that you have had a long form of COVID-19?’.

Conceptual model of factors involved in long COVID aetiology

Due to the well-known bidirectionality between certain groups of factors such as chronic conditions, health behaviours and working conditions48, which were assessed within the same 1-year time frame before the interview, no directed acyclic graph could be created49. Nonetheless, a conceptual model was constructed from the literature review exploring the factors associated with long COVID with the aim to (1) define relevant sets of factors potentially associated with the condition and the main relationships between these sets and (2) identify the minimal set of covariates to be adjusted for each given set of factors and, symmetrically, discard covariates likely to be intermediates whose control would lead to overadjustment.

The following eight sets of factors, operating in a chronological and potentially causal order, were thus distinguished and evaluated in six stages (Fig. 2): first, basic and early defined sociodemographic characteristics: age, sex, geographic origin and education level; second, current socioeconomic position (SEP) and living conditions: household size, employment status, occupation, employer, household income, financial security (satisfaction), size of place of residence, region and deprivation level of the place of residence evaluated by the French Deprivation Index; third, pre-existing chronic physical and mental comorbidities (diagnosed by a physician): number of diseases and diseases grouped according to the chapters of the International Classification of Diseases; fourth, health behaviours: smoking, alcohol drinking, moderate and vigorous physical activity, sedentariness and SARS-CoV-2 vaccination (number of injections); fifth, short-term work-related factors: type of employer, impact of COVID-19 pandemic on occupation and work conditions, and caregiving of elderly or disabled relative; Note that the third, fourth and fifth sets should be tested simultaneously at the same third stage, since no order of preference could be firmly established due to the bidirectional relationships between factors in these sets. Sixth, SARS-CoV-2 infection-related factors: number of infections (at the time of the interview, recorded independently of symptoms, Supplementary Table 2), hospitalisation, number and type of initial symptoms and infection during work; seventh, healthcare and social support: number of general practitioner consultations, number of specialists consulted and contacts with family and friends; eighth, COVID-19-related representations: perception of COVID-19 severity irrespective of one’s own situation and information about long COVID.

Fig. 2
figure 2

Conceptual model and sets of risk factors for long COVID. Eight sets of factors for long COVID are distinguished and evaluated in six stages in the statistical analysis (the third, fourth and fifth sets are tested at the same third stage). SEP socioeconomic position, ICU intensive care unit admission.

At each stage, a final model retained the factors independently (statistically) associated with the risk of long COVID. For each set of factors, the optimal adjustments include all the factors retained in the final model of the preceding stage.

Statistical analysis

Based on the conceptual model, two series of Poisson regression with robust (sandwich) variance were constructed hierarchically to derive prevalence ratios (PR) and 95% confidence intervals (CI) of long COVID50,51. Two control groups were considered: (1) SARS-CoV-2 infected participants currently without long COVID: a comparison based on the paradigm of long COVID as a specific complication of SARS-CoV-2 infection; and (2) never-infected participants according to the paradigm of long COVID as a non-specific condition, potentially arising from causes other than SARS-CoV-2 infection (Fig. 3). In the first case (comparison with previously infected patients), six series of models were successively constructed to test, stage-by-stage, the factors belonging to the eight aforementioned sets. In the second case (comparison with never-infected participants), infection-related factors were not tested and instead only five series of models were constructed.

Fig. 3: Overview of the comparisons performed.
figure 3

1: Analysis according the paradigm of long COVID as a specific complication of SARS-CoV-2 infection. 2: Analysis according to the paradigm of long COVID as a non-specific condition, potentially arising from causes other than SARS-CoV-2 infection.

Only statistically significant factors at a given stage (i.e. associated with the risk of long COVID) were considered as a potential explanatory variable at the following stage. At each stage, interactions (reflecting departures from the multiplicative Poisson model used52) were tested between the factors significantly associated with the risk of long COVID. Type 1 error was set at 0.05 (two-sided).

Due to the variability of long COVID definitions and the modest agreement between these definitions13, the robustness of the results was assessed in a series of sensitivity analyses. The standard WHO-PCC definition was used as the primary outcome, while the three other definitions (NICE, strengthened WHO-PCC and self-reported perceived long COVID) were used for sensitivity analyses. To minimise the risk of reverse causality between long COVID and certain factors (e.g. low household income and financial insecurity may increase the risk of long COVID, which can further reduce income and worsen financial security), a fourth sensitivity analysis considered only participants with infections of less than 1 year. It was expected that such reverse pathways would be less likely in a reduced time frame of 1 year.

All estimates of percentages and PR were weighted to account for the selection probability of participants as well as the structure of the French adult population, thus allowing the generalisation of results to the French population. In the first stage of the survey, (1) design weights, reflecting the individual selection probability, were calculated using information about the number of phone numbers generated, the number of phone numbers owned by the respondent, and the number of eligible persons in the household in the case of landlines; (2) design weights were then calibrated to adjust to the French population structure (age, education level, household size, urbanisation and region of residence) as reported by the Labor Force Survey (conducted by the French National Institute for Statistics and Economic Studies53) using the raking ratio54. In the second stage of the survey, re-weighting was performed to account for the different sampling ratio by group (1, 2 and 5.3) and the non-response rate, since respondents differed from non-respondents in terms age, sex, employment situation, education level and chronic condition. SAS version 9.2 software (SAS Institute, Cary, NC) was used.

Consent and ethics

Informed consent was obtained from all individual participants included in the study. The survey planning, conduct and reporting was in line with the Declaration of Helsinki and French laws. The survey was approved by the institutional review board of Santé Publique France, the French Public Health Agency, on 19 August 2022.

Results

Characteristics of participants

A total of 1813 participants completed the detailed interview (CAWI) with a response rate of 43%, similar across the three groups (Fig. 1). Overall, 55% of CAWI participants had a probable SARS-CoV-2 infection, with 88% reporting a positive test and only a very small proportion being hospitalised (1%). Among infected participants, 7.1% met the standard PCC WHO definition, of whom 60% were infected for the first time within the past year and 64% had at least one symptom with at least moderate impact on daily functioning (strengthened PCC definition). Further, 7.3% met the NICE definition and 13.0% had self-reported perceived long COVID-19. The overlap between the long COVID definitions was less than 50% for most definitions as shown by the Venn diagrams (provided in Supplementary Fig. 1). Among participants meeting the standard WHO-PCC, 71% were women, 82% were aged between 25 and 64 years and 67% were employed. The main symptoms reported and the proportion of symptoms meeting the standard WHO-PCC definition are given in Supplementary Table 4.

Factors associated with long COVID

The characteristics of participants according to the main categories (WHO-PCC, non-WHO-PCC infected and never-infected) are shown in Supplementary data file 1, along with PRs, both age and sex-adjusted and optimally adjusted at the appropriate stages. The final models obtained at each of the six predefined stages (retaining only the factors independently associated with the risk of long COVID) are reported in Supplementary data file 2 and Supplementary data file 3 (participants infected less than 1 year ago), Supplementary data file 4 (NICE definition), Supplementary data file 5 (PCC with at least moderate impact), and Supplementary data file 6 (self-reported perceived long COVID). Factors positively or negatively associated with long COVID according to the various definitions and control groups are summarised in Supplementary data file 7.

Sociodemographic background and early defined determinants

Female sex appeared to be very strongly associated with PCC with more than a twofold increased prevalence in women versus men when compared with non-WHO-PCC infected and never-infected participants. The association of PCC with age appeared to be nonlinear, with low prevalence in older age being more apparent in comparison to never-infected (more than 10-fold less in persons ≥75 years) than to infected participants (about 4-fold less in persons ≥75 years). Interestingly, this pattern was largely unchanged after six stages of adjustment contrary to the higher prevalence associated with female sex, which decreased by more than 50%, suggesting that this association was partly mediated by intermediate covariates, especially those linked to comorbidities and SARS-CoV-2 infection. Neither geographic origin nor education level was associated with the risk of PCC regardless of the comparison groups.

Current socioeconomic position and living condition factors

Most SEP factors, including employment status, occupation, household income, size of place of residence, deprivation level and region, were not associated with the risk of PCC. Only household size and especially financial security were associated with a higher risk of PCC compared with the never-infected group for the former and both groups for the latter. However, after adjusting for comorbidities for household size and for COVID-19-related representations for financial security, these factors were no longer associated with PCC.

Pre-existing comorbidities, health behaviours and short-term work-related factors

The number of chronic comorbidities was strongly associated with the risk of PCC as well as with dose-response relationships, using both comparison groups, while PRs only slightly decreased after subsequent adjustments. In addition to the number of diseases (overall chronic disease burden), two categories of comorbidities were independently associated with PCC risk: mental disorders, especially in comparison with never-infected participants, and uncorrected visual and hearing impairments. Once adjusted for comorbidities, only the impact of the COVID-19 pandemic on occupation and work conditions as well as vaccination were associated with PCC risk among short-term work-related factors and health behaviours, respectively. These factors remained independently associated with PCC risk in comparison to the never-infected group.

SARS-Cov-2 infection related factors

In infected participants, the number of infections and initial COVID-19 symptoms were strongly associated with the risk of PCC with dose-response relationships. Among the initial SARS-CoV-2 symptoms reported, myalgia was independently associated with PCC risk in addition to the number of symptoms. The two- to threefold higher risk of PCC in the 20 participants hospitalised for COVID-19 was not statistically significant.

Healthcare and social support and overall perception

Contacts with the healthcare system as assessed by the number of general practitioner consultations and the number of specialist consultations were no longer associated with the risk of PCC in adjusted models that notably included comorbidities. Social support assessed by contact frequency with family and friends was not associated with PCC. On the contrary, overall perception of COVID-19 severity and long COVID information were independently and strongly associated with the risk of PCC.

Interaction or effect modification

Very few interactions were significant at each modelling stage, and most were associated with p values above 0.01 (Supplementary data file 2). Age increased the impact of the number of comorbidities in comparison to infected participants, whereas female sex decreased the effect of mental disorders in comparison to non-infected participants. Myalgia during the acute COVID-19 phase decreased the impact of the number of comorbidities in comparison to infected participants.

Sensitivity analyses

The final models obtained at each stage based on the alternative definitions of PCC and in participants infected within the past year provided very similar results to those of the main analysis (Supplementary data files 37). In particular, the factors identified in the main analysis were all confirmed, with the exception of the initial symptom of myalgia and household size, which were not consistently associated with PCC risk in sensitivity analyses. Occupational status, especially retirement, which was not associated with the risk of PCC using the standard definition, was found to be associated with long COVID in two out of four sensitivity analyses instead of age (strengthened WHO-PCC requiring at least moderate impact on daily activities and self-reported perceived long COVID).

As compared with the main analyses using the standard WHO-PCC definition, associations were sometimes stronger (as reflected by higher or lower PRs) with alternative definitions and especially the strengthened definition of WHO-PCC (e.g. number of individual chronic conditions, mental disorders, vaccination). The analysis based on self-reported ‘perceived’ long COVID yielded different initial symptoms of COVID-19 from the main analysis such as fatigue, dyspnoea and anosmia-ageusia and uncovered the higher risk of long COVID associated with hospitalisation and lower education levels in infected participants. Interactions were also seldom observed in sensitivity analyses, and overall, no consistent pattern of interaction or effect modification was observed.

Discussion

This study confirmed or uncovered 15 factors associated with long COVID pertaining to seven out of eight sets of factors: age and sex (among basic and early defined sociodemographic characteristics; household size, financial security and retirement (current socioeconomic position); crude number of pre-existing chronic diseases, respiratory disease, mental and sensory disorders (physical and mental comorbidities); SARS-CoV-2 vaccination (health behaviours); impact of COVID-19 pandemic on occupation and work conditions (short-term work-related factors); number of SARS-CoV-2 infections and number of initial symptoms (SARS-CoV-2 infection related factors); and overall perceptual experience of COVID-19 severity and long COVID information (COVID-19-related representations). By contrast, 36 tested factors were not consistently associated with long COVID: geographic origin and education level (among basic and early defined sociodemographic characteristics); occupation, employer, household income, size of the place of residence, region, deprivation level of the place of residence (current socioeconomic position); cardiovascular, endocrine and metabolic, osteoarticular condition, cancer, obesity and injury sequelae (physical and mental comorbidities); smoking, alcohol consumption, moderate and vigorous physical activity and sedentariness (health behaviours); caregiving for elderly or disabled relative (short-term work-related factors); hospitalisation, type of initial symptoms (N = 11) and infection during work (SARS-CoV-2 infection related factors); and number of general practitioner consultations, number of specialists consulted and contacts with family and friends (healthcare and social support). However, the broad spectrum of associated factors strongly suggests that long COVID should be viewed not only as a complication of SARS-CoV-2 infection but also as part of a broader network of contextual factors that influence the individual’s risk of long COVID far beyond SARS-CoV-2 infection. It is noteworthy that no significant interaction between factors retained at any stage of the analysis was evidenced, thus supporting the relatively simple conceptual model shown in Fig. 2 (excluding the seventh set of factors).

Many previous studies found female sex to be associated with the risk of long COVID14,16,17,18,19,20,21,22,25; although the present study also points to the intermediate role played by previous SARS-CoV-2 infections and comorbidities: adjusting for these factors reduced the increased risk associated with female sex by about half. Indeed, women are more frequently infected by SARS-CoV-2, and infected women present a higher rate of multimorbidity than men55,56 Consequently, our results supported by a multicentre prospective cohort showing that sex-related variables may mediate a large part of the sex differences in the risk of long COVID among infected individuals57 may be important to avoid unfounded sex stereotypes. A reduced risk of long COVID in participants aged ≥55 years and especially ≥75 years (or retired participants in certain analyses) was also clearly evidenced in our study, with no impact of successive adjustments on potential intermediate covariates. This finding, already shown in population-based studies10,11,12,13, is in sharp contrast with clinical studies conducted among hospitalised patients in which older age was associated with a higher risk of long COVID40. Our study confirms that at a population level, there is a large body of evidence that long COVID is primarily a working-age condition10,11,12,13. Those with more family and professional responsibilities, such as women and working-age adults, are more likely to report an impact on these activities and therefore to meet the long COVID and especially WHO-PCC criteria after SARS-CoV-2 infection.

Financial security appeared to be the strongest social determinant of long COVID in this study, although its effect was largely influenced by comorbidities, which was an expected result due to the strong relationship between SEP and chronic disease risk56,58. No other structural SEP indicator such as occupation, income level, geographic origin or deprivation level was associated with the risk of long COVID contrary to the findings of earlier studies. This may be explained by the lower reported rates of infection in lower SEP participants due to their underuse of diagnostic tests for COVID-19. When using self-reported ‘perceived’ long COVID, education level was associated long COVID. Instead of the type of occupation and employer, the negative impact of the COVID-19 pandemic on occupation and work conditions was linked to the risk of long COVID. This result, which has not been previously reported, highlights the impact of poorer working conditions caused by the pandemic and its management59.

The number of preexisting comorbidities was strongly associated with long COVID with a roughly linear trend. However, several comorbidity groups had an effect ‘above the mean’: respiratory diseases and especially mental disorders, as previously reported16,17,19,21,22,23,24,25,26,27,28,29,30. In particular, these results support another population-based study in which respiratory diseases and depression were the only conditions associated with the higher risk of long COVID in addition to multimorbidity60. This study also showed a consistently higher risk associated with uncorrected sensory disturbances, which may reflect the biological and social fragility of these individuals. Conversely to several studies16,17,19,21,35,40 obesity was not identified as a significant increased risk of long COVID, although differences in exposure may explain this result (e.g. in 2014, French mean body mass index was 3.5 points lower than the US61). Except for vaccination, health behaviours were not associated with the risk of long COVID, once adjusted for comorbidities, which is contrary to some studies reporting associations between smoking and long COVID. Regarding vaccination, its strong protective effect against SARS-CoV-2 infection and thus indirectly against long COVID is no longer debated, despite the variable protection observed in infected participants, as in our study32,62,63,64.

This study confirmed that the number of acute infections as well as the number of initial symptoms are important determinants of long COVID16,18,21,23,24,35,36. However, none of the initial symptoms were consistently and specifically associated with a higher risk of long COVID. The association of long COVID with the number of initial symptoms may be interpreted as the consequence of multi-systemic infection (despite the limited evidence in the literature of any link between long COVID and persistent symptoms in non-hospitalised patients) or may be seen as the overarching tendency of certain individuals to experience somatic symptoms, which would be consistent with the frequent observation of intense symptoms in subjects without abnormal physical or biological findings in long COVID45,65,66,67,68,69,70.

Although contacts with the healthcare system and social support as assessed in this study were not associated with the risk of long COVID, people’s perception of COVID-19 severity and access to long COVID information were strongly and consistently associated with the condition. These findings should be interpreted with caution. Although they may result from psychological mechanisms linking COVID-19-related concerns to the risk of long COVID, as previously reported26,36 they may also show the contrary, namely that people’s experience of long COVID leads them to re-evaluate the severity of COVID-19 and seek more information about the condition.

This present study has several strengths. Firstly, it used a structured epidemiological approach to comprehensively evaluate the factors associated with long COVID according to two main paradigms for the condition: a specific SARS-CoV-2 complication and a non-specific syndrome as observed, for example, in chronic fatigue syndrome71,72 or Gulf War illness73,74. Secondly, the use of a large nationwide population-based random sample following the potent Omicron waves in France in spring-summer 2022 along with a satisfactory participation rate allowed us to make inferences in the general population setting. Thirdly, the comprehensive symptom assessment (date of onset, explanations by alternative diagnoses and impact on daily activities using a Likert scale) and the use of several definitions of long COVID as either symptom-based (with several severity thresholds) or self-reported (‘perceived’) allowed us to build robust knowledge (through sensitivity analyses) and to use different perspectives on long COVID (healthcare community and patients). The use of several definitions of the condition strengthens the inferences that may be drawn from the results.

However, several limitations should be acknowledged. Firstly, the cross-sectional nature of the study limits the extent to which causal inferences can be made. Despite the unambiguous chronology of exposure to many of the factors considered here, the adjustments made with regard to this chronology and the sensitivity analyses limited to participants infected within the past year, which provided similar results, caution remains needed when causally interpreting some of the associations found in this study. Several dose-effect relationships observed between the number of chronic morbidities, number of infections, number of initial symptoms and the risk of long COVID, which are highly implausible in the reverse direction, nonetheless provide strong arguments for a close, if not causal, relationship between these factors and long COVID. Secondly, the exclusively self-reported nature of the data, despite being collected by experienced professionals in the CATI stage, raises the issue of misclassification errors. Since the participants were not informed about the precise objectives of the study, these errors are likely non-differential, thus tending to attenuate the associations observed. Thirdly, due to the spontaneous recovery of long COVID patients over time, it was not possible to assess the effect of different COVID waves or variants and to confirm whether the Omicron waves were associated with a lower risk of long COVID as previously reported75,76 (the indirect evidence from prevalence studies demonstrating stable or decreasing prevalence rates argues for this lower risk10,11,13). Finally, despite the large sample size, some strata were sparse, which motivated our use of the Poisson regression with robust variance51,77; and the power was limited to the detection of small effects or even moderate effects associated with rare exposure (e.g. hospitalisation for COVID-19).

Almost 4 years after its emergence, the debate regarding the nature and causes of long COVID is still ongoing78,79. In this debate, epidemiological evidence has been rarely used, while hypotheses have not been formally verified. Our study provides quantitative arguments for the use of an integrative, non-reductionist clinical and theoretical perspective on long COVID, which has long proven useful in the field of chronic diseases or illnesses80, with the consideration of biological, psychological and social processes81. In addition to SARS-CoV-2 infections and their effective prevention by vaccination and other protective behaviours, this study strongly supports the role played by both physical and mental comorbidities, poorer working conditions and low financial security as well as the perception of COVID-19 severity and information about long COVID. These factors indicate a context that favours delayed recovery or maladjustment82, especially since two-thirds of adults with long COVID are in active employment. SARS-CoV-2 infection may be the proverbial straw that broke the camel’s back for those living in such unfavourable circumstances. Indeed, the working part of the population was most hit by the COVID-19 pandemic59, while the health and quality of life of this group deteriorated much more when compared with the non-active or retired population in France and several European countries over the last 20 years83,84.

In conclusion, this study provides evidence for the multidimensional network of factors associated with long COVID in the general population. Although several factors are associated with SARS-CoV-2 infection, others are related to people’s background (sociodemographic characteristics, working conditions, pre-existing chronic comorbidities and overall perceptual experience of COVID-19), which probably exacerbate and maintain the condition and its symptoms. In addition to vaccination and other protective behaviours for COVID-19, contextual factors should be better taken into account when developing strategies aimed at limiting the burden of this condition in the general population, which should primarily target the most affected group of working age.