Introduction

Sexually transmitted infections (STIs) comprise a diverse group of pathogens that can cause disease and death, with implications for individuals, communities, and healthcare systems1,2,3,4,5. These pathogens propagate within sexual networks, defined as groups of individuals connected through sexual relationships, and are transmitted during sexual acts when individuals form partnerships1,6,7,8,9,10.

The risk of acquiring an STI within a sexual network is commonly described by the concept of “sexual risk behavior”11. However, this concept is poorly defined, as no single behavior exclusively determines the risk of STI acquisition11. STI risk is influenced not only by an individual’s sexual behavior but also by the “ecology” surrounding the individual within the sexual network, including the behaviors of direct partners, their partners, and their partners’ partners1,7,10,12. An individual with multiple recent sexual partners may remain uninfected, while another with only one partner may acquire an STI, depending on their position within the network and the dynamics of infection transmission within the network.

Since the discovery of the HIV pandemic, extensive research has focused on understanding sexual risk behavior13,14. This research identified specific behaviors, such as having multiple sexual partners or engaging in sex with female sex workers, as factors that increase the risk of acquiring STIs1,14,15,16,17,18,19. Additionally, properties of sexual networks, including concurrency, clustering, and degree correlation, have been shown to influence STI transmission risk1,12,20,21,22. Yet, it remains that no single sexual behavior or sexual network metric can exclusively capture the risk of acquiring an STI.

Complicating this matter further is the difficulty of accurately measuring sexual behavior13,22. Although substantial data have been collected over the past decades, most of these data rely on self-reports13,23. The sensitive nature of sexual behavior, along with non-random social desirability and recall biases, can compromise their accuracy20,24,25,26. Biomarker studies have also revealed discrepancies between self-reported sexual behavior and biological evidence, highlighting the limitations of self-reported data27,28.

In this study, a novel perspective is proposed to functionally define and quantify sexual risk behavior within a population by leveraging its direct consequence—STI acquisition. This is achieved through mathematical modeling, which provides a mechanistic framework for understanding how STIs propagate within sexual contact networks, directly linking sexual behavior to STI acquisition and transmission29,30,31.

The approach is applied to herpes simplex virus type 2 (HSV-2) infection32, which serves as a suitable proxy biomarker for sexual risk behavior due to its incurable, lifelong nature, rendering it largely unaffected by treatment patterns23,33,34,35. HSV-2 antibody prevalence (seroprevalence) also provides a reliable measure of lifetime exposure to the infection32,33. The analysis focuses on the United States (U.S.), where robust, standardized, population-based, sex- and age-stratified HSV-2 data have been collected over several decades through the National Health and Nutrition Examination Surveys (NHANES)36,37,38,39,40, facilitating the application of this approach.

Fundamentally, this approach resembles reverse engineering, as it deconstructs observed HSV-2 infection patterns in the population to infer the underlying aggregate population-level sexual risk behavior that produced these patterns. This approach is demonstrated by addressing a specific question: How has population-level sexual risk behavior evolved in the U.S. over the past few decades?

Materials and methods

Mathematical model

This study employed a deterministic, compartmental, population-level dynamical model previously developed to characterize HSV-2 transmission dynamics in the U.S. population41. A detailed description of the model structure, governing equations, and parameterization is provided in Section S1 of the Supplementary Material, while a brief overview is presented here.

The model is based on current knowledge of the natural history and epidemiology of HSV-2 infection41,42,43,44,45. It consists of a system of coupled nonlinear differential equations that stratify the population by HSV-2 status, stage of infection, sex, age group, and sexual risk group41. The model was implemented and analyzed using MATLAB R2019a (The MathWorks Inc., Natick, MA, USA).

The population was divided into 20 age groups, each representing a five-year interval (0–4, 5–9, …, 95–99 years)41. The model incorporated two stages of viral shedding—primary infection and reactivation—along with latent periods without viral shedding between reactivation episodes41. Individuals who acquired HSV-2 for the first time progressed from the primary infection stage to a lifelong latent phase marked by episodic reactivations41,42,43,44,45.

The force of infection was determined by the sexual contact rate and the probability of HSV-2 transmission per partnership between a susceptible and an infected individual of the opposite sex41. This probability depended on the transmission probability per coital act during each stage of infection, the frequency of sexual acts within the partnership, and the duration of the partnership41.

Sexual risk behavior

Sexual risk behavior for men at a given time t was expressed using the function \(\rho \left( {t;a,i} \right)\) that estimates the sexual contact rate, representing the risk of acquiring HSV-2 infection for a specific age group a and risk group i.

This function serves as a summary measure of sexual risk behavior in the population, capturing both the distribution and intensity of exposure risk to infection. It reflects not only partner change rates but also sexual network factors that influence exposure risk, such as variability in partner change patterns, concurrency, clustering, and degree correlation1,10,12,46,47,48. \(\rho \left( {t;a,i} \right)\) was parametrized as the product of three functions: one time-dependent, one age-dependent, and one risk group-dependent:

$$\rho \left( {t;a,i} \right)=\eta \left( t \right) \times \psi \left( a \right) \times f\left( i \right)$$

Here, \(\eta \left( t \right)\) describes the risk of exposure to HSV-2 infection over time, and \(\psi \left( a \right)\) captures how this risk varies by age. \(\eta \left( t \right)\) was modeled non-parametrically as a piecewise function with five-year intervals to quantify population-level sexual risk behavior across successive five-year calendar periods. Similarly, \(\psi \left( a \right)\) was modeled as a piecewise function with five-year intervals to capture variations in sexual risk behavior across different five-year age groups.

Meanwhile, \(f\left( i \right)\) is a power-law function that characterizes how the risk of exposure varies across different risk groups within the population. Its form was informed by sexual behavior data50, the structure of sexual networks as scale-free networks50, and findings from network and modeling analyses46,51,52,53,54. It is defined as:

$$f\left( i \right)={i^\theta }$$

where \(\theta\) is the exponent parameter that determines the degree of variation in exposure risk across risk groups11,41,55,56.

The population was stratified into five sexual risk groups to capture the variation in sexual risk behavior, ranging from the general population to high-risk groups41. The proportion of individuals in each risk group was informed by NHANES data on the reported number of sexual partners in the past 12 months40.

Sexual partner mixing across different age and sexual risk groups was modeled using mixing matrices that accounted for both assortative mixing (preferential partner selection within the same age or risk group) and proportionate mixing (random partner selection without preference for age or risk group)41,57.

Sexual risk behavior among women was derived by balancing sexual contact rates, ensuring that the number of contacts formed by women in any specific subpopulation (defined by age and risk group) with men in another subpopulation equals the number of contacts formed by men in the latter subpopulation with women in the former subpopulation41,57. Further details on this balancing procedure are provided in Section S1 of the Supplementary Material.

Data sources and model parameters

Model parameter values were sourced from primary studies on the natural history and epidemiology of HSV-2 infection, with detailed descriptions of the parameters, their justifications, and sources provided elsewhere41. Demographic data, including population size by age and sex, along with historical trends and future projections, were obtained from the Population Division database of the United Nations Department of Economic and Social Affairs58.

HSV-2 seroprevalence data by age and sex were extracted from ten publicly available biennial rounds of the nationally representative, population-based NHANES surveys, conducted between 1988 and 201636,37,38,39,40. The 1976–1980 round was excluded due to survey procedural differences and previously identified limitation related to its measures41.

Each survey round followed standardized methodologies for both analytical and laboratory procedures36,37,38,39,40. Data on demographics, sexual behavior, and HSV-2 laboratory testing from each round were extracted, merged, and analyzed following NHANES standardized guidelines59. Sampling weights were applied to all NHANES-derived estimates to ensure representativeness of the U.S. population.

Model calibration

The demographic part of the model was first fitted to reproduce the sex-specific and age-specific demographic projections for the U.S., as provided by the Population Division of the United Nations58. Following this, the epidemiological part of the model was fitted to NHANES time-series, sex-specific, and age-specific HSV-2 seroprevalence data from 1988 to 201640.

Model fitting was conducted using a non-linear least-squares method60,61. This calibration process enabled the estimation of the \(\eta \left( t \right)\) and \(\psi \left( a \right)\) functions, which quantify the population-level sexual risk behavior across successive five-year calendar periods and adult age groups, respectively.

Fitting parameters were varied to minimize an objective function defined as the sum of squared differences between model-predicted and observed HSV-2 seroprevalence among women and men for each age group and survey round, following established approaches41,56,62. Model fit was assessed using formal goodness-of-fit metrics, including the root mean square error. This calibration process produced a uniquely identifiable parameter set. The optimization was conducted using the Nelder-Mead simplex algorithm60, a widely used, derivative-free numerical method for minimizing multidimensional functions.

Uncertainty analysis

An uncertainty analysis was conducted to quantify uncertainty in the estimated population-level trends in sexual risk behavior, following an established validation strategy widely applied in similar modeling studies and cost-effectiveness analyses, and consistent with recommended practices for such applications41,55,56,63,64,65,66,67,68. In the absence of established empirical uncertainty bounds for HSV-2 biological parameters and other inputs, this approach provided a practical and accepted means to capture uncertainty in this study.

This approach involved 500 model runs using Latin Hypercube Sampling (LHS) from a multidimensional distribution of model parameters, applying ± 30% uncertainty to the point estimates of these parameters41. LHS is a stratified sampling technique that efficiently captures variability across the full parameter space while requiring fewer simulations compared to simple random sampling69,70.

Each parameter range was divided into 500 equally probable intervals, with one value randomly selected from each interval without replacement. These values were then randomly combined across parameters to generate 500 unique parameter sets. For each set, the model was refitted to the data. The resulting distributions of estimates were used to derive the mean estimates and corresponding 95% uncertainty intervals (UIs), reflecting the uncertainty in projections driven by joint variation in biological and behavioral parameters.

Sensitivity analyses

To further assess the robustness of the estimated population-level trends in sexual risk behavior, sensitivity analyses were conducted to examine the impact of extreme variations in key model parameters. Specifically, each parameter was varied independently by ± 80% from its baseline value. These parameters included the per-coital-act probability of HSV-2 transmission, the frequency of HSV-2 shedding, the degree of assortative mixing by sexual risk group, and the degree of assortative mixing by age group. These analyses provided an additional assessment of the model’s capacity to reliably reproduce estimated sexual risk behavior trends across a broad range of parameter assumptions.

Results

The model demonstrated robust fits to the U.S. population size (Fig. S1 of Supplementary Material), age-specific HSV-2 seroprevalence data across NHANES rounds for women (Fig. 1) and men (Fig. 2) aged 15–49 years, temporal trends in HSV-2 seroprevalence among all individuals aged 15–49 years (Fig. 3a), and sex-specific temporal trends in HSV-2 seroprevalence among individuals aged 15–49 years (Fig. 3b). The root mean square error was 1.3% points, indicating a low average deviation between model-predicted and observed HSV-2 seroprevalence across age groups, sexes, and survey rounds.

Fig. 1
figure 1

Model fitting of age-specific HSV-2 seroprevalence among women in the United States. Comparison of model-estimated age-specific HSV-2 seroprevalence with NHANES data for women aged 15–49 years across survey rounds from 1988 to 2016. HSV-2 denotes herpes simplex virus type 2, and NHANES, National Health and Nutrition Examination Surveys.

Fig. 2
figure 2

Model fitting of age-specific HSV-2 seroprevalence among men in the United States. Comparison of model-estimated age-specific HSV-2 seroprevalence with NHANES data for men aged 15–49 years across survey rounds from 1988 to 2016. HSV-2 denotes herpes simplex virus type 2, and NHANES, National Health and Nutrition Examination Surveys.

Fig. 3
figure 3

Model fitting of HSV-2 seroprevalence over time in the United States. (a) Comparison of model-fitted temporal trends in HSV-2 seroprevalence among all individuals aged 15–49 years with NHANES data. (b) Comparison of model-fitted sex- specific temporal trends in HSV-2 seroprevalence among individuals aged 15–49 years with NHANES data. HSV-2 denotes herpes simplex virus type 2, and NHANES, National Health and Nutrition Examination Surveys.

Figure 4 shows the model-estimated temporal trend in population-level sexual risk behavior in the U.S. from 1950 to 2020. The results indicate a gradual increase in sexual risk behavior beginning in the early 1960 s, peaking in the early 1980 s, followed by a steady decline through 2020. Notably, the 1990 s exhibited a sharp decline in sexual risk behavior compared to the levels observed in the 1980s.

Fig. 4
figure 4

Temporal trend in population-level sexual risk behavior in the United States, 1950–2020. Model-estimated mean and 95% uncertainty interval for sexual risk behavior over time.

Figure 5 shows the model-estimated age-specific variation in population-level sexual risk behavior in the U.S. The results indicate that sexual risk behavior is highest among individuals aged 15–24 years, followed by a gradual decline with increasing age.

Fig. 5
figure 5

Age-specific variation in population-level sexual risk behavior in the United States. Model-estimated mean and 95% uncertainty interval of sexual risk behavior by age group.

Figure S3 in the Supplementary Material presents the results of the sensitivity analyses assessing the impact of extreme variations in key model parameters. Across all four analyses, the model produced consistent estimates for the temporal evolution of population-level sexual risk behavior, reinforcing the robustness of the study’s findings and conclusions.

Discussion

This study presented an approach to defining and quantifying sexual risk behavior at the population level. The approach leverages the fact that HSV-2 infection, which results from sexual behavior and an individual’s position within a sexual network, leaves a permanent biological marker—detectable antibodies in the blood. This serological marker serves as an objective, measurable indicator of sexual risk, enabling the evaluation of aggregate sexual risk behavior within the population over time through the use of mathematical modeling.

By applying this approach to the U.S. population, the analysis revealed a distinct wave of sexual risk behavior that began in the early 1960 s, peaked in the early 1980 s, and declined thereafter, with a particularly sharp decrease observed during the 1990s. These trends align with socio-cultural shifts in the U.S. over the past several decades, including the “sexual revolution” of the 1960 s and changes in sexual behavior following the introduction of oral contraceptives71,72. Oral contraceptives facilitated greater sexual freedom by decoupling sex from the risk of pregnancy. The decline in sexual risk behavior observed from the late 1980 s onward also aligns with the widespread concern and behavioral changes triggered by the discovery of the HIV/AIDS pandemic73.

These findings are consistent with evidence from self-reported sexual behavior data in the U.S., which indicate that, starting in the 1960 s, there was a decrease in the age of sexual debut, an increase in premarital sex, and a rise in the number of sexual partners74,75. For example, the proportion of women reporting premarital sex before age 15 increased from 1% among those born in the early 1900 s to 12% among those born between 1968 and 1973, while premarital sex before age 18 rose from less than 10% to over 50%75,76. Women born in the 1950 s were three to four times more likely than earlier cohorts to report five or more sexual partners. Among men, the proportion rose from 34 to 50%, then remained stable in later cohorts74,75. These behavioral trends are consistent with the model-inferred patterns (Fig. 4), which indicate a substantial rise in sexual risk behavior beginning in the 1960 s and continuing through the 1980s.

Similarly, studies have documented declines in various sexual behaviors following the recognition of the HIV/AIDS pandemic, including reductions in the number of sexual partners, an increase in monogamous partnerships, decreased sex with female sex workers, and greater use of condoms76,77. Moreover, the observed trend aligns with evidence from other regions indicating declines in HSV-2 seroprevalence following the recognition of the HIV/AIDS pandemic, including in Africa78, the Americas79, Asia80, and Europe81.

Sexual risk behavior was found to vary with age, peaking in young adulthood, specifically among individuals aged 15–24 years. This pattern aligns with NHANES data, which shows a similar distribution in the reported number of sexual partners in the U.S. (Fig. S2 in Supplementary Material). Moreover, this trend is consistent with patterns of sexual behavior observed in other regions82.

The findings should be interpreted in the context of two caveats. First, the observed changes in sexual risk behavior may lag behind actual changes in sexual behavior by few years. This delay occurs because the measure reflects the risk of exposure to infection, which only becomes evident after individual-level behavioral changes aggregate across the population, leading to appreciable alterations in the structure of sexual networks that influence infection transmission dynamics.

Second, the study specifically quantified sexual risk behavior associated with the acquisition of HSV-2 infection. Since different sexual behaviors and network statistics can affect the transmission dynamics of various STIs in distinct ways1,12,18,83, the patterns of sexual risk behavior identified in this study may not be directly applicable to other STIs. However, because sexual behavior is ultimately the driver of STI transmission, the findings are likely to remain qualitatively valid for other STIs.

This study has limitations. HSV-2 shedding was assumed to occur at a constant frequency regardless of the duration of infection, but evidence suggests that shedding declines over time84. HSV-2 infectiousness was assumed to be invariable regardless of symptoms, although this likely has a limited impact, as most viral shedding is asymptomatic42,85.

To maintain a streamlined model structure, the model did not explicitly differentiate between modes of sexual transmission, and transmission among men who have sex with men (MSM) was not modeled separately. While HSV-2 transmission among MSM is intense due to higher contact rates86,87, the majority of transmissions at the population level occur through heterosexual contact, given the substantially larger size of the heterosexual population88,89. Broader behavioral shifts—such as those triggered by the AIDS pandemic—also affected both MSM and heterosexual populations, shaping HSV-2 transmission dynamics across groups. Moreover, HSV-2 acquisition among MSM is indirectly captured, as the model is calibrated separately to sex-stratified population-level data. With these considerations, the absence of explicit MSM modeling is not likely to have an appreciable influence on the findings or conclusions.

The model did not account for concurrent HIV transmission within the population, which could influence HSV-2 dynamics due to the epidemiological interaction between these two infections12,44,90,91 and the differential impact of AIDS-related mortality11,44. However, HIV prevalence in the U.S. has never been high enough to appreciably affect the derived estimates.

The study has strengths. It utilized standardized, high-quality, population-based data from NHANES spanning three decades, providing reliable and representative estimates for the U.S. population36,37,38,39,40. A sophisticated mathematical model was employed to capture the complex dynamics of HSV-2 transmission41, grounded in quality data on the natural history and transmission parameters of the infection41,42,43,44,45. The model demonstrated robust fits to empirical data, with predicted trends closely aligning with observed patterns. Furthermore, the uncertainty and sensitivity analyses reinforced the validity of the point estimates and demonstrated the robustness of the model projections across a wide range of parameter assumptions. Lastly, the consistency between model outputs and real-world data provides additional confidence in the study’s findings.

In conclusion, this study demonstrated an approach to functionally define and quantify sexual risk behavior at the population level by leveraging its direct consequence—STI acquisition. The analysis identified age-specific variations in sexual risk behavior in the U.S. population and revealed a distinct wave of sexual risk behavior that began in the early 1960 s, peaked in the early 1980 s, and declined thereafter, with a particularly sharp decrease during the 1990s. This approach offers a framework that can be applied to other populations and extended to different STIs, providing deeper insights into the dynamics of sexual risk behavior and its variation across age groups and over time.