Introduction

Humans recalibrate their choices and trade-offs between the present (e.g., spend now) and the future (e.g., save for retirement), responding to the fluctuating availability of internal and external resources. Such tradeoffs are common, because there are many intertemporal decisions in personal and professional life domains. In these intertemporal trade-offs, the value of delayed rewards diminishes over time, a phenomenon quantified as delay discounting (Frederick et al. 2002). In a typical experimental paradigm of delay discounting, participants are asked to make choices between a series of smaller-sooner (SS) and larger-later (LL) rewards (Madden and Bickel, 2010). For instance, a typical trial in such task could be framed as, “Would you rather receive $100 today or $200 in 15 days?” The extent of a participant’s inclination towards SS option correlates with their delay discounting rate k, often characterized by a hyperbolic function (Mazur, 1987):

$$V=\frac{A}{1+{kD}}$$

where V, A, and D represent the subjective value of the delayed reward, the reward magnitude, and the delay, respectively. Since Samuelson’s seminal work (1937), delay discounting has been extensively studied, revealing associations with a diverse range of psychological traits (Shamosh and Gray, 2008; Silverman, 2003). Importantly, several studies have pointed to gender-based differences in delay discounting rates (Kirby and Maraković, 1995). This has significant implications, because it highlights that different genders may exhibit unique tendencies and preferences when faced with choices. Hence, incorporating considerations of gender differences in policy-making, regulations, and resource allocation can better cater to the diverse needs of gender groups, thereby reducing gender gaps and increasing equity and inclusivity.

Nevertheless, despite early evidence for gender differences in delay discounting (Kirby and Maraković, 1995) examined, there is growing debate regarding gender effects without a definite resolution. Various theories have been proposed to explain the impact of gender on delay discounting, yet they are fragmented across disciplines and offer conflicting insights. For example, evolutionary theory and social role theory each provide different perspectives from biological and social viewpoints, respectively, to understand individual behavior differences. Hence, a comprehensive exploration of the causes and ramifications of this phenomenon is imperative. The current research aims to (a) compare and synthesize pertinent theories from diverse disciplines, (b) quantitatively assess the magnitude and direction of gender effects on delay discounting through meta-analysis, and (c) pinpoint possible moderating (contextual) factors influencing gender effects on delay discounting.

Gender differences in delay discounting

Research on gender differences and similarities is crucial for two main reasons: the significant impact of prevalent stereotypes on behavior, necessitating an evaluation of their accuracy (Boysen et al. 2022), and the frequent referencing of psychological gender differences in crucial policy dialogues, such as formulating gender-specific educational policies or elucidating factors contributing to women’s underrepresentation in leadership positions in technology companies (Seo et al. 2017). Meta-analyses offer advantages in assessing accumulated insights about gender differences by ascertaining the replicability of specific disparities, providing a nuanced understanding through quantifying their magnitude, and enabling systematic exploration of moderators like social context.

Discussions on gender differences often focus on two main factors: evolution and environment. Evolutionary psychologists analyze the origins of gender differences from a biological evolution perspective, suggesting adaptive advantages. Social psychologists, in contrast, explore how gender differences manifest and are explained across various social and environmental contexts, emphasizing the shaping role of social roles, cultural expectations, and environmental influences. Understanding which theory—evolutionary or social role theory—better explains gender differences in delay discounting is crucial for advancing our understanding of human decision making. The following sections will separately review these theories and integrate findings from diverse studies through meta-analysis to elucidate the primary drivers of gender differences in delay discounting.

Gender differences in delay discounting from the perspective of the theory of evolution

Evolutionary psychology focuses on the functional aspects of decision making, aiming to uncover why specific preferences and biases exist. It posits that psychological gender differences have evolved through adaptive mechanisms, where behaviors beneficial to survival and reproduction have been selected over generations (Buss and Schmitt, 1993). Key concepts in this framework include sexual selection and parental investment, originally proposed by Darwin. Sexual selection explains how males and females exhibit different behaviors in competition for mates and in choosing mates, influencing traits such as aggression. For instance, males often compete to display resources, which may translate into preferences for immediate small rewards in social contexts. Conversely, females, with their higher parental investment, may prioritize long-term stability and larger rewards to ensure reproductive success. Parental investment theory further elucidates gender differences, highlighting how females, due to their substantial investment in offspring, may favor delayed rewards to secure resources and support for their children’s well-being. In contrast, males, investing less biologically, may prioritize immediate gains to enhance their competitive status.

According to evolutionary psychologists like Mealey (2000), gender-specific behaviors and traits have evolved and persisted because they offer adaptive advantages in reproductive success. This perspective underscores the evolutionary roots of gender differences, emphasizing how behaviors and preferences have been shaped over millennia to maximize fitness. However, evolutionary psychology is not without criticisms. It can exhibit hindsight bias by retrospectively explaining behaviors without robust empirical evidence. Moreover, it risks reinforcing gender stereotypes by attributing roles solely based on biological imperatives, potentially limiting our understanding of diverse gender identities and roles. Furthermore, it may overlook the complex interplay of social and environmental factors that also contribute to gender differences, particularly in rapidly changing cultural contexts.

In conclusion, while evolutionary psychology provides valuable insights into the origins of gender differences in behaviors such as delay discounting, it should be approached critically, considering its strengths and limitations in explaining the complexities of contemporary gender dynamics.

Gender Differences in delay discounting from the perspective of social role theory

Social role theory, as supported by scholars like Eagly and Wood (1991, 2013), asserts that gender differences in behavior stem from societal perceptions of roles and expectations. These perceptions are informed by observations of individuals fulfilling gender-specific roles, leading to the development of gender role beliefs and stereotypes. Men are often encouraged towards roles that emphasize assertiveness, competition, and risk-taking, while women are guided towards nurturing, cooperative, and risk-averse roles.

In the context of delay discounting, social role theory predicts distinct behavioral patterns between genders. Men, conditioned by societal norms that value immediacy, may exhibit a higher tendency for delay discounting. This preference for immediate gratification aligns with traits associated with male roles, such as decisiveness and action orientation. Studies consistently show that men tend to discount future rewards more steeply compared to women (Kirby and Maraković, 1995; Lv et al. 2021, 2023; Xiao et al. 2022).

Conversely, women, influenced by roles that prioritize long-term planning and stability, often display lower rates of delay discounting. This inclination reflects traits such as patience, prudence, and strategic long-term thinking, which are valued in female roles. Women are more likely to wait for larger future rewards, demonstrating a willingness to delay gratification for greater benefits (Odum and Rainaud, 2003; Reynolds et al. 2006).

The divergence in delay discounting behaviors between genders can be attributed to societal expectations and norms associated with gender roles. Men may feel societal pressure to assert dominance and take risks, influencing their decision-making towards immediate rewards. In contrast, women may face expectations to prioritize stability and long-term outcomes, affecting their tendency to delay gratification.

While social role theory provides insights into gender differences in delay discounting, it may reinforce gender stereotypes and overlook individual variability within genders. Additionally, societal changes and cultural contexts may challenge traditional gender roles, influencing behavioral patterns in unpredictable ways.

In conclusion, social role theory offers a nuanced understanding of how societal expectations and gender roles shape behaviors like delay discounting. By exploring these dynamics, researchers can gain insights into the complexities of gender differences in decision-making processes, contributing to a broader understanding of human behavior across diverse social contexts.

From theories to empirical studies of gender effects on delay discounting

A growing body of empirical research has focused on the impact of gender on delay discounting. Nevertheless, few scholars have conducted a thorough examination of relevant interdisciplinary theories, synthesizing existing empirical findings which at times are conflicting. Kirby and Maraković (1995) conducted pioneering research by identifying gender differences in delay discounting. Their initial studies revealed similar delay discounting rates between females and males in various samples, without statistical significance. However, the limited sample sizes (N1 = 21, N2 = 18) prevented an in-depth analysis of this outcome. In subsequent research (Kirby and Maraković, 1996), the sample size was expanded to 258 individuals. Utilizing the same delay discounting measure, they found a higher delay discounting rate in males compared to females, a difference that was statistically significant. The researchers cautiously suggested that these gender differences in delay discounting may stem from differences in impulsivity among personality traits.

Over the subsequent decade, researchers extensively investigated gender differences in delay discounting among both animal and human subjects. For instance, employing primary reward stimuli such as food or cocaine, studies using animals as subjects yielded inconsistent results (Koot et al. 2009; Perry et al. 2007; Perry et al. 2008). In human studies, conflicting results emerged, with a prevailing trend towards impulsive decision making in females (Beck and Triplett, 2009; Logue and Anderson, 2001; Reynolds et al. 2006; Smith and Hantula, 2008). The incongruity and opposing conclusions in the literature captured researchers’ attention. Weafer and de Wit (2014) conducted a comprehensive review examining gender differences in delay discounting across human and animal experiments. Subsequent reviews have explored the potential impact of task design and reward types on outcomes and sought explanations rooted in evolutionary theory (Daly and Wilson, 1978), three-factor theories (Cloninger, 1987), and reinforcement sensitivity theories (Gray, 1970).

Using neuroscientific approaches, Peper et al. (2013) used tract-based diffusion tensor imaging and magnetization transfer imaging to examine the quality of frontostriatal white matter tracts as a predictor of delay discounting of hypothetical rewards. The study also examined the role of sex hormones, testosterone and estradiol, as potential predictors of both functional connectivity and impulsive decision making. Estrogen, a circulating gonadal hormone, has been found to modulate neurotransmitter activity within the prefrontal cortex (PFC) (Keenan et al. 2001), a region crucial for intertemporal decision making (Wang et al. 2014; Xue et al. 2009). Despite observing greater discounting rate in males compared to females, the study did not find a direct association between sex hormones and impulsive decision making in either gender, except for a correlation between testosterone levels and increased diffusion of the frontostriatal tract specifically in men. In conclusion, this study provides evidence for greater delay discounting rate in men, contrary to previous findings, while failing to establish a direct link between sex hormones and impulsive decision making in males or females. In a separate study, Lv et al. (2023) investigated gender-specific neural mechanisms underlying intertemporal decision making using resting-state functional connectivity (rsFC) across three independent samples. Their findings revealed a positive correlation between log k and rsFC linking the right dorsomedial prefrontal cortex (rDMPFC) with the anterior cingulate cortex/right superior frontal gyrus in females, while males exhibited a negative correlation between log k and rsFC connecting the rDMPFC with the left orbitofrontal cortex/right superior frontal gyrus. These results contribute a better comprehension of gender-related differences in decision impulsivity and the associated neural substrates.

A review of empirical research on the impact of gender on delay discounting has revealed valuable insights from the existing literature. It has been observed that the reliability of gender effects is influenced by several other factors (Weafer and de Wit, 2014). The primary aim of this meta-analysis is to clarify the manifestation of gender differences in delay discounting and their possible influencing factors. First, the influence of age should be taken into account, as social factors such as gender roles and gender identity undergo changes over the lifespan. Thus, the present study aims to investigate the developmental trajectory of gender differences in delay discounting in relation to age. Second, recognizing that gender is a socially and culturally constructed concept, it is also crucial to acknowledge the potential moderating effects of cultural and ethnic factors.

The present meta-analytical approach

In the process of literature retrieval, we only found one relatively relevant meta-analysis, which mainly explored the gender differences in impulsivity, and delay discounting was only a small part of it (k = 15). Its results reported no gender differences in delay discounting tasks (Cross et al. 2011). The purpose of the present meta-analysis is to fill a gap in the literature by investigating the presence of gender differences in delay discounting. This quantitative and systematic review aims to accomplish the following research goals: (1) testing various theoretical hypotheses for gender effects through a meta-analysis of existing studies; and (2) examining additional moderators, such as reward magnitude, socioeconomic status (SES), age and region.

We imposed no restrictions on the time frame, national origin or age of participants in our search, aiming to gather a comprehensive sample of existing research. Furthermore, we specifically included Chinese databases in our search to address potential cultural differences. Thus, our analysis is expected to offer a reliable representation of the available data in alignment with our research objectives.

Method

Literature search

A systematic search was conducted across academic databases including Web of Science, ScienceDirect, PubMed for English records, and CNKI for Chinese records, as well as specialized economics databases such as EconLit and RePEc, covering the period from 1930 (the year of the seminal publication on intertemporal choice) to December 2023. The search aimed to include studies on gender differences in intertemporal decision making without bias towards gender-specific terms. Search terms included “intertemporal decision making” OR “intertemporal choice” OR “delay discounting” to ensure a comprehensive review. Additionally, the search criteria were broadened to include abstracts for increased relevance in article retrieval.

Inclusion/exclusion Criteria

The screening process followed a set of predefined inclusion and exclusion criteria. First, all conference abstracts, reviews, and commentaries were excluded from consideration. Then, studies were required to involve a behavioral measure of delay discounting, wherein participants made choices between a smaller-sooner (SS) reward and a larger-later (LL) reward. Additionally, the assessment of delay discounting included the discounting rate (k) within the hyperbolic discounting model, and the area under the curve (AUC) representing the indifference fitting line. Studies were excluded if they measured intertemporal preference using variations of the delay discounting task, non-monetary rewards, or solely engaged in model comparisons.

Second, we excluded studies involving animals and participants with physical or mental conditions (e.g., those with mental disorders, physical ailments, addiction, unique professions, or obesity). Although previous research has indicated that conditions like addiction and obesity may influence delay discounting levels, these specific groups were beyond the scope of our investigation. While we did consider longitudinal studies, our analysis focused solely on the first time of delay discounting measurement. We also eliminated studies from the same open database, including those from the Human Connectome Project.

Third, the included studies must include data on both genders to examine gender differences. We included studies that (a) directly investigated the relationship between gender and intertemporal choice, (b) categorized participants by gender while assessing their intertemporal preferences, or (c) although not primarily focusing on gender effects, presented the relationship in a correlation table or provided adequate details for effect size calculation. When only a gender contrast was mentioned without ample data for effect size calculation, we reached out to the corresponding authors for additional information.

Studies investigating the effects of interventions on delay discounting, often employing a between-subject design, were excluded. Here, interventions encompassed a range of approaches such as drug administration, cortical stimulation, meditation, and emotion induction.

Our screening procedure adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (Page et al. 2021), with screening conducted by a single reviewer (the first author). The screening procedure is virtually represented in Fig. 1. Data collection, interaction with authors, and the application of inclusion/exclusion criteria yielded 118 effect sizes from 109 studies. The studies and samples included in the final meta-analysis are detailed in Supplementary Tables S1 and S2, with corresponding numbering.

Fig. 1
figure 1

Flow chart of studies into the meta-analyses.

Coding of moderators

The coding methods of all moderators are detailed in Supplementary Table S3. The first author coded the moderators, and any ambiguities in coding were resolved through discussions among all authors. Each study is coded independently. If a single study contains multiple samples, they are coded separately.

Analysis

Effect size extraction

When dealing with continuous outcome data, standardized mean difference (SMD) is often calculated as the outcome and summary measure of each sample (Borenstein et al. 2009). Cohen’s d is a prevalent representation of SMD. In the present study, Hedge’s g was selected as the effect size metric, derived from Cohen’s d to ensures normality. The correction of Cohen’s d was performed using Hedges’ method (Cumming and Williams, 2012; Hedges, 1982). The correction parameter (J) can be approximated by \(1-\frac{3}{4\times \left(N-1\right)-1}\), where N represents the number of participants.

$${\rm{Hedge}}^{\prime} {\rm{s}}\; {\rm{g}}={\rm{Cohen}}^{\prime} {\rm{s}}\; {\rm{d}}\times {\rm{J}}$$

The calculation of Hedge’s g is contingent upon the nature of the available data. Ideally, Hedge’s g can be calculated from the respective sample sizes of the two groups (n1/n2), the mean (M1/M2) and standard deviation (SD1/SD2) of their performance in the intertemporal choice task.

$${{ES}}_{{sm}}=\frac{{M}_{1}-{M}_{2}}{{S}_{p}}$$

Sp is the pooled standard deviation, defined as:

$${S}_{p}=\sqrt{\frac{{(n}_{1}-1){{{SD}}_{1}}^{2}+{(n}_{2}-1){{{SD}}_{2}}^{2}}{\left({n}_{1}-1\right)+{(n}_{2}-1)}}$$

Unfortunately, Hedge’s g is sometimes not directly obtainable from the provided data. In instance where studies solely presented one-way ANOVA outcomes, it is possible to calculate the standardized mean difference (SMD) from the F-values of the one-way ANOVA. In a one-way ANOVA of two groups, the degrees of freedom (df) should always start at 1. The formula for this conversion is as follows:

$${\rm{Cohen}}^{\prime} {\rm{s}}\; {\rm{d}}=\sqrt{F\left(\frac{{n}_{1}+{n}_{2}}{{n}_{1}{n}_{2}}\right)\left(\frac{{n}_{1}+{n}_{2}}{{n}_{1}+{n}_{2}-2}\right)}$$

For studies that only reported the results of independent-samples t-tests, the formula for calculating SMD is as follows:

$${\rm{Cohen}}^{\prime} {\rm{s}}\; {\rm{d}}=\frac{t({n}_{1}+{n}_{2})}{\sqrt{({n}_{1}+{n}_{2}-2)({n}_{1}{n}_{2})}}$$

For two groups with unequal numbers of subjects, we can derive the SMD from the point-biserial correlation using the following formula:

$$r=\frac{d}{\sqrt{{d}^{2}+\tfrac{({N}^{2}-2\times N)}{{n}_{1}{n}_{2}}}}$$

In addition, we can also calculate Hedge’s g (Lipsey and Wilson, 2001) from the results of the non-normalized/normalized regression coefficients (b/β) and Chi-square tests (χ2). Effect sizes were computed in R Version 4.3.0 (R Core Team, 2021) using the package esc (Lüdecke, 2018). A negative value of this coding scheme suggests that females exhibit a stronger preference for the SS reward than males, indicating higher delay discounting in females. Due to variations in the measurement of delay discounting across studies, the direction of the effect size was determined based on the specific method employed. For example, if a study shows that males have higher delay discounting than females, and the Area Under the Curve (AUC) is used (where a higher value signifies lower discounting rates), the effect size would be negative. Conversely, when discounting rates (e.g., k values) are used for measurement, the effect size would be positive.

Effect size synthesis and moderator analysis

Analyses were conducted in R using various packages. To calculate the effect size, we use the esc package (Lüdecke, 2018). This was then calculated as pre-calculated effect sizes using the escalc function for meta-analysis with metafor package. To estimate the overall effect size, we used the random-effects model, which accounts for both within-study variance and between-study variance. Specifically, we first calculated the square of the standard error (i.e., the variance) for each study based on its effect size and standard error. To incorporate heterogeneity across studies, we then estimated the between-study variance (τ²) using the DerSimonian–Laird method. The weighting coefficient for each study was computed as the reciprocal of the sum of its variance and the between-study variance (i.e., weight = 1 / (variance + τ²)). Finally, we computed the weighted sum of the effect sizes and divided it by the sum of all weights to obtain the weighted average effect size. Hedge’s g was used to represent the effect size for each included sample and was calculated such that a positive effect size indicates males choose more immediate small rewards, greater delay discounting, and more impulsive behavior than females. Hedge’s g constitutes an effect size for the difference in means and is unbiased with respect to the number of participants in each sample. Effect sizes were interpreted using Cohen’s (1988) suggested cutoff values of 0.20 (small), 0.50 (medium), and 0.80 (large).

To ensure the independence of effect sizes, a single effect size was extracted from each sample. In cases where a sample demonstrates varying effect values across different conditions, such as differing amounts, two prevalent approaches are typically employed: selecting a single effect value from these conditions or calculating a mean effect value from the multiple values. As we plan to incorporate variables such as amount size and delay time as moderating factors in our subsequent analysis, we have randomly selected one effect value for further processing (Lipsey and Wilson, 2001).

Addressing heterogeneity is a critical step in meta-analysis, where either fixed-effect models or random-effects models are commonly employed (Borenstein et al. 2009). The fixed-effect model assumes that all included studies share a common true effect, with differences between studies arising solely from sampling error. In contrast, the random-effect model allows for variability in the true effects across studies, assuming that the effect sizes are drawn from a distribution, making it more suitable for cases with substantial heterogeneity. This model accounts for both within-study variance (sampling error) and between-study variance (heterogeneity), assuming that the true effects across studies follow a normal distribution. Based on existing research, the variables related to intertemporal decision making exhibit certain heterogeneity in terms of measurement and sample sources. Therefore, we employed a random-effects model. Specifically, we implemented the random-effects model using the rma function from the metafor package in R and estimated the between-study variance (τ²) using the restricted maximum likelihood (REML) method, which is widely recognized for its accuracy in random-effects meta-analysis.

At the same time, to address the dependency of effect sizes arising from multiple independent samples within the same study and the potential correlation of effect sizes reported by the same author, we conducted a multilevel random-effects model analysis and compared it with the traditional random-effects model to select the most suitable model. This model takes into account the hierarchical dependencies of effect sizes, specifically at the author level (author_id) and study level (study_id). We implemented this multilevel meta-analysis using the rma.mv function from the metafor package in R (Harrer et al. 2021).

To assess heterogeneity, we used Cochran’s Q test to determine its significance and the statistic to quantify the proportion of total variation attributable to heterogeneity (DerSimonian and Laird, 1986). If the Q test was significant and exceeded 50%, it indicated considerable heterogeneity, warranting the use of a random-effect model (Higgins and Thompson, 2002; Higgins et al. 2003).

In the moderation analysis, we employed both subgroup analysis and individual meta-regression analysis. For categorical variables, we used subgroup analysis to explore potential sources of heterogeneity. Specifically, we treated the categorical variable as a grouping factor to assess the differences in effect sizes between the groups. For continuous variables, we conducted individual meta-regression analysis to test the significance of potential moderating factors. Each moderating variable was included in a separate meta-regression model, rather than incorporating multiple moderators into a single model, to avoid multicollinearity and to clearly assess the independent effects of each moderating variable. These analytical methods help provide a deeper understanding of the impact of moderating variables on effect sizes and ensure the accuracy and reliability of the results.

Examining reporting biases

We examined two types of reporting biases: publication and time-lag biases. Gender was not the primary interest in many eligible studies included in our meta-analysis, making the publication bias less likely. However, studies targeting gender differences could exhibit a publication bias. We explored publication bias in two ways. First, we plotted contour-enhanced funnel plots (Peters et al. 2008). Asymmetrical forms in standard funnel plots indicate small-study effects (Schwarzer et al. 2015), in which smaller studies report larger effects than larger studies with more participants. In contour-enhanced funnel plots, the levels of statistical significance are visualized, facilitating the identification of whether the asymmetry was related to the levels of significance (e.g., publication bias). After the visual inspection of funnel plots, a series of Egger’s regressions were conducted to check the symmetry of the contour-enhanced funnel plot at the statistical level (Egger et al. 1997). Second, we conducted subgroup analysis with primary interest as the independent variable (0 = gender was not the primary interest; 1 = gender was one of the primary interests) to see whether the effects differed significantly.

We also examined possible time-lag bias, indicating that studies with more significant outcomes (e.g., lower p values) were published earlier than less significant effects (Page et al. 2021). We inspected time-lag bias by entering the publication year (centered with mean) into the meta-regression model. A significant regression coefficient indicates the existence of time-lag bias.

Assessment of study quality

Study quality was assessed to identify potential sources of bias that may have led to misleading conclusions. Given that this article describes a descriptive meta-analysis based on mean differences in performance between different genders in intertemporal decision-making tasks, and referring to existing quality assessment tools, we employed a self-developed Data Quality Index (DQI) to assess the quality of the included studies. This tool primarily focuses on the following five aspects: (1) Control Variables: Whether the study controls for other variables that may have an impact, score 1 if yes, 0 if no or unclear.; (2) Sample Size: The sample size score is determined logarithmically (Score = lg(N)). The larger the sample size, the higher the score; (3) Journal Tier: Journals are categorized into SCI and SSCI journals in the first and second quartiles, other SCI and SSCI journals, and Peking University core journals. Scores are assigned as 2 points, 1 point, and 0 points, respectively; (4) Data Validity: A score of 2 is assigned if the data validity rate is 0.9 or above, 1 point for validity rates between 0.8 and 0.9, and 0 points for validity rates below 0.8 or unreported; (5) Gender Ratio: Calculate R = Number of females / Total number of participants. If 0.40 ≤ R ≤ 0.60, assign 1 point; otherwise, assign 0 points. Finally, the total score for each literature is calculated, and a higher score indicates better overall quality of the literature.

In order to guarantee objectivity and reliability in quality assessment, two researchers independently conducted the process, with the calculation of their inter-rater reliability coefficient serving as a key indicator. Any discrepancies in quality scores are resolved through discussion until a consensus of 100% agreement is achieved.

Results

Descriptive summaries

A total of 109 eligible articles and 118 effect sizes were identified through our screening process, with detailed characteristics of these studies provided in the Supplementary Information. The individual sample sizes varied from 6 to 23,677 participants, with a median of 567.56 participants (SD = 2238.82). The mean age of participants ranged from 9.26 to 83.60 years (M = 23.51, SD = 12.49; Fig. 2A), with an age range of 9 to 94 years. The majority of study samples exhibited gender balance (Fig. 2B), with 53.80% females (SD = 11.20%). These studies were conducted across 16 countries (Fig. 2C), with three samples having a global participant distribution. The geographical distribution of countries represented in the samples was predominately from developed Western nations, with fewer samples covering a diverse range of Eastern countries. Publication years spanned from 1995 to 2023, with 86.44% of studies published after 2010 and 97.46% after 2000 (Fig. 2D). The 118 effect sizes included in the current synthesis ranged from −1.29 to 4.28 (M = 0.19, SD = 0.60). Information on moderator features (e.g., means, standard deviations, range) were summarized in Supplementary Table S3.

Fig. 2: Descriptive information of included studies.
figure 2

Yellow lines are kernel density estimate curves. A Age distribution of participants; B Distribution of female proportion across samples; C Distribution of countries across samples; D Publication year distribution of studies.

Overall effects

We compared the results of the traditional random-effects model and the multilevel meta-analysis model, with detailed results presented in Table 1. Although the effect size estimates from both models were similar and statistically significant, the traditional random-effects model exhibited a smaller standard error and a higher z-value, indicating more precise effect size estimates. Additionally, the AIC and BIC values of the traditional random-effects model were lower than those of the multilevel meta-analysis model, suggesting that the traditional model had a slight advantage in terms of model fit. Therefore, although the multilevel model offers a more nuanced approach to handling data dependencies, we concluded that the traditional random-effects model sufficiently meets the needs of this study and provides more precise fit results. Consequently, the traditional random-effects model was selected for subsequent analyses.

Table 1 Results of the model comparison.

The intertemporal decision-making differences between males and females, as estimated using the traditional random-effects model, yielded a small yet significant effect size, with Hedge’s g = 0.1762, 95% CI [0.0640, 0.2885], p = 0.0021 (see Fig. 3 for a forest plot). According to Cohen’s (1988) guidelines for effect size, this falls within the “small effect” range. Although the effect size is small, the result is statistically significant, indicating that males are more likely than females to choose the immediate smaller reward (SS) over the delayed larger reward (LL), reflecting a higher rate of delay discounting among males. Assuming that the discounting rates follow a standard normal distribution, Hedge’s g = 0.1762 suggests that the average discounting rate for males is 0.1762 standard deviations higher than that for females. Specifically, if the discounting rate for females is \({k}_{f}\) the discounting rate for males is \({k}_{m}\), where km ≈ kf + 0.1762 × σ, with σ being the standard deviation of the discounting rate. A sensitivity analysis was conducted by sequentially removing each sample (Viechtbauer, 2010), and it was found that the overall effect size remained significant (p < 0.05) after removing any single sample. This indicates that the overall effect size is robust, and no individual sample was identified as having an undue influence on the results.

Fig. 3: Forest plot of individual effect sizes of gender effects.
figure 3

Note. Individual effect sizes are ordered by effect size. The overall estimate with 95% confidence intervals is represented with a diamond.

Effect size heterogeneity

Heterogeneity within the data set was examined to assess variations in effect sizes across samples. Results of this analysis showed significant heterogeneity, Q (117) = 18458.44, p < 0.001, indicating substantial differences in effect sizes. A notable variance in true effect sizes was also observed, with τ = 0.58, τ2 = 0.34, likely due high heterogeneity across samples, I2 = 97.30%. According to the 75% rule proposed by Hunter and Schmidt (1990), further investigation into potential moderators to understand modifying influences is warranted. The overall estimate of effect size may not adequately represent the range of effect sizes in the sample. Therefore, exploring potential moderators is essential to identify factors contributing to this variability.

Subgroup and moderator analyses

In addition to conducting an overarching effect-size analysis, our study delved into moderation effects to gain deeper insights. We employed rigorous methods such as meta-regression and subgroup analyses to identify potential moderators, thereby segmenting the data into more meaningful subsets. A pivotal aspect of these moderator analyses entailed ensuring a sufficiently large sample size, as underscored by Hedges and Pigott (2004), to bolster statistical power and robustness of our findings.

Chronological age and types of participants

The results of the regression analysis with age as a continuous variable revealed that b = −0.012, z = −2.669, and p = 0.008, indicating a gradual reduction in the gender effect with increasing age. Additionally, an analysis was conducted using participant type as a grouping variable to explore the moderating effect of age on gender differences. The results showed Q = 5.39, p = 0.066, suggesting marginally significant gender differences across various age groups. Specifically, gender differences in delay discounting were only significant in adults, with males exhibiting higher impulsivity than females (estimate mean g = 0.1798), while no significant differences were observed in children, adolescents, and the elderly.

Region

Excluding several global studies covering extensive regions, we conducted a subgroup analysis on the geographical locations of participants in other studies, grouping them by continent. The results revealed significant differences between regions, with Q = 17.96 and p = 0.0004. Gender differences in delay discounting were predominantly observed in studies conducted in the Asia, with males exhibiting a stronger preference for immediate small rewards, with an estimate mean of g = 0.2391.

None of the other moderators reached statistical significance. Tables 2 and 3 encapsulate the essence of our moderation analyses outcomes, providing a comprehensive summary of the key moderators identified and their respective effects on the dependent variables under investigation.

Table 2 Results of the subgroup analyses for the comparison between males and females.
Table 3 Results of the meta regression analyses for the comparison between males and females.

Publication bias

Since the present meta-analysis consists mostly of data obtained from published studies, the final sample might not be representative of the entire population of studies in existence. It is, therefore, possible that studies producing non-significant results, because they have a lower probability of publication, did not enter our meta-analysis and this problem might exaggerate the magnitude of the observed effect under consideration.

We plotted a contour-enhanced funnel plot using all studies (Fig. 4), which visually appeared symmetric, with most observations clustered at the top due to infrequent extreme effects with large standard errors. While visual inspection suggested minimal asymmetry, we also applied the Egger et al. (1997) method to formally assess publication bias. The test result (z = 0.65, p = 0.51) suggests that publication bias is unlikely to account for the present results. Additionally, the absence of significant differences based on whether gender was a primary interest further alleviates concerns about publication bias (Q = 0.01, p = 0.928).

Fig. 4: Contour-enhanced funnel plot based on all effect sizes.
figure 4

Each black point represents an observation. The funnel is centered at 0 (the null hypothesis of no difference). The criterion of statistical significance is p ≤ 0.05.

Time-lag bias was examined by predicting effects using centered publication year. The results did not reach the 0.05 level of significance, b = 0.011, z = 1.067, p = 0.286.

Assessment of study quality

Considering that all included studies are observational in nature, we employed a self-developed Data Quality Index (DQI), referencing quality assessment tools for observational studies. This involved two independent psychology researchers evaluating all included studies. Ratings ranged between 1 and 10 points (M = 5.79, SD = 1.37). The Kendall coefficient of concordance between the two evaluators was 0.79, with p < 0.001, indicating a significant level of agreement in their assessment notes for all included literature.

Discussion

The present study made a concerted attempt to resolve the long-standing debate on gender differences in delay discounting by synthesizing theoretical and empirical findings from various disciplines over the past nearly 30 years. Despite significant progress in this field, the evidence has not conclusively resolved the issue. Regarding the relationship between gender and delay discounting, empirical results often conflict with each other, not to mention the unknown influence of possible moderating factors. This meta-analysis provides the first comprehensive and systematic review of gender differences in delay discounting, revealing the overarching direction of these differences and key potential moderating factors. The findings indicate that males exhibit a greater preference for immediate small rewards, demonstrating a higher degree of delay discounting compared to females.

Our meta-analysis offers distinct advantages over prior ones, and by so doing, extends their findings. Firstly, it encompasses a wider and more inclusive selection of studies. Initial observations revealed that certain studies, while not explicitly centered on gender differences, did report on the gender-specific aspects of delay discounting. To ensure a comprehensive review, the search terms were broadened to include a diverse range of literature on delay discounting, rather than solely focusing on “gender/sex differences”. Secondly, our meta-analysis hones in on a more specialized aspect. Prior studies primarily delved into investigating gender differences in impulsivity, which encompasses a spectrum of impulsive traits, behaviors, and decision-making processes, of which delay discounting is just one facet. Additionally, within the realm of delay discounting, considerations extend beyond impulsivity to encompass evaluations of rewards and time. Hence, a specialized examination of gender differences in delay discounting is deemed essential. Thirdly, our meta-analysis delves deeper into the underlying reasons for the inconsistent findings on gender differences in delay discounting noted in earlier studies through moderation effects and subgroup analyses. Our objective is to offer a comprehensive elucidation of these discrepancies by exploring the associated theoretical frameworks.

Limitation and future research

Sample bias and study selection

Despite efforts to include non-English literature, such as a Chinese database, this meta-analysis only partially mitigates the potential impact of monolingual bias on the results (Johnson, 2021). Additionally, achieving a balanced number of male and female participants is crucial when examining gender differences. However, only 47% of the included studies had a balanced gender ratio (0.45–0.55 female participants), as many studies were not primarily focused on gender differences. This imbalance may have weakened the observed effect size for gender differences and was also evident in the moderating effect analysis, where the ratio significantly influenced gender-related effects. Furthermore, the reliance on self-reported measures in the included studies may introduce variability. Considering that participants’ choices in both self-reported questionnaires and laboratory tasks were driven by their subjective preferences, the potential limitations of self-reported data remain. Future research could incorporate objective measures, such as eye-tracking or neuroimaging, to complement traditional self-reported data and mitigate these limitations. Another limitation is the exclusion of unpublished studies. While including such studies could provide a more comprehensive representation of the evidence within the field, their quality is often difficult to assess, which may reduce the reliability of the meta-analytic results (Cook et al. 1993). Future studies could attempt systematic retrieval of unpublished studies and explore practical methods to address the challenges associated with evaluating the quality of these studies.

Domain-specific generalizability

The conclusions drawn are limited to monetary rewards (secondary reinforcers), and was limited to choice-based tasks, potentially limiting their generalizability to other domains of resource allocation. There remains an ongoing debate regarding the universality of delay discounting as a phenomenon across various domains (Odum, 2011; Odum et al. 2020) or its specificity to particular contexts (Jimura et al. 2011; Weatherly et al. 2010). Future research should explore appropriate methods to more comprehensively integrate various types of delay discounting tasks and measurement approaches, thereby enhancing the generalizability of the results and providing a more thorough understanding of the delay discounting phenomenon. while not all English-based articles exclusively sample from English-speaking countries, including studies from Eastern countries like China, the majority of the sample still composed mainly of studies from the Americas and Europe, with 65% of participants originating from these regions. Therefore, caution is advised when generalizing the study findings to other nations, particularly those in developing regions.

In the subsequent section of the article, we have structured our discourse according to the moderator analysis results, theoretical comparison, and implications.

Findings of the moderator and subgroup analyses

The moderation analysis indicates that age (both as a continuous and categorical variable) moderate gender differences in delay discounting. Additionally, geographical region of participants also serve as a moderator in influencing the gender effect on delay discounting. These findings will be discussed in subsequent sections.

First, when we take age as a continuous variable, the results of regression analysis showed that the gender differences gradually weaken as age increases. We then used participant type as a grouping variable to explore the moderating effect of age on gender differences. The results found significant gender differences solely among adults, indicating that age acts as a moderator in gender differences in delay discounting behavior. The fluctuation gender differences in delay discounting over the lifespan correspond with fluctuations in sex hormone levels (Ober et al. 2008), indicating that gender differences in delay discounting are not consistently static. Specifically, gender differences are not significant during adolescence, peak during young and middle adulthood, and gradually decrease with age, particularly post-menopause, leading to the eventually disappearance of most differences. These research outcomes imply that sex hormones are pivotal in shaping and perpetuating gender-specific trends in delay discounting behavior. Moreover, changes in gray matter in the brain further support the dynamic developmental patterns of gender differences in delay discounting across age groups. Studies indicate that while males and females exhibit similar gray matter density at age 8, females experience a more rapid increase in gray matter density during adolescence, resulting in higher overall gray matter density compared to males (Kaczkurkin et al. 2019).

Second, the impact of “region” on gender differences in delay discounting appears to be significant. Gender differences in this behavior are predominantly observed in studies conducted in Asian countries, while no such differences are noted in the Americas. Furthermore, subgroup analysis based on country types indicates that gender differences in delay discounting are mainly present in developing countries and not significant in developed countries. Economic conditions and the Global Gender Gap Index (GGGI) are likely to influence gender differences in delay discounting across different regions. Economic status is commonly gauged through metrics such as GDP, inflation rates, and the Gini coefficient. The GGGI provides a comprehensive evaluation of gender differences within a country, encompassing aspects like economic participation, educational opportunities, health access, and political empowerment. A prior research surveyed delay discounting in 61 countries to explore global variances proposed that economic status could underlie such differences (Ruggeri et al. 2022). It is hypothesized that developing countries may exhibit higher economic disparities or distinct gender roles in economic decision making. In terms of social decision making, if women are often assigned roles like caregivers, they may prioritize immediate needs due to limited time and resources for long-term planning. Consequently, even in more gender-equal societies, women may exhibit lower rates of delay discounting. Additionally, cultural norms and traditions can impact individual choices. Societies that emphasize men as primary economic providers may encourage a focus on long-term financial gains over immediate rewards, potentially explaining higher rates of delay discounting among men in regions with lower gender inequality.

In addition to the demographic variables identified in this study—age and region—future research should explore psychological moderating factors, such as personality traits (e.g., self-control, risk preference) and cognitive abilities (e.g., executive function, working memory), as these may play a critical role in gender differences in delay discounting (Keidel et al. 2021). For instance, a recent study using genetic data found a weak to moderate positive correlation between trait impulsivity and delay discounting (Gustavson et al. 2020). Furthermore, the ability to imagine future events is recognized as a key cognitive factor influencing delay discounting. Research has shown that episodic imagery can effectively reduce delay discounting (Rung and Madden, 2018; Scholten et al. 2019). By incorporating these psychological variables, future studies can provide a more nuanced understanding of the relationship between gender and delay discounting behavior, offering valuable theoretical insights for interventions and policy-making.

In delay discounting tasks, the type of reward (real or hypothetical monetary rewards) is an important experimental design factor, raising concerns about the ecological validity and consistency of task results. Some studies suggest that real and hypothetical monetary rewards have similar incentive effects on delay discounting. For instance, Johnson and Bickel (2002) compared the impact of real and hypothetical monetary rewards on participants’ decision-making behavior and found no significant differences in discounting rates between the two reward conditions. Subsequently, Madden et al. (2003, 2004) confirmed these findings in a replication study, ruling out potential effects of repeated measurements. Lagorio and Madden (2005) introduced forced-choice and free-choice conditions into the delay discounting task, similarly finding no significant differences in decision-making behavior between real and hypothetical rewards. Furthermore, Bickel et al. (2009) used functional magnetic resonance imaging (fMRI) to explore the neural mechanisms underlying impulsive decision making with real and hypothetical monetary rewards. They found that both types of rewards activated brain regions associated with reward pathways and prefrontal control functions to a similar extent, with no significant differences in activation levels. These studies collectively indicate that hypothetical monetary rewards can serve as effective incentives, comparable to real rewards, in delay discounting tasks. These findings align with the results of our subgroup analysis, which revealed no significant differences in the effects of real versus hypothetical monetary rewards on the gender differences observed in delay discounting. However, some researchers have proposed alternative perspectives. For example, Weafer and de Wit (2014) summarized in their review that gender differences in delay discounting tasks may be context-dependent. Specifically, they suggested that women tend to exhibit higher discounting rates and prefer immediate rewards in hypothetical reward tasks, whereas men may exhibit steeper discounting in tasks involving real rewards or opportunities to receive rewards through lotteries. While this perspective provides an intriguing interpretative framework, it is important to note that the conclusions drawn by Weafer and de Wit are primarily based on a review of existing studies and lack empirical data or meta-analytic evidence. As such, the generalizability and reliability of their claims remain to be further validated. Therefore, when interpreting gender differences in delay discounting tasks, caution should be exercised in considering the role of contextual factors, and future research should incorporate larger sample sizes and diverse experimental designs to provide more robust empirical evidence.

Theoretical comparison and explanation

Our research reveals a substantial gender disparity in delay discounting, with males exhibiting higher rates compared to females. This indicates a stronger inclination among males towards immediate small rewards. These findings are highly consistent with existing theories of gender differences we mentioned earlier, whether from an evolutionary perspective or social role theory, further validating our results.

From an evolutionary perspective, gender differences in decision preferences may reflect genetic foundations of early human survival strategies. Due to higher competitive pressures and challenges in resource acquisition throughout evolution, males may seek immediate rewards to maximize resource utilization or manage risks efficiently, leading them to exhibit a tendency towards immediate small rewards in delay discounting tasks. Simultaneously, traditional social role divisions expect males to demonstrate adventurous, competitive, and decisive traits, while females are expected to exhibit cautious, cooperative, and long-term perspectives (Eagly, 1987; Wood and Eagly, 2015). These gender role expectations become internalized during individual development, shaping their decision preferences. Therefore, males may prefer immediate small rewards in delay discounting tasks, aligning with societal norms associated with their gender roles.

However, based on moderation and subgroup analyses, social role theory appears to more effectively explain the gender differences we observed. Our study reveals significant gender disparities in delay discounting across different age groups and regions, indicating that environmental and social factors play a crucial role in shaping these differences. Specifically, variations in age and geographical location seem to moderate the differences between men and women in delay discounting tasks, aligning with expectations from social role theory.

In childhood and adolescence, gender roles are still developing (Martin and Ruble, 2010; Stynes et al. 2021), leading to a less pronounced divergence in decision making between genders as individuals are primarily exploring various social roles (Benish-Weisman et al. 2022). Young adults, however, are at a crucial stage where gender role expectations and cultural norms strongly influence their decision-making processes (Hutchison et al. 2016). Differences in educational opportunities, career expectations, and social pressures faced by men and women during this phase are reflected in their intertemporal decision-making patterns. For instance, men tend to favor high-risk, high-reward long-term investments, while women often prioritize stability and safety with lower-risk short-term investments. However, as individuals age, societal roles stabilize, and the influence of social expectations weakens. Gender differences in decision making may diminish in old age as older adults prioritize personal values and quality of life over gender norms (Lu et al. 2023). Health status and economic conditions also play a significant role in shaping intertemporal decision making among the elderly, potentially overshadowing gender disparities. Moreover, regional factors further impact gender differences in decision making. Cultural and economic variations across regions shape distinct gender role expectations and norms (Best and Puzio, 2019; Mazzuca et al. 2023). In some regions, men are expected to display competitiveness, while women are tasked with family and community care-taking roles (Wang et al. 2024). These expectations influence men and women’s decision-making tendencies. In certain cultures, men are encouraged to pursue economic independence and career advancement, leading to long-term financial planning and investment. Conversely, women may prioritize family and community, focusing on short-term budgeting. Regional gender role expectations not only influence decision content but also the factors considered during the decision-making process.

While age and region are influential in determining gender differences in intertemporal decision making, it is crucial to recognize that these disparities are not universally moderated or eradicated. Factors such as individual distinctions, educational backgrounds, and job characteristics can significantly impact intertemporal decision making. Therefore, a holistic understanding of gender variations in intertemporal decision making requires a comprehensive evaluation of multiple elements, including age, region, and individual attributes.

Implications

The dynamics of gender differences

The present study highlights the intriguing notion that gender disparities exhibit greater prominence in adults, indicating that age plays a significant role in modulating such distinctions. This discovery underscores the importance of investigating how individual decision-making tendencies evolve throughout the lifespan, prompting further exploration into the dynamic progression of gender differentials across various age cohorts. As individuals transition through distinct life stages, they encounter evolving social, economic, and psychological pressures, which directly influence the manifestation of gender variations at different points in time. During early phases, educational and career considerations may exert a substantial impact, potentially accentuating gender variations in preferences for immediate gratification. With advancing age, familial responsibilities, social interactions, and long-term planning may assume greater significance in shaping the expression of gender disparities. Furthermore, the interplay of physiological and psychological transformations over the lifespan, including alterations in cognitive abilities and social needs, is likely to significantly contribute to gender differentials. Consequently, future research endeavors should place heightened emphasis on comprehensively examining the dynamic trajectory of gender distinctions across diverse age demographics to enhance our understanding of the underlying mechanisms governing this phenomenon. It is imperative to recognize that gender differentials are not static traits but rather dynamic attributes that necessitate contextualization within the broader framework of the lifespan. This perspective carries implications for tailoring interventions in a more personalized and phased manner and for advancing a more holistic theoretical framework within gender studies.

The critical role of culture in shaping individual decision-making preferences

Gender differences exhibit greater prominence in certain regions, underscoring the profound influence of culture on individuals’ values and decision-making behaviors. This discovery offers fresh empirical backing for cultural psychology, enhancing our comprehension of how culture molds behavioral patterns. In Asia and developing nations, entrenched cultural values notably shape individual cognition and behavior. Gender roles in these cultures come with specific social expectations, directly impacting decision-making performance. For instance, certain cultures prioritize financial success and competition for men, while women are expected to prioritize family and social connections. These gender role divisions not only define individuals’ ideas of success and accomplishment but also influence their preferences for immediate rewards versus long-term investments. The present study’s findings also reflect the role of culture in shaping education, family values, and media influence. Through social learning, individuals gradually internalize perceptions of gender roles, significantly influencing their decision-making processes. If a culture links immediate rewards with males, individuals may tend to adopt this behavior to align with societal norms. Conversely, societies that highlight women’s roles in long-term planning and family may lead women to focus on stability and long-term investments, showing less inclination toward immediate rewards. In sum, this research sheds light on how culture shapes gender differences and decision-making behaviors, underscoring culture’s significance in individual psychological processes. These insights carry weighty implications for developing culturally attuned decision-making approaches and fostering appreciation for cultural diversity.

Developing delay discounting models that capture interactions among multiple factors

Traditional discount utility models (DU models) and their variations (e.g., hyperbolic discounting models) are commonly used to study delay discounting. However, these models assume a single discounting rate, overlooking individual and contextual heterogeneity, and thus struggle to fully explain the complexity of behaviors observed in delay discounting. This limitation aligns with the perspective of Frederick et al. (2002), who argued that delay discounting is not merely the result of a single discounting process but is driven by multiple psychological mechanisms. His theory of multiple motives provides a new theoretical framework for understanding inter-individual and cross-context behavioral differences in delay discounting. For instance, in the study of gender differences, the behaviors exhibited by males and females in delay discounting may not simply stem from differences in discounting rates but rather from the interplay of various psychological motives. Females may place greater emphasis on the reliability of future rewards (anticipatory utility), while males may be more influenced by emotional factors or immediate gratification motives. Such differences can be more comprehensively explained through the lens of the multiple motives theory. Therefore, research into gender differences in delay discounting should focus on the interaction of these psychological mechanisms rather than solely attributing differences to the levels of discounting rates. Future studies should aim to develop delay discounting models that account for the interactions among multiple contexts and motives. These models should be capable of adapting to dynamic preference changes, incorporating factors such as temporal shifts, contextual adjustments, and learning effects, to explain heterogeneity across contexts and individuals. Such models would not only provide a more accurate understanding of gender differences but also shed light on cultural variations, social norms, and their influence on delay discounting behaviors. By developing these models, we can achieve a more comprehensive understanding of the complexity of delay discounting and provide theoretical support for designing personalized intervention strategies.

Practical significance

Delay discounting is widely used to predict and measure behavioral disorders, with research showing a strong correlation between addiction disorders and higher delay discounting rates (MacKillop et al. 2011). Gender differences in these behavioral disorders have also garnered significant attention, with men being notably more likely than women to engage in addictive behaviors such as gambling and online gaming (Chóliz, 2016; Su et al. 2019). Therefore, understanding the role of gender in delay discounting can provide theoretical support for developing personalized interventions to optimize decision-making processes (Gillebaart and de Ridder, 2015). For example, in the education sector, gender-specific financial literacy programs can help men strengthen long-term planning awareness, while enhancing women’s risk management skills. In the financial sector, banks and investment institutions can design differentiated products based on gender characteristics, such as offering incentive-based savings tools for men to promote long-term financial planning and providing more investment education to women to enhance their risk decision-making capabilities. At the same time, policies should consider regional cultural and economic contexts. In developing countries, efforts should focus on strengthening women’s economic independence and raising awareness among men about the importance of saving, while in developed countries, public education should promote rational consumption and saving. These integrated interventions can reduce gender differences in delay discounting and foster healthier long-term decision-making behaviors.