Human cooperation is a remarkable phenomenon, distinguishing our species from others in the animal kingdom. Unlike most mammals, including our closest genetic relatives, humans achieve large-scale cooperation among unrelated individuals. This propensity for prosociality has been crucial to our success as a species. However, the evolutionary origins and mechanisms that maintain human prosociality remain an important puzzle. An emerging explanation for the evolution of human cooperation is cultural group selection1,2. Extending multilevel selection processes2,3,4, cultural group selection theory emphasizes two key conditions: substantial between-group variation coupled with minimal within-group variation in the proportion of altruistic behavior; and sufficiently strong intergroup competition as a form of group-level selection. Specifically, when groups with high internal homogeneity form, cooperators can, on average, obtain higher payoffs than defectors (Fig. 1a). When this first condition is met, sufficiently strong between-group selection processes drive the evolution of cooperation. In fact, cultural \(\:{F}_{ST}\), an index of cultural differentiation between groups, measured for the real-world human population, is typically large enough for cultural group selection to work5, despite the small genetic \(\:{F}_{ST}\)—an index of genetic differentiation between groups, resulting from frequent intergroup migration. Consequently, the debate surrounding cultural group selection primarily addresses the second condition: intergroup competition’s nature and strength.

Researchers have proposed two intergroup selection processes that could make this possible6,7: differential group survival rate8, where cooperative groups survive better under strong group-level selection pressures (e.g., climate change or epidemics), and intergroup conflicts (e.g., warfare), which can directly influence group members’ survival and reproduction rates. The latter process is exemplified by Choi and Bowles‘9 model of coevolution between intergroup conflicts and parochial altruism. Empirical evidence has also shown that intergroup competition elicits altruism, with studies illustrating increased reciprocity during wartime10 and enhanced third-party punishment and cooperation in economic games with intergroup payoff conflicts11,12,13,14,15,16,17. Intergroup conflicts are also likely to favor types of altruism influenced by group boundaries, e.g., parochial altruism or limited and personal prosociality, which are observed in non-WEIRD societies18,19. Nevertheless, the extent to which this type of group-level selection drove the evolution of cooperation remains debated due to difficulties estimating war frequencies in prehistoric societies20,21.

However, cultural group theorists offer two alternative processes for intergroup competition6,7,22: selective migration to successful groups23 and imitation of successful groups. Considering humans’ unique social learning abilities24, the latter process, upon which this study focuses, is likely to be a powerful channel of intergroup selection, allowing cooperation to diffuse beyond group boundaries (Fig. 1b). A key distinction from warfare-based selection processes is that this cultural diffusion does not require intergroup conflict. Consequently, this mechanism may cause generous and impersonal prosociality—a type of altruism unaffected by group boundaries and more commonly observed in WEIRD societies18. If this intergroup competition process was indeed cooperation evolution’s primary driver, we would expect cooperative behaviors and norms to continue spreading across group boundaries even in modern times, which have a lower frequency of intergroup conflicts compared to prehistoric eras.

Fig. 1
Fig. 1
Full size image

Graphical description of the spread of cooperative behavior through imitation of successful out-groups in social dilemma situations. Red devil icons represent defectors, blue smiley icons represent cooperators. (a): Conceptual diagram of the payoff structure in social dilemmas. Cooperators can always gain higher payoffs by switching to defection, while a group of cooperators achieves higher collective payoffs than a group of defectors. (b): Cooperation spread process. \(\:{T}_{0}\): Defector group observes a more successful group of cooperators. \(\:{T}_{1}\): They copy the successful group. \(\:{T}_{2}\): Cooperative behaviors increase within the group repeating of \(\:{T}_{1}\) (see Discssion section for possible processes stabilizing cooperative behavior within groups).

Despite these significant implications, there remains little empirical examination of cooperative behavior spreading through the imitation of successful groups. Modeling analysis of experimental results suggests a strong tendency for humans to imitate the successful25, and historical anecdotes from early Christian communities’ expansion in the Roman Empire suggest the imitation of successful individuals from neighboring groups26. Furthermore, theoretical evidence suggests that group-beneficial behaviors in coordination games can indeed spread through payoff-biased imitation27. Therefore, even in social dilemma situations, less-successful groups can potentially copy the cooperative behaviors of out-groups, possibly leading cooperation’s diffusion across the population. Unlike previous studies assuming intergroup payoff competition in economic game experiments11,12,13,14,15,16,17, our research explores this possibility without direct competition between groups. We first examine cooperative out-group information’s causal impact on subsequent cooperative behavior through laboratory experiments. We subsequently analyze country-level data to investigate how accessibility to other groups’ information influences different types of prosocial values, such as personal prosociality (e.g., trust in family, willingness to fight for country) or impersonal prosociality (e.g., general trust).

We conducted two-condition experiments to manipulate opportunities for observing successful out-group information and examine its impact on cooperation in less-successful groups. In both conditions, groups of four participants played a repeated public goods game (PGG) divided into two 30-round blocks using a partners protocol (Fig. 2a). Each round, participants received an endowment of 20 points. Subsequently, they decided amounts of contribution (\(\:{X}_{i}\in\:\{\text{0,1},...,20\}\)) to the public good, keeping \(\:20-{X}_{i}\) points for themselves. For each point contributed, each group member earned 0.4 points (i.e., MPCR = 0.4). Thus, the per-round payoff was \(\:20-{X}_{i}+{\sum\:}_{j=1}^{4}{X}_{j}\cdot\:0.4\), with \(\:X=0\) as the Nash equilibrium. To establish non-cooperative groups in the 1st block, all participants played with three non-cooperative bot players showing negative reciprocity (Fig. 2c; see Methods for bot algorithm details). After the 1st block, participants received summary feedback on their group’s results. Here, participants randomly assigned to the “control” condition were presented with only their own group’s information, while participants assigned to the “out-group information” condition were also shown results from a more cooperative out-group (Fig. 1b). This out-group information was identical for all participants in this condition, representing a more cooperative and successful group (mean contribution: 16.7 points, mean payoff: 30.1 points; see Methods for details). Participants then played PGG for 30 rounds in the 2nd block with the same group composition. This design allowed us to manipulate exposure to cooperative behaviors of the out-group and examine its impact on subsequent cooperative behaviors. In Experiment 1, participants continued playing with bot players in the 2nd block to control for others’ behaviors. In Experiment 2, to examine whether payoff-biased imitation’s effects on cooperation diffusion would be amplified through interaction, we had four participants play the public goods game together in the 2nd block (Fig. 2a). Importantly, participants’ rewards were determined solely by their own group’s results in PGG, with no payoff-based competition between groups (see Methods for reward details).

Fig. 2
Fig. 2
Full size image

Overview of the experimental design used to test the effect of successful out-group information on cooperative behavior. (a): Two experimental designs. In both experiments, participants play 30-round public goods games in each block. During the manipulation phase between blocks, a summary of results in the 1 st block is displayed. Participants are randomly assigned to either the treatment group (blue), which is informed of another successful group, or the control group (red), which is not informed, and then play the 2nd block with the same members (partner’s protocol). In Experiment 1, participants play with bot players in both blocks, while in Experiment 2 they play with real participants in the 2nd block. (b): Summary of results in the 1 st block shown during the manipulation phase. The control group can see information only about their own group (Group 1), while the treatment group (“out-group information” condition) can also see information about another group (Group 2) that achieved higher average cooperation points (16.7 out of 20) and payoffs (30.1 out of 32). (c): Bot player algorithm. The contribution amount \(\:{x}_{i,t}\) of bot player \(\:i\) in round \(\:t\) was sampled from a normal distribution \(\:N(0.8\cdot\:{\mu}_{t-1},\:3)\), rounded to the nearest integer, and truncated to [0, 20], where \(\:{\mu\:}_{t-1}\) represents the average contribution of other players in round \(\:t-1\).

While experimental studies allow us to examine the causal relationship between observing cooperative out-groups and following cooperative behavior, they have difficulty measuring various types of prosociality beyond economic game contexts. One way to address these problems would be to obtain quantitative real-world data, which approximate the experimental conditions and observations, and test our hypothesized model. To this end, we conducted a cross-cultural study using countries as units of analysis. Following the results from our experiments, we investigated how independent variables approximating opportunities to observe out-group information predict various types of prosocial values obtained from the World Values Survey. This analysis controlled for countries’ political systems, quality of rule of law and regional correlations. In this survey, we aimed to augment and complement our experimental findings with broader real-data observations.

Results

Study 1: laboratory experiments

We verified random assignment and ensured comparability between conditions in both experiments by examining the difference of contribution amounts in the 1 st block, finding no significant effect of condition (see Methods for detailed results of manipulation check). Furthermore, we confirmed that participants in the out-group information condition received information about a more cooperative and successful group than their own (Fig. 2b; mean contribution: 16.7 points, mean payoff: 30.1 points). Indeed, these values were substantially higher than those observed in participant groups (mean contribution: 6.1 points, mean payoff: 23.2 points during 1 st block in both experiments). Note that in the chronological order of research, we switched from the frequentist null-hypothesis-significance-testing approach (for analyzing behavioral data of the two experiments) to the Bayesian statistical modeling approach (for analyzing participants’ reciprocity in the experiments and cross-cultural data). Although we report these analyses in the order they were conducted, we subsequently reanalyzed all the frequentist’s tests using hierarchical Bayesian models and confirmed the qualitatively same results (see Supplementary information for details).

Experiment 1

We conducted Experiment 1 in October 2021 with a sample of 105 students from Hokkaido University—53 participants in the out-group information condition and 52 in the control condition (see Methods for further information). Figure 3 provides a visual summary of the results from Experiment 1. To examine the immediate effect of observing a successful out-group on cooperative behavior, we first analyzed contributions in the 1 st round of the 2nd block. Figure 3a displays the participants’ 1 st -round contributions for each condition in both blocks, with horizontal lines representing mean values. To statistically test our manipulation’s impact between blocks, we performed GLMM analysis, predicting 1 st -round contributions using block and condition dummy variables, with participant ID as a random effect. The results revealed a significant interaction effect between block and condition variables (2.829, 95% CI [0.752, 4.902], t(158.951) = 2.667, p =.008), suggesting that observing information regarding a cooperative group led to higher cooperative behavior in the immediately following round.

Figure 3b illustrates the mean contributions across rounds for both blocks in each condition. Solid lines represent observed values, with shaded regions indicating standard errors. We estimated linear regression lines using GLMM with participant ID as a random effect, depicted by dashed lines. The interaction effect between block and condition was significant (2.681, 95% CI [1.973, 3.390], t(6284) = 7.419, p <.001), suggesting that the intercepts (i.e., 1 st -round contributions) in the 2nd block differed significantly between conditions, consistent with our previous analysis. In both blocks, mean contributions decreased over rounds (fixed effect of round: −0.207, 95% CI [−0.226, −0.189], t(6190)=−21.644, p <.001), likely due to participants playing the public goods game with non-cooperative bot players exhibiting negative reciprocity. Furthermore, the three-way interaction between round, block, and condition, which equals the difference between the slopes of both conditions for the 2nd block, was significant (−0.099, 95% CI [−0.136, −0.061], t(6190)=−5.145, p <.001), indicating a significantly steeper decline in the 2nd block for the out-group information condition. Consequently, contributions in the out-group information condition eventually decreased to levels similar to those in the control condition.

Finally, for ad hoc analysis, we divided the 30 rounds for each block into six chunks comprising five rounds. Figure 3c shows estimated marginal means and standard errors calculated using GLMM for each chunk. Comparing the marginal means between conditions reveals that the manipulation effect persists significantly until chunk 2. However, the difference gradually diminishes in subsequent chunks (see Supplementary Table S1-2 for comprehensive analysis results across all chunks).

These experimental results demonstrate that imitating successful groups led to the inter-group transmission of cooperative behavior. However, due to the experimental design, participants again played with non-cooperative bot players in the 2nd block, quickly resulting in mutual defection and preventing the spread of cooperation within the group. To address this limitation, we conducted Experiment 2, where participants played PGG with other actual participants in the 2nd block. This allowed us to examine whether interaction with other participants could amplify payoff-biased imitation’s effects on cooperation diffusion. We conducted power analysis to achieve a statistical power of 0.8 for Experiment 2 using the results of Experiment 1. Based on this analysis, we determined that a sample of approximately 168 participants would be appropriate (see Supplementary Figure S3 for details).

Experiment 2

We conducted Experiment 2 in June 2022 with 152 students from Hokkaido University, evenly divided with 76 participants in each of the out-group information and control conditions. Per Fig. 4a, similar to Experiment 1, we observed a large difference in contributions between the two conditions in the 1 st round of the 2nd block immediately after the manipulation. GLMM analysis revealed a significant interaction effect between block and condition variables (2.421, 95% CI [0.507, 4.342], t(272.491) = 2.479, p =.014), indicating the inter-group transmission of cooperative behavior.

Figure 4b illustrates high cooperation rates across most rounds in both the out-group information and control conditions when participants interacted with each other in the 2nd block. In fact, nine out of 19 groups in each condition achieved full cooperation (all participants contributing max 20 points) for five or more rounds. We estimated linear regression lines using GLMM with participant ID as a random effect, depicted by dashed lines in Fig. 4b. The interaction effect between block and condition was significant (2.166, 95% CI [1.511, 2.820], t(8939) = 6.483, p <.001), indicating a significant difference in intercepts in the 2nd block between conditions. In both blocks, mean contributions decreased over rounds as in Experiment 1 (fixed effect of round: −0.193, 95% CI [−0.210, −0.176], t(8927)=−21.839, p <.001). However, we found no significant three-way interaction effect between round, block, and condition (0.004, 95% CI [−0.149, 0.157], t(8939) = 0.053, p =.958), i.e., no difference in slopes between conditions in the 2nd block. Unlike Experiment 1, cooperation did not decay as much in the 2nd block when participants played with each other, rather than with bot players.

Figure 4c illustrates estimated marginal means and standard errors calculated using GLMM for each chunk. Comparing the marginal means between conditions reveals that the difference lasts longer in Experiment 2 than Experiment 1, indicating that transmitted cooperation persisted to some extent in Experiment 2, unlike Experiment 1. However, we did not observe a diffusion process where cooperation in the out-group information condition increased relative to the control group over rounds. Summarily, our analysis of cooperative behavior in two experiments shows that, while imitation of a cooperative out-group allows cooperative behaviors to be culturally transmitted to less-successful groups, the subsequent interaction process does not result in spread of cooperation without additional mechanisms, e.g., punishment. In the next section, we examine the impact of observing out-group cooperation on the underlying reciprocity using model estimation.

Fig. 3
Fig. 3
Full size image

Figure 3: Summary of Experiment 1 results showing the effect of observing successful out-group information. Red represents the “Control” condition where the feedback of the 1 st block results includes only the participant’s own group. Blue represents the “Out-group information” condition where the feedback of the 1 st block results includes not only the participant’s own group but also a successful out-group. (a): Violin plot of contributions in the 1 st round for each condition. Dots represent observed data, horizontal lines represent mean values. (b): Line plot showing mean contributions across 30 rounds in each block. Solid lines and shaded regions represent observed values and their standard errors, dashed lines represent predicted values based on GLMM estimates for the linear regression model. (c): Bar plot showing mean contributions across six chunks in each block (one chunk includes five rounds). Error bars represent standard errors. Significance signs for differences between conditions based on estimated marginal means from GLMM results are shown (2nd block chunk 1: Control 10.58 vs. Out-group information 14.21, t (175.67) = 4.60, p =.00; 2nd block chunk 2: Control 6.85 vs. Out-group information 9.44, t (175.67) = 3.28, p =.01).

Model Estimation on participants’ reciprocity

The observed increase in cooperation may result from changes in internal models such as reciprocity rates, which represent the proportion of a participant’s contribution relative to the mean contribution of others in the previous round. To investigate this possibility, we estimated reciprocity rates from the data using hierarchical Bayesian modeling. Figure 5 provides a visual summary of participants’ reciprocity for each block in Experiments 1 and 2 (see Supplementary Information on detailed reciprocity models, how to construct Fig. 5 and additional estimated values of parameters). The x-axis shows the average contribution of the other three members in the t-1 round, while the y-axis shows the participant’s contribution in the t round. Bubble sizes represent the number of data points, dotted lines show estimated mean reciprocity, and shaded regions represent the 95% credible interval (HDI) of the predictions. We found a substantial effect of condition on the reciprocity slope parameter. In the 2nd block for Experiment 1, the mean estimate of the difference in slopes (i.e., reciprocity) between conditions was 0.16 (95% HDI [0.10, 0.22]). For Experiment 2, this difference was 0.09 (95% HDI [−0.03, 0.21]), suggesting that information about another successful out-group may lead to an increase in reciprocity rates, to some extent, in their own groups. This finding implies that the spread of cooperative behaviors may arise not only from direct imitation of the behavior, but also from changes in underlying psychosocial processes, e.g., reciprocity rates.

Fig. 4
Fig. 4
Full size image

Figure 4: Summary of Experiment 2 results showing the effect of observing successful out-group information with human-human interaction. As in Fig. 3, red represents the “Control” condition and blue represents the “Out-group information” condition. (a): Violin plot of contributions in the 1 st round for each condition. Dots represent observed data, horizontal lines represent mean values. (b): Line plot showing mean contributions across 30 rounds in each block. Solid lines and shaded regions represent observed values and their standard errors, dashed lines represent predicted values based on GLMM estimates for the linear regression model. (c): Bar plot showing mean contributions across six chunks in each block. Error bars represent standard errors. Significance signs for differences between conditions based on estimated marginal means from GLMM results are shown (2nd block chunk 1: Control 13.11 vs. Out-group information 15.59, t (262.34) = 3.16, p =.01; 2nd block chunk 4: Control 13.09 vs. Out-group information 14.94, t (262.34) = 2.35, p =.09; 2nd block chunk 5: Control 12.70 vs. Out-group information 14.69, t (262.34) = 2.54, p =.06).

Study 2: Cross-cultural survey

To test our hypothesized model derived from the experiments’ results, we conducted a cross-cultural survey using countries as units of analysis. As shown in Table 1, we approximated the opportunity to observe successful out-groups using the World Press Freedom Index score28 as the independent variable (the analysis using the Freedom on the Net29 index as a supplementary independent variable shows qualitatively similar trends). The World Press Freedom Index score is represented by the average of scores from 2013 to 2021, as these editions used a consistent evaluation method. Consequently, we employed average scores from comparable time periods as representative values for other variables.

To approximate the prosociality, we utilized various prosocial indicators from the World Values Survey30,31 as dependent variables. These included impersonal prosociality types prevalent in WEIRD societies, such as general trust (“Most people can be trusted”), trust in individuals with different characteristics (religion and nationality), and trust in strangers. We also incorporated personal prosociality types observed in non-WEIRD countries, including trust in family, willingness to fight for one’s country, and the social norm of obedience, which is assessed by whether obedience is an important quality in children and has been reported to exhibit a different pattern from general trust32. We obtained dependent variables from the World Values Survey, using data from Wave 6 (2010–2014) and Wave 7 (2017–2022). When data from multiple waves were available for a country, we averaged the values to align with the timeframe of our independent variables.

Our models incorporated variables for each country’s political system and rule of law quality to control for factors potentially influencing press freedom and prosocial values. We used a dummy variable for polity based on the each country’s polity score ranging from − 10 (hereditary monarchy) to + 10 (consolidated democracy)33. For analytical convenience, we classified countries with average scores greater than 0 from 2013 to 2018 as democracies (1) and average scores below 0 as autocracies (0). Rule of law quality was represented by the average of scores between 2013 and 2021 from World Bank, capturing perceptions of societal rule adherence34. The dataset comprised 75 countries when using press freedom as the independent variable (see Methods and Supplementary Table S3 for further information on the data).

Table 1 Summary of all variables included in the cross-cultural analysis examining relationships between information access and prosocial values. The table presents variable names, descriptions, data sources and coverage years. The independent variables are two indicators approximating how much people in a country can observe information about out-groups. The World Press Freedom Index score is represented by the average of scores from 2013 to 2021, as these editions used a consistent evaluation method (for results with the variable Freedom on the Net, see Supplementary Information). The dependent variables are indicators representing various types of prosocial values. Data from Wave 6 (2010–2014) and Wave 7 (2017–2022) of the World Values Survey were used (averaged if multiple Waves’ data were available). Control variables include a binary variable indicating the country’s polity status and an indicator of the level of confidence in the quality of societal rules, including police and courts. In the analysis, standardized scores were used for all variables except for the binary polity variable.
Fig. 5
Fig. 5
Full size image

Figure 5: Posterior estimates of individual reciprocity parameters from 1007 participants’contribution decisions in both experiments. As in Figures 3 and 4, 1008 red represents the “Control” condition and blue represents the “Out-group 1009 information” condition. The x-axis shows the average contribution of the other 1010 three members in the t-1 round, while the y-axis shows the participant’s 1011 contribution in the t round. The bubble chart represents observed data, with the 1012 size of the dots indicating the number of data points. Dotted lines show the 1013 estimated mean reciprocity, and shaded regions represent the 95% credible 1014 interval (HDI) of the predictions.

Figure 6 displays prosocial values as a function of World Press Freedom scores for each polity type. Red represents autocratic, and blue represents democratic countries. We plot raw data as dots and linear regressions as lines. The estimated slopes’ coefficients are shown at the top of each panel. Here, we highlight the different relationships between press freedom and two types of prosocial values. Particularly within democratic countries, as press freedom increases, we observe a rise in impersonal prosocial values (top row in Fig. 6) such as general trust or trust in out-groups; conversely, we observe a decrease in personal prosocial values (bottom row in Fig. 6). This suggests a qualitative difference between two types of prosociality, as discussed in previous studies18,35.

Fig. 6
Fig. 6
Full size image

Relationships between press freedom (proxy for out-group information access) and various forms of prosociality across countries, analyzed separately for democratic and autocratic regimes. Each panel shows country-level data with World Press Freedom Index scores on the x-axis and different prosocial measures on the y-axis. Red represents autocracy, and blue represents democracy. Dots indicate observed data, while line plots and shaded regions represent linear regression lines and 95% confidence intervals based on predicted values. The estimated slope’s coefficient for each regression line is displayed at the top of each panel.

We conducted two hierarchical linear regression analyses. For both models with countries as units of analysis, spatial autocorrelation can lead to inflated false-positive rates (i.e., Galton’s problem), necessitating control for this correlation. Following previous studies36,37, we employed hierarchical models with six world regions as random intercepts for our analysis (see Methods for detailed information). First, we constructed Model 1 to predict the expected value \(\:\widehat{{y}_{i}}\) of the dependent variable \(\:{y}_{i}\), using the dummy variable for each country’s political system and the variable indicating press freedom as independent variables:

$$\begin{aligned}& {y}_{i}\sim\:N\left(\widehat{{y}_{i}},\:{\sigma\:}_{y}\right)\\&\widehat{{y}_{i}}={\alpha\:}_{1,j}+{\alpha\:}_{2}\cdot\:democrac{y}_{i}+wp{f}_{i}\cdot\:\left({\beta\:}_{1}+{\beta\:}_{2}\cdot\:democrac{y}_{i}\right),\end{aligned}$$
(1)

where \(\:democrac{y}_{i}\) is a dummy variable representing the political system (democracy equals 1, autocracy equals 0), and \(\:wp{f}_{i}\) is the standardized World Press Freedom index of each country \(\:i\), and \(\:j\) represents the global region (\(\:j\in\:\{\text{1,2},...,6\}\)). \(\:{\alpha\:}_{1,j}\), \(\:{\alpha\:}_{2}\), \(\:{\beta\:}_{1}\), and \(\:{\beta\:}_{2}\) are parameters, with \(\:{\alpha\:}_{1,j}\) representing a random effect specific to the global region affecting the intercept. We assumed that \(\:{\alpha\:}_{1,j}\) obeys the normal distribution (i.e., \(\:{\alpha\:}_{1,j}\sim\:Normal({\mu\:}_{{\alpha\:}_{1}},{\sigma\:}_{{\alpha\:}_{1}})\)). \(\:{\alpha\:}_{2}\) represents the impact of democracy on the intercept, \(\:{\beta\:}_{1}\) is the default slope coefficient for press freedom, and \(\:{\beta\:}_{2}\) represents the additional slope coefficient for press freedom in democratic countries.

Furthermore, previous research has shown that the rule of law index is a significant predictor of societal cooperation tendencies38,39. Therefore, we developed Model 2 to control for the rule of law:

$$\begin{aligned}& {y}_{i}\sim\:N\left(\widehat{{y}_{i}},\:{\sigma\:}_{y}\right)\\&\:{\widehat{y}}_{i}={\alpha\:}_{1,j}+{\alpha\:}_{2}\cdot\:demo{cracy}_{i}+wp{f}_{i}\cdot\:\left({\beta\:}_{1}+{\beta\:}_{2}\cdot\:demo{cracy}_{i}\right)\\& \qquad +\:ro{l}_{i}\cdot\:\left({\beta\:}_{3}+{\beta\:}_{4}\cdot\:demo{cracy}_{i}\right), \end{aligned}$$
(2)

where \(\:ro{l}_{i}\) is the standardized score of rule of law index for each country \(\:i\). Here, \(\:{\beta\:}_{3}\), and \(\:{\beta\:}_{4}\) are additional parameters, with \(\:{\beta\:}_{3}\) indicating the default slope coefficient for rule of law and \(\:{\beta\:}_{4}\) indicating the additional slope coefficient for press freedom in democratic countries. Note we confirmed the absence of multicollinearity issues in Model 2 by checking Variance Inflation Factors after removing interaction terms (for detailed model specifications see Methods). We used cmdstanr version 0.8.0 (CmdStan version 2.34.1) and R version 4.4.0 for our analyses. The posterior distributions reported below were based on 40,000 samples for Model 1 and 80,000 samples for Model 2. We confirmed convergence through ESS and Rhat values (most below 1.001, all below 1.02), as well as density and trace plots (see Supplementary Information on model convergence).

Figure 7 displays the estimated posterior distributions of coefficients (slopes) in our two hierarchical Bayesian models. For each panel, the outer solid lines represent the 95% Bayesian credible interval (HDI) of the estimates, while the inner shaded region shows the 80% HDI. In Model 1, which includes a dummy variable for polity and standardized score of World Press Freedom (wpf) as independent variables, we observe that, for impersonal prosocial values (general trust, trust in others who have different nationality or religion, and trust in stranger), the estimated partial regression coefficients of the wpf slope within the democracy class (β1 + β2) are greater than zero. Conversely, for personal prosocial values (trust in family, social norm for obedience, and willingness to fight for country), these estimated values are less than zero. Model 2, which adds the effect of Rule of Law (rol), shows that, for impersonal prosocial values, the estimated partial regression coefficients of wpf slope decrease compared to Model 1. Besides, the estimated coefficients of rol slope have effects in a direction similar to those of wpf slope, with a particularly strong influence for the General Trust dependent variable model (leftmost). Conversely, for personal prosocial values, the estimated negative effects of wpf on dependent variables persist and even strengthen. Comparing the Widely Applicable Information Criterion (WAIC40) for Models 1 and 2 revealed distinct patterns across dependent variables. For impersonal prosocial values, Model 2 consistently yielded lower WAIC values than Model 1, suggesting superior model fit. Conversely, for personal prosocial values, Model 1 consistently outperformed Model 2, as indicated by lower WAIC values across all measures (Supplementary Table S4 contains comprehensive comparison of model performance). For information on other parameter estimates, such as region-specific intercepts, see Supplementary Figures S8, S10.

Fig. 7
Fig. 7
Full size image

Estimated coefficients (Slopes) of two hierarchical Bayesian models. For each panel, the outer solid lines represent the 95% Bayesian credible interval (HDI) of the estimates, while the inner shaded region shows the 80% HDI. (a) Posterior distribution estimates of Model 1, which includes Polity and World Press Freedom as independent variables. β1: slope for autocracy (autoc) countries as a function of World Press Freedom (wpf) values; β1 + β2: slope for democracy (democ) countries. (b) Posterior distribution estimates of Model 2, which includes Polity, World Press Freedom, and Rule of Law (rol) as independent variables. β3: slope for autocracy (autoc) countries as a function of Rule of Law (rol) values; β3 + β4: slope for democracy (democ) countries.

Discussion

This study primarily aimed to investigate the possibility of cultural evolution of cooperation through payoff-biased imitation of successful out-groups. We thus conducted a series of experimental studies and a cross-cultural survey. Our experiments demonstrated that cooperative behavior can transmit across group boundaries, and fitting reciprocity models confirmed the influence of successful out-group information on decision-making processes. However, the effect of payoff-biased imitation did not endure and gradually disappeared in both experiments. The cross-cultural survey provided observational support for different relationships between two types of prosociality and accessibility to outside information approximated by press freedom. This classification of prosociality aligns with Henrich’s18 findings, distinguishing between impersonal prosociality prevalent in WEIRD societies and personal prosociality observed in non-WEIRD countries. While democratic countries are known to exhibit WEIRD characteristics, including impersonal prosociality35, our Model 1 results show that press freedom’s impact persists even after controlling for polity and regional correlations. This suggests that intermittent access to out-group information and associated types of prosociality might facilitate cooperation with unrelated individuals, like our experimental setup. However, Model 2 results indicate that trust in top-down institutional punishment systems (rule of law) exhibits similar effects and even stronger influences on some impersonal prosocial values, while influences from press freedom approached to zero. Conversely, we found that lower press freedom predicts higher personal prosociality such as trust in family and willingness to fight for the country, even controlling for the effect of rule of law quality. This suggests that limited access to external information may boost parochial types of altruism.

Our findings offer two significant insights into the cultural group selection processes in the evolution of cooperation. First, many previous studies have assumed or argued that warfare is the primary form of group selection9,41. However, our research suggests an alternative process. We demonstrate the possibility of intergroup selection occurring through cultural processes, specifically payoff-biased imitation, on the premise of intergroup variation. This cultural transmission mechanism could play a crucial role in the evolution of cooperation, potentially rivaling the importance of conflict-based selection. Furthermore, in the contemporary era of rapid and global information transmission, this process might accelerate the evolution of cooperation.

Second, our experimental results highlight the importance of within-group selection processes for transmitted cooperative behaviors. While we found evidence of cooperative behaviors spreading across group boundaries, they did not diffuse within the groups. The primary reason for this failure is the incentive structure of the public goods game, where defection dominates cooperation, regardless of the number of cooperators. Previous theoretical models have shown that group-beneficial behaviors can spread through payoff-biased imitation in coordination games[27]. In these games, the alternative strategy becomes favored by within-group selection once its proportion exceeds a certain threshold. Experimental studies have demonstrated that people with reciprocal preferences subjectively transform the payoff structure of social dilemmas into an assurance game42, which has the same frequency-dependent nature as coordination games. Thus, we had conjectured that imitated cooperation could also diffuse within a group, but our findings contradicted this expectation. This finding suggests that payoff-biased imitation from cooperative out-groups alone is insufficient, and additional mechanisms are necessary to spread and maintain cooperation within groups. As numerous previous studies have indicated, punishment may be necessary to transform simple social dilemmas into coordination-like games and sustain cooperation within groups43,44,45. Similarly, reputation mechanisms could serve this function to maintain within-group cooperation46.

Our experiments show out-group success information enhances in-group cooperation, while cross-cultural data link press freedom with out-group trust. This apparent paradox likely reflects different mechanisms: experimental participants received specific information about one successful group, potentially motivating their own group’s improvement. In contrast, press freedom exposes individuals to diverse information about multiple groups over years, fostering generalized prosociality. These findings might represent complementary rather than contradictory processes—immediate responses could focus on in-group enhancement through emulation, while sustained exposure to varied information may promote universal cooperation. Future research could test this hypothesis through experiments that manipulate both the target of cooperation (in-group vs. out-group members) and the duration of exposure to out-group information.

Besides, the expected treatment effect may have been obscured by the factor specific to our experimental design. Participants knew the public goods game would last 30 rounds, potentially enabling strategic decisions for long-term payoffs. This might explain the high cooperation rates achieved in both conditions of 2nd block in Experiment 2, possibly creating a ceiling effect that prevented us from precisely examining our manipulation’s impact. Our cross-country data analysis also faces limitations in establishing a causal link. Despite controlling for variables likely related to both press freedom and prosocial values, we cannot discount the possibility of omitted factors affecting both. To address this limitation and strengthen causal inferences, future research could employ time-series analyses spanning several decades to a century, examining relationships between variables measured in the distant past and present-day outcomes.

In conclusion, two experiments and a cross-cultural survey revealed that opportunities to observe successful groups promote cooperative behavior and generous types of prosociality. Furthermore, our experimental results showing that transmitted cooperative behavior does not diffuse within non-successful groups, coupled with survey findings indicating that rule of law quality predicts generous prosociality, suggest that other mechanisms, such as punishment, may be crucial for maintaining within-group cooperation. Together, our results advance our understanding of group-level selection process, demonstrating that cultural processes such as payoff-biased imitation can facilitate cooperation beyond group boundaries, and that intergroup conflicts are not the sole mechanism for spreading cooperative traits. Our empirical data contributes to theoretical modeling of cooperation evolution dynamics47 while offering insights into how moral preferences evolve through social learning48, ultimately solving the puzzle of large-scale cooperation specific to humans.

Methods

All methods, including the two experiments and the cross-cultural survey, were performed in accordance with the relevant guidelines and regulations. The experimental protocols were approved by the research ethics committee at the Center for Experimental Research in Social Sciences at Hokkaido University, Japan (Experiment 1: number 3–16; Experiment 2: number 6–17) All participants provided informed consent before taking part in the experiments.

Study 1: experiments

Participants

In Experiment 1, we recruited 105 students from Hokkaido University (53 in the out-group information condition, 52 in the control condition; 55 males, 50 females; \(\:{M}_{AGE}=\:19.886\), \(\:S{D}_{AGE}\:=\:1.396\)). For Experiment 2, 152 students from the same university participated (76 in each condition; 99 males, 49 females, 4 other; \(\:{M}_{AGE}=19.211\), \(\:S{D}_{AGE}\:=1.533\)). We determined the sample size for Experiment 2 based on computer simulations using MCMC samples from the reciprocity model estimated from Experiment 1 data, aiming to achieve a statistical power of 0.8 (see Supplementary Figure S3 for details). No participants took part in both experiments. All participants were Japanese. No participant was excluded from the analyses.

Experimental setting

We developed the experimental program using oTree (Ver. 5.8.549). The experimental program used in this study is publicly available at https://www.otreehub.com/projects/multi-group-pgg-experiment/. The experiment consisted of a four-person public goods game with two blocks of 30 rounds each. Except for the 2nd block of Experiment 2, participants played with bot players. We designed the non-cooperative bot player algorithm to make participants feel they were interacting with real participants. The contribution amount \(\:{x}_{i,t}\) of bot player \(\:i\) in round \(\:t\) was sampled from a normal distribution (i.e., \(\:{x}_{i,t}\sim Normal(0.8\cdot\:{\mu\:}_{t-1},\:3)\)), rounded to the nearest integer, and truncated between 0 and 20. Here, \(\:{\mu\:}_{t-1}\) represents the average contribution of other players in round \(\:t-1\). For the first round of each block, we set the contributions of the three bot players to 18, 15, and 12 points, respectively.

Experimental procedure

Participants made all decisions individually on personal computers in separate partitioned rooms, ensuring anonymity. Before the experiment, all participants were told they would play 60 rounds of the public goods game in two 30-round blocks with the same group. In reality, only the 2nd block of Experiment 2 involved actual interactions between participants. For all other blocks, participants played PGG with three bot players. We instructed participants that eight participants were divided into two groups of four, implying the existence of another group besides their own. After each round, participants received feedback on their group members’ average contribution and payoff in a table format. The order of other members was randomized each trial to prevent individual tracking. After the 1 st block, participants received feedback on the overall average contribution and payoff for the entire block. The information presented at this phase differed by condition. In the out-group information condition, participants saw average contribution and reward points for both their group and the other group from the 1 st block. In reality, this information was pre-prepared by the experimenters, specifically concocted to represent a highly cooperative and successful out-group. The displayed values (average contribution: 16.7 points, average payoff: 30.1 points) were derived from 100 simulations of 30-round public goods games assuming a 100% reciprocity rate (i.e., the contribution amount follows a normal distribution with mean \(\:{\mu\:}_{t-1}\) and variance 3) among four players with initial contributions of 20 points. This design ensured that all participants in the treatment condition observed a group that was unambiguously more cooperative and successful than their own group, which typically achieved lower cooperation levels due to interaction with non-cooperative bot players. The control condition only presented information about the participant’s own group. Importantly, there were no penalties for lower cooperation rates compared with the other group. As the two groups were not in competition, there was no rational reason to act on this information.

Experimental rewards

After the experiment, participants received Amazon gift cards as rewards based on their accumulated points. We calculated the reward amount by exchanging the total points earned across all rounds at a rate of 0.8 yen per point, rounding up to the nearest 10 yen. We informed participants about this reward calculation method before the experiment began.

Task comprehension

After receiving instructions about the experiment’s structure and reward calculation method, participants answered comprehension questions to ensure they understood the task. Those who answered correctly proceeded to the experiment. For incorrect responses, we provided feedback with the correct answers and explanations of the experimental content. This process continued until participants answered all questions correctly.

Manipulation check for two experiments

For both Experiments 1 and 2, we examined differences between two conditions in the 1 st block, ensuring comparability in the 2nd block after manipulation. We performed GLMM analyses for both experiments, with participant ID as a random effect, condition as the independent variable, and contribution amounts across all rounds in the 1 st block as the dependent variable. Results showed no significant effect of condition in either experiment (difference between conditions, Experiment 1: 0.545, 95% CI [−1.009, 2.098], t(103) = 0.687, p =.494; Experiment 2: −0.321, 95% CI [−1.572, 0.931], t(150)=−0.502, p =.616), ensuring random assignment between two conditions.

Overview of reciprocity models in experiments 1 & 2

We modeled the reciprocal decision-making process underlying cooperation. The reciprocity model for Experiment 1 is as follows:

$$\begin{aligned} &{y}_{o}\sim\:N\left({\widehat{y}}_{o},\:{\sigma\:}_{o}\right)\\& \:{\widehat{y}}_{o}={\alpha\:}_{1,\:i}+{\alpha\:}_{2}\cdot\:Bloc{k}_{o}\\& \qquad +Mean.other{s}_{o}\left({\beta\:}_{1,i}+{\beta\:}_{2}\cdot\:Bloc{k}_{o}+{\beta\:}_{3}\cdot\:Bloc{k}_{o}\cdot\:Conditio{n}_{o}\right)\\& \:{\alpha\:}_{1,\:i}\sim\:N\left({\mu\:}_{{\alpha\:}_{1}}\:,\:{\sigma\:}_{{\alpha\:}_{1}}\right)\\& \:{\beta\:}_{1,i}\sim\:N\left({\mu\:}_{{\beta\:}_{1}}\:,\:{\sigma\:}_{{\beta\:}_{1}}\right),\end{aligned}$$
(3)

where the subscript \(\:o\) denotes the ovservation (contribution in all rounds except the 1 st round; \(\:o\in\:\{\text{1,2},\dots\:,8816\}\)) and the subscript 𝑖 denotes the participant (\(\:i\in\:\{\text{1,2},\dots\:,152\}\)). The variable \(\:{y}_{i}\) represents the observed contribution amount in the PGG, which is truncated between 0 and 20. \(\:Block\) and \(\:Condition\) are dummy variables (0 for 1 st block, 1 for 2nd block; 0 for control condition, 1 for out-group information condition, respectively), and \(\:Mean.others\) represents the average contribution of the other three members in the previous round. We set the weakly informative priors of \(\:{\alpha\:}_{2},\:{\beta\:}_{2},\:{\beta\:}_{3}\:and\:{\mu\:}_{*}\) to the normal distribution \(\:[0,\:100]\), and those of \(\:{\sigma\:}_{*}\) to the uniform distribution \(\:[0,\:\:100]\).

The model for Experiment 2 is as follows:

$$\begin{aligned} &{y}_{o}\sim\:N\left({\widehat{y}}_{o},\:{\sigma\:}_{o}\right)\\& \:{\widehat{y}}_{o}={\alpha\:}_{1,\:i}+{\alpha\:}_{2}\cdot\:Bloc{k}_{o}+\:\:Mean.other{s}_{o}({\beta\:}_{1,i}+{\beta\:}_{2,j}\cdot\:Bloc{k}_{o}\cdot\:(1-Conditio{n}_{o}) \\&+{\beta\:}_{3,j}\cdot\:Bloc{k}_{o}\cdot\:Conditio{n}_{o})\\&\:{\alpha\:}_{1,\:i}\sim\:N\left({\mu\:}_{{\alpha\:}_{1}}\:,\:{\sigma\:}_{{\alpha\:}_{1}}\right)\\&\:{\beta\:}_{1,i}\sim\:N\left({\mu\:}_{{\beta\:}_{1}}\:,\:{\sigma\:}_{{\beta\:}_{1}}\right)\\& \:{\beta\:}_{2,j}\sim\:N\left({\mu\:}_{{\beta\:}_{2}}\:,\:{\sigma\:}_{{\beta\:}_{2}}\right)\\&\:{\beta\:}_{3,j}\sim\:N\left({\mu\:}_{{\beta\:}_{3}}\:,\:{\sigma\:}_{{\beta\:}_{3}}\right), \end{aligned}$$
(4)

where the subscript \(\:o\) denotes the observation (\(\:o\in\:\{\text{1,2},\dots\:,8816\}\)) and the subscript 𝑖 denotes the participant (\(\:i\in\:\{\text{1,2},\dots\:,152\}\)), while the subscript 𝑗 represents the group (\(\:j\in\:\{\text{1,2},\dots\:,38\}\)). We set the weakly informative priors of \(\:{\mu\:}_{*}\:and\:{\alpha\:}_{2}\) to the normal distribution \(\:[0,\:100]\), and those of \(\:{\sigma\:}_{*}\) to the uniform distribution \(\:[0,\:\:100]\).

Study 2: Cross-cultural survey

Overview of model in cross-cultural survey

For both Model 1 (Eq. 1) and Model 2 (Eq. 2), each country \(\:i\) was nested within one of the six global regions \(\:j\). The six global regions were based on World Bank classifications. To ensure an adequate number of countries in each region, we merged the “North America” region with “Europe and Central Asia” due to their shared cultural backgrounds, despite their geographical distance. We set the random intercept \(\:{\alpha\:}_{1,j}\), assuming \(\:{\alpha\:}_{1,j}\sim\:N\left({\mu\:}_{{\alpha\:}_{1}}\:,\:{\sigma\:}_{{\alpha\:}_{1}}\right)\). The subscript \(\:j\) represents global regions (1: Sub-Saharan Africa, 2: East Asia and Pacific, 3: Europe and Central Asia, 4: Latin America and Caribbean, 5: Middle East and North Africa, and 6: South Asia). We set the weakly informative priors of \(\:{\mu\:}_{*},\:{\alpha\:}_{2}\:and\:{\beta\:}_{*}\) to the normal distribution \(\:[0,\:100]\), and those of \(\:{\sigma\:}_{*}\) to the uniform distribution \(\:[0,\:\:100]\).

Dataset

For our cross-cultural study, we used the World Press Freedom Index as our primary independent variable. This annual indicator, published by Reporters Without Borders, evaluates media freedom in each country based on surveys of experts and journalists. We also employed the Freedom on the Net index by Freedom House as a supplementary measure, which assesses internet freedom, privacy protection, and digital r World Press Freedom ights across countries.

From the World Values Survey, we extracted several dependent variables to measure prosocial values: general trust (“Generally speaking, would you say that most people can be trusted or that you need to be very careful in dealing with people?“), trust in others of different nationality (“How much do you trust people of another nationality?“) or religion (“How much do you trust people of another religion?“), trust in strangers (“How much do you trust people you meet for the first time?“), trust in family (“How much do you trust your family?“), social norm for obedience (“Here is a list of qualities that children can be encouraged to learn at home. Which, if any, do you consider to be especially important? - Obedience”), and willingness to fight for country (“Of course, we all hope that there will not be another war, but if it were to come to that, would you be willing to fight for your country?“). We calculated country-level averages after removing missing values. We reversed the coding of some variables to ensure that higher values indicate more positive outcomes and then standardized them.

As control variables, we used a binary variable representing each country’s political system (see main text for coding details) and the Rule of Law indicator (again see main text).