Abstract
The invention of agriculture is widely thought to have spurred the emergence of large-scale human societies. It has since been argued that only intensive agriculture can provide enough surplus for emerging states. Others have proposed it was the taxation potential of cereal grains that enabled the formation of states, making writing a critical development for recording those taxes. Here we test these hypotheses by mapping trait data from 868 cultures worldwide onto a language tree representing the relationships between cultures globally. Bayesian phylogenetic analyses indicate that intensive agriculture was as likely the result of state formation as its cause. By contrast, grain cultivation most likely preceded state formation. Grain cultivation also predicted the subsequent emergence of taxation. Writing, although not lost once states were formed, more likely emerged in tax-raising societies, consistent with the proposal that it was adopted to record those taxes. Although consistent with theory, a causal interpretation of the associations we identify is limited by the assumptions of our phylogenetic model, and several of the results are less reliable owing to the small sample size of some of the cross-cultural data we use.
Similar content being viewed by others
Main
For more than a century, scholars have debated how and why large-scale complex human societies emerged1. Early explanations for the formation of the state suggested that the development of agriculture, which enabled sedentism and surplus production, led to the formation of hierarchies with elites controlling the new states2,3. However, the long gap between the development of agriculture and the widespread emergence of states has encouraged a refinement of this view. It is now commonly suggested that it was only intensive agriculture that provided enough surplus for the needs of emerging states4,5. The timing for the appearance of agricultural intensification—preceding the establishment of hierarchies—is key to supporting this view. A recent study6 investigating the co-evolution of intensive resource use and socio-political hierarchy in the Pacific found support for a reciprocal relationship, suggesting that social as well as material factors can drive the emergence of political complexity. Another explanation is that, rather than an agricultural surplus, it was the taxation potential of different crops that was crucial to state formation7,8. Scott8 argues that cereal grains, which require fixed fields, grow above ground, ripen at a predictable time and are readily stored, provide the ideal crop for tax collection. By contrast, roots and tubers are not easily discoverable, have no fixed ripening time, can be left in the ground until needed and do not store well once harvested. Wheat, barley, millet and more recently rice and maize would therefore have provided the key to state formation because of their taxable potential8. Scott8 also points to the fact that all the earliest states that emerged were based on grain: wheat and barley in Mesopotamia, Egypt and the Indus Valley and millet followed by rice in the Yellow River Valley, and, in the New World, maize became a new state crop. To test this view, Mayshar and colleagues9 used cross-regional data to show that the cultivation of grain is correlated with hierarchy but not with land productivity, suggesting that it was the taxability of grain rather than its potential for producing a surplus that enabled local hierarchies to exploit this feature of grain to their own advantage.
In support of the taxation view, Scott8 also reiterates an earlier argument10,11 that it was only with the adoption of writing as the means to record taxes due and those paid that an emerging state would have been at all viable. The role of record-keeping in state formation and maintenance has previously been tested by Basu and colleagues12 using a worldwide data sample, which showed a positive and nonlinear association between community size and record keeping. More recently, Stasavage13 has used the same dataset to suggest that the availability of writing was positively correlated with the emergence of states.
The quantitative studies9,12,13 testing the link between crop type and appropriability, writing and state formation have all made extensive use of data from the Standard Cross-Cultural Sample (SCCS)14. The SCCS is a subset of 186 cultures from the 1,291 cultures in the Ethnographic Atlas (EA)15, which includes 1,781 separate variables. The SCCS was originally developed to try to deal with the problem of the non-independence of data through the shared histories of cultures in cross-cultural studies. It was hoped that the problems of proximity and common descent of cultures could be addressed by retaining a single culture for each region. However, the SCCS still shows high levels of both spatial and cultural autocorrelation, which seriously limits the interpretation of statistical analyses of global cross-cultural studies using these data16. To identify independent instances of correlated change in traits, a cultural phylogeny is required17. This has the added advantage of not only being able to test reliably for correlated evolution between traits across the phylogeny but also enabling the inference of relative timing for the appearance of those traits. If the best-fitting model is one in which the gain or loss of one trait is contingent on the presence or absence of the other, the direction of causation can be inferred17. This approach has been successfully applied in a number of studies to model the co-evolution of cultural traits along the branches of language family phylogenies18,19,20,21. The SCCS is a particularly rich data source for cultures worldwide, which means that the recent development of global language supertrees22,23 has now unlocked those data for phylogenetic cross-cultural analyses. A phylogenetic approach using a global supertree and drawing on SCCS data has already been used to explore marriage patterns24, food sharing25 and genital mutilation26 worldwide.
Here, we aim to leverage the potential of phylogenetic analysis to test for a correlated evolution between pairs of traits linked to the origin of the state and also to investigate the relative timing of the evolution of those traits to infer the likely direction of causality17. We do this using a new posterior treeset of world languages22, representing the genealogical relationships between cultures, derived using Bayesian phylogenetic inference techniques, matched to 868 cultures in the EA and to the 186 cultures in the SCCS.
One approach is to consider cultural complexity in the form of a set of classificatory criteria27, spanning technological specialization and social stratification, population density and the use of money (for example, refs. 6,12,28). Another approach is to focus on a single classification, such as political complexity (for example, ref. 29), defined as the number of distinct jurisdictional levels beyond the local community27 from autonomous bands to petty and larger chiefdoms to states. Here, we take this second approach, testing hypotheses concerning the evolution of political complexity and particularly the move from autonomous bands and chiefdoms to fully functioning states. We use the EA/SCCS variable Jurisdictional Hierarchy Beyond Local Community (EA033, SCCS237) as the measure of political complexity (following refs. 9,13,29), where a non-state is defined as having zero to two jurisdictional levels and a state is defined as having at least three levels. We test a number of hypotheses for the emergence of states worldwide using a Bayesian phylogenetic approach30:
-
(1)
The intensification of agriculture, which produces a surplus, is the crucial factor enabling the emergence of states4,5.
-
(2)
Cereal grain agriculture and its potential for taxation is key to state formation8,9,13.
-
(3)
Writing is adopted to manage taxation and is key to the emergence and maintenance of the state8,10,11,12,13.
Results
We matched 868 societies from the EA to the posterior treeset of the Global Supertree developed by Bouckaert and colleagues22 (Fig. 1). Using EA data on jurisdictional hierarchy beyond the local community (EA033) (Fig. 2), we tested several hypotheses for the emergence and maintenance of states. In tests that included measures of agriculture and jurisdictional hierarchy only, we were able to use the full EA data to maximize the reliability of our estimates. However, data on taxation and writing were only available in the SCCS dataset (a subset of EA data) (Supplementary Fig. 1), with taxation data only available for 83 societies. All traits showed a phylogenetic signal (PhyloD)31 significantly stronger than expected under a random distribution of data on the tree (Methods; Supplementary Table 2), violating the assumption of non-independence and supporting the use of phylogenetic methods to account for the shared histories of societies. At the same time, all traits, except writing, showed a significant difference from Brownian motion, suggesting that other unmodeled factors may play a role.
A maximum clade credibility tree of the global treeset22 with larger language families identified, matched to the data for state, intensive agriculture and grain from the EA. The maximum clade credibility tree is just one tree from the posterior distribution of 1,000 trees. All the analyses we report were performed on the full posterior distribution of trees, integrating over the considerable uncertainty in ancestral relationships between the world’s languages. The figure was produced using Treeviewer44.
A plot of EA data on state and non-state societies. The base map for this figure was obtained from ESRI’s World Terrain Reference (https://goto.arcgisonline.com/maps/Reference/World_Reference_Overlay) and generated using ArcGIS Pro 3.5.345.
Correlated evolution
To test for correlated evolution between pairs of binary traits under an assumption of phylogenetic inheritance, we used the stepping-stone sampler method32 within the BayesTraits30 analyses to compare the log marginal likelihood of the dependent (where traits evolve together) and independent models (where traits are restricted to evolve separately) for each pair of traits (Supplementary Table 3). log Bayes factor values (BF) were calculated for each pair of traits with <2 showing weak evidence, >2 positive evidence, 5–10 strong evidence and >10 very strong evidence33. Grain and taxation show positive evidence of correlated evolution (BF of 3.38), non-grain agriculture and the state show strong evidence (BF of 7.51), and the rest of the trait pairs show very strong evidence of correlated evolution.
Intensive agriculture
Intensive agriculture was present in 241 (28%) societies we sampled. In total, 66 states practiced intensive agriculture (72%), whereas 26 states did not (28%). A total of 163 non-states had intensive agriculture (22%), and 587 did not (78%). Accounting for non-independence due to shared ancestry on the global language phylogeny, we find very strong evidence for correlated evolution between intensive agriculture and the emergence of states (BF of 53.56) (Fig. 4 and Supplementary Table 3). The rate matrix (Fig. 4) shows the eight transitions between the two states of the two traits. Each transition rate (for example, from trait state 0,0 to trait state 0,1, q12) has the mean rate across the model iterations and the percentage of iterations with a zero rate (for example, where no transition took place in half of the iterations, Z = 50%). The rate matrix in Fig. 4 indicates that the presence of intensive agriculture makes the emergence of states somewhat more likely (rate of transition to statehood in the presence of intensive agriculture (q34 = 0.06) is twice the rate of transition to statehood in the absence of intensive agriculture (q12 = 0.03)). However, we find stronger evidence that the presence of a state makes the transition to intensive agriculture more likely (q24 (0.31) is six times the rate of q13 (0.05)). We also find strong evidence that the presence of statehood makes the loss of intensive agriculture less likely (q31 (0.27) is seven times the rate of q42 (0.04)) but no evidence that intensive agriculture sustains statehood (q21 (0.31) is equal to q43 (0.31)). This result provides limited support for hypothesis 1 that the surplus provided by agricultural intensification was important for the emergence of states. We find stronger evidence that the presence of the state encouraged intensive agriculture and that once a state gained intensive agriculture, it was much less likely to lose it than a non-state.
Grain
Grain was the main crop in 484 societies (56%), but not the main crop in 379 societies (44%) (Fig. 3). A total of 76 states depended on grain (84%), whereas 14 states were dependent on other crops (16%). In total, 386 non-states depended on grain (52%), whereas 362 non-states did not (48%). The cultivation of grains shows very strong evidence of correlated evolution with the emergence of states (BF of 22.59) (Supplementary Table 3). The rate matrix in Fig. 5 indicates that the presence of grain agriculture makes the emergence of states possible (the rate of transition to statehood in the presence of grains (q34) is low but positive (0.07), whereas the rate of transition to statehood in the absence of grains (q12) is zero). In addition, we find evidence that the presence of a state may make the transition to grain agriculture slightly more likely (q24 (0.12) is greater than q13 (0.07)). We find no evidence that the presence of statehood makes the loss of grain cultivation less likely (q42 (0.07) is equal to q31 (0.07)) and no evidence that grain agriculture sustains statehood (q21 (0.39) is equal to q43 (0.39)). This result supports hypothesis 2, which posits that grain agriculture was important for the emergence of states.
A plot of EA data on societies with grain present and absent. The base map for this figure was obtained from ESRI’s World Terrain Reference (https://goto.arcgisonline.com/maps/Reference/World_Reference_Overlay) and generated using ArcGIS Pro 3.5.345.
Scott8 argued that there were only grain states, such that no states emerged in societies based on other forms of agriculture. Although our results generally support this pattern, there were a minority of cases in the EA dataset that did not. In 14 of the societies classified as states (16%), grain was not the major crop: 10 had roots or tubers as the main crop, 3 had tree fruits and 1 had vegetables. The majority of these 14 societies were small states10, and most9 were in the Atlantic–Congo language family, all based in tropical Africa.
To investigate this further, we pruned the global phylogeny22 to include only the 241 societies from the Atlantic–Congo language family. We used this new posterior sample of trees to test for correlated evolution between intensive agriculture and the emergence of states, between grain agriculture and the emergence of states and between non-grain agriculture and the emergence of states. In all three cases, there was no evidence of correlated evolution (Supplementary Table 4 and Supplementary Fig. 6), consistent with a scenario in which these agricultural traits did not influence the emergence of states in societies in the Atlantic–Congo language family.
Non-grain
To check whether other types of agriculture show a similar association with the emergence of states, we tested for correlated evolution between non-grain agriculture (vegetables, tree fruits, roots and tubers) and jurisdictional hierarchy. There was evidence of strong correlated evolution between non-grain agriculture and the emergence of states (BF of 7.51) (Supplementary Table 3 and Supplementary Fig. 7). However, there was no evidence for the emergence of states being more likely in the presence of non-grain agriculture (q12 (0.04) equal to q34 (0.04)) or that non-grain agriculture was more likely to be gained in states rather than non-states (q13 (0.04) equal to q24 (0.04)). By contrast, non-grain agriculture was much more likely to be lost in states than in non-states (q42 (0.37) is nine times the rate of q31 (0.04)). This result suggests that states may have encouraged the growing of grain, while discouraging other forms of agriculture (Supplementary Fig. 7): supporting hypothesis 2 that grain was ideal for taxation.
Taxation and writing
Next, we tested whether there was correlated evolution between grain agriculture and taxation. Across the SCCS societies, tax was levied in 30 societies (36%) and not levied in 53 societies (64%). In societies where grain was the main crop, tax was levied in 19 (50%) and not levied in 19 (50%). In societies that were not dependent on grain, 10 societies levied taxes (23%), whereas 34 societies did not (77%). The cultivation of grain and taxation show positive evidence for correlated evolution (BF of 3.38) (Supplementary Table 3 and Supplementary Fig. 8). The rate of transition to taxation in the absence of grain agriculture (q12) is zero, whereas the rate of transition to taxation in the presence of grain agriculture (q34) is high (0.09), suggesting that grain agriculture consistently predicts taxation (Supplementary Fig. 8). There is no evidence that the presence of taxation makes the gain or loss of grain agriculture more or less likely (q24 (0.09) and q13 (0.09) equal, q42 (0.09) and q31 (0.09) equal). This result provides further support for hypothesis 2, showing evidence that taxation is more likely to appear in those societies that rely on cereal grain as their main crop. However, the low BF for correlated evolution and the low coverage of data on taxation in the SCCS (83 societies) means that this result may be less robust than previous results.
Writing is present in 38 societies (22%) across the SCCS and absent in 138 (78%). In societies that levied tax, 12 had writing (40%) and 18 did not (60%). In societies without tax, 5 had writing (9%), whereas 48 did not (91%). Our results show very strong correlated evolution between the raising of taxes and the adoption of writing (BF of 19.90) (Supplementary Table 3). The rate of transition to writing in the absence of taxation is zero (q12), whereas the rate of transition to writing in the presence of taxation (q34) is very high (0.06), suggesting that taxation consistently predicts the adoption of writing (Supplementary Fig. 9). This result supports hypothesis 3, suggesting that writing is adopted in societies that raise taxes, most likely to record those taxes. However, because of the relatively low coverage of data on taxation, we are less certain of this result.
From SCCS data, states are present in 27 societies (16%) and absent in 147 (84%). Writing was present in 21 states (57%) and 16 non-states (43%). Writing was absent in 6 states (4%) and 131 non-states (96%). Support for a model of correlated evolution between the adoption of writing and the emergence of states is very strong (BF of 48.40) (Supplementary Table 3). The rate of transition to statehood in the absence of writing (q12) is low but non-zero (0.02), whereas the rate of transition to statehood in the presence of writing (q34) is one of the highest rates we observe (0.17). The rate of transition for the adoption of writing (q24) is high in states (0.15), whereas in non-states (q13) it is low (0.01). Furthermore, a zero rate for the loss of writing in the presence of states (q42) suggests that states keep writing once it is adopted (Supplementary Fig. 10). This result supports hypothesis 3, suggesting that writing and state emergence are highly correlated. We find evidence that the presence of writing encourages the emergence of states and states encourage the emergence of writing. Furthermore, once states are established, they are unlikely to lose writing, suggesting that it is important for state maintenance.
Additional robustness checks
The above findings are based on an assumption of binary trait evolution along the branches of a language phylogeny. One concern with this approach is that more recent unmodelled borrowing between lineages could bias our rate estimates by causing the origins of several traits to be reconstructed too deep in the global tree. For example, grain, statehood and writing are widespread across many of the Indo-European ethnolinguistic groups in our sample (Fig. 1 and Supplementary Fig. 1), but this probably reflects a mix of vertical inheritance and more recent borrowing, rather than simple inheritance of these traits from a Proto-Indo-European ancestor. Defenders of phylogenetic comparative methods note that such borrowing events can themselves provide insight into the sequence of trait evolution within societies17 and that the methods are robust to realistic levels of borrowing, consistently outperforming conventional regression approaches34. Nevertheless, the reliability of inferences in any given case is likely to depend on a combination of factors, including the rate at which traits are borrowed, their degree of coupling, rates of extinction and sample size.
To incorporate additional historical knowledge into our analyses and to evaluate the robustness of our findings to model misspecification due to more recent borrowing, we repeated each analysis constraining the value of traits at the root of the ten largest language families. When all traits were constrained to be absent at the root of each family, consistent with the assumption that they emerged more recently within each family, we found similar support for correlated evolution to our initial analyses (Supplementary Table 5), with only the correlation for taxation and writing reducing from very strong evidence (BF of 19.90) to strong evidence (BF of 9.15). There was no change in the pattern of transition rate results reported (Supplementary Table 6). Of all the variables we considered, only grain may have already emerged at the root of several of the major global language families35. We therefore also reran the two analyses that included grain as a variable, allowing grain to be absent or present at each language family root, while the other binary variable was constrained to be absent. In the grain/state analysis, there was again no change to the results reported with correlation (BF of 23.73) very similar to the unconstrained model (BF of 22.59). In the grain/tax analysis, the support for correlation between the traits showed only weak evidence (BF of 1.41), lower than the positive evidence in the unconstrained model (BF of 3.38) and the model constrained to both traits absent (BF of 2.79). In both cases, the pattern of transition rates remained the same (Supplementary Table 6).
Overall, these analyses give additional support to our main results, suggesting that, although several of the traits we investigate may reflect more recent borrowing, our findings are relatively robust to uncertainty deeper in the phylogeny.
Discussion
In this study, we have tested a number of hypotheses for the emergence of states across human history. This is part of a wider debate about the movement from small-scale to large-scale complex societies, but here, we focus on the move to centralized bureaucratic states that can have widely differing population sizes4. The hypotheses for the emergence of the state include: the intensification of agriculture, which provides a surplus4,5; the taxation potential of cereal grain8,9,13; and the adoption of writing to record tax and enable and maintain states8,10,11,12,13.
This study uses phylogenetic methods to test these hypotheses for state formation on a global scale. The recent development of global language phylogenies22,23 makes it possible to use large cultural datasets such as the EA15 and the SCCS14 while accounting for the shared histories of cultures in cross-cultural studies16. Supporting the need for such an approach and the use of language phylogenies as a plausible model of trait inheritance, all of the variables in our study show evidence of phylogenetic signal. However, variation among the traits in our study is not perfectly captured by the language phylogeny—the Purvis D scores differed significantly from that expected under a strict Brownian diffusion model of trait evolution along the branches of the tree. We therefore acknowledge that other, as yet unmodelled processes, such as environmental factors, are also at work and stress that our findings should be interpreted as contributing to wider literature on the origins of statehood rather than offering the last word.
The first hypothesis tested argues that it was the intensification of agriculture, specifically the use of fertilization or crop rotation to reduce fallow periods and irrigation, which enabled a surplus to be produced that was used to form states4,5. Our results indicate that the use of intensive agriculture was indeed tightly coupled with the emergence of states worldwide. However, our findings suggest that although the presence of intensive agriculture makes the emergence of states slightly more likely, it is the inverse causal direction that is the stronger relationship, with the presence of states making the use of intensive agriculture much more likely. This result supports a previous study6 using phylogenetic methods to show that political complexity was more likely to have driven intensive agriculture than to be a result of it across Austronesian societies. We also found that, among societies that had adopted intensive agriculture, states were much less likely to lose the practice than non-states, suggesting that states may nevertheless have played an important role in the trend towards increasing agricultural intensification.
The emergence of states due to the taxable potential of cereal grains was the second hypothesis we tested. It is argued that grain is particularly good for taxation due to the use of fixed fields and because grain grows above ground, ripens at predictable times and can be stored for very long periods8. Our analyses indicate strong support for correlated evolution between the adoption of cereal grains as the main crop by societies and the emergence of states. Furthermore, our analyses suggest that states were very unlikely to emerge in societies without grain production, whereas states were very likely to emerge in societies with cereal grains as their main crop. This result supports those of a previous study9 using similar data, suggesting that it was the appropriability of grain that led to state formation. The advantage of the phylogenetic methods we deployed here is that we were able to both account for the non-independence of societies due to shared language ancestry and also suggest the direction of causation, contingent on evolution along the branches of a posterior distribution of plausible global language relationships17. For comparison, we also tested the hypothesis that it was non-grains, such as vegetables, tree fruit, roots and tubers, that resulted in the formation of states worldwide. The results again indicated strong correlated evolution between non-grain agriculture and state formation but suggest that once formed, states were much more likely to lose non-grain crops than non-states.
In our dataset, there were a few states without grain, of which most had roots or tubers as the main crop. Most of these small states were in tropical Africa. This finding emphasizes the potential for regional variation in state formation, perhaps linked to the role of unmodelled environmental contingencies. For example, environmental factors may have reduced the payoffs of grain for states in Sub-Saharan Africa. Indeed, when restricting our analysis to only the Atlantic–Congo language family of Sub-Saharan Africa, we found no evidence of correlated evolution between grain and the emergence of states, consistent with the suggestion that environmental factors play an important role9. Future work may be able to formally model these effects.
Nevertheless, our findings support a connection between the production of cereal grains and the emergence of states outside Sub-Saharan Africa. The proposed mechanism for this is that grain is ideal for taxation purposes8,9,13. Our results also support this argument, although our analysis of taxation data is less robust because it is based on a much smaller dataset. Grain production and taxation show positively correlated evolution worldwide. Furthermore, our findings suggest taxation was less likely to arise in societies without grain production and more likely in those with grain production.
The third hypothesis we tested was the role of writing in the emergence of states via its importance for recording taxation. Taxation requires a trustworthy method of recording taxes, and once states have emerged, writing has been argued to be essential for their maintenance8,10,11,12,13. Our results indicate that the adoption of writing is, indeed, strongly correlated with both taxation and the emergence of states. We found writing was very unlikely to be adopted in societies that do not raise taxes but very likely in societies that do. States did emerge in societies without writing but were much more likely to emerge in societies with writing. Furthermore, once states have emerged, we found they were very unlikely to lose writing. These results support previous studies12,13 using the same dataset. However, we are able to show that the relationship holds when accounting for the common linguistic ancestry of societies and that the hypothesized relative timing of trait change is also supported under our model. Again, these results are based on the smaller sample size of the SCCS, with taxation being a particularly small dataset.
Drawing strong inferences about human prehistory is an inherently challenging task, particularly when it comes to establishing causal relationships between complex phenomena such as modes of agricultural production and statehood. With this in mind, we want to emphasize that the findings we have presented here are contingent on the reliability of the available cross-cultural data, and on the assumptions of our model, most notably the extent to which binary trait evolution along the branches of a language phylogeny is an accurate description of the processes at work. The model we deploy is, like any model, a simplification and does not, for example, explicitly incorporate horizontal transmission or the role of environmental factors in constraining or canalizing social evolution. Nevertheless, there are good reasons to hold our findings credible. The consistent phylogenetic signal in the data we consider suggests a phylogenetic model is a reasonable approximation and an important extension on prior work that has not sought to model historical dependencies between societies. To the extent that our data reflect a more recent borrowing of traits, such events are themselves a legitimate source of change down cultural lineages17 and the methods we deploy have been shown to be robust to realistic levels of borrowing, outperforming the conventional regression techniques used in prior work34. Moreover, our findings incorporate considerable phylogenetic uncertainty across a posterior distribution of global language trees, suggesting that our inferences are not sensitive to a specific language tree topology, particularly for deeper relationships in the tree, where phylogenetic uncertainty is greatest. In addition, when we constrain ancestral states of major language families to incorporate historical knowledge consistent with a more recent spread of traits, our inferences are also unaffected.
Together, then, the overall pattern of results we present shows a clear concordance with and support for a newly emerging picture of the origin of states across the world. Namely, that it was not the surplus production of agricultural intensification but the taxable nature of cereal grains that led to both the emergence of states and the adoption of writing.
Methods
The variables used were as follows:
-
(1)
Jurisdictional hierarchy beyond the local community (EA033 and SCCS237): (1) autonomous bands (no levels), (2) petty chiefdoms (one level), (3) larger chiefdoms (two levels), (4) small states (three levels) and (5) large states (four levels). Here, states include both small and large states, with at least three jurisdictional levels above the local community.
-
(2)
Agriculture: intensity (EA028), with intensive agriculture defined as using fertilization, crop rotation or other techniques to shorten or eliminate the fallow period and irrigation.
-
(3)
Agriculture: major crop type (EA029 and SCCS233), cereal grain. Non-grain was defined as vegetables, tree fruit, roots and tubers.
-
(4)
Taxation paid to community (SCCS784), regular taxes36.
-
(5)
Writing and records (SCCS149), true writing, with or without records27.
The variables were made binary to test the three hypotheses directly (see Supplementary Table 1 for further detail on variables used and the criteria used for binary coding).
Phylogeny
The relationships between the world’s languages, particularly deeper macro-relationships between established families, and the timing of diversification events, are highly contentious. A common response to this has been to avoid global phylogenetic analyses altogether or to use crude proxies for ancestry such as language family membership to attempt to control for non-independence and model change. Here, we take a different approach, using a newly available Bayesian posterior treeset representing a global super tree of the worlds’ languages developed by Bouckaert and colleagues22. The treeset has been used in a number of studies37,38,39, and the posterior distribution and code used to generate it are available40. The supertree was pruned to the cultures included in the EA and the SCCS using Phytools in R41. This posterior treeset of 1,000 trees is derived from prior information on what is known about the sequence and timing of the breakup of the world’s languages, including specifying the considerable uncertainty in branch lengths and tree topology. This provides a principled statistical framework for modelling cultural evolution on the global tree, while integrating out phylogenetic uncertainty from our inferences.
Analyses
The correlated evolution analyses were conducted using the Discrete model in BayesTraits V3.0.542, using a reverse-jump hyperprior approach with an exponential hyperprior (0–1.0), allowing the priors to be estimated from the data30. Models were run for 110,000,000 iterations with the first 10,000,000 iterations discarded as burn-in. This approach estimates support for correlated evolution using a stepping-stone sampler method32 to calculate the log BF for the likelihood of the dependent model over the independent model. Log BF values were calculated such that <2 shows weak evidence, >2 positive evidence, 5–10 strong evidence and >10 very strong evidence33.
Transition rate matrices
The model results also show the level of support for the transition rates between the two states of the two variables in terms of the mean rate and the percentage of model iterations that show a transition rate of zero30. The rate matrix figures show the eight transitions between the two states of the two traits. Each transition rate (for example, q12) has the mean rate across the model iterations and the percentage of iterations with a zero rate (for example, Z = 50%).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
References
Morgan, L. H. in Ancient Society (ed White, L. A.) (Harvard Univ. Press, 1877).
Childe, V. G. Man Makes Himself 4th edn (Collins, 1966).
Childe, V. G. The urban revolution. Town Plan. Rev. 21, 3–17 (1950).
Johnson, A. W. & Earle, T. K. The Evolution of Human Societies: from Foraging Group to Agrarian State (Stanford Univ. Press, 2000).
Sanderson, S. K. Evolutionism and its critics. J. World Syst. Res. 3, 94–114 (1997).
Sheehan, O., Watts, J., Gray, R. D. & Atkinson, Q. D. Coevolution of landesque capital intensive agriculture and sociopolitical hierarchy. Proc. Natl Acad. Sci. USA 115, 3628–3633 (2018).
Scott, J. C. The Art of Not Being Governed: an Anarchist History of Upland Southeast Asia (Yale Univ. Press, 2009).
Scott, J. C. Against the Grain (Yale Univ. Press, 2017).
Mayshar, J., Moav, O. & Pascali, L. The origin of the state: land productivity or appropriability? J. Political Econ. 130, 1091–1144 (2022).
Goody, J. The Logic of Writing and the Organization of Society (Cambridge Univ. Press, 1986).
Maynard Smith, J. & Szathmary, E. The Major Transitions in Evolution (Oxford Univ. Press, 1997).
Basu, S., Kirk, M. & Waymire, G. Memory, transaction records, and The Wealth of Nations. Account. Organ. Soc. 34, 895–917 (2009).
Stasavage, D. in The Handbook of Historical Economics (eds Bisin, A. & Federico, G.) 881–902 (Academic Press, 2021).
Murdock, G. P. & White, D. R. Standard cross-cultural sample. Ethnology 8, 329–369 (1969).
Murdock, G. P. Ethnographic atlas—a summary. Ethnology 6, 109–236 (1967).
Dow, M. M. & Eff, E. A. Global, regional, and local network autocorrelation in the standard cross-cultural sample. Cross Cult. Res. 42, 148–171 (2008).
Mace, R. & Pagel, M. The comparative method in anthropology. Curr. Anthropol. 35, 549–564 (1994).
Opie, C., Shultz, S., Atkinson, Q. D., Currie, T. & Mace, R. Phylogenetic reconstruction of Bantu kinship challenges main sequence theory of human social evolution. Proc. Natl Acad. Sci. USA 111, 17414–17419 (2014).
Holden, C. & Mace, R. Spread of cattle led to the loss of matrilineal descent in Africa: a coevolutionary analysis. Proc. R. Soc. Lond. Ser. B 270, 2425–2433 (2003).
Pagel, M. D. & Meade, A. in The Evolution of Cultural Diversity: a Phylogenetic Approach (eds Mace, R., Holden, C. & Shennan, S.) 235–256 (Univ. College London Press, 2005).
Watts, J., Sheehan, O., Atkinson, Q. D., Bulbulia, J. & Gray, R. D. Ritual human sacrifice promoted and sustained the evolution of stratified societies. Nature 532, 228–231 (2016).
Bouckaert, R. et al. Global language diversification is linked to socio-ecology and threat status. Preprint at SocArXiv https://doi.org/10.31235/OSF.IO/F8TR6 (2022).
Duda, P. & Zrzavý, J. Human population history revealed by a supertree approach. Sci. Rep. 6, 29890 (2016).
Minocher, R., Duda, P. & Jaeggi, A. V. Explaining marriage patterns in a globally representative sample through socio-ecology and population history: a Bayesian phylogenetic analysis using a new supertree. Evol. Hum. Behav. 40, 176–187 (2018).
Ringen, E. J., Duda, P. & Jaeggi, A. V. The evolution of daily food sharing: a Bayesian phylogenetic analysis. Evol. Hum. Behav. 40, 375–384 (2019).
Šaffa, G., Zrzavý, J. & Duda, P. Global phylogenetic analysis reveals multiple origins and correlates of genital mutilation/cutting. Nat. Hum. Behav. 6, 635–645 (2022).
Murdock, G. P. & Provost, C. Measurement of cultural complexity. Ethnology 12, 379–392 (1973).
Richerson, P. J. & Boyd, R. Complex societies. Hum. Nat. 10, 253–289 (1999).
Currie, T. E., Greenhill, S. J., Gray, R. D., Hasegawa, T. & Mace, R. Rise and fall of political complexity in island South-East Asia and the Pacific. Nature 467, 801–804 (2010).
Pagel, M. D. & Meade, A. Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo. Am. Nat. 167, 808–825 (2006).
Fritz, S. A. & Purvis, A. Selectivity in mammalian extinction risk and threat types: a new measure of phylogenetic signal strength in binary traits. Conserv. Biol. 24, 1042–1051 (2010).
Xie, W., Lewis, P. O., Fan, Y., Kuo, L. & Chen, M.-H. Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Syst. Biol. 60, 150–160 (2011).
Kass, R. E. & Raftery, A. E. Bayes factors. J. Am. Stat. Assoc. 90, 773–795 (1995).
Currie, T. E., Greenhill, S. J. & Mace, R. Is horizontal transmission really a problem for phylogenetic comparative methods? A simulation study using continuous cultural traits. Philos. Trans. R. Soc. B 365, 3903–3912 (2010).
Diamond, J. & Bellwood, P. Farmers and their languages: the first expansions. Science 300, 597–603 (2003).
Ross, M. H. Political decision making and conflict: additional cross-cultural codes and scales. Ethnology 22, 169–192 (1983).
Skirgård, H. et al. Grambank reveals the importance of genealogical constraints on linguistic diversity and highlights the impact of language loss. Sci. Adv. 9, eadg6175 (2023).
Passmore, S. et al. Global musical diversity is largely independent of linguistic and genetic histories. Nat. Commun. 15, 3964 (2024).
Her, O.-S. et al. Early humans out of Africa had only base-initial numerals. Humanit. Soc. Sci. Commun. 11, 254 (2024).
Global language trees. GitHub https://github.com/rbouckaert/global-language-tree-pipeline/releases/ (2022).
Revell, L. J. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3, 217–223 (2012).
Pagel, M. D. & Meade, A. BayesTraits Version 3.0.5. Reading Evolutionary Biology Group https://www.evolution.reading.ac.uk/BayesTraitsV3.0.5/BayesTraitsV3.0.5.html (2021).
Kirby, K. R. et al. D-PLACE: a global database of cultural, linguistic and environmental diversity. PLoS ONE 11, e0158391 (2016).
Bianchini, G. & Sánchez-Baracaldo, P. TreeViewer: flexible, modular software to visualise and manipulate phylogenetic trees. Ecol. Evol. 14, e10873 (2024).
ArcGIS Pro, version 3.5.3 (ESRI, 2025).
Acknowledgements
C.O. received funding for an Early Career Fellowship (ECF 619) from the Leverhulme Trust for part of the research. Q.D.A. was partly supported by two Royal Society of New Zealand Marsden grants (grant nos. MFP-20-UOA-123 and MFP-24-UOA-126) and a grant from the Templeton Religion Trust (grant no. TRT-2022-30666). We thank T. Currie and S. Montgomery for useful advice on analysis and interpretation and M. Gillings for help producing the maps.
Author information
Authors and Affiliations
Contributions
C.O. and Q.D.A. designed and performed the research. C.O. collected data. C.O. analysed data. C.O. and Q.D.A. wrote the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Human Behaviour thanks Pavel Duda, David Stasavage and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Tables 1–6, Figs. 1–10 and Discussion.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Opie, C., Atkinson, Q.D. State formation across cultures and the role of grain, intensive agriculture, taxation and writing. Nat Hum Behav 10, 156–163 (2026). https://doi.org/10.1038/s41562-025-02365-5
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41562-025-02365-5







