Introduction

Criminology literature consistently highlights substantial disparities in offending rates between males and females, with males being responsible for the majority of crimes1,2. This pattern extends to corruption-related offenses, where experimental research suggests that females typically demonstrate lower tolerance for dishonesty and exhibit more pro-social behavior3,4,5. Controlled experiments further indicate that, on average, females offer smaller bribes than males, and interactions exclusively between females tend to exhibit higher levels of honesty compared to those involving at least one male5,6. Additionally, studies report positive correlations between female participation in government and reduced perceptions of corruption in democratic countries4,7,8,9, suggesting that increased female representation in public offices and civil service also serves as a potential anti-corruption measure3,10,11. Research also indicates reductions in procurement-related corruption in French municipalities led by female mayors, although reelected female mayors exhibit no significant difference in corruption levels compared to their male counterparts12. This finding suggests that, over time, females in leadership positions may integrate into existing political networks or establish new ones that are not necessarily less corrupt12,13.

Despite numerous recent studies on the dynamics and inner workings of political corruption networks14,15,16,17, organized crime18,19, mafia groups20,21, and other offenses committed through associations of individuals22,23,24,25, the role of gender in such networked crimes, particularly political corruption, remains relatively underexplored. A notable exception is the work by Smith26, which examined gender disparities in Chicago’s organized crime before and after the alcohol prohibition of 1920, revealing that the prohibition expanded opportunities for males while largely excluding females. This emerging research field holds significant potential for advancing academic understanding and equipping security forces with better tools for decision-making, particularly when enhanced by collaboration between law enforcement and academia. For instance, an empirical investigation into criminal networks assessed the impact of law enforcement interventions on the structure and functioning of a dark web pedophile network22, showing that the resulting dismantling resembled random node removal and was less effective than more optimized removal strategies. Other studies27,28 showed that leveraging criminal network structures in combination with machine learning approaches yields impressive accuracy in recovering missing criminal partnerships, distinguishing types of criminal associations, predicting money flow among criminals, and even anticipating future partnerships and re-offending behavior. Recent studies analyzing security intelligence data29,30 proposed a novel protocol for network disruption through key-agent identification, showing its effectiveness in enabling law enforcement to identify and assess criminal activity. Another recent study on a dark web pedophile ring31 used network tools to analyze user behavior, revealing crucial insights into temporal engagement patterns, content preferences, and user clusters.

Research on networked crime has thus demonstrated that the structure of criminal networks is highly informative about various nuances of criminal activity32. Therefore, it is critical to analyze possible disparities in the roles occupied by males and females within these networks. This issue parallels other gender-based analyses in fields such as education33,34, scientific careers35,36, representation in literary works37, membership in artistic and scientific academies38,39,40, and online biographies41, to name just a few, and may offer valuable perspectives on gender differences in networked crime. Here, we analyze data from political corruption scandals in Brazil14 and Spain42, spanning from the 1990s to the 2010s, to construct corruption networks in which agents are connected when they collaborate in the same scandal. Gender labels are assigned to each network node based on the identification of individual names, allowing us to examine the structural roles of males and females within these networks. Consistent with the criminology literature, we find that males constitute the vast majority of corrupt agents, with females representing only 10% of nodes in the Brazilian network and 20% of nodes in the Spanish network—fractions that are at least partly explained by very low participation in politics, high-ranking positions in the private sector, and other leadership roles38,43,44,45.

Based on this disparity, we adapt a model capable of reproducing several properties of these corruption networks42 to simulate networks where gender is randomly assigned to nodes while preserving the observed male-to-female ratios. These in silico networks serve as null models for quantifying the significance of differences in network properties and determining whether these differences can be attributed solely to the disparity in gender representation. We examine the role of gender in centrality measures, network resilience, reoffending rates, and collaboration patterns, finding no significant gender differences overall, or that existing differences are at least partially explained by our null network model. The only exceptions refer to an overrepresentation of males among highly central individuals and a higher degree of gender homophily in collaboration patterns observed in the Spanish network, which cannot be accounted for by the null model. Our findings thus indicate a markedly low prevalence of females in political corruption cases; however, once this disparity is taken into account, the structural roles of these networked criminals appear to be largely invariant with respect to gender. Interestingly, these findings align with existing research on gender differences in scientific careers, where female and male faculties exhibit similar numbers of co-authors35, comparable annual productivity36, and equivalent career-wise impact36, once total publication numbers and career lengths are considered.

In what follows, we present these results in detail, beginning with a description of our data and the construction of the corruption networks, with an emphasis on the disparity in female participation. We then introduce our null network model, followed by an analysis of gender differences in centrality measures and network dismantling. Next, we explore gender differences in collaboration patterns within these networks. Finally, we conclude by contextualizing and summarizing our findings.

Results

Data

We begin by presenting the datasets used in our investigation, which encompass well-documented political corruption scandals in Brazil14 and Spain42, spanning from the late 1980s to the late 2010s. These datasets were compiled from widely circulated magazines and newspapers in both countries and subsequently organized by the authors of Ref.14 for the Brazilian cases, and by a non-profit organization for the Spanish cases46. The data comprise the names of all individuals involved and the year of occurrence for 65 scandals in Brazil and 437 scandals in Spain. Using the list of involved individuals, we manually assign gender to each person based on their names, complemented by web searches to confirm the information. We then construct a complex network representation of these scandals where nodes represent individuals and edges connect pairs who partnered in at least one scandal. In both networks, nodes represent politicians, public servants, or private sector actors engaged in illegal practices that compromise the public good in return for monetary reward or the accrual of greater influence, access, or power. For the Spanish network, we further exclude isolated scandals involving a single individual. This process resulted in two corruption networks comprising 404 individuals in Brazil and 2695 in Spain.

Gender prevalence

The Brazilian and Spanish networks are depicted in Fig. 1A and B, with nodes color-coded by gender (purple for males and green for females). These visualizations clearly show that female agents represent a small fraction of the nodes in both networks. Specifically, females account for \(10\%\) and \(20\%\) of the individuals in the Brazilian and Spanish networks, respectively. These proportions remain similar when considering only the largest connected component of the networks, where females constitute \(9.4\%\) and \(20.7\%\) of the Brazilian and Spanish networks, respectively. Moreover, Fig. 1C and D shows that the average prevalence of gender does not change across the different levels of connectivity obtained from the k-core decomposition47, which progressively removes nodes with fewer than k connections, creating a hierarchical structure of nodes, from the 1-core (the entire set of nodes) to higher-order cores such as the 2-core (nodes with at least two connections), and so on. We also examine the evolution of female participation as the networks expand over time due to the addition of new corruption cases. Figure 1E and F show that, following a brief initial period with no females in the Brazilian network and a slightly higher fraction in the Spanish network, the proportion of females stabilizes and remains steady over time in both networks. Additionally, we analyze whether these proportions vary with the number of individuals involved in political scandals by estimating the average number of males and females as a function of scandal size. The insets of Fig. 1E and F show that the average numbers of male and female individuals scale proportionally with scandal size, with the proportionality rates corresponding to the overall gender incidence in the networks. These findings confirm that females are underrepresented in corruption networks and that this underrepresentation remains stable across both the hierarchical structures of connectivity and different scandal sizes.

Fig. 1
Fig. 1The alternative text for this image may have been generated using AI.
Full size image

Gender disparities in participation within corruption networks. Panels (A) and (B) present visualizations of corruption networks in Brazil and Spain. Nodes represent male (purple) and female (green) agents involved in corruption scandals, and connections among them indicate partnerships in at least one scandal. Females comprise 10% of agents in Brazil and 20% in Spain. Panels (C) and (D) show the proportions of males (purple line) and females (green line) across various k-core levels (defined as sets of nodes with at least k connections) in the Brazilian and Spanish networks, respectively. In both panels, horizontal dashed lines indicate the average prevalence of males and females within each network. Panels (E) and (F) depict the evolution of the proportion of males (purple lines) and females (green lines) as the corruption networks in Brazil and Spain expand through the inclusion of new scandals. Insets show the average number of male (purple markers) and female (green markers) agents as a function of the scandal size (number of involved individuals). Dashed lines represent linear relationships in which the average number of agents is proportional to scandal size, with proportionality coefficients corresponding to the overall gender prevalence in each network.

Network centrality

In addition to examining gender prevalence within these networks, we assess whether males and females occupy distinct roles and functions by analyzing the degree (k) and betweenness (B) centrality measures48. We estimate the average values of these measures by grouping nodes according to gender, further distinguishing whether individuals are recidivists. Recidivists, or re-offenders, defined as individuals involved in multiple corruption scandals, play a crucial role in shaping the structure and dynamics of corruption networks by linking separate scandals42. Figure 2A–D present the average values of both centrality measures (circle markers with error bars) alongside their distributions (shown as violin plots) separated by gender (purple for males and green for females). On average, male individuals have \(18\pm 1\) partners (with ± denoting the standard error of the mean), while females show a slightly lower average of \(17\pm 2\) partners in the Brazilian network. A similar trend is observed in the Spanish network, where males average \(21\pm 1\) partners, compared to \(19\pm 1\) for females. Among recidivists, males also exhibit a higher average number of partners than females in both networks, with values of \(33\pm 3\) vs. \(19\pm 10\) and \(38\pm 2\) vs. \(34\pm 4\) for recidivist males and females in the Brazilian and Spanish networks, respectively. For betweenness centrality, we calculate the average values only for recidivists, as non-recidivists have null betweenness given they do not intermediate any shortest paths. The average betweenness of female recidivists [\((1.39\pm 1.26)\times 10^{-2}\) and \((2.87\pm 1.65)\times 10^{-3}\)] is also slightly smaller than that of males [\((2.16\pm 0.38)\times 10^{-2}\) and \((2.87\pm 0.50)\times 10^{-3}\)] in both networks (Brazilian and Spanish).

The analysis of centrality measures suggests that, on average, males not only have more partners but also intermediate more shortest paths within these networks. To systematically test these hypotheses, we employ bootstrap tests for the equality of means49, finding that none of the observed gender differences in the average values of either centrality measure are statistically significant at the 95% confidence level (see Supplementary Table S1 for all p-values). Additionally, Mann-Whitney tests for equality in distribution50 do not allow rejection of the null hypothesis that the distributions of these centrality measures are equal between the genders at the same confidence level (see Supplementary Table S2 for all p-values).

Fig. 2
Fig. 2The alternative text for this image may have been generated using AI.
Full size image

Network centralities of males and females in corruption networks. Panels (A) and (B) show the degree centrality k of males (purple) and females (green), with nodes further categorized into all agents and recidivist agents for the Brazilian and Spanish networks. Similarly, panels (C) and (D) present the betweenness centrality B of recidivist agents. Error bars represent the average values of the centrality measures along with their standard errors of the mean. Violin plots illustrate the data distributions, with individual observations depicted as small dots. Panels (E)–(J) compare the difference between the number of males and females identified as outliers based on degree centrality (\(\delta _k\) and \(\delta _{k}^\textrm{r}\) for recidivists) and betweenness centrality (\(\delta _{B}^\textrm{r}\)) for the Brazilian (first row) and Spanish networks (second row). Vertical lines indicate the empirical differences observed in each network, while the filled curves correspond to the complementary cumulative distribution functions [CCDFs, \(F(\delta )\), with \(\delta \in (\delta _k, \delta _k^\textrm{r}, \delta _B^\textrm{r})\)] of the same differences estimated from the null network model. Legends display the probability of observing the same or a large excess of male outliers in the null model.

To further corroborate these results, we adapt a model capable of replicating several properties of corruption networks42 to generate synthetic networks where gender is randomly assigned to nodes, maintaining the observed proportions of males and females in each network. As detailed in Methods, the model begins with an empty network and incrementally adds fully connected graphs, simulating the emergence of new scandals. Nodes within these subgraphs have a probability of being assigned as re-offenders (the recidivism rate), and when a re-offender appears, her or his new partners are connected together to that re-offender in the network. This process continues until the number of scandals in the synthetic networks matches those observed in the empirical networks. Additionally, following42, we fine-tune the model parameters to best replicate the properties of each empirical network. Once the synthetic networks are generated, gender is randomly assigned to each node according to the empirical proportions of males and females (10% in the Brazilian and 20% in the Spanish networks). This method thus generates a null model in which gender is not associated with any structural property of the corruption networks, reflecting only the empirical disparity in gender representation. Indeed, simulations of this model mimic the empirical behavior and produce networks where degree and betweenness centrality measures display the same average values and distributions for both genders (Supplementary Figure S1).

Yet, the results of Fig. 2A–D indicate that the centrality distributions are highly skewed, with certain individuals exhibiting centrality values substantially higher than the average. This prompts the question of whether male and female agents are equally represented among these highly central individuals. To investigate this possibility, we calculate the difference between the numbers of males and females identified as outliers based on degree (\(\delta _k\) and \(\delta _k^\textrm{r}\) for recidivists) and betweenness (\(\delta _B^\textrm{r}\), only for recidivists) centralities. Outliers are defined using Tukey’s fences rule51, where any observation exceeding the third quartile by more than 1.5 times the interquartile range is classified as an outlier. In the Brazilian network, we identify an excess of 15 males among degree outliers (\(\delta _k=15\)), no difference among recidivists outliers in degree (\(\delta _k^\textrm{r}=0\)), and an excess of 2 males among outliers in betweenness centrality (\(\delta _B^\textrm{r}=2\)). In the Spanish network, there is an excess of 115 males among degree outliers (\(\delta _k=115\)), an excess of 6 males among recidivist outliers in degree (\(\delta _k^\textrm{r}=6\)), and an excess of 27 males among betweenness outliers (\(\delta _B^\textrm{r}=27\)). These results suggest that males constitute the majority of highly central individuals; however, part of this male overrepresentation may be due to the significantly higher proportion of males in these corruption networks.

To test this hypothesis, we generate one thousand simulated networks from our null model for each country, evaluating the differences in the numbers of males and females identified as outliers. We then calculate the complementary cumulative probability distributions of these differences [\(F(\delta )\), where \(\delta \in (\delta _k, \delta _k^\textrm{r}, \delta _B^\textrm{r})\)], as shown in Fig. 2E–J. Most of the distribution masses lie in the positive range, confirming that the higher prevalence of males is indeed partially responsible for the emergence of more male outliers. These distributions further allow us to quantify how rare the empirical differences between the number of male and female outliers are, analogously to a p-value in hypothesis tests. In the Brazilian network, the empirical differences (represented as vertical lines in Fig. 2E, G, and I) all have higher probabilities of randomly occurring in our null model, suggesting that they can be fully explained by the higher proportion of males in the network. Similarly, in the Spanish network, the excess of male outliers in degree for recidivists has a \(27\%\) probability of occurring randomly (vertical line in Fig. 2H). However, for degree and betweenness centrality outliers (vertical lines in Fig. 2F and J), the probabilities of observing the empirical differences in our null models are significantly lower. Specifically, the probability of observing 115 or more male degree outliers is estimated at \(6\%\), and the probability of finding 27 or more male betweenness outliers is \(1\%\). Although the \(6\%\) probability for degree outliers falls within the \(95\%\) confidence level, these findings support, at least partially, the hypothesis that males are overrepresented among highly connected individuals in the Spanish network, even beyond what numerical differences in gender prevalence would suggest. We also note that the Brazilian network is significantly smaller than the Spanish network, which may explain the lack of significance as random fluctuations become more pronounced in smaller systems.

Network resilience

We also examine the potential influence of gender on the resilience of the Brazilian and Spanish corruption networks by performing random dismantling and targeted attacks on their largest components. For random dismantling, we consider three distinct strategies: randomly removing all females, randomly removing the same number of males, and randomly removing the same number of agents regardless of gender. Each strategy is replicated 1000 times per network, and the attack’s impact is assessed by calculating the average fraction of the largest connected component S as a function of the number of removed nodes n, as depicted in Fig. 3A for the Brazilian network and Fig. 3B for the Spanish network. In both cases, the average behavior of S when removing only males (purple circles) or when removing nodes regardless of gender is similar due to the significantly higher prevalence of males. Interestingly, the random removal of females produces different results between the two networks. In the Brazilian network, removing all females causes \(3\%\) less damage to the giant component than removing the same number of males or nodes regardless of gender. In contrast, in the Spanish network, removing all females causes \(16\%\) more damage to the giant component than the two other strategies. To verify the significance of these findings, we compare the effects of randomly removing females with those of randomly removing nodes regardless of gender in networks generated from our null model, calculating the differences in the final fraction of the largest component (\(\delta _S\)), and their distributions across 1000 simulations. Figure 3C and D show that observing \(3\%\) less damage to the giant component when removing only females in the giant component of the Brazilian network or the \(16\%\) more damage caused by the removal of all females in the giant component of the Spanish network falls within the 95% confidence intervals derived from our null model. Therefore, there is no significant difference between randomly removing only females and only males to the resilience of both corruption networks. The same conclusions are obtained when evaluating the attack damages by calculating the average value of the inverse of the shortest path lengths in the giant components of these networks (Supplementary Figure S2).

Fig. 3
Fig. 3The alternative text for this image may have been generated using AI.
Full size image

Network resilience under the removal of male and female agents. Fraction of the largest connected component S of the (A) Brazilian and (B) Spanish networks as a function of the number of removed nodes n following three random dismantling strategies: randomly removing all females (green crosses), randomly removing the same number of males (purple circles), and randomly removing the same number of agents regardless of gender (dashed lines). The curves represent the average values of S calculated from 1000 independent realizations of each dismantling strategy. Probability distribution functions [PDFs, \(f(\delta _S)\)] comparing the differences \(\delta _S\) in the final fraction of the largest component after randomly removing females and randomly removing nodes regardless of gender in 1000 simulations of networks generated from our null model for the (C) Brazilian and (D) Spanish networks. Fraction of the largest connected component S of the (E) Brazilian and (F) Spanish networks as a function of the number of removed nodes n following the generalized network dismantling (GND) algorithm proposed by Ren et al.52. Green crosses indicate the removal of female agents, and purple circles represent the removal of male agents. The GND algorithm is deterministic with nodes sequentially removed (up to the total number of females initially present in the giant component of each network) in an optimal order designed to cause maximum disruption. Probability distribution functions [PDFs, \(f(x_{\text {GND}})\)] of observing a fraction \(x_{\text {GND}}\) of female agents among the nodes selected for removal by the GND algorithm across 1000 simulations of our null model for the (G) Brazilian and (H) Spanish corruption networks. The empirical fractions are depicted by vertical lines, with shaded regions corresponding to 95% confidence intervals estimated from the null model.

In addition to random dismantling, we further consider targeted attacks using the deterministic dismantling algorithm proposed by Ren et al.52, addressing the generalized network dismantling (GND) problem. This approach aims to identify an optimal set of nodes whose removal (subject to cost constraints) reduces the largest connected component to a specified maximum size. Since the algorithm of Ren et al. relies on a recurrent spectral partitioning approach, it offers complementary insights into the structural roles of males and females that may not be captured by the random dismantling and centrality analysis. Figure 3E and F depict the fraction of the largest components of the Brazilian and Spanish networks as nodes selected by the algorithm are sequentially removed (up to the total number of females initially present) in an optimal order designed to cause maximum disruption. In these figures, the removal of male nodes is indicated by purple circles, while the removal of female nodes is shown by green crosses. As expected, these targeted attacks inflict significantly greater damage on the giant components of both networks compared to random dismantling strategies. Regarding gender differences, in the Brazilian network, 4 females out of the 29 initially present are removed, corresponding to \(13.8\%\), a fraction slightly higher than the overall female prevalence in the giant component (\(9.4\%\)). Similarly, in the Spanish network, the GND algorithm selects 50 females for removal out of the 225 initially present, yielding a fraction slightly higher than the overall female prevalence in the giant component (\(22.2\%\) vs. \(20.7\%\)). Once again, we use our null model to assess the significance of these differences by estimating the probability distributions of observing a fraction \(x_{\text {GND}}\) of females among the nodes selected for removal across 1000 simulations for each corruption network. These distributions are shown in Fig. 3G and H, indicating that both empirical differences are fully accounted for by our null model. The same results are obtained when considering targeted network attacks based on degree and betweenness centralities, as well as when quantifying damage using the average value of the inverse of the shortest path lengths (Supplementary Figure S3). Therefore, as with random dismantling, gender does not play a significant role in targeted network attacks.

Collaboration patterns

In our final analysis, we examine collaboration patterns of both genders in corruption networks. Results so far indicate that females and males display a similar number of collaborators, including among recidivists. However, males are overrepresented among highly central individuals in the Spanish network. This prompts the question of whether there are differences in the recidivism rates (defined here as the proportion of individuals involved in multiple corruption cases) between males and females. In the Brazilian network, \(14.1\%\) of males and \(11.9\%\) of females participate in more than one corruption scandal, while in the Spanish network, \(9.04\%\) of males and \(7.27\%\) of females are recidivists. Thus, the recidivism rate of males is \(\approx 2\%\) higher than that of females in both networks. Once again, we assess the significance of these differences by simulating 1000 networks using our null model with parameters tailored to the data of each country. We then calculate the differences \(\delta _r\) in the recidivism rate between genders and estimate their probability distributions. Figure 4A and B display these distributions for Brazilian and Spanish simulated networks, where shaded regions represent \(95\%\) confidence intervals for \(\delta _r\), and the vertical lines indicate the empirical differences. As the empirical differences fall within the confidence intervals, they are not statistically significant, suggesting no clear evidence that males have higher recidivism rates than females.

Fig. 4
Fig. 4The alternative text for this image may have been generated using AI.
Full size image

Collaboration patterns of males and females within corruption networks. Probability distribution functions [PDFs, \(f(\delta _r)\)] of differences in recidivism rates \(\delta _r\) between males and females estimated from simulation of our null model with parameters tailored to the (A) Brazilian and (B) Spanish data. Vertical lines indicate the empirical differences calculated for the Brazilian (\(\delta _r^{\text {BR}}=0.022\)) and Spanish \((\delta _r^{\text {ES}}=0.018)\) networks. The shaded regions represent \(95\%\) confidence intervals estimated from the simulated data for each country. Complementary cumulative distribution functions [CCDFs, \(F(\phi )\)] of the fraction \(\phi\) of females involved in corruption scandals in the (C) Brazilian and (D) Spanish data. Green curves represent the empirical distributions, gray curves show the distributions estimated from each of the 1000 simulations of the null model for each country, and the black curves correspond to the average behavior of the simulations. Green vertical lines indicate the empirical average fractions (\(\bar{\phi }_\textrm{BR} = 0.11\) for Brazilian and \(\bar{\phi }_\textrm{ES} = 0.19\) for Spanish scandals), while black vertical lines show the corresponding averages calculated from all simulations. Percentage of male-male, female-male, and female-female links in the (E) Brazilian and (F) Spanish corruption networks on a logarithmic scale. Colored bars indicate the empirical percentages, while gray bars show the averages obtained from 1000 simulations of the null model for each country. Error bars represent the 95% confidence intervals calculated from the null model simulations, with asterisks above the bars indicating statistically significant differences.

We also explore the cumulative distributions \(F(\phi )\) of the fractions \(\phi\) of females involved in each corruption scandal, as depicted in Fig. 4C for Brazilian scandals and 4D for the Spanish scandals (green lines). These figures show that scandals predominantly involving females (\(\phi >0.5\)) are rare (\(3\%\) and \(4\%\) in Brazilian and Spanish data, respectively); however, the average values of \(\phi\) (\(\bar{\phi }_\textrm{BR} = 0.11\) for Brazilian and \(\bar{\phi }_\textrm{ES} = 0.19\) for Spanish scandals) closely align with the overall proportions of females in both networks (0.10 and 0.20, respectively). Using the Mann-Whitney test50, we compare these empirical distributions with those derived from 1000 simulations of our null model (gray curves in Fig. 4C and D). The test fails to reject the null hypothesis of equality between the simulated and empirical distributions for the Brazilian data in 98% of the comparisons (p-value \(<0.05\) in only \(2\%\) of the comparisons, Supplementary Figure S4), and in the Spanish data, the hypothesis is not rejected in 61% of the comparisons (p-value \(<0.05\) in \(39\%\) of the comparisons, Supplementary Figure S4). Furthermore, the average values of \(\phi\) and \(F(\phi )\) across all simulations closely match their empirical counterparts, with only small systematic deviations observed in the Spanish scandals. While the prevalence of females in Spanish scandals appears marginally higher than that predicted by the null model, the fraction of scandals predominantly involving females (\(\phi >0.5\)) is not significantly different from the simulated results (Supplementary Figure S5). These findings confirm that scandals predominantly involving females are rare, and this underrepresentation can be fully explained by their overall lower representation in the networks.

Still analyzing collaboration patterns, we calculate the percentage of links exclusively between males, between females and males, and exclusively between females to quantify gender homophily within the corruption networks. As before, we use 1000 simulated replicas of the Brazilian and Spanish networks generated from our null model as baselines for comparisons with empirical values. The colored bars in Fig. 4E and F show the empirical proportions of each link type, while the grey bars represent the null model simulations, with error bars indicating \(95\%\) confidence intervals. In the Brazilian network, the proportions of male-male, female-male, and female-female links are \(81.2\%,\) \(17.9\%\), and \(0.9\%\), respectively. These empirical percentages closely align with the average values obtained from the null model and do not significantly differ from them (Supplementary Figures S6A-S6C), suggesting that the observed degree of homophily in the Brazilian network is fully explained by the null model. In contrast, in the Spanish network, \(66.3\%\) of links are exclusively between males, \(28.8\%\) connect females and males, and \(4.9\%\) are exclusively between females. These percentages deviate more substantially from the null model averages of \(63.3\%\), \(32.5\%\), and \(4.2\%\) for male-male, female-male, and female-female connections, respectively. The surplus of male-male links and the deficit of female-male connections are statistically significant, as the empirical values lie outside the \(95\%\) confidence intervals (Supplementary Figures S6D and S6E). In turn, although the excess of female-female links is not strictly statistically significant, the empirical value (\(4.9\%\)) closely approaches the upper limit of the \(95\%\) confidence interval (\(3.4\%\) to \(4.9\%\), Supplementary Figure S6F). Thus, our data provide evidence for the existence of a higher degree of gender homophily in the Spanish network that cannot be fully accounted for by the null model.

Discussion and conclusions

Corruption, broadly defined as “the abuse of entrusted power for private benefits or gains,”16 is a pervasive criminal behavior that remains highly resistant to control53,54. The vast sums of money involved in corruption cases, alongside the high costs of efforts to combat this crime, have driven much of the literature on corruption and organized crime toward economic and sociological analyses53. In parallel, the rise of network science 48,55 has recently fostered a growing body of research exploring the dynamics and structure of networked crimes14,16,17,18,19,22,23,27,28,29,31. Our study contributes to this growing field by addressing the underexplored role of gender in corruption networks, thus complementing the extensive literature documenting the significantly higher offending rates of males across various crime types1,2,3,4,5, as well as the suggested lower tolerance of females for dishonest and corrupt behavior3,4,5,6,10,11,12.

Our findings align with the criminology literature, demonstrating that males constitute the vast majority of individuals in political corruption networks, with females representing only \(10\%\) of nodes in the Brazilian network and \(20\%\) of nodes in the Spanish network. Moreover, we observed that the proportion of females involved in corruption scandals remained stable over time, across hierarchical connectivity structures, and irrespective of scandal size. Given that most individuals involved in corruption scandals are politicians and high-ranking employees in both public and private sectors, the underrepresentation of female agents may, at least in part, be attributable to their comparatively low participation in politics, high-ranking positions, and other leadership roles38,43,44,45. Indeed, during the same period covered by our data, females constituted, on average, \(10.6\%\) and \(26.4\%\) of parliamentarians in the upper and lower chambers in Brazil and Spain56,57, respectively—proportions that parallel the female participation in our corruption networks. However, corruption network participants are not exclusively political representatives and the complete networks capturing all relevant interactions between politics and other sectors remain inaccessible, leaving the associated gender distributions unknown. Moreover, politicians are not ordinary individuals45, and political corruption represents a highly specialized form of criminality. Thus, selection bias associated with the occupation of political positions, as well as bias related to the emergence of individuals involved in illicit activities, may also influence the observed gender differences in corruption network participation. Accordingly, it remains a challenge for future studies to determine whether disparities in political participation and leadership roles fully account for the underrepresentation observed.

Interestingly, the level of female participation in these corruption networks is somewhat comparable to that observed in Chicago organized crime prior to alcohol prohibition in 1920, where females constituted 18% of networked individuals26. However, following prohibition, female representation dropped to just 4%, despite the network nearly tripling in size26. A similarly low level of female participation was observed in organized crime groups involved in serious offenses in a region of England, where females accounted for 6% of individuals58. In the case of Chicago, Smith26 argued that the exogenous shock of alcohol prohibition increased profits, violence, and risk, leading to new organizational structures that concentrated power and opportunities among males, largely excluding females from criminal activities. While our findings do not indicate significant variations in female involvement across both corruption networks, the potential impact of similar exogenous shocks—such as anti-corruption measures or large-scale police operations—on these networks warrants further investigation in future research.

To explore how this gender disparity translates into differences in network properties of males and females, we considered a model capable of replicating several aspects of corruption networks, generating simulated networks that reflect the observed gender imbalance but without tying gender to any specific network property. These simulations thus served as null models, providing baselines for statistical comparisons and revealing differences that cannot be attributed solely to the disparity in gender representation. Despite the substantial imbalance in gender representation, our analysis found that females and males exhibit similar averages and distributions in terms of degree and betweenness centralities. However, the number of males emerging as outliers for both centrality measures was significantly greater than that of females, with these differences not being fully explained by the female underrepresentation in our null models, particularly in the Spanish network. The lower occurrence of females among centrality outliers also aligns with the absence of females in the criminal elite during the alcohol prohibition era in Chicago26. Gender also did not significantly affect the resilience of corruption networks, whether through random dismantling or targeted attacks on their largest components. Specifically, we found that randomly removing only females, only males, or removing agents irrespective of gender caused similar levels of damage to these networks. Similarly, the fraction of females selected for removal by a deterministic algorithm addressing the generalized network dismantling problem52 was indistinguishable from the values obtained in the null model simulations.

Regarding collaboration patterns, our findings indicated that males are more likely than females to be involved in multiple scandals and that the proportion of scandals predominantly formed by females is small in both networks. However, these differences are fully explained by the overall lower female representation, as captured by the null network model. Although our simulations fully explain the gap in recidivism rates, it is noteworthy that female politicians are often subjected to higher levels of voter accountability—given the general perception of females as more honest than males5,6—which may further disincentivize their involvement in multiple scandals13,59. We further investigated gender homophily within the corruption networks by estimating the percentage of links exclusively between males, between females and males, and exclusively between females, both from the empirical networks and their null model simulations. In both networks, male-male connections accounted for the vast majority of network connections, followed by female-male connections, while exclusively female connections were the least common. In the Brazilian network, the degree of homophily quantified by the prevalence of these three connection types was fully explained by the null model. Conversely, in the Spanish network, male-male connections exceeded and male-female connections fell below null model expectations, indicating that the degree of gender homophily is not solely explained by female underrepresentation. This pattern aligns with previous research on organized crime, where homophilic tendencies have been observed in relation to ethnicity60, age58,61, and gender26,58,61. Specifically, Campana and Verese identified gender as a determinant of membership in criminal groups, with individuals of the same gender more likely to belong to the same group58. Similarly, Smith26 observed mild assortative mixing by gender, where females tend to connect with males, while males predominantly connect with other males in the context of organized crime in Chicago during the alcohol prohibition.

Taken together, our findings indicate that the participation of female agents in political corruption is markedly low; however, once this disparity is accounted for, the structural roles of these networked criminals appear largely invariant with respect to gender, with notable exceptions concerning the presence of highly central individuals and the degree of homophily in criminal associations. Although not a perfect analogy, these findings align with existing research on gender differences in scientific careers, where female and male faculties exhibit similar numbers of co-authors35, comparable annual productivity36, and equivalent career-wise impact36, once total publication numbers and career lengths are considered.

Our research is not without limitations. A primary challenge lies in the difficulty of acquiring reliable data on corruption networks and organized crime, due to legal confidentiality or the covert nature of criminal activities. As with many datasets of this type, corruption data remains inherently incomplete, as it is challenging to ensure that all individuals involved in corruption scandals are identified through investigations. The temporal span of our networks (from the late 1980s to the late 2010s) may also influence our results. Future research using more recent data, in light of the rapid developments in female representation and evolving societal norms regarding gender equality, may eventually reveal emerging trends despite the observed stability in gender prevalence. Furthermore, the limited size of our corruption networks may have constrained the detection of statistically significant differences, particularly in the Brazilian context, where random fluctuations in the null model were more pronounced. Moreover, our conclusions are based on political corruption within two Western nations, and although these countries differ in political systems and development levels, future research should expand to other regions to assess the generalizability of our findings. Another limitation is the exclusion of nonbinary identities from our analysis; although such data are undoubtedly challenging to collect, future work should strive to incorporate these identities for a more comprehensive understanding of gender roles in corruption networks. Additional studies could also explore the temporal dynamics of centrality and collaboration within corruption networks, considering their growth and evolution as well as the role of gender in the emergence of isolated scandals and individuals. Equally important is the examination of other networked crimes, such as money laundering, illicit trade, and cybercrime. Despite these constraints, we believe this work improves our understanding of gender disparities and commonalities in the roles individuals occupy within corruption networks.

Methods

We adapt the corruption network model proposed in Ref.42 to generate a null model in which gender labels are randomly assigned based on the empirical gender proportions. The original model was designed to replicate several characteristics observed in corruption networks, including exponential degree distributions, high clustering coefficients, and the small-world property. The model begins with an empty synthetic network that grows by iteratively adding complete graphs representing corruption scandals. To mimic empirical behavior, the size s (number of people involved) in these scandals is drawn from the exponential distribution, \(P(s) = \frac{1}{s_c} e^{-s/s_c}\), where \(s_c\) is a parameter defining the typical size of corruption scandals. Additionally, to further replicate empirical patterns, the model assumes that the number of recidivist agents (R) increases linearly with the total number of nodes (N) according to \(R = \alpha N + \beta\), where \(\alpha\) controls the recidivism rate and \(\beta\) determines the minimal number of individuals in the network required to observe the first recidivists. Each time new recidivists are introduced, the model randomly selects individuals from the network to become recidivists and incorporates them into the next scandal added to the network. Furthermore, individuals who are already recidivists are selected with a small probability \(p_{rr}\). This process continues until the number of corruption scandals in the simulated networks matches that observed in the empirical networks. Once the network is complete, genders are randomly assigned to all nodes based on a Bernoulli distribution, where the probability of a node being female is \(p_f\).

Following Ref.42, we set the model parameters to best reproduce the empirical properties of the corruption networks. For the Brazilian network, we use \(\alpha =0.14\), \(\beta =-11.5\), \(s_c=7.51\), and \(p_f=0.1\). For the Spanish network, we set \(\alpha =0.09\), \(\beta =-3.47\), \(s_c=7.33\), and \(p_f=0.2\). We further consider \(p_{rr}=0.025\) for both networks.