The BioSUD Biobank as a genomic resource for substance use disorders in Italy

Ribatti, Raffaella Maria; de Gennaro, Luciana; Daponte, Alessia; Cozzoli, Danilo; Quaranta, Maria Rita; Ostuni, Angelo; Casanova, Margherita; Ariano, Vincenza; Leone, Vincenzo; Perrone, Francesco; Bona, Salvatore Della; Lacalamita, Angela; De Fazio, Salvatore; Lorusso, Daniela; Metspalu, Mait; Torroni, Antonio; Olivieri, Anna; Capelli, Cristian; Antonacci, Francesca; Catacchio, Claudia Rita; Ventura, Mario; Montinaro, Francesco

doi:10.1038/s41598-025-05211-w

Download PDF

Article
Open access
Published: 01 July 2025

The BioSUD Biobank as a genomic resource for substance use disorders in Italy

Scientific Reports volume 15, Article number: 21817 (2025) Cite this article

1849 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Substance Use Disorders (SUDs) are a significant public health concern with complex etiologies involving genetic, environmental, and psychological factors. Here, we present BioSUD, a biobank that, by integrating genomic data with comprehensive phenotypic assessments, including sociodemographic, psychosocial, and addiction-related variables, was designed to investigate the etiology of SUDs within the Southern Italian population. We assessed a cohort of 1,806 participants (1,508 controls and 298 individuals with SUD diagnosis). Genomic analyses of the newly generated genotypes showed a predominantly Southern Italian ancestry for the BioSUD cohort. Admixture analysis reveals a complex history of genetic admixture in Southern Italian populations, exhibiting Southern European, African, and other ancestries. This results in significant genetic variation, potentially limiting the applicability of translational studies primarily based on Northern European ancestries. From a social and psychological perspective, individuals with SUDs exhibited lower socioeconomic status, increased exposure to adverse experiences, and compromised familial and peer relationships relative to controls. These results show that the BioSUD cohort is valuable for studying SUD-associated complex behavioral traits.

Strong and weak cross-inheritance of substance use disorders in a nationally representative sample

Article 10 November 2021

Genetics of substance use disorders in the era of big data

Article 01 July 2021

Genome-wide meta-analyses of cross substance use disorders in diverse populations

Article Open access 07 October 2025

Introduction

The Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, Text Revision (DSM-5-TR) defines substance use disorders (SUDs) as a pattern of substance use resulting in clinically significant impairment or distress¹. This encompasses a range of conditions, including tolerance, withdrawal, and persistent, unsuccessful efforts to control or reduce substance consumption.

SUDs pose a substantial global health challenge, significantly contributing to morbidity and mortality². According to the 2021 United Nations Office on Drugs and Crime (UNODC) report³, an estimated 296 million individuals worldwide engage in drug use, with 29.5% experiencing a SUD, marking a 45% increase in prevalence over the preceding decade. The European Drug Report⁴ indicates that over 29% of Europeans aged 15 to 64 have used illicit drugs at least once. In Italy, substance use among 15-19-year-olds reached 27.9% in 2023, affecting approximately one million students⁵. With a cannabis use rate of 21.5%, Italy ranks second in Europe, exceeded only by Czechia, and substantially surpasses the EU average of 12.2%. Cocaine use, reported at 2.1%, aligns closely with the European average of 2.21%, with higher consumption concentrated in Northern Italy in 2023⁴. In 2022, the European Union recorded an estimated mortality rate of 22.5 deaths per million among individuals aged 15–64⁵. Additionally, data from Italian prefectures and law enforcement points to a slight rise in drug-related deaths in 2022 over 2021⁶.

The etiology of SUDs seems to be multifactorial, encompassing neurological, genetic, and sociocultural components^7,8. From a genetic standpoint, familial patterns found in twin and family studies indicate that SUDs have a heritable component⁹. Estimates of heritability for various SUDs usually fall between 50 and 60%, suggesting a high level of polygenicity^10,11. Genome-wide association studies (GWAS) have identified specific genomic loci associated with various SUDs, encompassing both licit and illicit substances, including nicotine, cannabis, and opioids^12,13,14. While often challenged by limitations in robustness and reproducibility, candidate gene studies have explored genetic variations within dopaminergic, serotonergic, opioid receptor, GABAergic, and nicotinic cholinergic system genes across various SUDs^15,16,17,18. Linkage disequilibrium (LD) score regression methods revealed positive genetic correlations between smoking, cannabis use, major depression, and risk-taking behaviors¹⁹. Hatoum et al.¹⁹ found a shared hereditary risk element for addiction affecting problematic use of opioids, cannabis, tobacco, and alcohol. This risk factor result is separate from general substance use patterns, exhibiting the strongest correlations with opioid and cannabis use disorders, with a weaker association with tobacco use. This factor also correlates with executive functioning, personality traits such as risk-taking and neuroticism, and several non-substance-related mental health conditions. According to the authors, this addiction-specific genetic risk factor remains a significant predictor of addiction even after controlling for typical substance use patterns and general psychopathology, suggesting a unique genetic architecture driving, at least partially, addiction, independently of these contributing elements.

From a psychological perspective, current addiction models suggest that individuals initiate substance use due to positive reinforcement, which can lead to automated processes and inflexible, compulsive behaviors that resist negative consequences^20,21. The prefrontal cortex, which is essential for many executive functions^22,23, including inhibitory control, working memory, and attention^24,25, is significantly affected by persistent substance use. In people with SUDs, these deficits cause more general cognitive problems in domains outside substance-related reward^26,27,28,29.

A significant research gap is highlighted by the fact that Northern European ancestry groups are the most frequently studied populations in GWAS, so there is a need to investigate more diverse populations. In fact, different research has demonstrated that several Southern European populations have been influenced by many populations during their history, causing subtle but significant differences in allele frequencies^30,31,32,33. Differences in evolutionary pressures, environmental adaptations, and demographic events have resulted in substantial genetic diversity across human populations. This diversity challenges identifying universally applicable risk variants for complex traits like SUDs. Therefore, the substantial genetic heterogeneity within the Italian population underscores the importance of its inclusion in SUD research.

Global biobank development has seen a marked increase in recent decades, preserving biological samples from thousands of participants. Their value is amplified by integrating observational data and in-depth questionnaire responses, thereby significantly boosting research potential^34,35. Even though they are still relatively new, biobanks have already transformed biomedicine, especially in association studies, and scientists anticipate they will soon provide amazing insights³⁶. However, for specific traits, such as SUDs, most biobanks lack sufficient data, particularly for the Italian population, with research primarily focused on alcohol and tobacco addiction or abuse, together with lifetime usage data for other substances.

Due to the high societal costs and complex nature of SUDs, investigating this interplay is crucial for improving prevention, diagnosis, and personalized interventions³⁷. This article presents BioSUD, a new Biobank project for studying SUDs in Southern Italy. The primary goal is to determine the genetic causes of SUDs and the relationships between treatment outcomes and environmental and genetic factors. Our findings could improve genotype-informed SUD treatments, overcoming patient outcomes and advancing scientific understanding.

Materials and methods

Recruitment

The BioSUD initiative intends to create a genetic resource for understanding the phenotypic characteristics associated with SUDs. We aim to collect and analyze data from 3,000 people, 1,500 of whom are diagnosed with SUD. On the 1st of February 2024, the cohort included 1,806 participants, of whom 1,508 individuals served as control participants, comprising 1,046 males and 462 females, and 298 case participants, 278 males and 20 females. This study defined controls as individuals without a formal SUD diagnosis from either public or private treatment centers. We recruited control participants exclusively from a single blood donor center (Centro Trasfusionale of the University General Hospital, Bari, Italy) between March and October 2021 during their donation process. Before sample collection, we informed them about the project’s aims and rationale and provided an informative document to obtain their written informed consent.

The case group included samples from several private centers and public structures (detailed below). Recruitment is ongoing, aiming to reach the target of 1,500 SUD cases for a balanced case-control ratio. All the cases met the standardized diagnostic criteria for SUDs according to the International Classification of Diseases, 11th Revision (ICD-11)³⁸, or the DSM-5 TR¹. Eligible participants under these criteria were recruited from private (N = 71) and public (N = 227) healthcare facilities in Apulia, Southern Italy. Specifically, we collected the private center samples from the Therapeutic Community Emmanuel Onlus - Sector Dependencies (Lecce) and the Therapeutic Community “Fratello Sole” - Social Cooperative (Gioia del Colle, BA). We gathered the samples from public institutions at the SerD of Bari (BA), Bitonto (BA), Brindisi (BR), Campi Salentina (LE), Castellaneta (TA), Casarano (LE), Foggia (FG), Francavilla (TA), Galliano del Capo (LE), Gallipoli (LE), Grumo Appula (BA), Lecce (LE), Maglie (LE), Manduria (TA), Martina Franca (TA), Nardò (LE), Ostuni (BR), Poggiardo (LE), San Cesario di Lecce (LE), San Pietro Vernotico (BR), Taranto (TA), Ugento (LE) and the SerD in the Brindisi Prison (BR).

We used different engagement strategies to encourage volunteer participation in the case group. In all facilities, the BioSUD members presented the project separately to staff and participants, using multimedia tools such as presentations and short demonstrative videos. After the presentation, we recorded the volunteers’ willingness to participate in the study. During scheduled routine examinations in the following weeks, we collected written consent and blood samples to reduce participant burden. A dedicated medical professional or psychologist was on-site to oversee the process, including obtaining written consent and aiding with the questionnaire (detailed below). Healthcare professionals, such as doctors and specialized nurses on the research team, collected blood samples. After transport to the BioSUD lab facilities, all the blood samples were processed as described in the following sections within 72 hours. We entered the questionnaire data from cases and controls into Excel and processed it with R Studio, version 4.5.0³⁹.

Sampling

Blood samples from controls were collected by a specialized nurse and from cases by healthcare professionals, including physicians and specialized nurses affiliated with the research team. Specifically, after the written consent was returned, venous blood (8 ml) was drawn using a Vacutainer K2 EDTA and kept refrigerated until arrival at the processing laboratory at the University of Bari. Within 72 h from collection, all samples underwent centrifugation at 800 g for 15 min at a 45-degree angle. Following stratification, 1 mL of plasma was stored at -80 °C, while the remaining sample was preserved at -20 °C for subsequent DNA extraction and analyses.

DNA extraction

We extracted DNA from 250 µL of the layer of nucleated blood cells obtained after centrifugation during the initial processing, using Qiagen DNA Blood Mini Kit according to the manufacturer’s protocol. We evaluated the quality and concentration of the extracted DNA using the NanoDrop 1000 UV Thermo Scientific.

Genotyping and dataset

A total of 1,378 DNA samples meeting the quality control thresholds (concentration ≥ 30 ng/µl, 260/280 ratio > 1.6) were genotyped at the Institute of Genomics, University of Tartu (Estonia) using the Illumina Global Screening Array (GSA, Illumina Inc.). Moreover, all the samples with a quality call rate lower than 97% were discarded, resulting in 1,279 genotypes with 723,895 SNPs captured. The genotype data were imputed using TOPMed Imputation Server (https://imputation.biodatacatalyst.nhlbi.nih.gov/#!), which leverages Minimac4 and the TOPMed r3 reference panel to infer additional SNPs. The total number of imputed SNPs obtained after applying an Rsq filter ≥ 0.3 was 34,118,504. However, although these imputed variants were available, we did not harness them in the present study. This decision was based on the observation that the total number of retained SNPs did not increase significantly after merging with other datasets and performing LD pruning. We combined the newly generated genotypes with publicly available datasets comprising 4,551 individuals from 140 different populations, including 107 Eurasian populations^{30,40,41,42,43,44,45,46,47,48,49,50,51,52}, using the –bmerge function in PLINK version 1.9⁵³. Before merging, we removed all markers and individuals with more than 5% of missing data. This resulting dataset is composed of 5,830 individuals and 85,310 SNPs.

Principal component analysis (PCA)

To explore the genetic variation of the BioSUD cohort, we performed a PCA. We retained only the Eurasian populations from the complete dataset and discarded variants and individuals with missingness rates higher than 5%. After pruning for SNPs with high linkage disequilibrium score (indep-pairwise 200 50 0.4), 69,359 SNPs and 3,530 samples remained, and the PLINK files were converted to EIGENSTRAT format using convertf (version 5722).

We performed the PCA using SmartPCA (version 16000) from the EIGENSOFT package⁵⁴. Specifically, we projected the BioSUD samples into the principal component space inferred from all other Eurasian individuals (using the poplistname option). Outliers were automatically removed with the numoutlieriter, numoutlierevec, and outliersigmathresh options set to default parameters. This process led to removing 102 samples, reducing the sample size to 3,428 individuals. After visual inspection of the PCA plot, we identified and removed three additional BioSUD participants falling outside the genetic variability of the cohort (Fig. S1 - Supplementary Materials). The final dataset comprised 3,425 individuals, including samples from the BioSUD cohort and the Eurasian populations from the publicly available datasets (Table S1 - Supplementary Materials).

ROH Estimation

The --homozyg function in PLINK was exploited to detect Runs Of Homozygosity (ROHs) containing at least 50 SNPs. The minimum ROH length was 1,500 Kb to exclude short ROH due to Linkage Disequilibrium (LD). ROHs were detected by scanning genotypes for each BioSUD cohort and all other individuals sharing the same bulk of SNPs.

Admixture analysis

To increase the number of SNPs analyzed while maintaining a proper sample size for the population, we confined the analysis to data from the 1000 Genomes Program⁴⁰ and Raveane et al.³¹. The resulting dataset comprised 2,680 individuals: 1,401 from European (Finnish: FIN; Central Europeans: CEU; British from England and Scotland: GBR; Iberians from Spain: IBS; Italians: ITA; Tuscans: TSI), African (Luhya from Webuye, Kenya: LWK; Yoruba from Nigeria: YRI), and Asian (Gujarati Indians from Houston: GIH; Dai Chinese: CDX; Japanese from Tokyo: JPT) populations, and 1,279 from the BioSUD sample (Table S1 - Supplementary Materials).

We performed ten independent repetitions (using time as a starting point for randomization with the –seed option) for each K value ranging from two to ten using the ADMIXTURE software tool (version 1.3⁵⁵. We inferred the “optimal” number of K using the cross-validation (CV) procedure with the –cv option and observed the lowest CV error at K = 7 and 8 (Fig. S2). We first conducted an ADMIXTURE analysis excluding the BioSUD data. After obtaining the initial results, we projected the BioSUD data onto the resulting ADMIXTURE profiles (-P flag) to integrate and analyze their genetic structure within the established framework.

We have also performed an unsupervised admixture analysis on the same dataset using the same procedure for comparison.

Questionnaire

Each participant completed a tailored paper-and-pencil questionnaire to assess the frequency, amounts, and patterns of substance use, including nicotine, alcohol, cocaine, heroin, cannabis, and other substances (Tables S1 and S2 - Supplementary Materials), also including questions from the DSM-5 TR checklist¹. This instrument explored drug-related behaviors, with a specific focus on psychosocial factors, family relationships, peer group influences, substance accessibility, and social contexts experienced during adolescence.

The questionnaire given to the ‘Emmanuel’ group and control participants was longer (233 items) than the one given to other patients (165 items) to accomodate the difficulties and time constraints experienced by individuals with addiction outside formal rehabilitation settings. The survey encompasses three main sections: sociodemographic, psychosocial, and substance use (Fig. 1).

The sociodemographic section gathers a wide range of participant data to account for the possible impact of various life factors on the study’s results. We included critical demographic details such as gender, age, education, marital status (including the number of children), place of residence and birth, income, employment status, self-reported health, and family background (parents, siblings, and other caregivers).

The psychosocial section focuses on life experiences that may have influenced the participant’s psychological and social well-being. It investigates situations such as early separations from parents, parental divorce, and relocations (Tables S2 and S3 - Supplementary Materials). In this section, we also investigated aspects closely related to the substance use section, such as substance exposure in social settings, including both family and peers, substance accessibility in the cities where participants lived, and perceived safety level. Furthermore, we explored adverse events across life stages, including grief, accidents, illness, violent crimes, sexual abuse, and other painful events. Participants reported occurrences in age categories (< 14, 14–18, 18–25, > 25 years), and we quantified the overall incidence by calculating the cumulative events across these categories. Moreover, we included questions about the perceived quality of relationships with fathers, mothers, siblings, and peers on a scale ranging from “Very Poor” to “Very Good.” We converted this categorical representation into a 5-point numerical scale (1 to 5). We aggregated cumulative scores across family members to reclassify them into categories: 1–3 as “Very Poor,” 4–6 as “Poor,” 7–9 as “Average,” 10–12 as “Good,” and 13–15 as “Very Good” for analytical purposes.

Lastly, the substance use section explores the consumption of nicotine, alcohol, cannabis, cocaine, heroin, and other substances (e.g., amphetamines, MDMA, ecstasy, hallucinogens, etc.). We tailored questions for each substance, adapting the DSM-5 TR checklist¹ to thoroughly examine substance use across various categories and identify potential use disorders. The exposure subsection explores family, peers, and accessibility of substance use, considering social settings and craving behaviors. Subjective feelings, including relief, reward, and obsession, are measured. We also used a Visual Analogue Scale (VAS) for the craving assessment. Positive responses to substance consumption-related questionnaire items were assigned to a value of 1 and summed to create a ‘family substance consumption’ score. We then categorized the score into five ordinal levels: ‘None’ (0 positive responses), ‘Low’ (1–2 positive responses), ‘Average’ (3–4 positive responses), ‘High’ (5–6 positive responses), and ‘Very High’ (more than seven positive responses), based on average and standard deviation.

Results

The genetic variation of the biosud cohort

Principal component analysis

To genetically characterize the BioSUD samples, we first projected the BioSUD individuals onto the PCA space inferred from 2,150 Eurasian individuals (Fig. 2A). The PCA showed that BioSUD samples form a cluster largely overlapping with individuals from Southern Italy and partially overlapping with the ones from Central Italy. On the contrary, it differed from individuals from Northern Italy and Sardinia. Within Eurasia, the BioSUD samples appeared more like Balkan populations and Corsicans than Iberians. Indeed, on the west side of Europe, the Iberian populations were close to Northern Italian populations and Central Europeans. At the same time, on the west side of Europe, Iberians looked closer to Northern Italians than to BioSUD samples and Southern Italy. Central European populations were closer to Northern Italians than to the BioSUD cohort.

Runs of homozygosity

To infer the homozygosity pattern within the BioSUD cohort compared to worldwide populations, we performed an ROH analysis for genomic segments extending more than 1,500 kb and encompassing at least 50 SNPs. Our results showed that the BioSUD cohort has an average of 7.64 ROH segments for an average total length of 18,132.644 kb (Standard Deviation (SD) = 21,627.211), which is comparable with those inferred for the other European populations (ITA = 21,637.46, IBS = 22,599.718, GBR = 23,427.666). However, we inferred a large SD for the BioSUD population, possibly due to 53 individuals showing a total ROH length ranging from 32,796 kb to 349,267 kb.

Moreover, when comparing the median of the total ROH length of the BioSUD cohort with the ones from European populations, the BioSUD resulted in the lower one (BioSUD = 14,940.7 kb; ITA = 16,892 kb; IBS = 18,337.8 kb; GBR = 19,915.7 kb; FIN = 30,780.85 kb), suggesting being the most heterozygous population among the analyzed ones (Fig. 2B). When extending this comparison to the CDX and YRI populations, only the latter showed a higher heterozygosity rate than the BioSUD cohort (CDX = 33,066.85, YRI = 9,379.32).

Admixture analysis

We performed an Admixture for K from 2 to 10 for ten random iterations to infer population structure (Fig. S2 - Supplementary Materials). Due to the lower error, cross-validation showed 7 and 8 as the optimal K values (Fig. S3 - Supplementary Materials).

Figure 2C shows the barplot summarizing the ancestral component proportions for K = 7. The BioSUD samples (Fig. 2C, lower panel) showed a composition very similar to the Italian populations (Fig. 2C, upper panel), with the main genetic components being modal in Southern Europeans (yellow in Fig. 2C, Average (A) = 0.764, Median (M) = 0.766, SD = 0.027), Northern Europeans (blue in Fig. 2C, A = 0.100, M = 0.099, SD = 0.024) and South-East Asians (GIH, red in Fig. 2C, A = 0.090, M = 0.091, SD = 0.012). A minor proportion of genetic component is represented by the ones modal in the African populations, with the East African one (LWK) accounting for 1.9% (pink in Fig. 2C, A = 0.019, M = 0.019, SD = 0.009) and the West African one (YRI) for the 1.3% (green in Fig. 2C, A = 0.013, M = 0.012, SD = 0.021).

Although SD values suggested the BioSUD cohort to be highly homogeneous, we could identify three outliers that deviate from the ancestral component proportions of other individuals. Outlier 1 showed 66% of a component modal in YRI (green in Fig. S4 - Supplementary Materials), suggesting a predominantly African ancestry. The questionnaire data confirmed this observation, showing that this individual was born in Nigeria. Outlier 2 exhibited a higher prevalence of ancestries commonly found in the Japanese population compared to the BioSUD cohort (purple in Fig. S4 - Supplementary Materials, 0.18), Northern European (blue in Fig. S4 - Supplementary Materials, 0.16), and South-East Asians (red in Fig. S4 - Supplementary Materials, 0.09). However, considering the additional information on this subject, we could not explain this different profile, which we suspect to be linked to his family history. Outlier 3 had, as expected, the modal component in the Southern Europeans as the major one (yellow in Fig. S4 - Supplementary Materials, 0.52) and additionally showed a higher rate than expected of the components modal in Western Africans and Eastern Africans, accounting respectively for 29.0% (green in Fig. S4 - Supplementary Materials) and 12.7% (pink in Fig. S4 - Supplementary Materials). From the questionnaire data, we hypothesized that this individual had an Italian and an African parent, thus explaining these results.

The results obtained from the unsupervised ADMIXTURE analysis were substantially equivalent, with the main exception of detecting a main “Apulian/BioSUD component” at K = 7, probably due to the disproportionate sample size of the Apulian cohort (Fig. S5—Supplementary Materials).

Descriptive evaluation of the questionnaire variables

Sociodemographic data and psychosocial factors

All the 1,806 sampled individuals completed the questionnaire. The missingness rate of questions varied from 0 to 39.6% (A = 11.8%).

Both the control (1,046 males vs. 462 females) and case groups (278 males vs. 20 females) included a higher number of male participants (Fig. 3A). The overall sample had an average age of 40.69 years (SD = 12.31, range 18–72). As summarized in Fig. 3B, the control group had an average age of 40.43 years (SD = 12.66, range 18–72), while the cases had an age of A = 42.10 years (SD = 10.15, range 20–71). High school was the most frequent level of education among participants (46.1%), followed by a university degree (25.2%), with the remaining participants distributed across middle school (16.0%), post-graduate studies (10.0%), and primary school (1.9%; Fig. 3C). A comparison between the sample data to the 2020 statistics from the Italian National Institute of Statistics (ISTAT) for Apulia demonstrate inequalities in educational distribution. The control group had a more significant high school attendance rate (49.9% vs. 31.9% national average), degree (30.4% vs. 12.4%), and lower middle school rate (5.4% vs. 31.3%) than the Apulian ISTAT average (31.9%). In contrast, the cases revealed a lower rate for high school (31.4%) and degree (3.4%), as well as a greater rate of higher middle school completion (55.6%) than the ISTAT statistics. For employment status, the case group shows a higher proportion of unemployed individuals for over 12 months (37.7% vs. 12.4%) and a lower rate of full-time employment (34.8% vs. 58.8%; Fig. 3D). Furthermore, participants in the case group experienced more adverse events than controls (Fig. 3E). Controls reported higher rates of bereavement (68.3%) and violent crime victimization (11.3%) if compared with cases (51.2% and 7.3%, respectively). Conversely, serious accidents were more frequent among cases, with a rate of 15.9% compared to 8.7% among controls. Cases also experienced higher rates of serious illness (2.4% vs. 0.5%), witnessed violent crimes (8.2% vs. 5.9%), and sexual abuse (4.8% vs. 1.1%), compared with controls.

Regarding family members’ substance use and behaviors (Fig. 3F), results indicated that controls showed a trend in family drug consumption, with 23.9% reporting no use, 56.4% reporting low consumption, 16.7% reporting moderate consumption, and minimal representation in higher categories (2.1% high and 0.9% very high). Conversely, the case group exhibited a divergent pattern: 6.8% reported no familial substance use, 39.6% reported low use, 26.1% reported average use, and a consistent percentage collocated in higher categories (13.0% high and 14.5% very high).

Most participants in the control group reported having “Good” or “Very Good” relationships with their families (71.7%), with only 2.8% reporting “Poor” or “Very Poor” relationships. In contrast, the case group exhibited a lower proportion of “Good” or “Very Good” family relationships (47.4%) and a significantly higher percentage (21.3%) reporting “Poor” or “Very Poor” quality (Fig. 3G). On the other hand, in the control group, a substantial majority reported either “Good” or “Very Good” relationships with peers (76.0%). In comparison, the combined “Poor” and “Very Poor” categories accounted for a relatively small proportion (1.7%). Conversely, the case group demonstrated a lower prevalence of “Good” and “Very Good” relationships with peers (51.2%) and a higher incidence of “Poor” or “Very Poor” categories (10.1%) (Fig. 3H).

Substance use

To assess participants’ nicotine usage, we asked individuals about their smoking habits, which are defined as using at least one unit of tobacco or nicotine-containing products each day. Among controls, 54.7% are non-smokers, 20.1% are former smokers (more than six months before the questionnaire date), 3.2% are former smokers (less than six months before the date of the questionnaire), and 22.0% are current smokers. In contrast, the case group has a different profile, with 91.3% being current smokers (Fig. 3I).

The case group showed a different pattern of alcohol consumption compared to the controls (Fig. 3J). While fewer participants in the case group reported drinking alcohol than controls (70.5% vs. 81.3%), drinkers showed heavier use, particularly “4 times a week or more” (25.7% vs. 6.8%).

The prevalence of cannabis usage among exposed controls was low, with only 8.6% reporting 30 or more uses and a combined 29.6% indicating less frequent use (less than 29 occasions). 67% of the controls had never used cannabis. With a significant 78.6% reporting 30 or more instances of cannabis usage, a combined 11.7% indicating less frequent consumption (less than 29 times), and just 9.7% reporting never consuming cannabis, the case group, on the other hand, showed a significantly different pattern.

The prevalence of cannabis usage among exposed controls was low, with only 8.6% reporting 30 or more uses and a combined 29.6% for less frequent use (less than 29 times), with 61.7% of the controls having never used cannabis (Fig. 3K). In contrast, the case group showed a different pattern, with a substantial 78.6% reporting 30 or more times of cannabis usage, a combined 11.7% indicating less frequent consumption (less than 29 times), and just 9.7% reporting never using cannabis.

The prevalence of cocaine usage was extremely low in the control group, with only 0.3% of exposed controls reporting 30 or more times of use and a combined 2.6% for less frequent consumption (less than 29 times). Most controls (97.2%) had never used cocaine (Fig. 3L). The case group, on the other hand, showed a significantly different pattern, with 84.0% of participants reporting 30 or more occasions of cocaine usage. Just 9.7% of cases reported abstinence, while a lower number (4.5%) reported less usage.

Heroin use varied significantly between cases and controls. While almost all controls (99.9%) reported no heroin use, only 29.1% of cases never used heroin. Most of the cases (66.02%) reported using heroin 30 times or more, with the remaining 4.9% reporting less than 29 times of usage (Fig. S5E - Supplementary Materials). A similar pattern emerged for “other substances”, with most controls (97.5%) indicating no use, compared to 65.1% of cases. In contrast, 15.5% of cases used other substances 30 times or more, whereas 19.4% indicated less regular usage (Fig. S6 - Supplementary Materials).

Substance use classification - DSM-5 TR checklist

As expected, the prevalence of each SUD was higher among cases than among controls (see Table S4 — Supplementary materials).

For Cannabis Use Disorder (CaUD), most controls (88.6%) did not meet the diagnostic criteria, while 51.3% of cases had no CaUD. Among the cases, 11.4% had mild CaUD, 9.7% had moderate CaUD, and 20.8% had severe CaUD.

In contrast, a small proportion of controls exhibited probable CaUD, with 1.5%, 0.7%, and 0.4% meeting the criteria for mild, moderate, and severe CaUD, respectively.

For Cocaine Use Disorder (CUD), 93.0% of controls did not meet the diagnostic criteria, whereas only 24.5% of cases were classified as having no CUD. Conversly, 4.4% of cases met the criteria for mild CUD, 6.7% for moderate CUD, and 59.4% for severe CUD, while the prevalence of CUD among controls was negligible.

For Heroin Use Disorder (HUD), 93.0% of controls did not meet the criteria, whereas only 38.6% of cases were classified as having no HUD. Among the cases, 3.0% met the criteria for mild HUD, 3.4% for moderate HUD, and 50.7% for severe HUD. No control participants met the criteria for mild, moderate, or severe HUD.

For Other Substance Use Disorder (OSUD), most controls (94.3%) did not meet the diagnostic criteria, compared to 83.9% of cases. The prevalence of mild, moderate, and severe OSUD was relatively lower than that observed for other substances, with 5.0%, 1.7%, and 3.0% of cases meeting these criteria, respectively. Among controls, only 0.1% met the criteria for mild OSUD, while no control participants met the criteria for moderate or severe OSUD.

In the control group (N = 1,508), only a small number of participants reported mild substance use disorder involving combinations of substances. Specifically, one participant (< 0.1%) reported a mild SUD involving cocaine and other substances, while an another (< 0.1%) reported a combination of cocaine and cannabis. In the case group (N = 298), the most frequent mild SUD combination was cannabis and cocaine, affecting three participants (1%). For moderate SUD, cannabis and cocaine remained the most common combination, with two cases (0.7%). Severe SUD was more prevalent in the control group and showed a strong pattern of polydrug use. The most common SUD combination was in the severe range for CUD and HUD, affecting 86 participants (28.9%), followed by a combination of cocaine and cannabis use, which affected 55 participants (18.5%). Other combinations of substances were less frequent but still contributed to the overall burden of severe SUD. Specifically, 24 cases (8.1%) involved severe use of cannabis, cocaine, and heroin, five cases (1.7%) involved severe use of other substances and heroin, five cases (1.7%) involved severe use of cannabis, cocaine, and other substances, and three cases (1%) involved cannabis, heroin, and other substances.

Discussion

Here, we present the first analysis of the genetic, psychosocial, sociodemographic, and SUD behavior variability within the BioSUD cohort, which comprises 1,806 individuals.

When evaluating the genomic variation of the BioSUD cohort, PCA shows that most of the samples fall within the genetic variability of Southern Italian individuals, with a few samples showing genetic profiles similar to other Italian or Western European regions. Specifically, only three samples show genetic profiles compatible with substantial ancestry from different continents. The admixture analysis confirms a shared demographic and evolutionary history, which indicates that the BioSUD cohort shares a high percentage of ancestry with people from Iberian and, to a lesser extent, other European groups. This ancestral composition is comparable to other Italian groups³¹. However, almost all the individuals show a substantial proportion of ancestry component, which is modal in Southern East Asian Individuals and is absent in Iberians. Moreover, a low proportion of ancestry modal in Sub-Saharan African groups is observed. The complex demographic history of the Italian peninsula is reflected in these two ancestries, contributing to the high heterogeneity found across the European continent. According to recent studies, Italy exhibits the highest level of genetic variation on the European continent, with significant heterogeneity among its various regional populations^31,32. Here, we confirm this heterogeneity by observing that, on average, Italian individuals have the lowest number of ROHs in Europe. The BioSUD sample set, mostly of Apulian descent, carries fewer ROHs than other Italian individuals, suggesting that Southern Italians are among the most genetically diverse European populations. However, more research is required to validate these findings further.

The survey data reveal the demographic details of the population under study, including gender, age, educational attainment, and employment status in the control and case groups. The pronounced male predominance in substance consumption observed in both the control and cases groups aligns with established trends in substance consumption studies: men exhibit higher rates of alcohol consumption, alcohol-related issues, and alcohol use disorder diagnoses, as well as greater use of illicit substances and higher prevalence rates of SUDs^55,56,57,58.

The participants’ educational backgrounds are considerably different from one another. Both high school enrollment and high school graduation rates of the case group are lower than the control group. This is typical of the established correlation between drug use and diminishing educational achievement⁵⁹. However, possible sampling biases in the current study are revealed by comparing it with the 2020 ISTAT demographic data for the Apulia region. In particular, the mean educational attainment of the control group is higher than the stated regional average, while the case group demonstrates a lower average attainment. Therefore, the observed results necessitate a cautious interpretation, limiting their generalizability to the broader Apulian population and accounting for this bias in subsequent GWAS.

Psychosocial factor analysis reveals distinct patterns between case and control groups. Control groups primarily report low levels of familial substance use, while cases exhibit a more diverse range, suggesting that SUDs are influenced by both environmental^60,61,62 and genetic^63,64 factors. Additionally, case participants demonstrate more significant variability in reported relationship quality, with a significantly higher likelihood of participants describing challenging or strained familial and peer relationships than control participants^65,66.

In examining substance use, the control group predominantly consists of non-smokers, whereas the case group exhibits a markedly higher prevalence of current smokers. Individuals with SUDs often face complex challenges related to compulsive substance use, and nicotine, being highly addictive, may become intertwined with other substance use patterns^67,68,69. Shared risk factors, common neural pathways, or coping mechanisms may contribute to the increased nicotine consumption observed among participants with SUDs. Whereas the control group exhibited a pattern of regular, moderate alcohol consumption, the case group demonstrated a more polarized distribution, with a higher prevalence of abstinence (roughly 29% vs. 19%) and a significant proportion engaging in heavy drinking among those who did consume alcohol. The supervised environment in the private and public healthcare facilities where participants were recruited explains the comparative results of abstinence in cases versus controls.

Controls mostly abstained from cannabis and cocaine, except for some exposed controls, while cases showed more frequent and intense substance use, especially cannabis. Heroin and other substance use were almost absent among controls (roughly 99.9%), while consumption rates were significantly higher in the case group.

The case group exhibited a high incidence of severe SUDs, predominantly related to cocaine and heroin use, as evidenced by DSM-5-TR symptom response analysis, which supported the clinical classification. The low occurrence of severe SUD cases among exposed controls further reinforced the distinction between the clinical and control groups. Polydrug use, with cocaine-heroin as the most frequent combination, followed by cocaine-cannabis and other substance use, was common in severe SUD. These findings align with research on cocaine-heroin co-use in severe addiction⁷⁰. Mild and moderate SUD criteria were less frequent in the clinical group but followed similar patterns, with cannabis and cocaine being the most used substances. Even among controls, a small percentage (≤ 0.1%) met the criteria for mild SUD, indicating that some level of substance use occurs even in individuals not classified as clinical cases, referred to as exposed controls⁷¹.

Limitations and future directions

Although this study provides valuable insights, it also has some limitations. A limitation of this study is the current imbalance in our case-control ratio, stemming from the ongoing recruitment phase. While our target cohort is 3,000 participants, evenly distributed between cases and controls, the present analysis is based on 298 cases and 1,500 controls. This disparity necessitates a cautious interpretation of our findings. We are actively expanding our recruitment across the Apulia region to address this. Building upon our existing partnerships with initial facilities, we have established new collaborations with centers in the Foggia (FG) and Barletta-Andria-Trani (BAT) provinces. We aim to recruit 1,200 cases and achieve our desired 50:50 case-control ratio. To ensure we reach our target sample size, we are also actively pursuing additional agreements with other centers throughout Apulia.

Another key limitation is the lower representation of female participants. While this gender disparity reflects well-established epidemiological patterns of SUD prevalence and aligns with clinical referral trends, the European Drug Report 2024⁴ estimates the average male-to-female ratio among users entering treatment for cannabis, cocaine, heroin, and other drugs to be 4.39:1. However, this imbalance may constrain the generalizability of our findings to female populations. Although the current sample represents existing clinical populations, future phases of our research project would incorporate targeted recruitment strategies to address this sampling bias and enhance the external validity of our findings.

Methodological challenges inherent in control group classification within SUD research necessitate attention. The standard approach of controlling for the existence of a formal SUD diagnosis as the exclusive determining characteristic of controls is susceptible to overlooking subclinical or undiagnosed SUDs. Furthermore, the approach relies significantly on the reliability of honest self-report, an assumption frequently belied by social desirability bias⁷² and response biases⁷³. Misclassification is more likely when there is no external validation, such as biological markers, collateral data, or symptom validity tests. Future research should use stricter screening methods to improve control-clinical group differentiation. Moreover, the accurate assessment of drug use is hindered by its sensitive nature and concerns regarding privacy, social stigma, and legal consequences⁷⁴. This often leads to underreporting and non-response bias, especially in control groups. Future research should incorporate methodological refinements, such as standardized structured interviews and indirect questioning techniques, to mitigate these issues and promote more reliable reporting.

This study obtained self-report measures of early adverse experiences, family, and socioeconomic status but did not evaluate the effects of these factors on SUD severity and polydrug use. Future research studies examining these associations and adding additional psychological measures, such as executive functioning and personality, will inform possible influencing and confounding factors of SUDs.

Despite these limitations, the current study identifies the BioSUD cohort as an essential resource for studying complicated behavioral features linked with SUDs. This cohort may be used in future studies to investigate the link between genetic and environmental factors in characterizing SUD phenotypes.

Data availability

The datasets generated and analyzed during the current study *will be* available in the European Genome-phenome Archive (EGA) repository, https://ega-archive.org/. For more information, contact francesco.montinaro@uniba.it.

References

American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders – (5th ed) - Text Revised (DSM-T TR. APA, 2022).
Patnode, C. D. et al. Screening for unhealthy drug use: updated evidence report and systematic review for the US preventive services task force. JAMA 323, 2310–2328. https://doi.org/10.1001/jama.2019.21381 (2020).
Article PubMed Google Scholar
Haar, K. et al. Family united: piloting a new universal UNODC family skills programme to improve child mental health, resilience and parenting skills in Indonesia and Bangladesh. Int. J. Ment Health Syst. 17 https://doi.org/10.1186/s13033-023-00602-w (2023).
European Drug Report 2024: Trends and Developments. https://www.emcdda.europa.eu/publications/european-drug-report/2024_en.
European Drug Report 2023: Trends and Developments. https://www.emcdda.europa.eu/publications/european-drug-report/2023_en.
European School Survey Project on Alcohol and Other Drugs. https://play.google.com/store/books/details?id=fceQAQAACAAJ (Publications Office of the European Union, 2016).
Belfiore, C. I. et al. Multi-level analysis of biological, social, and psychological determinants of substance use disorder and co-occurring mental health outcomes. Psychoactives 3, 194–214. https://doi.org/10.3390/psychoactives3020013 (2024).
Article Google Scholar
Volkow, N. D. & Blanco, C. Substance use disorders: a comprehensive update of classification, epidemiology, neurobiology, clinical aspects, treatment and prevention. World Psychiatry: Official J. World Psychiatric Association (WPA). 22, 203–229. https://doi.org/10.1002/wps.21073 (2023).
Article Google Scholar
Prom-Wormley, E. C., Ebejer, J., Dick, D. M. & Bowers, M. S. The genetic epidemiology of substance use disorder: A review. Drug Alcohol Depend. 180, 241–259. https://doi.org/10.1016/j.drugalcdep.2017.06.040 (2017).
Article PubMed PubMed Central Google Scholar
Gelernter, J. & Polimanti, R. Genetics of substance use disorders in the era of big data. Nat. Rev. Genet. 22, 712–729. https://doi.org/10.1038/s41576-021-00377-1 (2021).
Article CAS PubMed PubMed Central Google Scholar
Martin, E., Schoeler, T., Pingault, J. B. & Barkhuizen, W. Understanding the relationship between loneliness, substance use traits and psychiatric disorders: A genetically informed approach. Psychiatry Res. 325 https://doi.org/10.1016/j.psychres.2023.115218 (2023).
Levey, D. F. et al. Multi-ancestry genome-wide association study of cannabis use disorder yields insight into disease biology and public health implications. Nat. Genet. 55, 2094–2103. https://doi.org/10.1038/s41588-023-01563-z (2023).
Article CAS PubMed PubMed Central Google Scholar
Saunders, G. R. B. et al. Genetic diversity fuels gene discovery for tobacco and alcohol use. Nature 612, 720–724. https://doi.org/10.1038/s41586-022-05477-4 (2022).
Article CAS PubMed PubMed Central Google Scholar
Deak, J. D. et al. Genome-wide association study in individuals of European and African ancestry and multi-trait analysis of opioid use disorder identifies 19 independent genome-wide significant risk loci. Mol. Psychiatry. 27, 3970–3979 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zhou, H. et al. Genome-wide meta-analysis of problematic alcohol use in 435,563 individuals yields insights into biology and relationships with other traits. Nat. Neurosci. 23, 809–818. https://doi.org/10.1038/s41593-020-0643-5 (2020).
Article CAS PubMed PubMed Central Google Scholar
Filip, M., Smaga, I. & Przegaliński, E. The role of serotonin in nicotine abuse and addiction. Handb. Behav. Neurosci. 31, 829–841 (2020).
Article Google Scholar
Kashem, M. A. et al. Long-term daily access to alcohol alters dopamine-related synthesis and signaling proteins in the rat striatum. Neurochem Int. 61, 1280–1288. https://doi.org/10.1016/j.neuint.2012.08.013 (2012).
Article CAS PubMed Google Scholar
Buck, K. J. & Finn, D. A. Genetic factors in addiction: QTL mapping and candidate gene studies implicate GABAergic genes in alcohol and barbiturate withdrawal in mice. Addiction 96, 139–149. https://doi.org/10.1046/j.1360-0443.2001.96113910.x (2001).
Article CAS PubMed Google Scholar
Hatoum, A. S. et al. The addiction risk factor: A unitary genetic vulnerability characterizes substance use disorders and their associations with common correlates. Neuropsychopharmacology 47, 1739–1745. https://doi.org/10.1038/s41386-021-01209-w (2022).
Article PubMed Google Scholar
Xiao, C., Zhou, C. Y., Jiang, J. H. & Yin, C. Neural circuits and nicotinic acetylcholine receptors mediate the cholinergic regulation of midbrain dopaminergic neurons and nicotine dependence. Acta Pharmacol. Sin. 41, 1–9. https://doi.org/10.1038/s41401-019-0299-4 (2020).
Article CAS PubMed Google Scholar
Bechara, A. et al. A neurobehavioral approach to addiction: implications for the opioid epidemic and the psychology of addiction. Psychol. Sci. Public. Interest. 20, 96–127. https://doi.org/10.1177/1529100619860513 (2019).
Article PubMed PubMed Central Google Scholar
Hardy, L., Mitchell, C., Seabrooke, T. & Hogarth, L. Drug cue reactivity involves hierarchical instrumental learning: evidence from a biconditional Pavlovian to instrumental transfer task. Psychopharmacology 234, 1977–1984. https://doi.org/10.1007/s00213-017-4605-x (2017).
Article CAS PubMed PubMed Central Google Scholar
Chanraud, S. et al. Brain morphometry and cognitive performance in detoxified alcohol-dependents with preserved psychosocial functioning. Neuropsychopharmacology 32, 429–438. https://doi.org/10.1038/sj.npp.1301219 (2007).
Article PubMed Google Scholar
Miyake, A. & Friedman, N. P. The nature and organization of individual differences in executive functions: four general conclusions. Curr. Dir. Psychol. Sci. 21, 8–14. https://doi.org/10.1177/0963721411429458 (2012).
Article PubMed PubMed Central Google Scholar
Day, A. M., Kahler, C. W., Ahern, D. C. & Clark, U. S. Executive functioning in alcohol use studies: A brief review of findings and challenges in assessment. Curr. Drug Abuse Rev. 8, 26–40. https://doi.org/10.2174/1874473708666150416110515 (2015).
Article PubMed PubMed Central Google Scholar
Durazzo, T. C., Meyerhoff, D. J. & Nixon, S. J. A comprehensive assessment of neurocognition in middle-aged chronic cigarette smokers. Drug Alcohol Depend. 122, 105–111. https://doi.org/10.1016/j.drugalcdep.2011.09.019 (2012).
Article PubMed Google Scholar
Brière, M. et al. Decision-Making measured by the Iowa gambling task in patients with alcohol use disorders choosing harm reduction versus relapse prevention program. Eur. Addict. Res. 25, 182–190. https://doi.org/10.1159/000499709 (2019).
Article PubMed Google Scholar
Balconi, M. & Campanella, S. Advances in Substance and Behavioral Addiction: the Role of Executive Functions (Springer Nature, 2021).
Bechara, A., Noel, X. & Crone, E. A. Loss of willpower: abnormal neural mechanisms of impulse control and decision making in addiction. In Handbook of Implicit Cognition and Addiction (eds Wiers, R. W. & Stacy A. W.), 215–232 (2006).
Raveane, A. et al. Population structure of modern-day Italians reveals patterns of ancient and archaic ancestries in Southern Europe. Sci. Adv. 5, eaaw3492. https://doi.org/10.1126/sciadv.aaw3492 (2019).
Article CAS PubMed PubMed Central Google Scholar
Raveane, A. et al. Assessing temporal and geographic contacts across the Adriatic sea through the analysis of genome-wide data from Southern Italy. Genomics 114, 110405. https://doi.org/10.1016/j.ygeno.2022.110405 (2022).
Article CAS PubMed Google Scholar
Sarno, S. et al. Ancient and recent admixture layers in Sicily and Southern Italy trace multiple migration routes along the Mediterranean. Sci. Rep. 7, 1984 https://doi.org/10.1038/s41598-017-01802-4 (2017).
Sazzini, M. et al. Genomic history of the Italian population recapitulates key evolutionary dynamics of both continental and Southern Europeans. BMC Biol. 18, 1–19. https://doi.org/10.1186/s12915-020-00778-4 (2020).
Article Google Scholar
Bycroft, C. et al. The UK biobank resource with deep phenotyping and genomic data. Nature 562, 203–209. https://doi.org/10.1038/s41586-018-0579-z (2018).
Article CAS PubMed PubMed Central Google Scholar
Leitsalu, L. et al. Cohort profile: Estonian biobank of the Estonian genome center, university of Tartu. Int. J. Epidemiol. 44, 1137–1147. https://doi.org/10.1093/ije/dyt268 (2015).
Article PubMed Google Scholar
Hood, L. & Price, N. The Age of Scientific Wellness: why the Future of Medicine Is Personalized, Predictive, Data-Rich, and in your Hands (Harvard University Press, 2023).
Cozzoli, D. et al. Genomic and personalized medicine approaches for substance use disorders (SUDs) looking at Genome-Wide association studies. Biomedicines 9 https://doi.org/10.3390/biomedicines9121799 (2021).
International Classification of Diseases, Eleventh Revision (ICD-11), World Health Organization (WHO) 2019/2021 https://icd.who.int/browse11. Licensed under Creative Commons Attribution-NoDerivatives 3.0 IGO license (CC BY-ND 3.0 IGO).
R Development Core Team. The R Reference Manual: Base Package. Network Theory. https://cran.rstudio.com/doc/manuals/r-devel/R-admin.pdf (2024).
1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74. https://doi.org/10.1038/nature15393 (2015).
Article CAS Google Scholar
Behar, D. M. et al. No evidence from genome-wide data of a Khazar origin for the Ashkenazi Jews. Hum. Biol. 85, 859–900. https://doi.org/10.3378/027.085.0604 (2013).
Article PubMed Google Scholar
Behar, D. M. et al. The genome-wide structure of the Jewish people. Nature 466, 238–242. https://doi.org/10.1038/nature09103 (2010).
Article CAS PubMed Google Scholar
Busby, G. B. J. et al. The role of recent admixture in forming the contemporary West Eurasian genomic landscape. Curr. Biol. 25, 2518–2526 (2015).
Kovacevic, L. et al. Standing at the gateway to Europe–the genetic structure of Western Balkan populations based on autosomal and haploid markers. PLoS One. 9, e105090. https://doi.org/10.1371/journal.pone.0105090 (2014).
Article CAS PubMed PubMed Central Google Scholar
Kushniarevich, A. et al. Genetic heritage of the Balto-Slavic speaking populations: A synthesis of autosomal, mitochondrial and Y-chromosomal data. PLoS One. 10, e0135820. https://doi.org/10.1371/journal.pone.0135820 (2015).
Article CAS PubMed PubMed Central Google Scholar
Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104. https://doi.org/10.1126/science.1153717 (2008).
Article CAS PubMed Google Scholar
Ongaro, L. et al. The genomic impact of European colonization of the Americas. Curr. Biol. 29, 3974–3986e4. https://doi.org/10.1016/j.cub.2019.09.076 (2019).
Article CAS PubMed Google Scholar
Raghavan, M. et al. Upper palaeolithic Siberian genome reveals dual ancestry of native Americans. Nature 505, 87–91. https://doi.org/10.1038/nature12736 (2014).
Article CAS PubMed Google Scholar
Tambets, K. et al. Genes reveal traces of common recent demographic history for most of the Uralic-speaking populations. Genome Biol. 19 https://doi.org/10.1186/s13059-018-1522-1 (2018).
Tamm, E. et al. Genome-wide analysis of Corsican population reveals a close affinity with Northern and central Italy. Sci. Rep. 9, 13581. https://doi.org/10.1038/s41598-019-49901-8 (2019).
Article CAS PubMed PubMed Central Google Scholar
Yunusbayev, B. et al. The Caucasus as an asymmetric semipermeable barrier to ancient human migrations. Mol. Biol. Evol. 29, 359–365. https://doi.org/10.1093/molbev/msr221 (2012).
Article CAS PubMed Google Scholar
Yunusbayev, B. et al. The genetic legacy of the expansion of Turkic-speaking nomads across Eurasia. PLoS Genet. 11, e1005068. https://doi.org/10.1371/journal.pgen.1005068 (2015).
Article CAS PubMed PubMed Central Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4 https://doi.org/10.1186/s13742-015-0047-8 (2015).
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190. https://doi.org/10.1371/journal.pgen.0020190 (2006).
Article CAS PubMed PubMed Central Google Scholar
Alexander, D. H., Shringarpure, S., Novembre, J. & Lange, K. Admixture 1.3 Software Manual. http://www.vcru.wisc.edu/simonlab/bioinformatics/programs/admixture/admixture-manual.pdf (UCLA, 2015).
Brady, K. T. & Randall, C. L. Gender differences in substance use disorders. Psychiatr Clin. North. Am. 22, 241–252. https://doi.org/10.1016/s0193-953x(05)70074-5 (1999).
Article CAS PubMed Google Scholar
Grant, B. F. et al. Epidemiology of DSM-5 drug use disorder: results from the National Epidemiologic Survey on alcohol and related conditions–III. JAMA Psychiatry. 73, 39–47. https://doi.org/10.1001/jamapsychiatry.2015.2132 (2016).
Article PubMed PubMed Central Google Scholar
Lev-Ran, S., Le Strat, Y., Imtiaz, S., Rehm, J. & Le Foll, B. Gender differences in prevalence of substance use disorders among individuals with lifetime exposure to substances: results from a large representative sample. Am. J. Addict. 22, 7–13. https://doi.org/10.1111/j.1521-0391.2013.00321.x (2013).
Article PubMed Google Scholar
Wilsnack, R. W. et al. Gender differences in alcohol consumption and adverse drinking consequences: cross-cultural patterns. Addiction 95, 251–265. https://doi.org/10.1046/j.1360-0443.2000.95225112.x (2000).
Article CAS PubMed Google Scholar
Palmer, R. H. C. et al. Genetic etiology of the common liability to drug dependence: evidence of common and specific mechanisms for DSM-IV dependence symptoms. Drug Alcohol Depend. 123(Suppl 1), 24–32. https://doi.org/10.1016/j.drugalcdep.2011.12.015 (2012).
Article CAS Google Scholar
Charles, N. E. et al. Altered developmental trajectories for impulsivity and sensation seeking among adolescent substance users. Addict. Behav. 60, 235–241. https://doi.org/10.1016/j.addbeh.2016.04.016 (2016).
Article PubMed PubMed Central Google Scholar
Silberg, J., Rutter, M., D’Onofrio, B. & Eaves, L. Genetic and environmental risk factors in adolescent substance use. J. Child. Psychol. Psychiatry. 44, 664–676. https://doi.org/10.1111/1469-7610.00153 (2003).
Article PubMed Google Scholar
Teixidó-Compañó, E. et al. Differences between men and women in substance use: the role of educational level and employment status. Gac Sanit. 32, 41–47. https://doi.org/10.1016/j.gaceta.2016.12.017 (2018).
Article PubMed Google Scholar
Waldron, J. S., Malone, S. M., McGue, M. & Iacono, W. G. A Co-Twin control study of the relationship between adolescent drinking and adult outcomes. J. Stud. Alcohol Drugs. 79, 635–643. https://doi.org/10.15288/jsad.2018.79.635 (2018).
Article PubMed PubMed Central Google Scholar
Fairbairn, C. E. & Cranford, J. A. A multimethod examination of negative behaviors during couples interactions and problem drinking trajectories. J. Abnorm. Psychol. 125, 805–810. https://doi.org/10.1037/abn0000186 (2016).
Article PubMed PubMed Central Google Scholar
Fairbairn, C. E. & Sayette, M. A. A social-attributional analysis of alcohol response. Psychol. Bull. 140, 1361–1382. https://doi.org/10.1037/a0037563 (2014).
Article PubMed PubMed Central Google Scholar
Kelly, P. J. et al. Prevalence of smoking and other health risk factors in people attending residential substance abuse treatment. Drug Alcohol Rev. 31, 638–644. https://doi.org/10.1111/j.1465-3362.2012.00465.x (2012).
Article PubMed Google Scholar
Lien, L., Bolstad, I. & Bramness, J. G. Smoking among inpatients in treatment for substance use disorders: prevalence and effect on mental health and quality of life. BMC Psychiatry. 21 https://doi.org/10.1186/s12888-021-03252-9 (2021).
Mendelsohn, C. P. & Wodak Am, A. Smoking cessation in people with alcohol and other drug problems. Aust Fam Physician. 45, 569–573 (2016).
PubMed Google Scholar
Leri, F., Bruneau, J. & Stewart, J. Understanding polydrug use: review of heroin and cocaine co-use. Addiction 98, 7–22. https://doi.org/10.1046/j.1360-0443.2003.00236.x (2023).
Article Google Scholar
Polimanti, R. et al. Leveraging genome-wide data to investigate differences between opioid use vs. opioid dependence in 41,176 individuals from the psychiatric genomics consortium. Mol. Psychiatry. 25, 1673–1687. https://doi.org/10.1038/s41380-020-0677-9 (2020).
Article PubMed PubMed Central Google Scholar
Latkin, C. A., Edwards, C., Davey-Rothwell, M. A. & Tobin, K. E. The relationship between social desirability bias and self-reports of health, substance use, and social network factors among urban substance users in Baltimore. Md. Addict. Behav. 73, 133–136. https://doi.org/10.1016/j.addbeh.2017.05.005 (2017).
Article Google Scholar
Giromini, L., Young, G. & Sellbom, M. Assessing negative response bias using self-report measures: new articles, new issues. Psychol. Injury Law. 15, 1–21. https://doi.org/10.1007/s12207-022-09444-2 (2022).
Article Google Scholar
Tourangeau, R. & Yan, T. Sensitive questions in surveys. Psychol. Bull. 133, 859–883. https://doi.org/10.1037/0033-2909.133.5.859 (2007).
Article PubMed Google Scholar

Download references

Funding

We acknowledge financial support under the #NEXTGENERATIONEU (NGEU), which is funded by the Ministry of University and Research (MUR), National Recovery and Resilience Plan (NRRP), and project MNESYS (PE0000006)—A Multiscale integrated approach to the study of the nervous system in health and disease (DN. 1553 11.10.2022) (L.D.G. and M.V.). We acknowledge financial support under the National Recovery and Resilience Plan (NRRP), Mission 4, Component 2, Investment 1.1, Call for tender No. 104 published on 2.2.2022 by the Italian Ministry of University and Research (MUR), funded by the European Union – NextGenerationEU– Project Title Telomere-to-telomere sequencing: the new era of Centromere and neocentromere eVolution (CenVolution) – CUP H53D23003260006 - Grant Assignment Decree No. 1015 adopted on 07/07/2023 by the Italian Ministry of University and Research (MUR) (M.V.) and Project Title SUDWAY: Substance Use Disorders through Whole genome, psychological and neuro-endophenotypes AnalYsis CUP H53D23003310006- Grant Assignment Decree No. 1015 adopted on 7 July 2023 by the Italian Ministry of University and Research (MUR) (F.M). F.M. was supported by Fondazione con il Sud (2018-PDR-01136). This work was supported by the Italian Ministry of University and Research (MUR) grant PRIN 2020 (project code 2020J84FAM, CUP H93C20000040001) to F.A.

Author information

Raffaella Maria Ribatti, Luciana de Gennaro and Alessia Daponte contributed equally to this work.

Authors and Affiliations

Department of Bioscience, Biotechnology and Environment, University of Bari “Aldo Moro”, Bari, Italy
Raffaella Maria Ribatti, Luciana de Gennaro, Alessia Daponte, Danilo Cozzoli, Francesco Perrone, Francesca Antonacci, Claudia Rita Catacchio, Mario Ventura & Francesco Montinaro
Associazione Comunità Emmanuel ETS, Lecce, Italy
Danilo Cozzoli & Vincenzo Leone
Servizio Dipendenze (SerD) Martina Franca, Dipartimento dipendenze patologiche ASL Martina Franca (TA), Taranto, Italy
Maria Rita Quaranta
U.O. Medicina Trasfusionale AOU Policlinico Bari, Bari, Italy
Angelo Ostuni & Margherita Casanova
Dipartimento Dipendenze Patologiche ASL Taranto, Taranto, Italy
Vincenza Ariano
Dipartimento Dipendenze Patologiche ASL Lecce, Lecce, Italy
Salvatore Della Bona
Dipartimento Dipendenze Patologiche - Ser.D. Bari, Bari, Italy
Angela Lacalamita
Struttura Sovradistrettuale Dipendenze Patologiche ASL Brindisi, Brindisi, Italy
Salvatore De Fazio & Daniela Lorusso
Institute of Genomics, University of Tartu, Tartu, Estonia
Mait Metspalu & Francesco Montinaro
Department of Biology and Biotechnology “Lazzaro Spallanzani”, University of Pavia, Pavia, Italy
Antonio Torroni & Anna Olivieri
Department of Chemistry, Life Science and Environmental Sustainability, University of Parma, Parma, Italy
Cristian Capelli

Authors

Raffaella Maria Ribatti
View author publications
Search author on:PubMed Google Scholar
Luciana de Gennaro
View author publications
Search author on:PubMed Google Scholar
Alessia Daponte
View author publications
Search author on:PubMed Google Scholar
Danilo Cozzoli
View author publications
Search author on:PubMed Google Scholar
Maria Rita Quaranta
View author publications
Search author on:PubMed Google Scholar
Angelo Ostuni
View author publications
Search author on:PubMed Google Scholar
Margherita Casanova
View author publications
Search author on:PubMed Google Scholar
Vincenza Ariano
View author publications
Search author on:PubMed Google Scholar
Vincenzo Leone
View author publications
Search author on:PubMed Google Scholar
Francesco Perrone
View author publications
Search author on:PubMed Google Scholar
Salvatore Della Bona
View author publications
Search author on:PubMed Google Scholar
Angela Lacalamita
View author publications
Search author on:PubMed Google Scholar
Salvatore De Fazio
View author publications
Search author on:PubMed Google Scholar
Daniela Lorusso
View author publications
Search author on:PubMed Google Scholar
Mait Metspalu
View author publications
Search author on:PubMed Google Scholar
Antonio Torroni
View author publications
Search author on:PubMed Google Scholar
Anna Olivieri
View author publications
Search author on:PubMed Google Scholar
Cristian Capelli
View author publications
Search author on:PubMed Google Scholar
Francesca Antonacci
View author publications
Search author on:PubMed Google Scholar
Claudia Rita Catacchio
View author publications
Search author on:PubMed Google Scholar
Mario Ventura
View author publications
Search author on:PubMed Google Scholar
Francesco Montinaro
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors contributed to the conception, design, data acquisition, analysis, interpretation, drafting, and revision of the manuscript. They also approved the final version and agreed to be accountable for all aspects of the work.Specific contributions: F.M., M.V.: Conceptualization, methodology, formal analysis, data curation, writing – review & editing, visualization. A.T., A.O., C.C.: Data curation. M.Q., A.O., M.C., V.A., V.L., S.D., A.L., S.D., D.L.: Data sampling. R.M.R., L.D.G., A.D., D.C., F.P.P., M.M.: Investigation, sample processing. R.M.R., L.D.G., A.D., F.P.: Writing – original draft preparation. F.A., C.R.C.: Writing – review & editing, visualization. F.M. and M.V. are the corresponding authors.

Corresponding author

Correspondence to Francesco Montinaro.

Ethics declarations

Ethics approval

Ethical approval for this study was obtained from the Ethics Committee of the Department of Brindisi (CE 149/19), Taranto (CE 150/19), Bari (CE 0019544), and Lecce (CE 47/2020), in compliance with the Declaration of Helsinki. Informed consent was secured from all participants before enrolment, with explicit disclosure of their right to opt out without specifying a reason.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary Material 2

Supplementary Material 3

Supplementary Material 4

Supplementary Material 5

Supplementary Material 6

Supplementary Material 7

Supplementary Material 8

Supplementary Material 9

Supplementary Material 10

Supplementary Material 11

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Ribatti, R.M., de Gennaro, L., Daponte, A. et al. The BioSUD Biobank as a genomic resource for substance use disorders in Italy. Sci Rep 15, 21817 (2025). https://doi.org/10.1038/s41598-025-05211-w

Download citation

Received: 19 September 2024
Accepted: 02 June 2025
Published: 01 July 2025
Version of record: 01 July 2025
DOI: https://doi.org/10.1038/s41598-025-05211-w

Subjects

Abstract

Similar content being viewed by others

Introduction

Materials and methods

Recruitment

Sampling

DNA extraction

Genotyping and dataset

Principal component analysis (PCA)

ROH Estimation

Admixture analysis

Questionnaire

Results

The genetic variation of the biosud cohort

Principal component analysis

Runs of homozygosity

Admixture analysis

Descriptive evaluation of the questionnaire variables

Sociodemographic data and psychosocial factors

Substance use

Substance use classification - DSM-5 TR checklist

Discussion

Limitations and future directions

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Competing interests

Additional information

Publisher’s note

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links