Introduction

The advent of artificial intelligence (AI) technology, notably marked by the emergence of ChatGPT in 2022, heralded a paradigm shift in AI development, increasingly emphasizing practical applications (Noy and Zhang, 2023). AI has rapidly ascended to prominence as a cutting-edge research field, with its applications permeating various sectors of the labour market (Ahmed et al. 2023). Its transformative impact has infused new dynamism into the global economy (Shao et al. 2022), with McKinsey & Company projecting the global AI application market to reach $1.27 trillion by 2025, potentially creating 12 million job opportunities (Forbes Technology Council, 2022). By 2030, AI technologies are anticipated to contribute an estimated $13 trillion to the global economy (Coccia, 2016). Thus, AI technology, as a driving force behind a new technological revolution, has demonstrated significant spillover effects on the economic growth of nations and regions, catalysing changes in production methods and corporate structures (Hauer, 2022; Yang, 2022).

The extensive application of AI across the labour market can be attributed to its inherently interdisciplinary nature (Yang, 2022). Transcending traditional academic silos, AI integrates diverse disciplines (Wang et al. 2023). Since its formal inception in 1956 (Sheikh et al. 2023), AI has witnessed computer science emerging as its dominant discipline. However, AI is, in essence, an amalgamation of multiple fields, encompassing not only computer-related disciplines but also extending to mathematics, statistics, physics, chemistry, literature, and beyond (Deming and Noray, 2020). The intersection of these varied disciplines within the AI framework is becoming increasingly evident. A seminal moment in this evolution was the advent of deep learning technologies, exemplified by AlphaGo (Häuselmann, 2022), prompting a rapid expansion and evolution of AI, as it integrates non-traditional disciplines (Rafner et al. 2023). This interdisciplinary expansion is reshaping labour market trends, with industries traditionally aligned with specific disciplines now actively seeking AI expertise. The healthcare industry, for instance, is leveraging AI to enhance remote healthcare systems (Goldfarb et al. 2020), while the automotive sector is competing for AI talent to pioneer autonomous driving technologies. In finance, demand for machine learning and AI-related skills is growing at a rate six times faster than other positions (Frank et al. 2019).

Despite a plethora of research on AI’s impact on the workforce in the past decade (Acemoglu and Restrepo, 2019; Alekseeva et al. 2021; Anon, 2021; Braxton and Taska, 2022), there is a notable gap in studies focusing on AI roles outside computer science. This oversight fails to accurately capture the burgeoning demand for AI talent across various disciplines. Previous studies examining the intersection of AI with traditional disciplines have predominantly remained speculative (Börner et al. 2018; Niemi, 2021; Xu et al. 2021), lacking empirical labour market evidence, and thus falling short in providing concrete guidance for talent development. Critical issues such as the interplay between AI and diverse disciplines, evolving demand for AI talent in the labour market, and the positioning of each discipline within the AI framework warrant comprehensive exploration.

In light of these considerations, our study pivots to statistics, a traditional discipline, to investigate its integration with AI from a labour market perspective. Our choice of statistics is underpinned by three key rationales. First, statistics has been a cornerstone in AI’s foundation, providing essential theoretical underpinnings through tools like probability theory and inferential statistics, particularly during AI’s formative years in the mid-20th century (Goos and Manning, 2007). Additionally, the robust presence of statistics across various industries in the U.S. labour market offers a rich pool of recruitment samples for AI talents. Moreover, the widespread application of statistical methods and theories across disciplines such as mathematics, biology, economics, sociology, etc., positions statistics as a versatile field (Friedrich et al. 2022), making our findings potentially extendable to other disciplines and predicting future AI integration in humanities and social sciences (Larivière and Gingras, 2010).

This study utilizes natural language processing (NLP) techniques to analyze an extensive dataset comprising 280 million recruitment records from the U.S. labour market (2010–2022). We identified positions linking statistics with AI job listings and conducted an in-depth analysis of these specific roles. The primary objective is to delineate the temporal trajectory of AI talent demand in the field of statistics. We have also constructed an interdisciplinary cluster network centered around statistics in AI roles using Gephi. In addition, we synthesized key insights on the requisite skills, certifications, educational backgrounds, and industry experiences for AI talent in statistics. Finally, we added data on over 2 million actual AI positions in the U.S. spanning from 2010–2022 to complement our analysis of job posting data. This addition enables a more comprehensive understanding of the dynamics within the labour market.

Our research aims to elucidate the integration patterns between traditional disciplines and AI within the labour market, providing actionable insights and recommendations for the workforce, employers, educational institutions, and policymakers. Specifically, we address the following questions:

  1. 1.

    What is the actual demand for statistical AI talents in the labour market?

  2. 2.

    How are interdisciplinary integration patterns emerging for statistical AI talents in the labour market?

  3. 3.

    What skills and certifications are becoming increasingly important in AI talent recruitment?

  4. 4.

    Who is most impacted by AI substitution in the statistics field, in terms of skills and occupations?

  5. 5.

    In the realm of statistical AI talents, which holds greater significance: academic education or industry experience?

  6. 6.

    Is the demand for AI talents in statistics influenced by geographical locations and economic sectors?

Methods

Data source

The data employed in this study mainly originates from Emsi Burning Glass Technologies (BG), a comprehensive aggregator of online job postings. BG utilizes advanced algorithmic methods to systematically analyze ~40,000 digital job boards and organizational websites daily. This extensive dataset encompasses a broad array of information, including occupational categories, recruiting entities, geographical locations, as well as detailed skill sets and educational prerequisites specified in each job listing. Furthermore, BG has augmented its database by incorporating the complete textual content of each job posting. This text data undergoes algorithmic processing to extract and categorize key terms and phrases, thus identifying specific qualifications required for the respective positions. This approach enables the coding of a diverse range of skills - for instance, proficiency in Microsoft Excel - across the myriad of job listings gathered. Currently, the dataset comprises 279.87 million complete job records from 2010 to 2022, encapsulating virtually all online job vacancies and ~60–70% of offline vacancies in the United States.

The representativeness and reliability of the BG database have been substantiated over time. It consistently reflects patterns observable in U.S. census data (Hershbein and Kahn, 2018), and previous studies have corroborated the alignment of its occupational and industry compositions with those reported by the U.S. Bureau of Labour Statistics (BLS) (Acemoglu et al. 2022).

We also use an actual hiring database Revelio Labs to compare and interpret the reliability of our findings in the labour market. Revelio Labs is a leading provider of labour market analytics, continually gathers information from employees’ online profiles and resumes on various job-related websites and social media websites such as LinkedIn. The dataset includes specific characteristics of the company’s employees and positions, such as the employee’s gender, highest degree, skills, and the duration of the position’s existence, position seniority, and salary. We selected all U.S. companies from 2010 to 2022 as a comparison, containing 262,585 companies with 209,0658AI positions.

Research process

Step 1: Data matching and cleaning

As presented in Fig. 1, initially, our approach involved consolidating the BG database and the Revelio Labs database, both of which stores each variable in a distinct sub-database. Utilizing the unique ID assigned to each job listing, we matched corresponding skills, certificates, and discipline requirements in BG database. Given the occurrence of synonymous terminologies in the data (e.g., ‘statistics’ and ‘stats’), we standardized these variations for consistency. We also matched positions and skills from the Revelio Labs database by position id. Additionally, the process entailed the rectification of missing data and outliers within the database.

Fig. 1
figure 1

Research process.

Step 2: identification of AI positions in statistics

AI-related positions within the realm of statistics were identified based on the skills and disciplines associated with each job. The skill dataset encompasses over 10,000 standardized skills, aggregated into skill clusters. Previous methodologies employed for annotating AI roles - both narrow and broad—have demonstrated robustness (Acemoglu et al. 2022). In this study, we adopted a more stringent AI annotation approach, categorizing positions as AI-related if they encompass core concepts such as artificial intelligence (AI), machine learning (ML), natural language processing (NLP), and computer vision (CV) (Brynjolfsson and Mitchell, 2017)Footnote 1.

Step 3: data analysis and visualization

Subsequent data analysis was categorized by time and discipline, with ‘discipline’ referring to fields concurrent with statistics in job requirements. The analytical findings were visualized using heatmaps, network graphs, and combination charts. The heatmap employed is a statistical representation indicating the frequency of categories through varying colour tones, while the network graph, delineated by a clustering algorithm, is elucidated in subsequent sections.

Development of the disciplinary cluster algorithm

In advancing our analytical methodology, this study integrates a suite of big data algorithms, inclusive of natural language processing (NLP) AI techniques. These algorithms are instrumental in refining and structuring the text data, thereby establishing an authentic correlation between recruitment data and academic disciplines. This process facilitates the formation of an artificial intelligence disciplinary cluster network, which integrates statistics with other disciplines. This approach encompasses two primary facets: firstly, the intricate analysis of the disciplinary network, and secondly, the assessment of the integrity and coherence of each identified disciplinary cluster.

Our methodological framework is particularly influenced by the heuristic algorithm centered on modularity optimization (Blondel et al. 2008). The effectiveness and precision of this dissection are quantified through the modularity of each cluster group. Modularity, in this context, is computed according to the following formula:

$$Q=\frac{1}{2m}\mathop{\sum} \limits_{i,\,j}\left[{A}_{{ij}}-\frac{{k}_{i}{k}_{j}}{2m}\right]\delta \left({c}_{i}{c}_{j}\right)$$
(1)

In the provided formulas, \({A}_{{ij}}\) represents the weight of the connection between discipline nodes i and j, \({k}_{i}\) is the sum of weights of all edges connected to discipline node i, \({c}_{i}\) and \({c}_{j}\) are the combination indices of discipline nodes i and j, δ(\({c}_{i}\), \({c}_{j}\)) indicates whether discipline nodes i and j belong to the same combination (1 if they are the same, 0 otherwise), and m is the sum of connection weights in the entire network.

The algorithm consists of two phases iteratively. Past researchers assume starting from a weighted neural network with N discipline nodes (Šubelj and Bajec, 2011). Initially, each discipline node is assigned a distinct group, so the number of groups in the initial partition equals the number of nodes. For each discipline node i, its adjacent discipline nodes j are evaluated, and the module gain ΔQ obtained by removing i from its group and placing it in the group of j is calculated. Node i is then placed in the discipline cluster group that maximizes the gain. This process is sequentially applied to all disciplines repeatedly until further improvement is not possible, completing the first phase. In this process, the choice of order is an important measure to improve the computational efficiency of the heuristic algorithm. The formulas are as follows:

$$\Delta Q=\left[\frac{{\sum }_{{in}}+2{k}_{i,{in}}}{2m}-{\left(\frac{{\sum }_{{tot}}+{k}_{i}}{2m}\right)}^{2}\right]-\left[\frac{\sum {in}}{2m}-{\left(\frac{\sum {tot}}{2m}\right)}^{2}-{\frac{{k}_{i}}{2m}}^{2}\right]$$
(2)

In the context of these equations, \(\sum {in}\) represents the sum of weights of internal connections within the group, \(\sum {tot}\) denotes the sum of weights of connections associated with the discipline node within the group, \({k}_{i}\) is the sum of weights of connections related to discipline node i, \({k}_{{in}}\) is the sum of weights of connections from i to nodes within the group, and m is the sum of weights of all connections in the entire network. These symbols are utilized to evaluate the change in modularity when i is removed from the group (Ronchi et al. 2008).

The second phase of the algorithm involves constructing a new network, where the nodes are the disciplinary cluster groups discovered in the first phase. The weights of connections between the new nodes are determined by the sum of connection weights between corresponding nodes in the two groups. In this new network, connections between nodes in the same group result in a self-loop for that group. Once the second phase is complete, the algorithm can be reapplied to the obtained weighted network through iterations. With each round of construction, the number of disciplinary cluster groups decreases, and the majority of computational time is dedicated to the initial iterations. The algorithm continues until there is no further change in modularity or until it reaches the maximum value. This algorithm exhibits a self-similar property akin to the self-affinity in complex networks, naturally incorporating the concept of hierarchy.

Research findings

Escalating demand and diversification of jobs in AI-related statistics

Our analysis, spanning from 2010 to 2022, reveals a significant trend: 11.16% of AI positions listings necessitate statistical expertise, placing it second in demand only to computer science. As shown in Fig. 2, despite the rapid growth in demands for all positions between 2010 and 2022, AI job postings have experienced significantly accelerated growth. The statistical job postings grew by a factor of 9.73 overall from 2010–2022, with non-AI-related statistical job postings growing by a factor of 3.12 and AI-related statistical job postings growing by a factor of 31, which shows a substantial disparity in growth rates between AI-related and non-AI-related statistical job postings over the same period. Notably, the growth rate of AI-related statistical job postings was 35.71% annually before the pivotal advancements in deep learning technologies (2010–2016), with notable spikes in 2013 (50.61%) and 2014 (44.62%). This suggests an anticipatory response by the labour market to impending AI breakthroughs. Post-2016, the growth rate decelerated to 29.64%, but the emergence of ChatGPT in 2021 renewed the upward trend, with an average annual growth rate of 47.11% in 2021–2022. Each technological leap in AI is paralleled by an accelerated demand for statistical talent, highlighting the rapid assimilation of AI within the statistics discipline.

Fig. 2: Time evolution of the various types of recruitment positions.
figure 2

a Trend over time in the number of positions across all jobs. b Growth trends in statistics-related jobs, including AI and non-AI jobs.

This trend is also reflected in actual hiring data, which has seen a growth of 2.67 times from 2010 to 2022.On the one hand, the rapid growth of actual AI positions is consistent with the results of the job postings data, proving the reliability of our findings, On the other hand, the growth rate of AI positions in actual companies is significantly lower than the growth rate of demand for AI jobs in postings data, highlighting a substantial shortage of AI talent in the U.S.

The evolution in AI-related statistical job roles over the past decade has transitioned from traditional positions to a more diverse spectrum. In 2022, the AI sector in statistics boasted 932 distinct job types, offering a plethora of opportunities for statistics graduates. Predominantly, Data Analyst roles (16.12%) emerged as the primary destination for AI talents in statistics, followed by Machine Learning Engineer (5.96%) and Machine Learning Scientist (4.34%). Notably, significant representation of roles in the financial sector, such as Risk Analyst (3.25%), Business Analyst (2.48%), and Marketing Manager (2.29%), highlights the burgeoning recruitment of statistical professionals in social sciences.

Figure 3a–c, based on job postings data, shed light on the top 30 AI job postings, revealing the dynamic landscape of roles related to statistics. The heatmap, divided into sciences (including pure statistics and IT, shown in red) and social sciences (blue), visually depicts the relative distribution of each role per year. Social science roles outnumber those in sciences, reiterating statistics expansive application in this field. A notable trend is the shift from traditional Statistician roles to more market-application-oriented positions like Data Analyst, reflecting an evolution in AI’s role within statistics. In IT, the rise of Machine Learning Engineer and Scientist roles signifies a growing focus on these specializations. Conversely, roles in social sciences such as Actuary and Marketing Analyst indicate fluctuating demand, reflecting the nascent stage of AI application in these fields. The emergence of roles like Sustainable Consultant and the pronounced increase in Risk Analyst positions underscore AI’s growing significance in environmental sustainability and financial risk management, respectively, as corroborated by industry reports (Deloitte, 2019). This trend points towards a future where AI is increasingly embraced across various sectors within social sciences.

Fig. 3
figure 3

Time evolution of the distribution of AI positions in statistics.

Figure 3d is based on actual hiring data, depicting the top 30 occupations in terms of the number of actual AI positions. The number of positions in the social sciences is not as high as those in the job postings data. However, occupations such as Software Engineers and Data Scientists remain prominent in terms of numbers. Interestingly, Historians rank third highest in terms of positions within the social sciences, according to the ONET codeFootnote 2. Historians typically analyze historical records using tools such as databases, geographic information systems, and other software. AI technology is already extensively utilized in this role.

Emergence of disciplinary clusters in AI jobs involving statistics

In the BG database, AI roles that incorporate statistics frequently encompass additional disciplines, creating what we define as ‘disciplinary clusters’ within the AI domain. In 2010, statistics in AI recruitment was part of 49 such clusters, which expanded to 190 by 2022. This growth illustrates the increasing tendency for statistics to form interdisciplinary connections within the AI landscape.

Within these clusters, Computer Science emerges as the most predominant discipline, accounting for 24.52% of the total. The close affinity of statistics with mathematical fields is also evident, with Mathematics (16.26%) and Applied Mathematics (5.65%) ranking prominently. Other significant disciplines include Economics (13.35%), Business Administration and Management (5.48%), Physics (3.53%), and Engineering (3.11%).

Observing the evolution over time, the early AI development phase (2010–2016) saw statistics primarily clustered with Computer Science (20.31%), Mathematics (16.16%), Economics (14.50%), and Operations Research (5.40%). During the intermediate phase (2017–2020), the confluence of statistics with computer science intensified (24.7%), while Mathematics (15.68%) and Economics (12.44%) maintained their prominence, albeit with slightly reduced proportions. Notably, Applied Mathematics ascended in relevance, and clusters involving Physics emerged within the top ten.

In the latest AI breakthrough phase (2021–2022), the integration of statistics with Computer Science further solidified (25.79%), underscoring the entrenched position of statistics in AI. The associations with Mathematics (16.83%) and Economics (13.82%) also strengthened, highlighting statistics’ enduring significance in both foundational AI research and practical applications.

The modularity-based analysis, as depicted in Fig. 4 reveals, four distinct and stable disciplinary clusters. The foremost cluster, labelled in purple in the diagram, amalgamates statistics, mathematics, and computer science, predominantly focusing on engineering fields like automation and robotics. The second group is the green cluster of disciplines, orbits around economics and business management, intersecting with medical fields such as public health and pharmaceutical information. The third group is the orange cluster of disciplines at the bottom left of the figure, encompasses finance, accounting, operations research, and management information systems, prevalent in the business and management arena. The smallest clusters of disciplines marked in blue in the figure, integrates genetics, bioinformatics, biostatistics, applied mathematics, physics, and other related fields in biochemical and mathematical domains.

Fig. 4
figure 4

Discipline cluster networks for statistics in AI recruitment.

A deeper examination of recruitment texts reveals that in these interdisciplinary fusions, statistics contributes primarily its methodological framework, encompassing statistical thinking, methods, and techniques. For instance, the integration of statistics with mathematics is predominantly involved in AI algorithm development, data analysis, and model construction. The convergence with economics facilitates AI market analysis, cost-benefit evaluations, and strategic business planning. When coupled with finance, the focus shifts to AI-driven financial risk assessment, investment strategies, and market forecasting. Similarly, the intersection with operations research is centered on AI optimization, decision analysis, and resource allocation.

Shift towards hard skills in AI recruitment trends

Our analysis of the BG database reveals that AI recruitment for statistics-related positions involves a comprehensive spectrum of 3987 skills, with a total of 6,324,462 instances. We categorize these skills into ‘hard’ and ‘soft’ skills, following the definitions provided by Hendarman and Cantner (2018), Lin (2023), and Peng et al. (2023). Hard skills encompass specific technical competencies, including proficiency in tools, platforms, or computer programs, and the capability to perform job-required tasks. In contrast, soft skills relate to personal attributes and interpersonal skills. Predominantly, AI recruitment in the statistical domain emphasizes hard skills. Among the top 30 skill requirements, eight are directly associated with software tools, occupying prominent ranks, including Python (2nd), SQL (4th), and SAS (8th). Additionally, broader hard skills such as machine learning (1st), predictive modelling (7th), and data analysis (11th) are highly sought after. The most prevalent soft skill is research, constituting 2.25% of the skillset, often complemented by hard skills.

A temporal heatmap of skill demands from job postings data reveals a predominance of hard skills, both in variety and percentage, within the top 30 requirements. Further dissecting hard skills into ‘theoretical mastery’ and ‘tool usage’, it’s clear that AI talent in statistics is increasingly leaning towards computer-related capabilities like machine learning, data science, and data visualization (Fig. 5a, b). These practical applications are rising in prominence, while traditional statistical skills like predictive modelling are gradually diminishing. Interestingly, the big data skill reached its zenith in 2017 (1.48%) and has since stabilized. The evolution of software tool preferences is evident: older tools like SAS and Microsoft Excel are being phased out, while Python and visualization tools like Tableau are gaining traction. SQL, however, has maintained a consistent significance due to its indispensable role in database management. In the realm of soft skills (Fig. 5c), teamwork is the only skill experiencing an upward trend, indicative of AI’s interdisciplinary nature and the increasing emphasis on collaborative efforts across departments.

Fig. 5: Time evolution of the distribution of skill needs for AI positions in statistics.
figure 5

a Hard skills (Theoretical Mastery) evolution overtime. b Hard skills (tools usage) evolution over time. c Soft skills evolution over time. d The top skills in actual AI positions.

This trend is even more evident in the actual hiring data. Figure 5d shows that only four of the top thirty skills are soft skills, while the rest are hard skills. Similar to the job postings data, companies still regard research as the most important soft skill and place significant importance on hard skills like machine learning and Python. In addition, actual hiring data specifies more detailed requirements for employees’ hard skills, particularly in software usage and programming languages, like C++, C, MATLAB, and HTML. These findings reveal that hard skills are highly valued in actual job positions.

In recent years, AI recruitment has also demonstrated a dynamic trend in certification requirements. The BG database cites 399 distinct certifications, with project management and financial risk domains predominating. The most sought-after certifications include Project Management Certification (8.79%) and Financial Risk Manager Certification (7.96%). Notably, there is a growing demand for certifications indicative of IT technical proficiency, such as CISSP and MCSA. This shift reflects employers’ focus on practical skills and technical expertise. Contrasting with computer science, certifications for statistical AI talent highlight a preference for professional qualifications in project management and finance. Anticipating future technological advancements, it is projected that new certifications will emerge to meet the evolving demands of the AI industry. Consequently, job seekers should strive for a diverse certification portfolio to stay adaptable and competitive in various fields and technologies.

Endangered skills and occupations

We analysed job posting data and examined endangered occupations and skills. Figure 6a presents the top ten fastest-declining jobs. For example, the proportion of Actuaries decreased from 3.27% to 1.44%, reflecting a 56% decline. Computer Programmers saw their representation drop from 1.59% – 0.40%, a reduction of ~75%. The decline in these occupations follows several consistent patterns. Many of these roles entail repetitive tasks that are increasingly being automated by artificial intelligence. Additionally, advancements in technology and changes in industry dynamics are diminishing the demand for these professions. For example, the emergence of risk management software reduces the need for specialists in the field. Moreover, globalization and digitalization are reshaping industries, impacting occupations such as statisticians and financial examiners. Lastly, societal and economic changes can render certain professions obsolete over time. This convergence of factors underscores the evolving landscape of employment and the ongoing transformation of the job market.

Fig. 6
figure 6

Top 10 endangered skills and occupations.

Figure 6b illustrates the top ten skills with the fastest declining skill shares in the field of statistics show significant trends. Notably, traditional statistical analysis skills like SPSS, Microsoft Access, Spreadsheets, SAP BusinessObjects, Database Marketing, CHAID, and OLAP are experiencing substantial declines in skill demand. For instance, SPSS skills decreased from 1.16% in 2010 to 0.17% in 2022, representing a notable downward trend, and the most significant decline was observed in CHAID skills, dropping from 0.23% to 0.01%. The decline mirrors the broader challenge facing the traditional statistics field, where the demand for conventional data processing and analysis skills is waning due to the rapid advancement of AI technology and the emergence of more sophisticated data analysis tools and automated methods. Therefore, professionals in the field must adapt to stay relevant and leverage newer, more advanced tools and techniques for effective data analysis.

Our findings on these fading skills and jobs can demonstrate that the impact of AI on traditional jobs in statistics is multifaceted, this impact extends beyond just soft skills, with even traditional hard skills being replaced by newer counterparts.

Educational requirements versus practical experience in AI recruitment

The educational prerequisites for statistical talent in AI recruitment appear modest, with the majority of roles requiring only a bachelor’s degree (66.85%), solidifying it as the primary educational benchmark. As shown in Fig. 7a, a smaller fraction of positions stipulates a master’s degree (25.68%), and the demand for Ph.D. qualifications is notably lower at 6.35%. In instances where educational qualifications were unspecified in the data, we assumed no formal qualifications were required. Despite this, the average working experience requirement over the past 13 years equated to 3.9 years s and has been trending upwards over time (Fig. 7b). This trend suggests that in the AI job market for statistical roles, practical experience is often weighted more heavily than academic degrees.

Fig. 7: Changes in education and experience requirements for AI positions in statistics.
figure 7

a Trends over time in the distribution of educational requirements, b Trends overtime in the average experience needed for various educational qualifications.

During the periods 2010–2016, 2017–2020, and 2021–2022, the average working experience requirements for statistical AI roles fluctuated between 3.9 and 4 years. The proportion of roles requiring a bachelor’s degree has increased over time (59.48%, 61.76%, 74.25%), whereas the demand for master’s and Ph.D. qualifications has declined (master’s: 31.45% to 19.56%, Ph.D.: 8.18% to 4.62%).An analysis of the qualification distribution trends for candidates with at least a bachelor’s degree reveals several key observations. Firstly, the demand for bachelor’s degree holders has maintained relative stability over the 13 year span. However, from 2014 – 2018, there was a heightened emphasis on more advanced qualifications for master’s degree candidates, averaging above 4 years. In contrast, the fluctuation in the requirements for Ph.D. candidates was more pronounced, possibly reflecting the diverse nature of Ph.D. training programs. Secondly, there seems to be a complementary relationship between the qualifications of bachelor’s and master’s degree candidates, while the requirements for Ph.D. candidates are more variable. Lastly, the demand for bachelor’s degree candidates underwent a decrease followed by an increase, while the demand for master’s and Ph.D. degree holders peaked around 2016–2017 and has been diminishing since. This shift may reveal a saturation of highly educated technical personnel in enterprises and a transition from AI research and development to practical applications. Post-2018, the demand for Ph.D. candidates continued to diminish, accompanied by a concurrent decrease in their qualification requirements.

We analysed the highest degree of individuals who actually hold AI positions in companies. We found that 25% hold a bachelor’s degree, 42% hold a master’s degree, and 33% hold a doctoral degree. This distribution differs from the educational requirements specified in job postings data. It suggests that while educational requirements may be decreasing over these years, individuals with higher degrees still predominantly occupy AI positions in practice.

Heterogeneity in AI recruitment

We calculated the percentage of AI jobs under the field of statistics for each state in the U.S. (Figs. 8a, b). Our analysis revealed significant variation in AI talent demand across different states, with the northwestern part of the U.S. exhibiting a higher ratio compared to other regions. Washington has the highest ratio of AI talents in statistics (32.6%), followed by states such as Idaho (29.3%), North Dakota (28.7%) and other states, while Alaska (17.4%), Maryland (19.4%), and Nebraska (19.5%) rank last. The distribution of AI positions in statistics jobs is closely linked to the prevalence of manufacturing and high-tech industries within each state., for example, Washington is home to Boeing’s production lines, as well as the headquarters of Microsoft and Amazon, leading to a stronger demand for AI talent in the region.

Fig. 8: Heterogeneity in AI recruitment.
figure 8

a Ratio of AI talent among statistics employees in different states in the U.S. b Ratio of AI talent among statistics employees across different industries in the U.S.

Based on the distribution of NAICS 2-digit codes, we found that the statistics discipline is spread across 20 industries, and there is a significant difference in AI talent ratio between different industries (Fig. 8b). Among these industries, the retail sector emerges as the most impacted by AI, with an AI talent ratio reaching 37%. Additionally, the information (30.8%), management of companies and enterprises (29.2%), and manufacturing (28.1%) sectors exhibit high AI talent ratios. Conversely, AI talent ratios are generally lower in industries such as agriculture, forestry, fishing, and hunting (13.3%), construction (13%), and educational services (10.3%).

To verify that the increase in jobs is a direct attributed to AI’s impact on job demand rather than a spillover effect of high-tech industry growth, we categorized all industries into high-tech versus non-high-tech industries based on their NAICS codes. The average annual growth rates of high-tech AI jobs and non-high-tech AI jobs were 73.2% and 72.7%, respectively, the average annual growth rates of high-tech non-AI jobs and non-high-tech non-AI jobs were 14.7% and 18.3%, respectively (further discussed in the supplementary materials). Notably, the growth rate of AI jobs significantly exceeded that of non-AI jobs, whereas whether the industry was high-tech or not did not significantly impact job growth rates. Hence, we infer that the rapid growth of AI jobs is not merely a high-tech spillover effect. We also analysed the distribution of AI talent in the entertainment and arts industry. Between 2010–2022, non-AI positions in the entertainment and arts industry under the statistical subject area grow 9.7 times from 23,307 – 226,397, while AI jobs grow 22.6 times from 30–677.

The growth rate of AI positions in the entertainment and arts sector was much higher than non-AI jobs. But the absolute number of AI positions remains relatively small. These AI roles, based on the 6-digit NAICS code, are primarily found in the gambling industry, sports teams and clubs, racetracks, casinos (except casino hotels), fitness and recreational sports centers, and other gambling and sports industries. We surmise that because our study focuses on AI positions within statistics, the overall impact of AI on the entertainment and arts industry remains limited in scope.

Discussion: the integral role of statistics in the AI-driven industrial revolution

As the vanguard of the fourth industrial revolution, artificial intelligence (AI) has signified a pivotal shift in the integration of traditional disciplines into its expansive disciplinary system. In this transformative landscape, statistics emerges as a cornerstone, seamlessly bridging the natural and social sciences. Its contribution to the establishment, progression, and innovation within AI is undeniable. This study, leveraging extensive recruitment data, provides an empirical lens through which the integration of statistical expertise into AI is observed. It systematically outlines the AI disciplinary system, highlighting significant disciplinary clusters where statistics plays a pivotal role, and delves into the essential academic credentials, skill sets, qualifications, and certifications required in the field of AI statistics. This research not only addresses current debates but also sets a foundational reference for the evolution and talent cultivation strategies in other traditional disciplines.

Key research findings and implications

Evolving demand for AI talent in statistics

The study reveals a substantial increase in the demand for AI talent in statistics over the past 13 years, expanding by a factor of 31. This trend is indicative of the growing significance of statistics in the labour market, driven by AI innovations. Notably, this increase is not merely confined to the creation of new job opportunities but also in the diversification of roles within the field.

Diversification of roles in AI and statistics

The AI sector has witnessed a remarkable diversification in statistical roles, with as many as 932 distinct positions identified in 2022. This diversity offers a plethora of opportunities for statistics graduates, signifying the expansive reach of AI in various categories of the social sciences. The shift towards practical market demands is evident, with emerging roles in sustainable consulting and other fields gaining prominence.

The interdisciplinary nature of statistics in AI

From 2010 to 2022, the number of interdisciplinary clusters involving statistics expanded significantly. The integration of statistics with disciplines such as computer science, mathematics, and economics has notably increased. The study identifies four stable disciplinary clusters, encapsulating key areas like engineering, management, finance, and bioinformatics. These clusters underscore the multifaceted nature of statistics in AI, contributing to various domains including algorithm development, market analysis, and financial risk analysis.

Skill emphasis in AI recruitment

A notable shift towards ‘hard’ skills, particularly in software applications, technology development, and data analysis, is evident in the AI recruitment landscape. This trend highlights the evolving skill requirements in AI, where practical, technology-oriented competencies are increasingly valued over formal academic qualifications.

Fading skills and occupations in AI job postings

Traditional statistical skills such as SPSS, Microsoft Access and spreadsheets have significantly declined under the influence of AI, reflecting the shift to advanced AI-driven tools. Similarly, traditional occupations like actuaries, computer programmers, and statisticians have seen substantial reductions. These trends highlight the necessity of continuously updating skills to adapt to the rapidly changing environment shaped by AI and emerging technologies.

Educational requirements and practical experience

While AI talent tends to be highly educated in actual AI hiring data. this study finds a decreasing emphasis on higher academic qualifications in AI recruitment for statistical roles. Instead, there is an increasing preference for candidates with bachelor’s degrees coupled with substantial practical experience. This shift reflects the industry’s prioritization of hands-on expertise and real-world application over theoretical knowledge. We speculate that the decreased preference for highly educated AI talents in the labour market is mainly due to the fact that in recent years, the industry’s capital investment and training of AI talents have been ahead of academia in the AI field (Ahmed et al., 2023). Consequently, major companies place greater emphasis on candidates’ practical work experience in the industry when recruiting talent.

Heterogeneity in AI recruitment across geographic locations and industries

Our study reveals significant variation in AI talent demand across U.S. states, which correlates with the density of manufacturing and high-tech industries in each state. In terms of industry, retail, information, management, and manufacturing sectors exhibit higher demand for AI talent, while agriculture, construction, and educational services show lower demand. Irrespective of an industry’s classification as high-tech, the growth rate of AI job postings data significantly outstrips that of non-AI job postings data, suggesting that the rapid expansion of AI employment is not simply a spillover effect of high-tech industry growth. Furthermore, while the entertainment and arts industry has witnessed a substantial increase in AI positions, the overall impact remains limited.

Strategic recommendations

Adapting to AI for workforce

Individuals in traditional disciplines should proactively adapt to the evolving AI landscape. Enhancing AI skills through specialized training programs and gaining practical experience are key strategies to remain relevant and competitive in the AI-driven job market. According to the World Economic Forum, by 2025, the widespread adoption of AI technology may lead to a reduction of 85 million jobs, but it is also expected to create 97 million new jobs (Russo, 2020). Considering the findings of this study, we suggest that employees in traditional disciplines need not overly worry about the existential crisis that AI might bring. Instead, they could focus on strategies for transitioning their skills to align with AI developments in the future labour market (Agrawal et al. 2019). One effective approach is to enhance one’s AI skills through participation in AI training programs, adapting to the evolving demands of the labour market. Leading technology companies such as LinkedIn, IBM, Google, Microsoft, and others have introduced AI online course along with corresponding certifications (Felten et al. 2021). Additionally, many universities like Stanford and Arizona State University offer AI projects and courses now, providing opportunities for continuing education for working professionals. Furthermore, individuals with higher academic qualifications should consider accumulating more practical work experience to enhance their performance in real-world projects. This will contribute to increasing competitiveness in the highly competitive AI job market. This involves actively participating in AI-related projects in their daily work, continuously learning, and applying new knowledge to practical situations. Through these efforts, they can better showcase their practical abilities, thereby improving their employment prospects in the field of AI.

Transition for companies

Companies are advised to embrace skills-based organizational models, prioritizing specific capabilities required for diverse AI tasks (Autor, 2019). This approach aligns with the evolving nature of AI work, promoting flexibility and efficiency. Our research indicates that the AI talent required by companies is often associated with specific skills. Moreover, the indicators and definitions of AI capabilities have become increasingly clear in recent years. Therefore, the traditional model of talent-based organizations may become a hindrance in the AI era (Sun et al. 2021). A more adaptive approach would be to consider transitioning into what Deloitte defines as skills-based organizations, which is described as “a new organizational form placing capabilities at the core of talent strategy, creating a new operating model for work and workforce (Michael and Robin, 2023). In essence, this model involves breaking down work into small projects and tasks, organizing temporary workers based on the skills required for each task (Autor and Dorn, 2013). This approach enhances the flexibility and productivity of employers in the AI era by adapting to the specific capabilities needed for different aspects of work.

Government and educational collaboration

Governments should collaborate with educational institutions and employers to standardize AI talent certification, ensuring a global certification system that fosters international competitiveness in AI talent development. It is crucial to expedite the global certification system associated with AI recruitment, facilitating international certification and development for AI talents in statistics. Simultaneously, partnerships with universities and talent development institutions can be strengthened to establish and enhance standardized AI talent development criteria. This collaboration ensures that students receive unified and high-quality training across various disciplines, empowering them with stronger overall capabilities and international competitiveness.

Government guidance and regulation are also extremely important. Terence Tao, in collaboration with the PCAST Task Force, released a report on AI in 2024, emphasizing the importance of the federal government enhancing both the sharing and regulation of AI. it underscores the necessity for the federal government to enhance both the sharing and regulation of AIFootnote 3. On one hand, there is a call to promote the sharing of AI technologies, including sharing AI models trained on federally funded research data, and allocating sufficient resources to support these endeavours. On the other hand, federal funding agencies should update their codes of conduct for responsible research, mandating researchers to provide plans for the ethical and responsible use of AI. Meanwhile, agencies such as the National Science Foundation (NSF) and the National Institute of Standards and Technology (NIST) should continue to support responsible and trustworthy science-based AI research. This includes establishing standard benchmarks for measuring AI accuracy, repeatability, fairness, and other essential attributes.

Limitations and future directions

This study preferred to concentrate on occupations with higher cognitive skill needs, especially those related to statistics, so the results of this study may not be applicable to all occupations, especially those with lower cognitive and educational needs, such as plumbers, drivers, etc. Additionally, the focus on statistics as a discipline could be expanded to include a wider range of traditional disciplines, exploring their integration and performance in AI applications. Future research should consider the intersectionality of different disciplines within AI, aligning job market demands with the evolving requirements of AI talent.