Introduction

In an era characterized by the constricting pressures of population aging, Baumol’s cost disease, and Wagner effects on welfare states, there exists an urgent need to improve the long-term sustainability of the welfare state, e.g. through a larger supply of workers available for work, to both increase the number of taxpayers and reduce the number of individuals dependent on income transfers from the government (Andersen, 2015). At the same time, a significant proportion of individuals are only marginally attached to the labor market not only because of long-term unemployment and reliance on social assistance but also health-related issues, social barriers, family difficulties, housing instability, and/or personal obstacles that make it difficult for them to establish a connection to the labor market. Many of these “hard-to-place” social assistance recipients share a common aspiration—namely, the desire to work. A transition into employment would also presumably improve the well-being and possibly even the physical and mental health of those obtaining employment (thus reducing publicly financed income transfers and increasing tax payments). However, their journey toward employment is marred by a labyrinth of—personal as well as systemic—obstacles and challenges, making it difficult for them to find employment. While the Public Employment Services (PES) in many countries employ a comprehensive toolbox of e.g. Active Labor Market Policies (ALMPs), aiming to facilitate reintegration into the labor market, the effectiveness of such tools for this particularly disadvantaged group of individuals remains unclear, in part due to very low transition rates into unsubsidized employment and the associated difficulty of measuring causal effects. It is therefore imperative to develop a dynamic tool that transcends the traditional categorical views of employment, unemployment, and non-participation. This tool should have the unique capacity to measure the progression of individuals on the fringes of the labor market towards unsubsidized employment—effectively capturing employment readiness of those “hard-to-place” but seeking work.

This study introduces and validates a novel survey tool, the Employment Readiness Indicator Questionnaire (ERIQ), in selected job centers in Denmark. ERIQ incorporates measures of employment readiness and job search behavior, which are used to predict unsubsidized employment. The use of machine learning techniques to analyze ERIQ survey responses allows for the identification of key factors that need improvement to enhance clients’ employment readiness and, consequently, their likelihood of securing a job. ERIQ’s performance is compared against a rich set of administrative data variables (henceforth labeled Admin), demonstrating both strong forecasting ability and practical value in guiding caseworkers toward targeted interventions that can improve employment prospects for their clients. This feature is particularly valuable for the group of disadvantaged social assistance recipients who face significant barriers to labor market integration.

Even within Denmark’s comprehensive Scandinavian welfare state–characterized by extensive and relatively generous social assistance and employment services–effectively integrating disadvantaged social assistance recipients remains a persistent and unresolved challenge. Many of these individuals are long-term unemployed and only marginally attached to the labor market due to additional obstacles such as health issues, social barriers, family difficulties, housing instability, and personal challenges. ERIQ stands out by identifying the specific factors that influence this transition from social assistance to employment or education, making it a potentially useful tool for both caseworkers and individuals seeking to overcome labor market barriers.

We present compelling evidence for the validity of ERIQ as a tool for predicting progression toward labor market reintegration. ERIQ has an AUC-ROC (area under the receiver operating characteristic curve) of 83% when predicting employment, compared to an AUC-ROC of 64% using rich data from administrative registers. The AUC-PR (Area Under the precision-recall curve) is 28% for ERIQ compared to 10% for Admin (baseline of 5.5%). In addition, once ERIQ is included in the predictive model, almost no predictive power is gained by extending the model with data from the administrative registers.

To the best of our knowledge, we are the first to develop a freely available survey-based tool that both (i) has very good predictive properties, and (ii) almost directly recommends possible actions for the group of disadvantaged individuals on the edge of or outside the labor market. The latter is achieved by pointing to malleable traits that are associated with employment readiness, such as social skills, everyday coping strategies, goal orientation, and self-efficacy.

By bridging the gap between research and practice, we contribute significantly to the broader effort of promoting inclusive and effective labor market participation for disadvantaged clients.

Navigating employment readiness: indicators, domains, and tools in the literature

We investigate the relationship between distinct indicators of employment readiness included in the ERIQ and the likelihood of disadvantaged social assistance recipients commencing job search and obtaining employment, using machine learning tools to predict these intermediate and final outcomes of the path towards employment readiness. The individual components of the survey were identified based on a literature survey of factors associated with employment and employment readiness (Væksthuset and NewInsight, 2012). This led to a large number of possible questions grouped within 11 overall domains, as shown in Fig. 1. These questions were subsequently reduced to 11 questions for the caseworker and 11 questions for the client, which in combination cover all 11 domains.

Fig. 1
figure 1

Employment readiness domains.

For a group with such complex and differential problems, it is clear that for some, the path to employment is long, while for others it may be shorter.

McQuaid and Lindsay (2005) discuss different definitions of employment readiness and argue that a proper definition should focus on demand-side factors (e.g., the needs of employers) as well as supply-side factors (e.g., network, personal and work-related competencies, motivation, etc.) and, in particular, their interactions, as important for employability or employment readiness.

Relatedly, Pearson et al. (2023) argue that employment readiness should be viewed through a capabilities approach, focusing on what people can do rather than what they actually do.

As EIC (2020) notes, "Each disadvantaged jobseeker faces a unique set of personal and work-related barriers; Personal circumstances, such as financial hardship, disability, caring responsibilities and substance dependence, can create barriers to employment by limiting access to opportunities and resources that improve jobseekers’ employment prospects and enable them to find and retain work.” It is, however, important to understand how each of these measures is associated with employment readiness, so appropriate measures to improve employment readiness can be identified.

We have only been able to identify a few available tools that try to measure employment readiness in a hard-to-place population. There are commercial tools, such as the Canadian tool called the ’Employment Readiness Scale’ (ERS), which contains 75 questions in total and is claimed to predict correctly in 80% of casesFootnote 1

Wittevrongel et al. (2022) investigate two tools and their validity specifically for youth with autism spectrum disorders: the Work Readiness Inventory (WRI) and the Ansell-Casey Life Skills Assessment. The WRI is particularly relevant to our case. They argue that factors such as responsibility, flexibility, competencies, communication, self-efficacy, and hopefulness are valued by potential employers.

Ding et al. (2023) conducts a scoping review of tools that assess the employability of cancer survivors, which is a very specific population. It focuses mostly on health-related indicators and thus does not offer a lot for a broader targeted tool.

Dencker-Larsen (2017) investigates whether the Danish Well-being survey, combined with data from administrative registers, can be used to predict employment. The author uses information on health, well-being, self-efficacy, alcohol use, and drug use and concludes that the relation between the proposed measures and subsequent employment is imprecisely estimated. The analytical sample is, however, rather small (N < 1000). The study does find some evidence that a self-efficacy factor (the belief in one’s ability to find employment and to work) is significantly related to subsequent employment.

Hence, the literature on employment readiness measurement for a more general population of individuals on the margin of the labor market is, to the best of our knowledge, very limited. There is a related literature that attempts to predict employment in unemployed (employment-ready) populations, such as the German TrEffeR (which also attempts to point to the potentially most effective intervention) and the Danish Job Barometer (e.g., Stephan et al. 2006 and Rosholm et al. 2006, respectively). However, these models rely entirely on information available in administrative registers, which implies that their predictions are not that useful to caseworkers and can be discouraging to the clients (say, if the model predicts low employment chances due to age, sex, ethnicity, and educational background, then this prediction is not very constructive in terms of how to intervene to improve the likelihood of employment).

Results

We present the ability of four predictive models to predict the likelihood of obtaining employment within a year after responding to the questionnaire, being engaged in active job search, enrolling in education within a year after responding to the questionnaire, and the combination of finding employment or commencing education. For each outcome, we compare the predictive ability of ERIQ to that of a comprehensive set of variables obtained from administrative registers—the Admin feature set. For details on the models and data, see the Methods section. Further, we also unveil the predictive capacity of each feature to identify the factors that significantly contribute to the predictions of clients’ likelihood of securing employment and active job search.

Model performance

Table 1 presents the performance of the four predictive models on each of the two primary and two secondary outcomes. The models are evaluated using three distinct feature sets: Admin, ERIQ, and a combination of both (Admin + ERIQ).

Table 1 Test performance of the predictive models.

The ERIQ feature set demonstrates the most promising predictive performance across the primary outcomes—employment and active job search—across all four predictive models. Notably, for the transition into employment and for starting job search, the XGBoost model achieves AUC-ROC scores of 83.48% and 84.24%, respectively, and AUC-PR scores of 27.51% and 62.76%, respectively, when using the ERIQ feature set. These results suggest that the ERIQ feature set, which encompasses information related to personal experiences, social networks, coping strategies, health management, and knowledge about job market opportunities, plays a vital role in accurately predicting successful transitions into employment as well as active job search.

On the other hand, the Admin feature set, which includes a comprehensive set of baseline characteristics and extensive labor market and health histories, shows comparatively lower predictive performance when predicting employment. For employment within a year and active job search, the XGBoost model achieves AUC-ROC scores of 63.63% and 68.53%, respectively, and AUC-PR scores of 9.87% and 42.46%, respectively. These results highlight that the Admin feature set alone may not fully capture the essential factors that influence successful transitions into employment or engaging in active job search.

When looking at the secondary outcome, transition into education within a year, the Admin feature set outperforms the ERIQ feature set in terms of both performance measures.

The combined feature set (Admin + ERIQ) leverages the strengths of both ERIQ and Admin feature sets. However, even with this combination, the ERIQ feature set remains crucial for improved predictive performance. For employment within a year and active job search, the XGBoost model achieves AUC-ROC scores of 83.73% and 86.32%, respectively, and AUC-PR scores of 24.89% and 66.05%, respectively. These results, compared to those using ERIQ or Admin alone, indicate that when predicting future employment and active job search, very little is gained from adding Admin to the ERIQ feature set.

We divided the data into two sub-groups based on the institutional setting, as caseworkers at the PES focus primarily on helping young individuals below 30 into education and on finding employment opportunities for those aged 30 or above. Consequently, it is interesting to assess whether ERIQ indicators serve as the best feature set for predicting the transition into employment and job search for both age groups. Supplementary Table A.3 reveals that ERIQ indicators have a significant impact on the model’s performance across both age groups. Moreover, the noteworthy predictive performance of the full models when predicting enrollment into the secondary outcome, education, using the Admin feature set, is largely influenced by age. Thus, the ERIQ feature set proves vital as its adaptable features provide crucial information on employment and education for both age groups. The results reaffirm the importance of the ERIQ feature set in accurately predicting successful transitions out of social assistance and engaging in active job search for individuals in different age groups, reinforcing its value for policy decisions aimed at enhancing reemployment and other outcomes for disadvantaged social assistance recipients.

Another way to illustrate the ability of the different feature sets to predict the primary outcomes is to plot the true positive rate within prediction deciles, with decile 1 being the 10% of individuals in the sample with the lowest predicted probability of a given outcome, and decile 10, similarly, being the 10% with the highest predicted probability of a given outcome.

We perform this analysis for the two primary outcomes, employment and active job search, in Figs. 2 and 3, respectively. The figures only show the result from our preferred XGBoost machine learning model. Figure 2 shows that, for the Admin feature set in panel (a), the true positive rate tends to increase by prediction decile, but the increase is not very steep, nor is it monotonous. In the lowest prediction decile, the true positive rate is around 2%, while it is 13% in the highest prediction decile. For the ERIQ feature set, in panel (b), the relation is, on the other hand, monotonous and much steeper. Namely, the true positive rate is below 1% in deciles 1–4 and 26% in decile 10, and the relation is convex with a large increase, especially from decile 9 to decile 10. Combining the two feature sets in panel (c), we gain a tiny bit of precision in deciles 1 and 10, but at the cost of the monotonous relationship across deciles.

Fig. 2: Fraction of true positives by prediction decile: employed within a year.
figure 2

a Displays results from a model exclusively utilizing Admin feature data, b displays results from a model exclusively utilizing ERIQ feature data, while c encompasses a model incorporating both ERIQ and Admin data.

For active job search, shown in Fig. 3, the overall picture is much the same, with gains in precision as well as monotonicity when going from Admin to ERIQ feature sets. In the Admin feature set, the true positive rate is 8% and 44% in deciles 1 and 10, respectively, while the same numbers for the ERIQ feature set are 2% and 72%.

Fig. 3: Fraction of true positives by prediction decile: applying for a job.
figure 3

a Displays results from a model exclusively utilizing Admin feature data, b displays results from a model exclusively utilizing ERIQ feature data, while c encompasses a model incorporating both ERIQ and Admin data.

In sum, the ERIQ feature set emerges as the most predictive feature set. Its ability to capture various aspects of an individual’s life, including personal experiences, social networks, coping strategies, health management, and job market awareness, proves essential in accurately predicting the transition into employment and the initiation of active job search. The superior performance of XGBoost, particularly when using the ERIQ feature set, makes it the preferred model for our analysis, allowing us to gain in-depth insights into the factors influencing successful transitions.

Client vs. caseworker questionnaire

We now investigate to what extent the client and caseworker questionnaires separately contribute to predicting the outcomes of interest. Supplementary Table A.4 sheds light on this by presenting AUC-ROC and AUC-PR scores for the complete ERIQ feature set, and split into client and caseworker indicators. The confidence intervals overlap, with a marginal discrepancy in the explanatory power of the two feature subsets. Notably, client indicators slightly outperform caseworker indicators, both when it comes to predicting transitions out of unemployment and when predicting job search. This distinction emphasizes that in scenarios characterized by limited resources for caseworkers to fill in the questionnaire, prioritizing the collection of client indicators would be a possibility.

Subset of ERIQ indicators implemented in practice

Another possibility would be to use a subset of both questionnaires. Gathering 22 indicators after each meeting between a caseworker and a client is a potentially resource-intensive endeavor, which could pose challenges for practical implementation. Drawing on insights from Rosholm et al. (2017), who performed a preliminary exploration of the correlation between ERIQ indicators and employment, a subset of these indicators has been adopted by some PES offices in Sweden.Footnote 2 We, therefore, conducted an exploratory analysis to compare the performance of the full set of indicators with this restricted subset, bearing in mind that answering only a subset of the full questionnaire may slightly alter the answers to the questions. Supplementary Table A.5 presents the predictive model’s performance using only the aforementioned subset of ERIQ indicators. The AUC-ROC and AUC-PR scores are marginally lower compared to utilizing the complete ERIQ set, yet the confidence intervals overlap. This suggests that, within resource-constrained environments, opting for these eight indicators may be a relevant strategy.

Unveiling predictive factors: exploring key variables in the ERIQ

Finally, we investigate which of the specific questions in ERIQ contribute most to the predictive model for employment-related outcomes by employing SHAP values. This approach allows us to identify the most influential factors for predicting clients’ prospects of securing a job as well as their job search activity.

In the figures below, we combined the global variable importance and local variable importance information into one main plot. The plot displays the mean of the absolute SHAP values for the ten most important variables, giving an overview of their overall impact on the model predictions. Additionally, we show the distribution of the SHAP values for the same variables using color coding to indicate the values of each variable.

Finding employment

The ERIQ indicators offer valuable insights by highlighting specific dimensions of employment readiness that directly relate to employment attainment and the initiation of job search. Employing SHAP value analysis on our preferred XGBoost model provides valuable insights into the indicators that most effectively predict clients’ likelihood of obtaining employment.

The SHAP analysis identifies the most important indicators for predicting employment. The ten most important indicators are shown in Fig. 4 and are presented both for the model using the ERIQ feature set only (panel (a)) and the model using the full feature set that combines ERIQ and administrative data (panel (b)). The analysis in panel (a) shows that the caseworker’s belief in the client’s ability to find a job (job prospect) has the largest impact on the likelihood of acquiring a job, as the SHAP value is more than twice as large as the second-most important variable, which is the indicator for actively searching for a job. It is quite impressive that the caseworkers’ subjective assessments of the client’s abilities to obtain employment trump actual job search behavior. Note, however, that the applied job search channels also appear in the figure and contribute to predicting employment. The analysis also shows that goal-oriented clients (goal orientation), who believe themselves that they can handle a job (job performance), and who improve their ability to cope with any health challenges (health), have a higher likelihood of acquiring a job.

Fig. 4: SHAP values for predicting the transition into employment.
figure 4

This figure illustrates the SHAP values for the ten most influential features. a displays results from a model exclusively utilizing ERIQ features, while b encompasses a model incorporating both ERIQ and administrative data. The variables are ordered by descending variable importance as measured by the average absolute SHAP value. The feature values are color-coded, with dark purple indicating high values and yellow indicating low values. (C) denotes client questions, and (CW) denotes caseworker questions.

When looking at Fig. 4b, which combines the two feature sets, we find that the caseworker’s belief is still the single most important predictor, followed by some long-term employment history and sex. Five of the ten most important indicators are from ERIQ, and, more importantly, these indicators are malleable, at least to some extent, in contrast to sex, age, and employment history.

Actively searching for a job

Recognizing the crucial importance of active job search as a necessary step towards obtaining employment, this section identifies key indicators that affect the probability of initiating job search.

Figure 5a underscores the strong connection between clients’ job search activities and their response across the measured ERIQ indicators. The belief among clients in their capacity to handle a job (Job Performance) emerges as the most predictive factor, while the beliefs of the caseworker are also on the list of the most important factors. Moreover, the caseworker’s assessment of the degree of goal orientation of the client is important for predicting job search. Clients who improve their ability to cope with any health challenges are also more likely to start looking for a job. Furthermore, clients who are more aware of the opportunities available in the labor market in relation to their personal resources and challenges will have a higher likelihood of searching for a job. Finally, factors such as everyday coping skills, work ideation, initiative, and ability to concentrate are good predictors of job search activity.

Fig. 5: SHAP values for predicting job search.
figure 5

This figure illustrates the SHAP values for the ten most influential features. a displays results from a model exclusively utilizing ERIQ features, while b encompasses a model incorporating both ERIQ and administrative data. The variables are ordered by descending variable importance as measured by the average absolute SHAP value. The feature values are color-coded, with dark purple indicating high values and yellow indicating low values. (C) denotes client questions, and (CW) denotes caseworker questions.

In Fig. 5b it is observed that six of the ERIQ indicators are among the ten most important predictors when combining ERIQ and administrative data in the predictive model. The important predictors from register data are a specific geographic location, being female, age, and taking (or not) antidepressant medication.

Discussion

This study introduces a new tool, the Employment Readiness Indicator Questionnaire, ERIQ, which has very strong predictive properties when it comes to predicting crucial measures of employment readiness—such as job search and obtaining employment—for social assistance recipients who are not assessed to be immediately ready for work.

The insights obtained have potentially strong implications for caseworkers’ ability to assist clients further away from employment in their progression towards employment. Thus, it offers an intermediate target outcome that can be used for measuring progression towards employment, and it points to specific challenges experienced by the client or assessed by the caseworker, which—in contrast to information from administrative registers, such as age, sex, ethnicity, and labor market history—are to some extent malleable either by the client, the caseworker, or through appropriately tailored interventions. The insights gained from using ERIQ thus enable the tailoring of interventions to address the specific needs and challenges of individuals outside the labor market, rather than a one-size-fits-all approach, ultimately increasing the efficacy of such programs.

This tool also offers an intermediate target outcome on which to measure the impact/effectiveness of active interventions (labor market and other types of interventions) targeted at overcoming specific challenges. The knowledge gained from this study can thus serve as a compass for quality assurance and evaluation efforts, guiding the selection of indicators that are most important for job search initiation and successful job acquisition for a given client with a certain combination of challenges or disadvantages.

Moreover, the results underscore the instrumental role of caseworkers in guiding disadvantaged clients’ progress toward the labor market. Caseworkers serve as catalysts for clients’ successful transition out of long-term unemployment, as evidenced by the high importance of both client and caseworker indicators on predicted probabilities. This confirms the importance of promoting collaboration between caseworkers and clients to maximize the impact of reintegration efforts into employment.

In conclusion, ERIQ enables a deeper dive into the dynamics governing job search activities and employment outcomes among disadvantaged social assistance recipients and is directly applicable in PES offices.

The applicability of ERIQ’s predictive capabilities beyond the Danish context is yet to be established. Nevertheless, its active implementation in several PES offices in Sweden is an encouraging sign of potential cross-cultural utility. Considering the extensive volume of data within Denmark’s administrative registers, the noteworthy superiority of a tool utilizing ERIQ over one relying solely on administrative registers implies that this enhanced predictive performance may extend to other nations as well. This suggests a potential superiority of ERIQ over administrative data-based predictive tools in diverse international contexts.

Methods

This study utilizes a unique data set comprising self-reported progression surveys collected every three months from disadvantaged social assistance recipients, that is, social assistance recipients defined (by their caseworkers) to be not immediately ready for work but ready for activation, in 10 job centers across Denmark, along with their approximately 300 attached caseworkers. The project and data collection was conceived and organized by Væksthuset (the Greenhouse) and Væksthusets Research Centre.Footnote 3 The data spans a four-year period, from 2013 to 2016, with almost all clients entering the project in 2013. An essential aspect of this study is the ability to merge these survey responses with comprehensive data from administrative registers, which include detailed geographic and demographic information as well as very detailed weekly information on labor market status (employment, unemployment, and other income transfers, etc.), detailed educational information, and historical health and criminal records, for each recipient.

The integration of self-reported surveys with administrative data facilitates a comprehensive understanding of the recipients’ progression and the intricate factors that impact their transition into employment. By combining these two data sources, this study gains valuable insights into the dynamics shaping recipients’ trajectories and elucidates the correlates of successful employment outcomes. This comprehensive approach enables a more holistic exploration of the multifaceted factors that contribute to recipients’ transition from social assistance to employment, providing a robust foundation for evidence-based policy recommendations and interventions.

Sample description

The predictive model was analyzed using data collected from social assistance recipients assessed to be not ready for work in 10 municipalities across Denmark. The initial data set encompassed 15,818 unique responses from 5512 clients. Each response ideally consisted of both 11 questions posed to the client (the client questionnaire) and 11 questions posed to the caseworker (the caseworker questionnaire). These questionnaires were answered in connection with compulsory meetings held between caseworkers and clients at the PES. To ensure the reliability and accuracy of the data, we carefully filtered out responses where either the client or the caseworker had not answered the questionnaire. Additionally, we excluded observations with a gap of more than 6 months between the completion of two questionnaires. These data-cleaning steps resulted in a data set comprising 11,268 unique responses from 3697 clients.

To focus specifically on the study of progression, we further eliminated 1105 responses from clients who had only completed the survey once. As a result, the final population included in the statistical analysis consisted of 10,163 observations from 2599 clients. It is important to note that whenever we analyze progression towards employment, the client’s initial answers as well as the change in the answers from the first survey to the current one are included in our assessments. Thus, for the final analyses, the data set comprised 7564 unique responses from 2599 clients.

Supplementary Table A.1 presents information on the 2599 clients included in the analysis and their characteristics, measured at the time of the first meeting between the client and the caseworker. The clients were on average 39 years old, there was a small over-representation of women, and only 20 percent were married or stably living together with their partner. In general, the clients hold a low educational level, with 71 percent having high school or less as their highest educational degree. Their employment history over the previous 5 years was very unfavorable, i.e., they were employed on average 2% of the past two years and 10% of the past five years. They had a substantial use of social assistance, which they also received at the time of measurement in order to be included in the study. Interestingly, and in line with our expectations, clients also had high usage of prescription medication (especially painkillers, lifestyle medication, and antidepressants) and generally many contacts with the healthcare sector in terms of somatic and mental health diagnoses.

As is common in the literature, we further divide the complete set of ERIQ responses into two separate samples for modeling purposes: a training sample representing 75% of the data, utilized for model development, and a test sample encompassing the remaining 25%, used to evaluate model performance. To avoid overfitting issues, we randomize individuals based on their (anonymized) personal ID numbers, guaranteeing that no individuals appear in both the training and test samples. This method yields a training sample comprising 5675 responses from 1930 distinct participants, while the test sample contains 1889 responses from 672 different participants.

Outcomes and feature sets

Outcomes

The primary objective of this project is to predict the transition into employment within a year after answering the questionnaire and the initiation of active job search. As secondary outcomes, we also consider transitions into educational programs in the ordinary educational system within a year (although ERIQ was not developed with this transition in mind), since for social assistance recipients below 30 without qualifying education, it is a major aim for the PES to help them into the educational system. We also consider the transition into either employment or education within a year as a secondary outcome.

Table 2 illustrates that only 9% of the sample successfully made a transition into either employment or education, with 6% entering employment and the remainder entering education. These figures underscore the main challenge faced when investigating the progression from social assistance towards employment, as a significant portion of the recipients are very distant from the labor market.

Table 2 Descriptive statistics for outcomes.

Job search is captured by constructing a dummy variable to indicate whether they are actively applying for jobs or not, taken from ERIQ. Table 2 demonstrates that job application prevalence is substantially higher, with 27% of the sample actively searching for jobs. We approach this measure from two perspectives. First, we examine whether applying for jobs serves as a viable intermediate goal toward the long-term objective of leaving unemployment altogether by including this dummy in the model for the transition into employment (and education). Secondly, we explore whether ERIQ can predict the likelihood of applying for jobs and, consequently, enhance the probability of successful reemployment in the long run. By investigating both of these angles, we hope to uncover valuable insights to support individuals in their progression from social assistance towards employment.

Feature sets

We construct two distinct feature sets for our analysis. The first feature set (referred to as the “Admin" feature set) comprises a comprehensive range of characteristics of social assistance recipients, extracted from Statistics Denmark’s administrative registers, which integrate population-wide data from all public databases. This data set includes demographic variables such as sex, age, ethnicity, cohabitation status, municipality of residence, and educational level, alongside detailed records on employment history, income, health status, medical diagnoses, and criminal history. Additionally, it captures information on social benefits, disability support, and housing conditions, offering a robust foundation for analyzing labor market trajectories. An exhaustive list of variables included in the Admin feature set is provided in Panel B of Supplementary Table A.1 in the Supplementary Information. The Admin feature set provides information on the participants receiving social assistance, representing characteristics that are often challenging, if not impossible, to change.

The second feature set is the ERIQ (referred to as the “ERIQ” feature set). It contains all the information obtained from the two questionnaires (one for clients and one for caseworkers). Social assistance recipients participating in ERIQ are queried approximately every three months during compulsory meetings at the PES, where they respond to a set of questions about their personal experiences. These questions cover various aspects, including social networks, coping strategies in daily life, health management, and knowledge about opportunities in the labor market, as well as job search strategies. Additionally, the caseworkers are asked to evaluate the same social assistance recipients at the same meetings, using a set of indicators, some of which overlap with the participant’s indicators, while others explore additional dimensions, such as concentration ability and the caseworker’s belief in the participant’s potential for employment. The selection of questions for both the participants and caseworkers was based on a comprehensive literature review Væksthuset and NewInsight (2012), aiming to identify employment readiness indicators that are malleable. The selected indicators are summarized in Table 3, while the full set of questions are available in Supplementary Table A.2. For descriptive statistics, please refer to Panel A in Supplementary Table A.1.

Table 3 ERIQ indicators.

Finally, we combine the two feature sets into a third feature set ("Admin + ERIQ”) to investigate whether the information contained in both sets complements each other, resulting in improved predictions. Alternatively, if no significant improvement is observed, it may suggest that one of the sets is more influential in the prediction process.

Prediction models

Following Rosholm et al. (2024), we employ four different machine learning methods of varying complexity to predict the primary and secondary outcomes. Importantly, all four models are implemented using the same sample splits and data, ensuring the model predictions are directly comparable.

Linear probability model

First, we consider a linear probability model (LPM) estimated using ordinary least squares. This model offers the advantage of being straightforward and interpretable, allowing us to determine the influence of each variable by examining the regression coefficients. However, the disadvantage of the LPM lies in its simplicity, as it only captures linear relationships in the data and assigns non-zero weight to all variables in the feature set, which increases the risk of overfitting.

Logistic regression model with LASSO

The second model combines the Least Absolute Shrinkage and Selection Operator (LASSO) (Tibshirani, 1996) with a logistic regression framework. This hybrid approach is well-suited for handling binary outcome variables and provides both variable selection and regularization, enhancing the precision of the predictions. To determine the optimal size of the regularization parameter λ, we employ five-fold cross-validation. Specifically, we select the value of λ that maximizes the cross-validated AUC-ROC (see below).

To implement this model, we utilize the glmnet R package, and following the authors’ recommendations, we standardize all variables to have a mean of zero and a standard deviation of one. This standardization helps ensure comparability and stability in the model’s performance across different variables.

Random forest model

The third model is a random forest model, initially introduced by Breiman (2001), which employs bagging as an ensemble learning technique. Bagging involves training different individual decision trees on various random subsets of the data in parallel. Additionally, random forest models perform a random selection of explanatory variables for each decision tree, significantly reducing the risk of overfitting the model.

For the implementation of the random forest algorithm, we utilize the ranger R package (Wright and Ziegler, 2017). To optimize the model’s predictive performance, two critical hyperparameters, namely the number of variables considered at each node (mtry) and the minimal node size (min.node.size), were thoughtfully selected. We employed a Bayesian optimization approach to identify the optimal hyperparameter configurations, maximizing the AUC-ROC through five-fold cross-validation. We implement the random forest algorithm using 1,000 independent trees.

Extreme gradient boosting model

The final and most complex predictive model is the extreme gradient boosting (XGBoost) model (Chen and Guestrin, 2016). This method uses boosting as an ensemble learning technique. Boosting combines weak models iteratively, focusing on correcting errors made by previous models, to create a strong predictive model. The XGBoost algorithm effectively handles nonlinear relationships in the data and mitigates overfitting through regularization and pruning.

To estimate the XGBoost model, we utilize the xgboost R package and fine-tune its performance by optimizing seven hyperparameters through Bayesian optimization. In accordance with the xgboost package’s terminology, we explore the following hyperparameters: max.depth, eta, gamma, subsample, colsample_bytree, colsample_bynode, and min_child_weight. Specifically, we search for the hyperparameter configurations that yield the highest AUC-ROC in the training sample.

Performance metrics

The predictive models we consider yield the probability of the transition into employment (or one of the other outcomes). To assess their performance using two different feature sets, we employ AUC-ROC and AUC-PR as performance metrics.

The ROC curve plots the true positive rate of the predictive model against its false positive rate for each decision threshold from 0 to 1. A higher AUC-ROC indicates that the model is more likely to assign a higher predicted probability of transition into employment to a randomly chosen true positive (i.e., an individual actually finding employment) than to a randomly chosen true negative (i.e., an individual not finding employment). It is essential to note that a fully random prediction would yield an AUC-ROC of 50%.

In binary classification, the precision of a classifier is the ratio of true positives to the total number of predicted positives (true positives plus false positives), while recall corresponds to the true positive rate (true positives divided by the sum of true positives and false negatives). By adjusting the threshold between zero and one for a given prediction model, we can plot the precision-recall curve, and the area under this curve (AUC-PR) can be calculated. An optimal model would have an AUC-PR value of one, indicating perfect precision and recall, while random guessing yields a score equal to the proportion of positives in the data (in our case, 5.8% for employment). Higher AUC-PR values indicate better model performance for a specific data set, but it is crucial to compare them to the prevalence of the outcome in the data. Therefore, direct comparison of AUC-PRs between different data sets or outcomes should be avoided, as their interpretation is specific to the characteristics of each data set. However, it is valid for comparison between different feature sets and model specifications.

The AUC-PR has a particular advantage in the context of highly imbalanced data, in the present case where the fraction of negatives is significantly larger than the fraction of positives (Saito and Rehmsmeier, 2015). In the ROC approach, equal importance is given to predicting both negative and positive instances correctly, which might lead to a high AUC-ROC score even when the model exhibits a significant number of false positive predictions. This is more likely to happen in severely imbalanced data sets, where true negatives outweigh false negatives. However, because the AUC-PR focuses on how well the model predicts the positives (i.e., movement into employment), the fraction of correctly predicted negatives becomes irrelevant.

In the context of transitions from social assistance to employment, it is crucial to study how well a predictive model can identify positive outcomes. Therefore, focusing on precision and recall allows us to address this aspect effectively, ensuring that the model’s performance is assessed based on its ability to predict positive outcomes accurately.

Explaining predictions

To elucidate the influence of different variables, including interactions between them, on the outcomes of interest, we employ Shapley additive explanation (SHAP) values (Lundberg et al. 2020; Lundberg and Lee, 2017). SHAP values offer a model-agnostic approach to unravel the underlying factors shaping the predicted probabilities of the transition out of unemployment.

By utilizing SHAP values, we can gain insights into how predictive models make specific predictions for each individual in the dataset. These values provide a measure of the contribution of each variable in each feature set to the final prediction. A SHAP value for a variable expresses how much its information alters the model’s opinion in relation to the prediction. In other words, SHAP values illustrate how the values of individual variables influence the prediction away from the average prediction of the outcome while accounting for correlations between variables. For comparison, the SHAP values equal the regression coefficients of a linear regression model in situations where variables are uncorrelated and there are no interactions.

The adoption of SHAP values enhances the interpretability and transparency of predictive models, enabling a deeper understanding of the factors influencing the outcome of interest. The insights gleaned from SHAP values facilitate tailored interventions and evidence-based policy decisions aimed at adaptable variables, thus potentially contributing to higher employment rates in the long run and increasing the well-being of social assistance recipients not ready for work.