Abstract
Personality is predictive of many behaviors, but personality questionnaires cannot be readily administered to persons of interest. The language people use to express themselves can often predict personality and so should, in theory, provide a surrogate marker for predicting behavior. We used social media (Twitter) language from a sample of 252 NBA players to estimate their Five Factor personality scores, and then, used these scores to try and predict on-court transgressive behavior. A machine learning model was able to predict players’ tendency to commit technical fouls (predictive performance: r = .18); with the most important contributors to the model including neuroticism, extraversion, and conscientiousness. These findings show that personality can predict individual choices and behaviors in specific contexts; furthermore, by assessing the degree to which our digital footprint can be used to derive actionable predictions of behavior, the current findings could inform discussions concerning regulations of data privacy.
Similar content being viewed by others
Introduction
The prediction of human behavior has long been a central goal in psychology. Early research relied on controlled experiments and self-report measures to identify stable traits and cognitive mechanisms. While these methods offered foundational insights, they often struggled to capture how behavior unfolds in real-world settings. The rise of large-scale data collection has transformed this landscape, enabling researchers to infer behavioral patterns from digital footprints—ranging from online interactions to biometric signals. Advances in machine learning (ML) have further accelerated this shift, allowing for the prediction of human behavior from vast and diverse datasets.
Today, behavior prediction models draw on an expanding range of raw data. Social media activity has been used to predict mental health conditions1, political preferences2, and individual-level income3. Wearable sensors have helped track mood fluctuations4, and deep learning methods have enabled sophisticated forecasts in different domains like in economics and financial decision-making5,6. These approaches highlight the growing power of data-driven modeling in capturing human behavior.
However, many of these models function as black boxes, identifying statistical patterns without clarifying the psychological mechanisms that underlie them. This raises concerns about causal validity and interpretability. For example, a model might predict depression from social media posts based on surface-level language features rather than meaningful cognitive or emotional markers. Additionally, models trained on narrow or biased datasets often fail to generalize across populations or contexts. For instance, criminal recidivism algorithms have been shown to reproduce historical biases7, and it is possible that different psychological inferences from online text reflect platform-specific norms instead of general individual characteristics. More fundamentally, accurate prediction is not equivalent to understanding. In behavioral science, prediction alone is not enough. Explanatory insight—understanding why someone behaves as they do—is equally important. Addressing this requires models that are not only statistically robust but also theoretically grounded.
One way to enhance interpretability is to treat psychological constructs as intermediary layers between raw data and behavioral outcomes. Instead of predicting behavior directly from digital traces, models can first infer traits such as personality and then use those traits to explain and forecast behavior. Among such constructs, the Big Five personality model8 is especially well-suited for this role. Personality shapes decision-making, emotional regulation, and social behavior, and its relative stability across time and contexts makes it a compelling bridge between raw data and psychological theory.
The big five and behavior prediction
There is a significant body of research demonstrating that the richness of people’s personalities can be effectively captured using five dimensions, the so-called Big Five factors: Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness to Experience8. The Big Five model is widely accepted and has been applied in a variety of contexts, such as clinical settings9, decision making10, vocational psychology11, and health psychology12,13 among others.
There has been a longstanding debate concerning the extent to which the Big Five model can accurately predict behavior. The dispositional perspective argues behavior stems from stable traits14,15, while the situationist view emphasizes contextual influence16,17,18,19,20,21. For example, while people typically construe helping behavior as dependent on people’s enduring personal traits, research in social psychology has demonstrated that context-specific behaviors such as giving money to charity or helping a stranger are dependent on specific situational properties such as whether the potential helper is late for work22. In light of this, much research has attempted to empirically evaluate the extent to which personality models are indeed predictive of behavior23,24,25,26,27,28,29.
Ample research has established that personality is indeed a reliable predictor of feelings, beliefs, and attitudes9,30,31. For instance, previous studies have shown that personality predicts political ideology32, happiness and subjective well-being33,34, the quality of romantic relationships35,36, job satisfaction37, and numerous other phenomena38. However, measures of political attitudes, well-being, job satisfaction, and others, similar to personality questionnaires, rely on self-report and are therefore prone to biases such as the social desirability bias39, self-knowledge limitation40, and other biases41,42, which may compromise their relation to real-world behavior.
Nonetheless, there is also research that indicates that personality is predictive of real-world significant life outcomes, which are not subject to self-report related biases38,43,44. For example, personality has been shown to predict objective measures of career success43. Personality has also been reported to predict other directly measured significant life outcomes such as academic performance45, divorce44, and mortality or longevity46,47.
Can personality predict specific behaviors in specific contexts?
While it is well-established that personality is a significant factor in human life, perhaps the most effective way to evaluate the dispositionist versus situationist perspective is to assess the extent to which personality can predict individual choices and behaviors in specific contexts, rather than examining the overall effects of personality on broad life outcomes.
Life outcomes result from the accumulation and integration of many behaviors over time. For example, a successful career is built on an array of behaviors including excellence in the workplace, networking, seizing opportunities, and continuous learning and development. While research on general life outcomes resulting from multiple behaviors23, can provide valuable insights, it does not necessarily inform our understanding of whether personality is a predictor of specific responses to specific situations in the real world.
Several studies demonstrate the potential for predicting situation-specific behavior. For instance, research has shown that a person’s personality can predict their likelihood of clicking on advertisements48 and that measures of narcissism can predict a tendency for specific behaviors such as skipping class49. However, relatively little research has demonstrated the ability of the personality measures to predict behaviors that are (i) concrete and context-specific, (ii) occur in the real world, and (iii) are consequential.
As a result, our understanding of the magnitude and boundaries of the effects of the situation and personality is somewhat limited. It is possible, for instance, that in situations where the stakes are high (e.g., investment decisions), the power of the situation overshadows any variability in behavior caused by personality. Similarly, it is possible that in highly professional and practiced settings (e.g., military), individual differences do not prove meaningful in predicting concrete behaviors. In light of this, the present study aims to focus on a unique case of real-world, context-specific behavior among highly trained individuals in high-stakes situations.
Prediction of transgressive behavior
One of the most consequential (and potentially, problematic) contexts that can serve as a test case for the power of personality is the domain of predicting transgressive behavior. Previous research has demonstrated that various factorial models of personality, such as the big five or HEXACO, can explain some forms of transgressive behavior.
For example, Agreeableness and Conscientiousness, as measured by different five-dimension models, and the Honesty-Humility dimension in the HEXACO scale, were negatively associated with self-report measures of workplace delinquency, such as vandalism and theft, in the Workplace Behavior Questionnaire50. Similarly, studies using various factorial models of personality, including the big five or HEXACO, have found a negative association between Agreeableness and Conscientiousness and anti-social behavior (ASB), while a small but significant positive association has been found between Neuroticism and ASB51. However, these studies may be subject to biases due to their reliance on self-report measures; moreover, they did not investigate concrete and context-specific behaviors.
The potential for predicting transgressive behavior presents a complex ethical quandary, as the benefits of early prediction of such behaviors must be weighed against potential drawbacks. For instance, the use of predictive capabilities may result in discrimination against minorities, as demonstrated by machine learning algorithms that have consistently predicted that African-Americans are more likely to exhibit future criminal behavior7. Moreover, the use of predictive algorithms in these contexts raise concerns regarding the possibility of self-fulfilling prophecies, where individuals labeled as at-risk may be treated differently and consequently engage in predicted behaviors.
These ethical concerns are exacerbated in light of recent research showing that people’s behavior and personality are reflected in our digital footprint and can be predicted; for example, personality can be predicted from Facebook likes2 or online language52,53,54,55. There is a growing apprehension that such models could be used for the prediction of such sensitive outcomes as delinquency, thereby compromising individuals’ freedoms. At the same time, there are claims that given the limited predictive validity of personality, the fears may be overblown56.
Professional sports as a test case for prediction of transgressive behavior
When seeking real-world examples in which the context is expected to shape individuals’ behavior, such as high-stakes situations in which individuals display a high degree of professionalism, a valuable case study is that of high-profile professional athletes. These individuals invest years honing the necessary skills for their sport and are highly practiced in how to respond to specific scenarios during competition. Professional athletes have a lot at stake in terms of prestige, social status, and potentially millions of dollars in salary. The intersection of situational pressure and training to respond to specific scenarios in professional athletics presents a prime example of how the context can supersede personality influences on behavior and serves as a test case for examining the extent to which personality can predict behavior despite these constraints.
In the context of sports, explicit, well-established rules dictate both acceptable and unacceptable behavior. Some actions such as overly aggressive or disrespectful behavior are prohibited in most sports and strictly enforced. Violating these rules often results in significant penalties for the athlete and their team, including points loss, advantages for the opposing team, and even exclusion from participation. Respectively, these offenses are commonly registered as technical fouls. Despite the high cost to the athlete and their team, technical fouls are not unordinary in the heat of high-stakes competition, particularly in team sports.
The occurrence of technical fouls in professional basketball players thus offers a tangible measure of “transgressive” behavior likely influenced by lapses in anger management. Given the sensitivity of predicting delinquency in the real-world, technical fouls offer us a suitable “toy model”; namely, a benign context that can nonetheless inform the debate on the role of personality in predicting a specific behavior in a specific real-world context.
The current research
In the current work, we focused on the context of high-profile professional basketball players from the National Basketball Association (NBA). Specifically, we used the language those players used on a social media platform (i.e., Twitter) to predict their five personality dimensions according to the Big Five approach. Then, we utilized those scores to predict “transgressive” behavior of the players (i.e., technical fouls). To ensure we provide an unbiased assessment of predictive accuracy we adhered to a pre-registered analysis plan.
Method
Case selection
To study the extent to which inferred from social media language personality scores can predict a real-world context specific consequential behavior, choose to focus on technical fouls in professional basketball games. The case of technical fouls in professional basketball provides an especially informative test case. In most team sports, rule violations related to player conduct vary in their nature and intent. In basketball, technical fouls are predominantly the outcome of momentary loss of composure, such as arguing with referees, excessive celebration, or unsportsmanlike conduct—infractions that do not typically affect the flow of play but reflect lapses in self-regulation. In contrast, other sports, such as football (soccer), exhibit disciplinary actions that are often directly tied to gameplay. While red cards can be given for unsporting behavior (e.g., dissent toward officials), they are more frequently issued for dangerous tackles, denial of a goal-scoring opportunity, or violent conduct—infractions integral to the physical and strategic nature of the game. Similarly, in hockey and American football, penalties for excessive aggression or unsportsmanlike behavior exist but are often intertwined with the physical demands of the sport. Given these distinctions, we focused on basketball as a test case due to the clearer separation between technical fouls and personal fouls. This delineation allows for a more precise examination of personality-driven transgressive behaviors, as technical fouls provide a distinct measure of impulsive reactions rather than strategic or gameplay-related actions.
Participants
The final sample included 252 NBA players who played in the league in 2019–2020 and/or 2018–2019. We excluded players who were not English speakers and/or did not have enough tweets on their Twitter accounts. We also removed players that did not play at least ten games during a given season and played at least 15 min per game.
Tools
We used Tweeter API access, which allowed us to inspect and download the NBA players’ tweets. Using this tool, we could access each player’s last 3200 published tweets using his username. We used the widely used “rtweet” package in R to extract data about tweets. We collected the online data from the official NBA website for the NBA players’ performance statistics.
Technical fouls
To measure how a given player can control himself, remain calm during the game, and follow the rules, we collected the average number of Technical Fouls a player committed from 2012 to 2021 in the regular season. A Technical Foul is a violation of the game’s rules committed by a player representing unsportsmanlike conduct (e.g., arguing with the referee, profanity, and behaving in an overly aggressive manner).
Big-five personality scores
To prepare the data for the NLP model, we conducted a few steps: We erased repetitive sentences; by checking the tweets’ published source, we eliminated automatic applications posting tweets that were likely for public relations; and we removed phrases that came after the symbols “#” or “@” because most of the time, they involved “catchphrases” used by the players. After this cleaning process, we removed players with less than 500 English words in all their available Tweets or insufficient/non-English data. To estimate the personality dimensions for the NBA players from the tweets we collected, we rely on a model57 trained on Facebook data that has been shown to successfully predict personality dimensions (average predictive performance across dimensions of r ~ .38). We estimated the Big-Five Personality scores for each player by the above procedure and used it as our independent measure.
Pre-registered analysis plan
Under our estimate that four out of five personality dimensions could contribute to technical foul behavior, we examined whether personality traits as a whole can predict technical fouls. We used two models to predict Technical Fouls based on personality. To avoid procedural overfitting, we pre-registered two specific models and used Bonferroni correction to examine the statistical significance of the prediction.
Model 1—Zero-inflated negative binomial count model
Because a large proportion of the players did not receive any technical fouls, the data is somewhat zero-inflated. Therefore, we fitted a zero-inflated count model with total minutes as an offset variable. This mixture model included a negative-binomial count model - accounting for any possible over/under-dispersion in the data - and a binomial model accounting for excess of 0s. Both the binomial and negative-binomial components were modeled with the same predictors.
Model 2—SVM with radial basis function model
Model 1’s advantages are that it is less prone to overfitting due to its linear form’s simplicity and has the benefit of better interpretability of the model’s coefficients. However, it cannot detect potential non-linear regularities in the data (e.g., perhaps both low and high, but not middling Neuroticism might predict an increase in technical fouls. Alternatively, high Neuroticism can predict technical fouls only when Agreeableness is low). As such, we also attempted to fit a non-linear model. We used support vector regression with a radial basis function. For this model, we normalized the dependent variable to reflect Technical Fouls per minute. We applied a nested Cross-Validation (CV) procedure for optimizing model parameters and hyperparameters: Selection of C (cost) and Sigma hyperparameters were made with a six-fold CV on the training set (N-1 observations). The selected hyperparameters were used to train a model on the entire training set, then used to generate a Leave-One-Out (LOO) prediction for each N held-out observation.
Model evaluation
We used a leave-one-participant-out cross-validation scheme (LOO-CV) to evaluate our models. Each model was trained on N-1 players, and a prediction was generated for the held-out observation (repeated for all N players). Thereby the models created an unbiased estimate of predictive performance. The model’s predictive performance was assessed by the significance of a simple correlation between the predicted and actual values for players, using a corrected threshold of p < .05/2 = 0.025, single-tailed.
Ethics approval and consent to participate
This study was conducted in accordance with the relevant guidelines and regulations. Ethical approval for the study was obtained from the Ethics Committee of Tel Aviv University. As the dataset generated and analyzed during the current study is not publicly available due to privacy concerns regarding the participants, and no personal information was collected, the Ethics Committee of Tel Aviv University waived the need to obtain informed consent.
Results
Pre-registered analysis
The zero-inflated negative binomial count model was unable to yield accurate predictions, r = .05, p = .21. However, the results of the SVM model were significant, with the correlation between a player’s predicted technical fouls per minute and his actual data was in the small-to-medium range, r = .184, p = .002, which translate to a non-negligible Binomial Effect Size Display (BESD) of 59.2:40.8 odds ratio (See Table 1).
Exploratory data analysis
We further investigated the contribution of each personality factor. To do so, we created five SVM models with solely each one of the factors and five models with all but one of the factors. The results of these analyses are presented in Table 1. As can be seen, the personality dimensions that contributed to the prediction of technical fouls were Agreeableness, Neuroticism, and Conscientiousness.
In order to investigate the unique contribution of each personality factor, five different SVM models were created, each using a different factor as the sole predictor, and another five models were created, each using four of the five factors as predictors (omitting one factor in each case). The outcomes of these analyses are reported in Table 1, revealing that the personality factors that contributed to the prediction of technical fouls were Agreeableness, Neuroticism, and Conscientiousness.
Unlike a linear regression model, radial support vector regression does not yield easily interpretable coefficients. To get a qualitative sense of the directionality of the effect of personality dimensions, we looked at the Technical Fouls per minute for the top and bottom 25% of each personality dimension. The observed trends were consistent with theory such that individuals with high Agreeableness had nominally fewer technical fouls (M = 0.8907, SD = 0.9049) than individuals with low agreeableness (M = 1.4072, SD = 1.4042); individuals with high Conscientiousness had nominally less technical fouls (M = 1.1365, SD = 1.3742); than individuals with low Conscientiousness (M = 1.2194, SD = 1.2719); individuals with high Neuroticism had nominally more technical fouls (M = 1.2413, SD = 1.1395) than individuals with low Neuroticism (M = 1.1417, SD = 1.2776).
Discussion
The current study aimed to explore the ability of personality to predict context-specific behavior in a high-stakes, real-world context. By using a natural language processing (NLP) model, we inferred the personalities of professional basketball players from their tweets; these personality traits were then used to predict a given player’s tendency to commit technical fouls. In order to arrive at an unbiased and reliable estimate of predictive accuracy, we pre-registered our analysis and applied cross-validation and null hypothesis significance testing. Our results revealed that the predictions of a non-linear machine learning model were significantly correlated with actual behavior.
Our findings suggest that personality can indeed predict real-world context-dependent behavior, even in a high-stakes situation. While the level of predictive accuracy obtained in this study (3.38% explained variance) may not be optimal, it still translates to a 60% “accuracy” when viewed in terms of a Binomial Effect Size Display. This level of accuracy is far from negligible and can provide actionable insights58. These results emphasize the importance of personality and the role it plays in real-world behavior.
The findings are perhaps especially informative given that the current study investigated the behavior of NBA players that are highly skilled professionals. Such a context of elite performers was likely to generate “range restriction,” limiting the predictive ability of the model; attempts to predict the behavior of the general public could yield more actionable predictions.
Although we utilized the current gold-standard model for predicting personality from text in this study, it should be noted that this model still only offers a rough estimate of individuals’ personality scores, with an average prediction performance of r = .38 57. While it is possible that the correlation between machine learning personality assessment tools and self-report questionnaires is due to the uncovering of previously unseen aspects of personality not captured by the questionnaires59, it remains possible that more accurate estimates of behavior (e.g., by using digital footprints from additional social media platforms) could further improve the overall predictive accuracy. Possible future direction would be to compare the predictive power of these language-based personality estimates to traditional self-report measures. While the current dataset does not allow for such a comparison, future research could explore whether self-reports, language-based models, or a combination of both yields more accurate or complementary behavioral predictions. Finally, it is possible that models that can estimate specific personality facets60 or prediction-focused alternatives to the Big Five61 may increase predictive accuracy even further.
The approach utilized in this study—inferring personality traits from digital footprints and subsequently leveraging them to predict behavior—demonstrates a versatile methodological framework that can be applied across various domains. By first extracting individual differences through digital markers and then employing these traits to forecast specific actions, this methodology provides a scalable, generalizable model for behavioral prediction. This framework is not limited to our specific use case but instead offers a foundation that can be adapted with different input sources (e.g., video interviews, written texts, or audio conversations), via diverse theoretical constructs (e.g., Dark Triad traits, impulsivity, or moral foundations), to predict a range of consequential behaviors (e.g., professional misconduct, leadership effectiveness, or law enforcement decision-making).
Our findings also speak to the discussion concerning ethical concerns associated with the possibility of predicting behavior from people’s digital footprints. One can envision a world where corporations, governments, and other institutions have the power of understanding our psychology superbly well62. For example, we should be wary of a “minority-report” like a dystopia where predictive algorithms are used to predict transgressive behaviors7. While such a possibility is highly concerning, it has been suggested that such fears are overblown56. Specifically, while it appears evident that our online presence can be used to detect our personality as discussed earlier63,64,65, there is considerable controversy over whether the predictive validity of our current personality models is sufficient to generate actionable predictions26. The findings of this study, which used a toy model to predict “transgressive” behavior, provide insight into the potential risks associated with using digital footprints to predict consequential behavior using psychological knowledge and machine learning algorithms. Importantly, despite the limitations of the predictions, inferences of personality from digital footprints were significant, even at the most reductive level of situational responses, and therefore should not be underestimated.
The current work joins previous research in showing that the five dimensions of the big five are indeed related to people’s behavior44. One unique aspect of this research is that it demonstrates that using five personality dimensions as inputs to a machine learning algorithm can greatly improve the predictive performance of these dimensions. While the use of non-linear machine learning algorithms complicates attempts to draw theoretically relevant insights (as compared to simple regression models), we were able to estimate the importance of each dimension to prediction by examining the effect of omitting each of the five dimensions, and by running the models on a single dimension. As in previous studies, and perhaps unsurprisingly, the results showed that the most crucial dimensions herein were Agreeableness, Neuroticism, and to a certain extent, Conscientiousness66.
The performance of a technical foul by basketball players is a form of rule-breaking behavior that involves aggressive behavior and a momentary failure in self-regulation. Previous research has shown that people who are higher in Agreeableness were lower in delinquency, confrontational attitudes, aggressiveness, and antisocial behavior measures66,67, which is consistent with a qualitative analysis of the directionality of the data. Our results are also in line with research showing that individuals who are higher in Conscientiousness were less angry and less likely to express aggression when angry – indicating that Conscientiousness may play a role in self-control over behavior in frustrating situations68. Other evidence shows that people who are higher in Neuroticism are more likely to be higher in the trait of anger and demonstrate antisocial behavior and aggression66,67, which is again consistent with the trends observed in the current research.
To conclude, the current work examined the extent to which personality can be used to make actionable predictions of consequential behavior in a specific real-world context. It also demonstrated that meaningful behavior can be predicted from digital footprints, providing further support for the predictive validity of personality. By adhering to a pre-registered analysis plan, we sought to provide a relatively unbiased estimate of the degree to which we may be knowable from our online presence. The findings show that the use of machine learning and personality models does indeed allow us to generate predictions of specific real-world behavior that reliably differ from chance; but while they are potentially actionable, they are still limited. Nonetheless, these results likely describe a lower bound, as it is plausible that future research with larger data sets, more sophisticated psychological models, and more advanced algorithms could render behavior more predictable27,50,61. As such, the findings could inform discussions concerning regulations of data privacy.
Data availability
Data availability. The dataset generated and analyzed during the current study is not publicly available due to privacy concerns regarding the participants. However, it may be made available by the corresponding author, Maor Daniel Levitin, upon reasonable request, contingent on maintaining participant confidentiality.
References
Eichstaedt, J. C. et al. Facebook Language predicts depression in medical records. Proc. Natl. Acad. Sci. 115, 11203–11208 (2018).
Kosinski, M., Stillwell, D. & Graepel, T. Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. U S A. 110, 5802–5805 (2013).
Matz, S. C., Menges, J. I., Stillwell, D. J. & Schwartz, H. A. Predicting individual-level income from Facebook profiles. PLoS ONE. 14, e0214369 (2019).
Saeb, S. et al. Mobile phone sensor correlates of depressive symptom severity in Daily-Life behavior: an exploratory study. J. Med. Internet Res. 17, e175 (2015).
Vo, N. N. Y., He, X., Liu, S. & Xu, G. Deep learning for decision making and the optimization of socially responsible investments and portfolio. Decis. Support Syst. 124, (2019).
Nosratabadi, S. et al. Data science in economics: comprehensive review of advanced machine learning and deep learning methods. Preprint at (2020). https://doi.org/10.21203/rs.3.rs-91905/v1
Angwin, J., Larson, J., Kirchner, L. & Mattu, S. Machine bias. ProPublica (2016).
McCrae, R. R. & John, O. P. An introduction to the five-factor model and its applications. J. Pers. 60, 175–215 (1992).
Kotov, R., Gamez, W., Schmidt, F. & Watson, D. Linking big personality traits to anxiety, depressive, and substance use disorders: A meta-analysis. Psychol. Bull. 136, 768–821 (2010).
Lauriola, M. & Levin, I. P. Personality traits and risky decision-making in a controlled experimental task: an exploratory study. Pers. Individ Dif. 31, 215–226 (2001).
Zhao, H. & Seibert, S. E. The big five personality dimensions and entrepreneurial status: a meta-analytical review. J. Appl. Psychol. 91, 259–271 (2006).
Bahat, E. The big five personality traits and adherence to breast cancer early detection and prevention. Pers. Individ Dif. 172, 110574 (2021).
Kornør, H. & Nordvik, H. Five-factor model personality traits in opioid dependence. BMC Psychiatry. 7, 37 (2007).
Alker, H. A. Is personality situationally specific or intrapsychically consistent? J. Pers. 40, 1–16 (1972).
Wachtel, P. L. Psychodynamics, behavior therapy, and the implacable experimenter: an inquiry into the consistency of personality. J. Abnorm. Psychol. 82, 324–334 (1973).
Anderson, C. A. Temperature and aggression: effects on quarterly, yearly, and City rates of violent and nonviolent crime. J. Pers. Soc. Psychol. 52, 1161–1173 (1987).
Asch, S. E. Effects of group pressure upon the modification and distortion of judgments. in Groups, Leadership and Men; Research in Human Relations 177–190 (Carnegie, Oxford, England, (1951).
Ekehammar, B. Interactionism in personality from a historical perspective. Psychol. Bull. 81, 1026–1048 (1974).
Milgram, S. Behavioral study of obedience. J. Abnorm. Psychol. 67, 371–378 (1963).
Mischel, W. Personality and Assessment. (John Wiley and Sons (WIE), Brisbane, QLD, Australia, (1971).
Sherif, M. The Psychology of Social Norms. xii, 210Harper, Oxford, England, (1936).
Darley, J. M. & Batson, C. D. From Jerusalem to Jericho’: A study of situational and dispositional variables in helping behavior. J. Personal. Soc. Psychol. 27, 100–108 (1973).
Jaccard, J. J. Predicting social behavior from personality traits. J. Res. Pers. 7, 358–367 (1974).
Mischel, W. From personality and assessment (1968) to personality science, 2009. J. Res. Pers. 43, 282–290 (2009).
Morgeson, F. P. et al. Reconsidering the use of personality tests in personnel selection contexts. Pers. Psychol. 60, 683–729 (2007).
Morgeson, F. P. et al. Are we getting fooled again? Coming to terms with limitations in the use of personality tests for personnel selection. Pers. Psychol. 60, 1029–1049 (2007).
Paunonen, S. V. Hierarchical organization of personality and prediction of behavior. J. Pers. Soc. Psychol. 74, 538–556 (1998).
Paunonen, S. V. Big five factors of personality and replicated predictions of behavior. J. Pers. Soc. Psychol. 84, 411–424 (2003).
West, S. G. Personality and prediction: an introduction. J. Pers. 51, 275–285 (1983).
Campbell, J. D. et al. Self-concept clarity: measurement, personality correlates, and cultural boundaries. J. Pers. Soc. Psychol. 70, 141–156 (1996).
Carney, D. R., Jost, J. T., Gosling, S. D. & Potter, J. The secret lives of Liberals and Conservatives: personality profiles, interaction styles, and the things they leave behind. Polit Psychol. 29, 807–840 (2008).
Mondak, J. J. & Halperin, K. D. A framework for the study of personality and political behaviour. Br. J. Polit Sci. 38, 335–362 (2008).
DeNeve, K. M. & Cooper, H. The happy personality: a meta-analysis of 137 personality traits and subjective well-being. Psychol. Bull. 124, 197–229 (1998).
Diener, E., Oishi, S. & Lucas, R. E. Personality, culture, and subjective well-being: emotional and cognitive evaluations of life. Annu. Rev. Psychol. 54, 403–425 (2003).
Donnellan, M. B., Larsen-Rife, D. & Conger, R. D. Personality, family history, and competence in early adult romantic relationships. J. Pers. Soc. Psychol. 88, 562–576 (2005).
Karney, B. R. & Bradbury, T. N. The longitudinal course of marital quality and stability: A review of theory, methods, and research. Psychol. Bull. 118, 3–34 (1995).
Judge, T. A., Heller, D. & Mount, M. K. Five-factor model of personality and job satisfaction: A meta-analysis. J. Appl. Psychol. 87, 530–541 (2002).
Ozer, D. J. & Benet-Martínez, V. Personality and the prediction of consequential outcomes. Annu. Rev. Psychol. 57, 401–421 (2006).
Fisher, R. J. Social desirability bias and the validity of indirect questioning. J. Consum. Res. 20, 303 (1993).
Paulhus, D. L. & Vazire, S. The Self-Report Method. Handbook of Research Methods in Personality PsychologyGuilford,. (2007).
Bowman, N. A. & Hill, P. L. Measuring how college affects students: Social desirability and other potential biases in college student self-reported gains. New Dir. Inst. Res. 73–85 (2011). (2011).
Hofmann, W., Gawronski, B., Gschwendner, T., Le, H. & Schmitt, M. A meta-analysis on the correlation between the implicit association test and explicit self-report measures. Pers. Soc. Psychol. Bull. 31, 1369–1385 (2005).
Judge, T. A., Higgins, C. A., Thoresen, C. J. & Barrick, M. R. The big five personality traits, general mental ability, and career success across the life span. Pers. Psychol. 52, 621–652 (1999).
Roberts, B. W., Kuncel, N. R., Shiner, R., Caspi, A. & Goldberg, L. R. The power of personality: the comparative validity of personality traits, socioeconomic status, and cognitive ability for predicting important life outcomes. Perspect. Psychol. Sci. 2, 313–345 (2007).
Poropat, A. E. A meta-analysis of the five-factor model of personality and academic performance. Psychol. Bull. 135, 322–338 (2009).
Caspi, A., Roberts, B. W. & Shiner, R. L. Personality development: stability and change. Annu. Rev. Psychol. 56, 453–484 (2005).
Friedman, H. S. et al. Psychosocial and behavioral predictors of longevity: the aging and death of the termites. Am. Psychol. 50, 69–78 (1995).
Matz, S. C., Kosinski, M., Nave, G. & Stillwell, D. J. Psychological targeting as an effective approach to digital mass persuasion. Proc. Natl. Acad. Sci. U. S. A. 114, 12714–12719 (2017).
Holtzman, N. S., Vazire, S. & Mehl, M. R. Sounds like a narcissist: behavioral manifestations of narcissism in everyday life. J. Res. Pers. 44, 478–484 (2010).
Lee, K., Ashton, M. C. & de Vries, R. E. Predicting workplace delinquency and integrity with the HEXACO and five-factor models of personality structure. Hum. Perform. 18, 179–197 (2005).
Miller, J. D. & Lynam, D. Structural models of personality and their relation to antisocial behavior: A meta-analytic review. Criminology 39, 765–798 (2001).
Boyd, R. L. & Pennebaker, J. W. Language-based personality: a new approach to personality in a digital world. Curr. Opin. Behav. Sci. 18, 63–68 (2017).
Kern, M. L. et al. The online social self: an open vocabulary approach to personality. Assessment 21, 158–169 (2014).
Park, G. et al. Automatic personality assessment through social media Language. J. Pers. Soc. Psychol. 108, 934–952 (2015).
Sap, M. et al. Developing age and gender predictive lexica over social media. in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (Association for Computational Linguistics, Stroudsburg, PA, USA, 2014). (2014).
Gibney, E. The scant science behind Cambridge Analytica’s controversial marketing techniques. Nature (2018).
Schwartz, H. A. et al. Personality, gender, and age in the Language of social media: the open-vocabulary approach. PLoS One. 8, e73791 (2013).
Funder, D. C. & Ozer, D. J. Evaluating effect size in psychological research: sense and nonsense. Adv. Methods Pract. Psychol. Sci. 2, 156–168 (2019).
Bleidorn, W. & Hopwood, C. J. Using machine learning to advance personality assessment and theory. Pers. Soc. Psychol. Rev. 23, 190–203 (2019).
Anglim, J. & Grant, S. L. Incremental criterion prediction of personality facets over factors: obtaining unbiased estimates and confidence intervals. J. Res. Pers. 53, 148–157 (2014).
Lavi, G., Rosenblatt, J. & Gilead, M. A prediction-focused approach to personality modeling. Sci. Rep. 12, 12650 (2022).
Harari, Y. N. Homo Deus: A Brief History of Tomorrow (Random House, 2016).
Kosinski, M., Bachrach, Y., Kohli, P., Stillwell, D. & Graepel, T. Manifestations of user personality in website choice and behaviour on online social networks. Mach. Learn. 95, 357–380 (2014).
Markovikj, D., Gievska, S., Kosinski, M. & Stillwell, D. Mining Facebook data for predictive personality modeling. ICWSM 7, 23–26 (2013).
Youyou, W., Kosinski, M. & Stillwell, D. Computer-based personality judgments are more accurate than those made by humans. Proc. Natl. Acad. Sci. U S A. 112, 1036–1040 (2015).
Jones, S. E., Miller, J. D. & Lynam, D. R. Personality, antisocial behavior, and aggression: A meta-analytic review. J. Crim Justice. 39, 329–337 (2011).
Sanz, J., García-Vera, M. P. & Magán, I. Anger and hostility from the perspective of the big five personality model. Scand. J. Psychol. 51, 262–270 (2010).
Jensen-Campbell, L. A., Knack, J. M., Waldrip, A. M. & Campbell, S. D. Do big five personality traits associated with self-control influence the regulation of anger and aggression? J. Res. Pers. 41, 403–424 (2007).
Author information
Authors and Affiliations
Contributions
M.D.L. analyzed the data, developed methodologies, wrote and revised the manuscript, and reviewed and approved the final manuscript.I.Z.G. contributed to the study design, data collection, methodology development, and manuscript preparation and review.Z.S. and A.T. contributed to the study design, data collection, and methodology development.L.U. contributed to the study design, data collection, data analysis, methodology development, project supervision, and review.M.G. conceived the study, analyzed the data, developed methodologies, revised the manuscript, supervised the project, and reviewed and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Levitin, M.D., Ger, I.Z., Sovik, Z. et al. Using the Language of elite athletes to predict their personality and on court transgressions. Sci Rep 15, 17002 (2025). https://doi.org/10.1038/s41598-025-99667-5
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-99667-5


