Abstract
The analysis of basketball strategies has traditionally relied on manual observation and limited data. As tracking technology progresses, there is potential for applying Artificial Intelligence specifically to strategy, delivering insights into defensive techniques to the teams. This research aims to develop a hybrid machine learning model combining Long-Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN) to classify and analyze defensive basketball strategies, specifically identifying switch and trap plays. The study uses large-scale tracking data to understand the team’s defensive behavior better. The proposed hybrid model integrates LSTM to identify temporal features of players’ motion and CNN to recognize spatial patterns across the court. The model was trained and evaluated on the National Basketball Association’s (NBA) SportVU tracking data, which includes over 32,000 possessions. Basic operations included standardizing the player’s position and reformatting half-court representations into grids. Automatic annotations were evaluated based on accuracy, precision, recall, and F1 score and compared with manual annotations. When implemented independently, the hybrid model decided switches and traps with an overall accuracy of 91.4%, higher than basic methods such as LSTM and CNN. The spatial density approach and temporal sequence showed the deformations in defensive structures, and the hybrid model had immense benefits in distinguishing the situations that called for switches and traps. The proposed hybrid model successfully categorizes and identifies defensive basketball moves and provides coaches and players with a strong tool to evaluate defensive tactics. This investigation has demonstrated the importance of using supervised machine learning models for real-time tactical analysis in sports, and future research in automated strategy evaluation and game planning has been laid out.
Similar content being viewed by others
Introduction
The use of analytics in sports has developed incredibly quickly, and statistics for decision-making support is a staple of professional sports. Basketball has been one of the significant sports experiencing greater adoption of analytics due to velocity characteristics and the growth in player tracking data. SportVU is currently one of the most continually used technologies for monitoring basketball movements; it records the position of players and the ball at a high rate (25 times per second), producing a large amount of space-time data1. These specific details facilitate coaches and analysts to understand players’ behaviors and movements in particular phases of the game operation, such as offense and defense operations2.
Specifically, identifying team tactics such as ‘switches’ or ‘traps’ in defending has received much attention for improvement. A switch is when two defensive players swap the players they’re marking and guard in the middle of a play, whereas a trap is when two defenders put pressure on the player in possession of the ball to reduce his/her options3. It is possible to gain more insight into defending efficiency and diversification and the potential for upgraded tactics of the teams by identifying and analyzing these strategies4. However, analyzing each play is arduous and time-consuming, and the nature of these plays raises issues requiring adopting such automated methods to categorize these strategies using tracking data.
Recent improvements in machine learning systems create new possibilities for automatic strategy evaluation and detection in basketball data. The traditional approach depended on manual annotation and produced slow results susceptible to human judgment and severe limitations regarding dataset expansion capabilities5. Modern machine learning algorithms focusing on SportVU datasets systemize the discovery of motion patterns while efficiently recognizing tactical behaviors. Automated approaches reduce human effort, simultaneously enhance accuracy, and consistently analyze complex defensive strategies for large-scale and real-time analysis applications6. Machine learning capabilities enable analysts to identify, categorize, and even count some of the tactics, such as switches or traps, in thousands of plays, thereby increasing the accuracy and the richness of the tactical work. The requirement for such automation has been adopted due to the increasing focus on speed in sports analysis. If defensive strategies are classified accurately, it would be easier for the coaches to comprehend the efficiency of their approach and gain real-time information to make necessary changes7. Finally, expanding strategy classification using machine learning is more versatile and conforms to the rapidly changing requirements of professional basketball analytics.
Basketball has experienced advances in analytics from high-frequency tracking technology called SportVU by STATS LLC. This system uses multiple cameras to capture the position of players at a rate of 25 frames per second throughout the game6. SportVU dataset includes each possession as a separate file, and possession is a combination of the game events continuously linked to a particular team’s ball control. This possession-level analysis lets the analyst focus on individual plays and get information about the offense, defense, and players’ efficiency8. Many approaches can be addressed, including the data set’s nature, the players’ organization and positioning in the half-court, and spacing and movements. This layout also helps standardize the data for equal comparison; each court area is represented as a grid, and the players’ positions are recorded depending on a fixed half-court view9. Analysts and researchers can exploit the proposed framework in studying team strategies such as switches and traps, especially since the study is designed to occur in a defined spatial area, which is critical for machine learning models that rely on consistent spatial data formats. However, SportVU data does not include full-court mapping, so its applicability is limited to only half-court studies. Nevertheless, the inside information about possessions in a space-time environment might help the second kind of model, aimed at classifying offensive play features or recognizing key defensive positions10. Because of this, SportVU data plays a vital role in tweaking basketball techniques and increasing the value of team research. Although extensive work has been accomplished in using machine learning in sporting activities, certain areas are still ambiguous, especially in the cases of automated recognition of switches and traps in defensive strategies. Current work is generally concerned with more encompassing measures central to shooting likelihood or general formation of the teams and not dancing tactically, which could cause a dramatic shift of the tide11. Nevertheless, using manually annotated datasets for play classification remains the weakness; identifying such defensive strategies demands time and professional experience; such a problem doesn’t enable wide-scale investigation and real-time use12.
The drawbacks of most existing models include that they focus on individual actions rather than switches and traps due to contextual decision-making. Not all defensive plays are created equal; the type, opponent’s movement, and general playing environment influence the relatively unique defensive actions or transactions possible for any given player13. Second, such research often provides only snapshots of these defensive strategies and thus fails to assess their development throughout a game, something that could be crucial to building analytical tools for a game.
This study focuses on creating a machine learning framework to classify defensive basketball strategies, particularly identifying “switch” and “trap” plays using SportVU tracking data. Key objectives are as follows:
-
To develop an analytical model capable of detecting switches and traps by analyzing player movement patterns, providing an alternative to manual strategy annotation.
-
To create an efficient and consistent approach to recognize defensive plays across thousands of possessions, reducing the need for labor-intensive manual analysis.
-
To systematically evaluate the model’s accuracy in identifying switch and trap plays, highlighting its effectiveness and reliability for practical use in team strategy development.
-
To provide real-time and post-game insights that help coaches analyze defensive strategies and identify tactical patterns.
This study contributes significantly to sports analytics by introducing a hybrid machine learning model that combines Random Forest (RF), Long Short-Term Memory (LSTM), and Convolutional Neural Networks (CNN) to classify defensive basketball strategies, particularly switches and traps. The model effectively describes both space-time patterns using the National Basketball Association’s (NBA) SportVU tracking data, which brings detailed awareness of defensive movements. The methodology achieves both practical usage for coaches and analysts and improved evaluation of defensive strategies in sports research. Coaches can use the model to detect their opponents’ patterns and react directly by modifying their defensive plans. This leads to performance enhancement by using decision-making powered by data analysis. The research provides fundamental groundwork for upcoming AI-powered sports applications, allowing in-depth analysis of intricate strategic actions within different game situations.
The rest of the paper is organized as follows. Section 2 reviews relevant literature on machine learning and player tracking in sports, setting the foundation for the proposed hybrid model. The dataset section describes the NBA SportVU tracking data, followed by the methodology, where data preprocessing, feature engineering, and model selection are detailed. Results are analyzed through performance metrics and tactical insights. Section 5 presents the discussion. Section 6 concludes with key findings, contributions, and directions for future research. This structure provides a comprehensive yet focused approach to understanding and improving defensive strategies in basketball through machine learning.
Literature review
Machine learning has also become crucial to sports analytics, enhancing sophisticated means of analyzing and forecasting players’ and teams’ performances. Over the past few years, new techniques have been used to examine structures within big data to make it easier to interpret than before14. A significant strength of machine learning in this field is handling multiple features, such as players’ motion and game activities, which regular statistics fail to capture well15. Machine learning has substantially impacted various basketball applications because of the player-tracking information set known as SportVU15,16. These models usually use supervised learning for classification, like classifying defensive formation, and for discovering hidden structures, such as the cluster of the group of players in terms of unsupervised movement learning17. Another branch of machine learning, specifically the one used widely in the current season, refers to deep learning, which is incredibly useful for videos and images. For example, CNNs were used to examine specific play types and developed methods for automatically detecting, bounding, and labeling them to provide a clearer view of team strategies and individual contributions18. It is important to note that such models do not possess an expiry date since they improve the accuracy of the predictions with time. As they feed on large volumes of data, it is rapidly becoming necessary in professional sporting organizations. With these developments, machine learning complemented and improved the usual sports research and enabled critical strategic modifications in real-time play. Coaches and analysts currently utilize these tools to arrive at decisions regarding the game, explaining the positive impacts that machine learning has on the enhancement of team performance and strategic development19.
The player tracking data in basketball analytics has dramatically shifted from performance-based metrics to real-time data collection about player movements on the playing field. It, for instance, only analyses the position of each player and the ball at a high frequency using technologies like SportVU, considering the roles of every player in a game20. This data is beneficial in having a fair measure of how everybody and every team in a given tournament or league is faring since it extends past the regular counting stats such as points scored, rebounds, and assists in providing insights such as the relative positioning of the players, the speed and so on. Defensive and offensive strategies are one of the areas that mainly implement player tracking data to be used. For instance, tracking data helps analysts evaluate the viability of other defensive tactics like switches, traps, and rotations to provide coaches with information on team defense layout21. Likewise, in offense, data tracking augments the team’s capability to regulate shot choices, player positioning, and ball motion, which leads to probable scoring opportunities22. The player tracking data is another important area for the player performance assessment. Measures like the number of kilometers run, the average velocity, and the extent of time spent in the ‘hot’ areas give indications about the physical effort put in by a player during the game, about his fitness level for the kind of physical exertion involved in the game, about his energy expenditure, and so on, all of which can help teams to plan rotations, and replacements effectively23. For example, habilitative statistics, which are complex derivatives of tracking data, including effective field goal percentage (eFG%) under pressure or defensive impact rating, are other quantifiers perceived as challenging to make by conventional statistics24. However, tracking players’ data has profound analytics for forecasting sports games. ML models can, therefore, use player trajectories and interactions to infer possession outcomes, score probable zones, and indicate how players will behave under different defensive actions25. These predictive insights help the team’s strategies and better prepare for opponents. Overall, player tracking technology has become instrumental in basketball. It has changed the game at the micro and macro levels by providing detailed information about players’ behavior while offering influence over different tactics during games and in the player development process in the long run. Two defensive moves in basketball, switch, and trap, slow down the countering team. This could switch when two defenders have to reverse their defensive positions, usually if one got screened, to continue guarding the shooter when he has the ball but without letting him shoot26. On the other hand, a trap refers to a situation where the two defenders place themselves closely around a ball handler to force the ball handler to lose possession of the ball or block the passes, especially in crucial conditions or pressure times27. The analysis of these strategies has not only become detailed with the help of the player tracking data. For instance, tracking data enables experts to determine when switches and traps are carried out or occur and at what particular frames from this opinion28. As such, when analyzing positional data, assessors can learn when teams decide to switch or trap in a match and what spatial arrangement is most suitable for that move.
Switches and traps have also been automatically detectable through machine learning incorporating player movement and tracking. For example, spatial-temporal clustering can help discover moments when two defenders are positioned at a certain distance from the ball handler—it means that a trap has occurred or when there is a shift—when the defenders’ assignments change29. Such models can help understand how often teams employ these tactics, in what context, and at productive rates. Worthy of note is the fact that switches and traps are also influenced by issues such as position, saltus, and production. The findings have proposed that timed changes can decrease an opposition’s goal rate, but faulty traps can lose other attacking players, leading to goals30. Therefore, by breaking these strategies down to a greater extent, it is possible to enhance the effectiveness of the shown defensive actions and make correct decisions at appropriate moments of the game concerning applying the provided switches and traps according to the opponents’ behaviors. Overall, by employing switch and trap strategies and identifying corresponding switching patterns for tracking-based analysis, this paper provides quantitative evidence for basketball defense, which can be applied to enhance defensive performance and offset the game’s attacking styles.
Methodology
This work builds a novel analysis model that can detect two key defensive actions in basketball, namely, switches and traps, based on the data on the players’ movements. That is, the model produces the spatial configuration of each possession and tracks the time functions of the players to identify transition points where defenders change. It also describes how one can identify these strategies and present the procedures for constructing the model.
Analytical model for identifying switch and trap plays
One of the model’s successful aspects is identifying and categorizing defensive actions by tracking players and recognizing the switch and the trap strategies. By examining movement and positioning data, the model can determine when either type of defender overcommits or moves/shifts to guard a confident ball handler. Essentially, it highlights periods of high defensive pressure.
Definition of switch and trap
Basketball defense changes when two defenders guarding specific hitters change sides with each other, typically when the team has set a screen. It aims to deny the ball handler an opportunity to shoot or efficiently drive past his opponents since the defenders always hover around their opponents after the screen. As to the switch, the model looks for cases where two defenders have to switch due to the backers’ close positions throughout the screen while maintaining no delay in the coverage and no opportunity left for backers to pass and shoot open.
However, a trap is a wholly defensive formation of two defenders also focusing on making difficult plays for the ball handler, stealing the ball, or completing passes. That is why the traps are usually set in certain court zones, for example, near the sideline or within the zones where a controlling player has little space for maneuver. To identify a trap, the model measures two defenders’ shifts toward the ball handler and decreases the distance to the offensive player, thus implementing pressuring coverage that restricts the ball handler’s choices. An area around the ball handler is defined earlier in the model to detect defensive and trap situations.
Analytical model for strategy identification
The model begins with data preprocessing, where the dataset is divided by possession and each player’s spatial coordinates \(\:(x,\:y)\) and corresponding timeframes are normalized to a half-court representation. Here, \(\:(x,\:y)\) represents the spatial position of players on the court, recorded at specific time stamps. At the same time, velocity is derived using temporal intervals provided by the dataset. Each possession is accompanied by a detailed record of events, including players’ movements and timestamps. The model first determines whether screens exist to identify switches by calculating the distance between two offensive players. The model evaluates the defenders’ states and relative positions if the offensive players are within a predefined proximity indicative of a potential screen. A switch is categorized if defenders adjust their orientation, cross paths, or change their defensive assignments immediately following the screen. Defense alignments are validated through positional checks, establishing new player coverages after defenders move below preset distance thresholds. This measure lowers the number of classification mistakes in identifying defensive responsibilities.
When two offensive players remain within 1.5 m of each other for three consecutive video frames, the system detects a switch. All defenders must position themselves inside a 2-meter zone bordering offensive players to verify the switch. The defensive adjustment must occur at a maximum speed of 0.8 s during the first 20 frames after the screen starts. Additionally, defenders must maintain proper positional alignment by rotating toward their new assignments with an angular deviation of less than 45 degrees, ensuring adequate coverage of the offensive players. For trap detection, the model tries to find a situation where two defenders approach the ball handler within a specific circular area determined by a predefined distance from the offense player. This factor alone decides when a trap is present, namely when the function of an object is defined by its close Euclidean distance to another object. Positional thresholds make sure that the defenders are close enough to qualify as a trap, and a timing constraint that separates traps because they are fast executed to let the ball-handler pass. The model produces labels for each possession, with the plays involving switches and traps labeled as such and other plays labeled standard defensive action if the model cannot find a switch or a trap.
Figure 1 illustrates the switch and trap detection processes within the analytical model. It presents the data input from player tracking, screens leading to switches, and defensive convergences leading to traps.
Data collection
The dataset used in this study was sourced from the NBA’s SportVU tracking system, a high-resolution player tracking technology that captures the positions of all players and the ball at a frequency of 25 Hz. The dataset comprises over 32,000 possessions from the 2012–2013 NBA season, representing approximately 630 games. Each possession is recorded as a sequence of frames containing \(\:(x,\:y)\) coordinates for all ten players and the ball, along with a timestamp. This granular data allows for detailed analysis of player movements’ spatial and temporal aspects. The data also includes possession-level metadata, such as the shot clock, ball handler’s position, and game context, which were leveraged to provide additional insights into defensive strategies. All data was mapped to a standard half-court representation to ensure consistency and alignment with the SportVU framework.
The analysis uses SportVU tracking data because this dataset enables researchers to track actual player positions and defense contrasts. This portion presents information about the analyzed dataset and its possession aspects while demonstrating how researchers use a grid layout to display court locations.
SportVU tracking data characteristics
The SportVU database provides comprehensive player trajectory insights through ball tracking and constant tracking data for all players throughout each possession. The 25 Hz speed of the data collection enables detailed observation of spatial-temporal events by recording both player and ball positions every 0.04 s in multiple game scenarios. Each game is segmented into a series of possessions, where possession is defined as the duration one team maintains control of the ball until a shot is attempted, a turnover occurs, or the ball changes teams. This high-frequency data allows researchers to analyze offensive and defensive plays, focusing on player movements, formations, and strategy executions.
Possession data overview and structure
The play data within SportVU tracks each player’s \(\:(x,\:y)\) coordinates and the ball, capturing position changes and movement patterns over time. Each play is structured as a sequence of frames, where each frame records the positions of all ten players (five from each team) and the ball, formatted as sequential coordinates. The length of each play varies, typically ranging from 50 to 300 frames, reflecting segments of gameplay such as specific offensive or defensive sequences rather than complete possessions. The dataset comprises over 32,000 plays from the 2012–2013 NBA season, extracted from approximately 630 games, providing a robust resource for machine learning applications and strategic basketball analysis. Table 1 presents an overview of several prominent neural style transfer techniques, outlining their computational efficiency, style fidelity, content preservation, and flexibility in handling multiple styles.
Court representation in grid format
The half-court is represented as a 400 × 360 grid to streamline analysis. Each coordinate in the tracking data is mapped to a cell within this grid, effectively reducing the dimensional complexity and facilitating spatial pattern recognition. The top-left corner of the grid is designated as the origin point \(\:(0,\:0)\), with grid cells indexed row-major. Each cell represents an area on the court, and player positions are translated into grid indices based on their real-time locations. This way, the defensive zone can be broken down into simple grids that demarcate the half-court, making it easy to identify a player’s position during defensive plays. The half-court representation in this study follows a structured coordinate system to ensure spatial consistency in defensive analysis. The basket is positioned at (100, 50), a key reference point for spatial calculations. The half-court line is at x = 200, dividing the playing area into two halves. The three-point line is represented as a semicircle with a radius of 237.5 cm from the basket, following standard NBA court dimensions. These coordinates enable precise mapping of player movements and defensive strategies, ensuring accurate spatial pattern recognition in the dataset.
Machine learning approach for classification
The advanced logistic regression model for classifying defensive strategies, particularly switches and traps, utilizes engineered features that capture the spatial-temporal features of player mobility. This section is the only one that details feature engineering specific to defensive analysis while explaining the reason for model selection for classification.
Feature engineering for defensive strategy analysis
Essential features used in classifying defensive strategies must, therefore, be well-developed to capture the subtle movements and actions of the players. As input for the analytics, SportVU delivers the raw (x, y) position data of players and the ball, which are inadequate to detect elaborate defensive actions. Hence, new characteristics are introduced to measure distances, relative locations, velocities, spatial organizations, switches, and trap characteristics. Distances between defenders and their assignments are critical in detecting defensive adjustments. Let \(\:{d}_{ij}^{t}\) represent the Euclidean distance between two players \(\:i\) and \(\:j\) at time \(\:t\):
This metric is handy for identifying traps, where two defenders converge on the ball handler, indicated by a decreasing \(\:{d}_{ij}^{t}\) over several frames. The angle between a defender’s path and the ball handler’s direction helps distinguish between a regular approach and a trap attempt. For each frame, we compute the angle \(\:{\theta\:}_{i}^{t}\) using:
where (\(\:{x}_{ball}^{t},{y}_{ball}^{t})\) is the ball’s position at the time \(\:t\). A sharp angle change, particularly in two defenders toward the ball handler, often indicates a trap.
Velocity and acceleration provide insights into the intensity of a defender’s movement. Let \(\:{v}_{i}^{t}\) denote the velocity of the player \(\:i\) at time \(\:t\), calculated as:
Sudden increases in defender velocity toward an offensive player signal a potential switch or trap event. Acceleration, defined as the change in velocity over time, further aids in identifying aggressive defensive actions.
Relative positioning, such as whether a defender is on the left or right of the ball handler, helps differentiate between switches and natural defensive rotations. The model gains additional context by calculating relative distances and angles among the players and encoding them as categorical variables.
Defensive clustering around the ball handler is a key feature of traps. We compute a spatial density function, \(\:{\rho\:}_{ball}^{t}\), around the ball handler, defined as the count of defenders within a radius \(\:r\) at time \(\:t\):
where \(\:{d}_{i,ball}^{t}\) is the distance between the defender \(\:i\) and the ball handler. A high-density score is characteristic of a trap scenario.
The selected features in this study play a crucial role in accurately classifying defensive strategies. Spatial features, such as player (x, y) coordinates and inter-player distances, help distinguish between switches, where defenders maintain equal coverage, and traps, where two defenders converge on the ball handler. Velocity and acceleration provide insight into movement intensity, with high acceleration toward the ball handler indicating a trap and controlled lateral movement signaling a switch. Angular and rotational movements (a sharp angle change > 45° within 0.4 s) help differentiate structured switches from natural defensive rotations. Additionally, spatial density around the ball handler, measured by the number of defenders within a 3-meter radius, enables the model to detect traps more accurately.
These features collectively enable the model to recognize patterns associated with defensive strategies and enhance its ability to classify switches and traps accurately.
Model selection and rationale
The model is selected based on its capability to model basketball play’s Sequential spatial and temporal dependencies. For this task, we take reference models that are low in complexity while delivering good classification error performance. RF is chosen as a baseline model due to its simplicity, interpretability, and effectiveness in handling tabular data with engineered features. RF is beneficial for evaluating the importance of individual features, such as distances, velocities, and spatial densities, in classifying defensive strategies like switches and traps. Each decision tree within the forest learns distinct patterns from subsets of the data, and their aggregated predictions provide a robust overall classification. Additionally, RF is resistant to overfitting, especially when working with a limited number of features, making it a reliable starting point for comparison against more complex models incorporating temporal (LSTM) and spatial (CNN) dependencies. Since basketball plays are more sequential, no other sequence learning method is better than LSTM regarding time dependency. The LSTM model can process frames in sequence, which helps it model temporal dependencies, such as a defender’s motion between frames. Since the LSTM network uses a memory cell, the model can identify patterns corresponding to switches and traps within short—and long-term intervals. Because data is arranged in grid structures, CNNs are ideal for identifying spatial relations in every frame. Through convolutional layers, the model can identify spatial clustering around the ball handler, which characterizes traps and position shifts of screens characteristic of switches. RF was included in the model selection as a baseline for comparison against deep learning models. As a non-sequential classifier, RF allows us to evaluate the benefit of incorporating temporal (LSTM) and spatial (CNN) dependencies in the hybrid model. Additionally, RF provides feature importance analysis, helping validate which spatial and movement-based features are most influential in classification. Its robustness to overfitting makes it a reliable reference point for assessing model performance. Furthermore, RF is computationally efficient, enabling faster experimentation and evaluation before transitioning to more complex deep-learning models. These factors justify its inclusion as an essential part of the methodology.
Proposed hybrid model
The proposed model combines LSTM and CNN to classify defensive basketball maneuvers, namely switches and traps. The combination of LSTM, which focuses on the temporal characteristics of the players’ movement, and CNN, with their spatial characteristics, results in a strong model that fits the dynamicity of the basketball strategy analysis.
The SportVU tracking data, which includes player and ball positions for each possession, is first preprocessed and standardized. Each possession is converted into a sequence of frames, with each frame containing positional information for all players. This data is then divided into temporal and spatial representations for input into the LSTM and CNN components. The sequential data for each possession, formatted as a series of positional vectors over time, is fed into the LSTM. Let \(\:X=\{{x}_{1},{x}_{2},…,{x}_{n}\}\) represent the sequence of frames, where \(\:{x}_{i}\) is the feature vector for frame iii. The LSTM processes \(\:X\) to capture movement patterns across time, learning shifts in positioning, and defensive adjustments that are indicative of switches and traps. The spatial grid representation for each frame is used as input to the CNN. The court’s half-court grid structure (e.g., 400 × 360) is populated with player positions, allowing the CNN to recognize spatial formations across frames. This grid serves as CNN’s input matrix, where each cell reflects player presence, density, and formation for that frame. The CNN can detect high-density zones that often indicate trap formations or key defensive alignments by analyzing these spatial configurations. Algorithm 1 presents the step-by-step procedure of the hybrid model for defensive strategy classification.
The temporal features are obtained from the LSTM and CNN parts, and the spatial features are obtained from CNN. Then, these two vectors are concatenated to create one powerful vector. This combined representation is passed through a dense layer and used for final classification to decide whether the defensive strategy accomplished was a switch, a trap, or neither.
The machine learning model construction features must emphasize players’ interactions, distances, and spatial-temporal relationships. By choosing CNN and LSTM, the proposed approach can generate the highest real-time switch and trap classification accuracy.
Model evaluation
Model assessment is crucial to ensure the machine learning model is indeed efficient in correctly identifying defensive moves like switches and traps. This part describes the classification accuracy assessment methods and manual data annotation for model reliability assessment.
Evaluation metrics for classification accuracy
Specifically, we use several metrics to assess the classification accuracy of the proposed model about switches and traps from tracking basketball data. The main measures are accuracy, precision, recall, F1 score, and a confusion matrix, which are all different angles of the same coin. Accuracy measures the proportion of correctly predicted instances among all predictions. For a binary classification of switches and traps, accuracy \(\:Acc\) is defined as:
where \(\:TP\) is the count of true positives, \(\:TN\) is true negatives, \(\:FP\) is false positives, and \(\:FN\) is false negatives.
Precision quantifies how many instances predicted as a specific class (e.g., trap) are correct. Precision for the trap class \(\:{P}_{trap}\) is given by:
where \(\:{TP}_{trap}\) is the count of true positives for traps and \(\:{FP}_{trap}\) is the count of false positives.
Recall measures the model’s ability to identify all actual instances of a particular class. Recall for the switch class \(\:{R}_{switch}\) is defined as:
where \(\:{TP}_{switch}\) is the count of true positives for switches and \(\:{FN}_{switch}\) is the count of false negatives.
The F1-score is the harmonic mean of precision and recall, balancing both metrics. For the trapping class, the F1-score \(\:{F1}_{trap}\) is calculated as:
This score is instrumental when classes are imbalanced. For comprehensive accuracy measurement, these metrics evaluate precision for correctness and recall for completeness while delivering an overall reliability score through the F1-score to effectively identify switches and traps in the dataset.
Manual annotation and model reliability testing
Measuring the reliability of model predictive accuracy between switch, trap, and neither category involved utilizing the trichotomous Cohen’s Kappa statistic.
The Cohen’s Kappa statistic is then calculated as: \(\:\kappa\:=({P_o}-{P_e})/(1-{P_e})\), where Po is the observed agreement.
Based on our annotated dataset of 144 possessions, the calculated Cohen’s Kappa value is 0.86, indicating strong inter-rater agreement beyond chance. This suggests a high reliability in the model’s ability to replicate expert classification of switch, trap, and neither plays.
The evaluation methodology provides agreement measurements between automated predictions and human markings by reducing the effects of chance correspondence. The probability of chance agreement is computed as:
Formulation for Pe in the context of our classification task. The probability of chance agreement is computed as: where \(\:k\) represents the number of categories (switch, trap, and neither), and \(\:N\) is the total number of instances. The Cohen’s Kappa statistic is then calculated as:
where \(\:{P}_{0}\) is the observed agreement. Cross-validation verified the system’s stability through procedural partitioning of annotated information using Kappa statistical calculations across different segments. This approach validated the model’s consistency across subsets of the data and reduced the likelihood of overfitting. The 144 manually annotated possessions dataset was divided into training and validation sets with an 80:20 ratio. This resulted in 115 possessions for training and 29 for validation, ensuring sufficient data for model learning while maintaining an appropriate validation set for performance evaluation. Additionally, k-fold cross-validation (\(\:k=5\)) was implemented to enhance robustness, and the reported metrics reflect the average performance across all folds. Based on the annotated dataset, Cohen’s Kappa value was 0.86, indicating a strong agreement between the model predictions and manual annotations beyond chance. We also perform cross-validation on the annotated part of the data to ensure that the approach developed here works equally well, no matter which fraction of the data is taken as the training set. Therefore, this technique helps to reduce overfitting and makes the final decision-making more accurate with unseen data. Thus, using manual annotation and Cohen’s Kappa coefficient in combination with cross-validation allows proving the stability of a machine learning model selected to identify switches and traps not only on a training set but also on the testing one.
Results and analysis
In this section, we report the results of classifying the actions of switches and traps in basketball according to the taxonomy proposed by the machine learning model. They also investigate the use and distribution of these plays to discuss the strategic distribution and positioning usage. Switch and trap patterns effectively understand the situational context and the strategic reasons for all the defensive actions. Based on the model predictions and the further processed SportVU data, the present study displays the frequency and spatial distribution of switches and traps. Switches and traps are used in different ways, mainly in the game scenario, the position of the ball-handler, and the time arrow. Frequency analysis then measures the degree of utilization of each of these strategies; this enables the discovery of contexts under which these maneuvers are frequent.
Table 2 illustrates the frequency of switches and traps observed in the analyzed possessions, segmented by game context factors such as shot clock time, ball-handler position (e.g., on the wings, top of the key), and possession length.
Figure 2 provides a visual representation of the distribution of switches and traps across different segments of the shot clock. It shows that teams are more likely to deploy traps to force turnovers as time on the shot clock decreases.
From the analysis in Table 2; Fig. 2, it can be observed that switches are generally more frequent than traps. Traps are more common in shorter possession windows, indicating their use in high-pressure scenarios, particularly when forcing quick shots or turnovers is crucial. Defensive strategies vary based on the ball’s location on the court, with traps being favored when the ball is at the top of the key, allowing defenders to force the ball handler into a specific area.
To analyze the spatial tendencies in defensive maneuvers, we plot heatmaps showing the positional density of switches and traps across a half-court. Each heatmap illustrates the concentration of defensive actions within a 400 × 360 grid, where the origin represents the top-left of the half-court.
Figure 3 illustrates the spatial density of defensive strategies, specifically switches and traps, on a basketball half-court. The blue points represent areas of high switch density, while the orange points indicate areas with a higher concentration of traps. The red marker highlights the basket’s location, and the dashed line represents the half-court line. This visualization provides insights into where defensive actions are most likely to occur, with traps concentrated near the lower key area and switches distributed across key zones near the top of the arc.
Further spatial analysis provides additional insight into the positional behavior of defenders when executing switches and traps.
Table 3 highlights the proximity differences between switches and traps. On average, defenders are closer to the ball handler during a trap, aiming to reduce the handler’s movement options and create pressure.
To provide deeper insights into defensive strategies, we analyzed spatial patterns and frequencies segmented by game quarters, focusing on clutch time (the final 5 min of games with a score margin of 5 points or less). This analysis revealed key differences in how defensive strategies evolve throughout a game. Switches were more evenly distributed across the court during the first and second quarters, reflecting a general defensive approach. However, in the later quarters, particularly the fourth, switches became more concentrated near the three-point arc, where offensive plays are more critical. During clutch time, traps significantly increased density near high-pressure zones, such as the top of the key and the baseline corners. This indicates a tactical focus on pressuring ball handlers and limiting passing lanes in crucial moments. Switches, meanwhile, displayed a more strategic distribution aimed at neutralizing key offensive threats. These findings have been visualized in Fig. 4.
Data trends indicate clear patterns in when teams should employ traps versus switches for optimal defensive effectiveness. Traps are most effective near the sidelines and baseline, where offensive players have limited passing or escape options. They are frequently used in clutch situations (final 5 min of close games) against high-usage ball handlers, aiming to force turnovers or hurried offensive decisions. On the other hand, switches are primarily deployed around the three-point arc and pick-and-roll situations, where maintaining defensive flexibility is crucial. Teams rely on switches to neutralize screen actions and prevent mismatches, particularly against perimeter shooters. These insights align with strategic defensive principles and provide a data-backed approach to optimizing in-game decision-making.
This section details the results of our machine learning approach for classifying switch and trap strategies, including comparisons between different models: RF, LSTM, CNN, and the proposed hybrid model. The evaluation of each model is based on considered performance factors: accuracy, precision, recall, and F1-score for the considered metrics to ensure the reliability of the models by comparing them with manually validated data.
We first assess each model’s general performance against a separate set of held-out possessions manually annotated for classification accuracy. This comparison empowers evaluation to properly determine the model most applicable to automated switch and trap classification. Table 4 compares the performance metrics (accuracy, precision, recall, F1-score) across the different models.
The hybrid model achieves the highest overall accuracy and F1-score, outperforming the individual Random Forest, LSTM, and CNN models. The CNN model also shows strong performance, particularly in precision and recall, but is surpassed by the hybrid model in consistency.
Figure 5 visualizes these metrics for each model, allowing for a more precise performance comparison.
This section calculates each model’s success rate in classifying switches and traps across the dataset. We evaluate each model’s performance on switch, trap, and “neither” classes. Table 5 presents detailed class-specific metrics for each model, showing accuracy, precision, recall, and F1-score for the switch, trap, and neither classification.
Figure 6 illustrates the confusion matrix of the hybrid model, showing classification accuracy for each class.
The hybrid model consistently performs well across all classes, with the highest accuracy in the “Neither” class, followed closely by “Trap” and “Switch”.
Analyzing the reliability of each model in different game contexts (e.g., shot clock time, possession length) provides insights into situational performance. The following table illustrates success rates in high-pressure contexts. Table 6 shows the hybrid model’s superiority in high-pressure situations and indicates its robustness in challenging contexts.
The hybrid model demonstrates the highest success rates and accuracy across various game contexts and defensive strategies. Consistent accuracy in switch, trap, and neither class, combined with high performance in high-pressure situations, suggests that the hybrid model is well-suited for practical use in automated defensive strategy analysis.
The insights generated by the proposed model offer valuable applications in real-world basketball scenarios. In-game adjustments can be made by identifying defensive tendencies, such as frequent traps near the baseline, allowing coaches to adjust offensive spacing and implement quicker ball movement strategies. Scouting and opponent analysis benefit from the model by detecting patterns in defensive strategies, such as frequent switching against pick-and-roll plays, enabling teams to design offensive plays that exploit mismatches. Additionally, player development and tactical training can be enhanced by using model outputs to assess individual defenders’ reactions to screens and traps, allowing coaches to tailor training drills to improve reaction time and positioning. These applications demonstrate the practical impact of the model’s insights in optimizing defensive and offensive strategies.
To contextualize our findings within the broader field, we compare our results on switch and trap effectiveness, model performance, and tactical insights with several existing studies in sports analytics. The focus of this comparison is on (1) defensive strategy classification, (2) accuracy of player tracking analysis, and (3) model performance in sports-based machine learning applications.
Table 7 summarizes our approach, key basketball player tracking, and strategy analysis studies.
Table 7 provides a comparative overview of defensive strategy analysis across various studies, focusing on methodologies, datasets, strategy types, classification accuracy, and key findings. Early studies, such as Kovalchik31, utilized player tracking data from the 2013 NBA season to analyze ball-screen defenses, achieving 80% accuracy and noting the strategic value of isolating plays through ball screens. Miller and Bornn29 applied deep learning using an LSTM model on NCAA data, identifying defensive switches with 82% accuracy and highlighting the model’s ability to capture temporal defensive patterns. A CNN-based approach in32 reached 85% accuracy on NBA tracking data, effectively identifying zone defenses. A hybrid LSTM and CNN model in33 further improved accuracy (86%) in analyzing defensive traps during the 2018 NBA playoffs, showcasing enhanced context sensitivity. The proposed study combines Random Forest (RF), LSTM, and CNN models on the NBA SportVU dataset, achieving the highest classification accuracy of 91.4% and excelling in detecting defensive switches, traps, and situational contexts. This advancement demonstrates the effectiveness of hybrid models in complex defensive strategy analysis.
While the proposed model demonstrates significant accuracy in identifying defensive strategies, several methodological limitations must be acknowledged. The model counts on distinct categories of defensive actions for each possession despite the possibility that multiple defensive methods appear simultaneously in the same possession. The current framework cannot detect simultaneous defensive shifts because a trap frequently leads to a switch during a single possession. Using spatial and temporal thresholds in defensive strategies, including drop coverage, proves problematic because defenders who remain near the basket while others switch can confound detection. The model does not perform detection with accuracy when such scenarios occur. Finding the dominant factors during classification that determine proximity or velocity remains hard to decipher, which inhibits establishing an understanding of model prediction outcomes. Because it works only by considering each possession, the system cannot analyze defensive strategies across multiple plays, which could lead to improved classification accuracy during complex game situations. The analysis using SportVU data requires an acknowledgment of numerous system-defined restrictions. The tracking system mainly processes actions inside the half-court zone, thus restricting its functionality across full-court defense situations, fast breaks, and transition defense events. Additional analysis embedding fast break situations and full-court pressures is essential for comprehensive defensive strategy research but exceeds the boundaries of this available dataset. The study results mainly demonstrate relevance to organized defensive maneuvers inside half-court zones yet struggle with dynamic court activities like transitions or extended play in full-court areas. Analysis of misclassification patterns reveals key areas for improvement in future model iterations. One common issue is the misclassification of hybrid defensive actions, particularly distinguishing between switches and hedging (“show”) plays, as both involve temporary defender shifts. Refining feature engineering by incorporating defender recovery time and angle deviation thresholds could improve differentiation. Another challenge arises in fast-paced possessions, where rapid player movements in transition defense lead to classification errors. Integrating additional contextual data, such as possession speed and court zone-based decision modeling, could enhance accuracy in these scenarios. Additionally, some plays involve a trap and a switch, which the current model treats as distinct classifications rather than co-occurring actions. A multi-label classification approach would allow the model to recognize overlapping defensive maneuvers more effectively. Addressing these challenges will enhance the model’s robustness and applicability in future research.
Discussion
This study presents a novel hybrid machine-learning model for classifying defensive basketball strategies, explicitly focusing on switches and traps. The findings demonstrate the model’s ability to capture spatial and temporal patterns of player movements, providing valuable insights into team defensive behavior. In this section, we discuss the key implications of the study, address its limitations, and propose directions for future research. The results suggest that the model can effectively distinguish between defensive strategies, highlighting patterns in player positioning, movement dynamics, and tactical adjustments. The ability to classify defensive maneuvers with high accuracy has essential applications for coaches, analysts, and teams. Real-time analysis of defensive strategies can assist the coaching staff in making in-game tactical decisions, such as adjusting defensive schemes based on opponent tendencies. Additionally, post-game analysis enables a deeper understanding of defensive efficiency, allowing teams to refine their strategies over time. The findings also support player development by identifying areas where defenders may struggle with switches or traps, helping coaches design targeted training programs. Despite its strong performance, the model has some limitations that must be considered. First, the dataset is limited to the half-court, meaning that full-court defensive strategies, such as transition defense and full-court presses, are not accounted for. Future research should aim to incorporate full-court tracking data to provide a more comprehensive analysis of defensive play. Another limitation is the misclassification of hybrid defensive actions, such as a trap that transitions into a switch. The current model treats defensive strategies as distinct classes, whereas in real gameplay, multiple defensive actions can occur within a single possession. A potential solution is to develop a multi-label classification approach that can recognize overlapping defensive strategies. Additionally, the model relies primarily on spatial and movement-based features without incorporating contextual factors such as player fatigue, opponent tendencies, or game context (e.g., clutch time scenarios). Future research should explore including these contextual elements to enhance the model’s predictive power. Building upon the findings of this study, several key areas for future research emerge. One important direction is the real-time implementation of the model, enabling its use in live games for immediate tactical insights. Optimization techniques, such as model pruning and edge computing, could help reduce inference time, making the model more suitable for real-time coaching applications. Furthermore, the model can be adapted for other team sports, such as soccer, hockey, and rugby, where defensive formations and transitions play a crucial role. By modifying spatial configurations and movement dynamics, similar machine-learning approaches could be used to classify defensive strategies in different sports contexts. Lastly, integrating performance-based outcomes, such as forced turnovers, contested shots, or defensive stops, could improve the practical applicability of the model. By linking defensive strategy classification to tangible game outcomes, teams can make more data-driven strategic decisions. Overall, this study provides a data-driven approach to defensive strategy classification, offering valuable insights for basketball analytics coaches, analysts, and researchers. While the model demonstrates strong classification performance, addressing its limitations and expanding its applications can further enhance its impact on sports strategy optimization. Future research should continue refining defensive modeling techniques to better capture the complexity of real-game scenarios, making machine learning a more powerful tool for modern basketball analysis.
Conclusion and future research directions
The research develops an innovative machine-learning ensemble of RF, LSTM, and CNN to detect defensive basketball strategies from NBA SportVU tracking data. The proposed combined hybrid model achieved a 91.4% classification accuracy, outperforming individual LSTM models and standalone CNN and RF models in detecting complex defensive approaches. Analyzing defensive strategies effectively requires spatial-temporal representations with integrated temporal dependencies in player trajectory data. The proposed methodology analyzed player movement data to reveal precise details about switch and trap scenarios and hand valuable strategic information to coaches and analysts. The findings of this study hold practical implications for basketball stakeholders. Coaching teams use the model to locate defensive paths, which helps them make last-minute decisions involving position adjustments and scheme redirections. Real-time implementation of this model enables coaches to provide instant tactical feedback, enabling improved decision-making across crucial game moments. Through integration within video analysis platforms, analysts can improve team strategy evaluation through insights from this model that enhance defensive preparations for upcoming matches.
Despite their encouraging findings, various pressing research questions and model constraints remain unaddressed. The SportVU database only contains half-court defensive information, limiting research to systematic defensive structures that exclude full-court man-to-man defenses and transition coverage. The research must implement full-court position tracking as a foundation to understand team defense operations completely. Special emphasis should be placed on establishing real-time model implementation since historical data analysis is this study’s primary goal. Modernized data processing systems will enable coaches and analysts to make instant strategic alterations throughout game time. Existing position tracking analysis provides no additional information beyond location data despite ignoring how skills performance history and player tiredness influence team play. Including player-specific attributes could improve accuracy while allowing the researcher to analyze football matches at detailed levels. Future work should develop defensive strategies beyond switches and traps, expanding model applicability while delivering comprehensive basketball defensive assessments. This study offers significant contributions to basketball analytics by advancing the classification of defensive strategies. Addressing the outlined limitations and expanding the model’s scope will enhance its practical utility. These steps will make the model more adaptable for coaching and decision-making and pave the way for broader applications in sports analytics, contributing to the evolution of data-driven strategies in basketball and beyond.
Data availability
The data in this study is publicly available. The SportVU tracking data used for my analysis can be accessed at the following link: https://github.com/sealneaward/nba-movement-data.
References
Miller, A., Bornn, L., Adams, R. & Goldsberry, K. Factorized point process intensities: A spatial analysis of professional basketball. In International Conference on Machine Learning 235–243 (PMLR, 2014).
Nikolaidis, Y. Building a basketball game strategy through statistical data analysis. Ann. Oper. Res. 227, 137–159 (2015).
Sansone, P. et al. Game schedule, travel demands, and contextual factors influence key game-related statistics among the top European male basketball teams. J. Sports Sci. 1, 1–8 (2024).
Cervone, D., D’Amour, A., Bornn, L. & Goldsberry, K. A multiresolution stochastic process model for predicting basketball possession outcomes. J. Am. Stat. Assoc. 111, 585–599 (2016).
Yan, W., Jiang, X. & Liu, P. A review of basketball shooting analysis based on artificial intelligence. IEEE Access. 1, 1 (2023).
Sarlis, V. & Tjortjis, C. Sports analytics—Evaluation of basketball players and team performance. Inf. Syst. 93, 101562 (2020).
Santomier, J., Dolles, H. & Kunz, R. The National Basketball Association’s (NBA) Digital Transformation: An Explanatory Case Study (2023).
Terner, Z. & Franks, A. Modeling player and team performance in basketball. Annu. Rev. Stat. Its Appl.. 8, 1–23 (2021).
Chessa, A. et al. Complex networks for community detection of basketball players. Ann. Oper. Res. 325, 363–389 (2023).
Wang, S. Visual Analytics for Basketball Shot Quality Analysis.
Tian, X. Y. & Shi, J. Modeling the complexity of basketball games using marked mutually exciting point processes. In Communications in Statistics-Simulation and Computation 1–24 (2024).
Qi, K., Hu, G. & Wu, W. Are made and missed different? An analysis of field goal attempts of professional basketball players via depth based testing procedure. Preprint at http://arXiv.org/2403.17221 (2024).
South, C. A basketball paradox: exploring NBA team defensive efficiency in a positionless game. J. Quant. Anal. Sports. 1, 1 (2024).
Bunker, R. P. & Thabtah, F. A machine learning framework for sports result prediction. Appl. Comput. Inf. 15, 27–33 (2019).
Haghighat, M., Rastegari, H., Nourafza, N., Branch, N. & Esfahan, I. A review of data mining techniques for result prediction in sports, Adv. Comput. Sci. Int. J. 2, 7–12 (2013).
Davis, J. et al. Methodology and evaluation in sports analytics: challenges, approaches, and lessons learned. Mach. Learn. 113, 6977–7010 (2024).
Ullah, F., Shah, D., Shah, S., Salam, A. & Ali, S. A novel approach for human face detection in color images using skin color and golden ratio. J. Appl. Environ. Biol. Sci. 7, 159–164 (2017).
Van Zandycke, G. et al. Deepsportradar-v1: Computer vision dataset for sports understanding with high-quality annotations. In Proceedings of the 5th International ACM Workshop on Multimedia Content Analysis in Sports 1–8 (2022).
Gudmundsson, J. & Horton, M. Spatio-temporal analysis of team sports. ACM Comput. Surveys. 50, 1–34 (2017).
Passos, P., Araújo, D. & Volossovitch, A. Performance Analysis in Team Sports (Routledge, Taylor & Francis Group, 2017).
Franks, A., Miller, A., Bornn, L. & Goldsberry, K. Characterizing the Spatial Structure of Defensive Skill in Professional Basketball (2015).
Goldsberry, K. Courtvision: New visual and spatial analytics for the NBA. In 2012 MIT Sloan Sports Analytics Conference, vol. 9, 12–15 (2012).
Sarlis, V., Gerakas, D. & Tjortjis, C. A data science and sports analytics approach to Decode clutch dynamics in the last minutes of NBA games. Mach. Learn. Knowl. Extr. 6, 2074–2095 (2024).
Guo, T. et al. From core to peripheral: A network analysis of lineup types in NBA playoff teams. Chaos Solitons Fractals X. 1, 100115 (2024).
Skinner, B. & Guy, S. J. A method for using player tracking data in basketball to learn player skills and predict team performance. PLoS ONE. 10, e0136393 (2015).
De Bacco, C., Wang, Y. & Blei, D. A causality-inspired plus-minus model for player evaluation in team sports. In Causal Learning and Reasoning 769–792 (PMLR, 2024).
Gkotsis, P. The Value of the Three-Point Shot Throughout the 2010–2020 Era of NBA Basketball (Lietuvos Sporto Universities, 2024).
Zhou, Y. & Li, T. Quantitative analysis of professional basketball: A qualitative discussion. J. Sports Anal. 1, 1–15.
Miller, A. C. & Bornn, L. Possession sketches: Mapping NBA strategies. In Proceedings of the 2017 MIT Sloan Sports Analytics Conference, vol. 1 (IEEE, 2017).
Ángel, G. M., Evangelos, T. & Alberto, L. Defensive systems in basketball ball possessions. Int. J. Perform. Anal. Sport. 6, 98–107 (2006).
Kovalchik, S. A. Player tracking data in sports. Annu. Rev. Stat. Its Appl.. 10, 677–697 (2023).
Hobbs, W. S. Unpredictability in Basketball: An Exploration of the Effects of Ball Movement Entropy on Performance in International Women’s Basketball (2019).
Tian, C., De Silva, V., Caine, M. & Swanson, S. Use of machine learning to automate the identification of basketball strategies using whole team player tracking data. Appl. Sci. 10, 24 (2019).
Author information
Authors and Affiliations
Contributions
Jianlong Li wrote the main manuscript text and prepared all figures.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Li, J. Machine learning-based analysis of defensive strategies in basketball using player movement data. Sci Rep 15, 13887 (2025). https://doi.org/10.1038/s41598-025-98877-1
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-98877-1









