Introduction

The widespread implementation of low-dose computed tomography (LDCT) screening has significantly increased the detection of early-stage non-small cell lung cancer (NSCLC), particularly stage IA tumors1,2. This shift has sparked renewed interest in parenchyma-sparing procedures such as segmentectomy, which has shown non-inferior or even superior survival outcomes compared to lobectomy in recent randomized trials3,4,5. Consistent with this evidence, the most recent NCCN® guidelines advocate for sublobar resection, preferably segmentectomy, in patients with peripheral T1a-bN0 NSCLC (clinical stage IA1–IA2, ≤ 2 cm)6.

Simultaneously, video-assisted thoracoscopic surgery (VATS) has become a preferred minimally invasive technique for anatomical resections. As VATS segmentectomy gains broader clinical adoption, ensuring adequate surgical training and procedural standardization has become increasingly important7.

Digital platforms like YouTube have emerged as widely used, accessible resources for surgical education. However, the educational quality of user-uploaded content remains largely unregulated and frequently lacks critical surgical details, peer review, or adherence to validated reporting guidelines8,9,10,11.

To our knowledge, no previous study has systematically evaluated the educational quality of YouTube videos on VATS segmentectomy using the LAP-VEGaS criteria.This study aims to systematically evaluate the quality and educational value of YouTube videos on VATS segmentectomy, using established assessment frameworks to highlight current strengths, limitations, and opportunities for improvement.

Materials and methods

To identify the most accessible and widely viewed educational content on VATS segmentectomy, a structured search was performed on YouTube® (http://www.youtube.com) on June 12, 2025, using the keyword “VATS segmentectomy”. The search was performed only once and was not repeated at later time points, as our goal was to capture a cross-sectional snapshot of the content available on that specific date. All video characteristics (URL, title, views, likes, duration, and other metadata) were extracted and documented at the time of search. In alignment with previous studies on surgical video quality, the results were sorted by view count, as users are more likely to engage with highly viewed videos. To ensure representativeness, we applied a minimum threshold of 2500 views as a pragmatic visibility cut-off. While prior studies in thoracic and general surgery did not use the same numeric value, they employed popularity-based inclusion criteria (e.g., selecting the most-viewed videos), and our approach follows this principle12,13. We acknowledge that this may have excluded recently uploaded but potentially high-quality videos. A total of four videos were excluded from the final analysis: one due to its nature as a patient information video, and three because they depicted robotic segmentectomy procedures, which did not align with the study’s focus on thoracoscopic techniques. This approach aimed to capture a realistic and representative sample of the most frequently accessed YouTube videos relevant to video-assisted thoracoscopic segmentectomy. After full review, a final cohort of 34 educational videos was included for comprehensive analysis.

Each video was independently screened by two experienced thoracic surgeons to determine eligibility and relevance. The following metadata were systematically extracted for each video: Title, URL, number of views, upload duration (in days), video length (in seconds), number of subscribers, image quality (e.g., 1080p, 720p), country of origin and likes. Publisher identity was categorized as individual or institutional, based on channel information and video presentation. The resected segment was classified according to the anatomical label provided by the video title or operative footage.

Given the absence of a universally accepted, segmentectomy-specific video evaluation tool, we selected the LAParoscopic surgery Video Educational GuidelineS (LAP‑VEGaS) criteria to systematically assess video quality. This validated framework has been widely applied in previous studies to evaluate technical accuracy, anatomical clarity, and procedural coherence in laparoscopic and thoracoscopic surgical videos. LAP‑VEGaS was considered particularly appropriate for this study due to its structured emphasis on stepwise intraoperative education, which aligns with the core instructional goals of thoracic surgical training. Its standardized nature also allows for reproducibility and comparability across studies assessing the educational value of online surgical content.

This study exclusively analyzed publicly accessible surgical videos that did not involve identifiable human subjects or patient data. In accordance with established ethical standards for research involving open-source content, institutional review board (IRB) approval was not required.

Software and statistical analysis

All statistical analyses were performed using SPSS version 27 (Statistical Package for the Social Sciences). A two-tailed p value of < 0.05 was considered indicative of statistical significance. All videos were independently evaluated by two experienced thoracic surgeons with extensive practice in minimally invasive anatomic resections, both of whom are actively engaged in resident teaching. The evaluators were blinded to the identity and institutional affiliation of the video uploaders and used the LAP-VEGaS scoring system.Inter-rater agreement for categorical assessments was measured using Cohen’s kappa (κ) coefficient, and any discrepancies were resolved through consensus-based discussion. Descriptive statistics were generated for all video characteristics and scoring variables. To assess the assumption of normality, both the Kolmogorov–Smirnov and Shapiro–Wilk tests were conducted. Non-parametric correlations between the educational quality score (LAP-VEGaS) and video popularity metrics (e.g., number of views, number of likes, video duration) were assessed using Spearman’s rank correlation coefficient (ρ).

Results

This study aimed to evaluate the educational quality of widely viewed YouTube videos on VATS segmentectomy using the LAP-VEGaS criteria. The complete list of videos selected through the structured selection process described above, ranked by number of views, is presented in Table 1. For each video, the table provides data on the number of views, upload duration (in days), video length (in seconds), image quality, number of likes, number of subscribers, country of origin, and type of YouTube channel.

Table 1 Videos analyzed and main characteristics.

A total of 34 videos were included in the final analysis (Fig. 1). The earliest video was uploaded in November 2011, while the most recent was published in November 2021. Among these, 24 videos (70.6%) were uploaded by personal YouTube channels, whereas 10 videos (29.4%) originated from institutional sources.

Fig. 1
figure 1

Flow chart of videos included in the current study.

In terms of geographic distribution, the majority of videos originated from China (n = 10, 29.4%), followed by Spain and the United States (each n = 8, 23.5%). Additional contributing countries included Australia and Italy (each n = 2, 5.9%), as well as the United Kingdom, Germany, India, and Israel (each n = 1, 2.9%). These findings indicate that most of the content was produced by individual users and was predominantly concentrated in China, Spain, and the United States (Table 2).

Regarding the type of commentary, 20 videos (58.8%) lacked any form of audio or written narration. 7 videos (20.6%) included only audio narration, while 2 videos (5.9%) provided only written commentary. In 5 videos (14.7%), both audio and written explanations were available. Notably, the surgeon contributing the highest number of videos had a YouTube channel with 17,300 subscribers, and 15 videos from this channel were included in the study.

Table 2 Detail of the videos.

The most frequently resected segments were the right upper anterior segment (n = 5, 14.7%), right apical segment (n = 4, 11.8%), right posterior segment of the upper lobe (n = 4, 11.8%), and right superior segment of the lower lobe (n = 3, 8.8%) (Fig. 2).

Fig. 2
figure 2

The distribution of the most frequently resected segments.

The median number of views was determined to be 3615.0, with a standard deviation of 2026.01; the minimum and maximum values were 2551.0 and 10,453.0, respectively. Regarding the online availability duration of the videos (in days), the median was 3199.0 days, with a standard deviation of 837.03; the minimum and maximum durations were 1315.0 and 4921.0 days, respectively. For video lengths (in seconds), the median duration was 468.0 s, with a standard deviation of 586.83; the minimum and maximum durations were 240.0 and 2418.0 s, respectively. In terms of the number of likes, the median value was 19.0, with a standard deviation of 20.33; the minimum was 7.0 and the maximum was 92.0 (Table 3).

Table 3 Video features.

To evaluate the relationship between video characteristics and the number of views, statistical analyses were performed (Table 4). The interpretation of the correlation coefficients in this study was based on the classification proposed by Schober et al.14. According to Spearman correlation analysis, a weak positive correlation was observed between video length (in seconds) and the number of views, and this relationship was statistically significant (rs = 0.340; p = 0.049). This finding suggests a slight tendency for longer videos to receive more views, although the strength of the association remains limited. A moderate positive correlation was found between the number of likes and view counts, and this association was highly statistically significant (rs = 0.612; p < 0.001). This indicates that videos with more likes are generally associated with higher numbers of views. A weak positive correlation was also identified between the number of subscribers and the number of views; however, this association did not reach statistical significance (rs = 0.291; p = 0.095). Similarly, a negligible positive correlation was detected between the duration of online availability (in days) and view counts, but this relationship was not statistically significant (rs = 0.011; p = 0.951). Regarding image quality, a weak positive correlation with view counts was observed; however, this association was also not statistically significant (rs = 0.104; p = 0.560). In addition, other variables—such as channel type (p = 0.821), country of origin (p = 0.797), and resected segment (p = 0.893)—were not significantly associated with the number of views.

Table 4 Statistical analysis of video characteristics with number of visualizations.

In the evaluation conducted according to the LAP-VEGaS guideline, a total of 9 criteria were considered. Each criterion was scored as “Not presented (0),” “Partially presented (+ 1),” or “Fully presented (+ 2),” resulting in a total possible score ranging from 0 to 18 (Table 5).

Table 5 LAP-VEGAS criteria.

Among the 34 videos analyzed in our study, total scores ranged between 2.00 and 14.00, with a mean score of 6.56 ± 3.96. The median score was calculated as 4.00 (Table 6). The fact that the median is lower than the mean indicates a positively skewed distribution, suggesting that the majority of videos demonstrated low compliance with LAP-VEGaS criteria.

In the validity analysis of the LAP-VEGaS video assessment tool, a total score of ≥ 11 has been recommended as the threshold for sufficient educational quality for publication15. In our study, only 8 out of 34 videos (23.5%) achieved a score above this threshold. This finding indicates that the vast majority of the videos are educationally inadequate according to the LAP-VEGaS standards.

Specifically, the standardized step-by-step presentation of the surgical procedure (LAP-VEGaS Item 4) was included in only 41.2% of the videos. The highest level of compliance with the LAP-VEGaS criteria was observed in video number 6, which achieved 77% of the total possible score. In contrast, videos numbered 11, 13, and 14 demonstrated the lowest compliance, each with only 11% of the total score.

Table 6 Descriptive statistics of LAP-VEGaS scores for evaluated Videos.

The relationships between the LAP-VEGaS score and various video characteristics were evaluated using Spearman correlation analysis (Table 7). According to the results, a positive but negligible correlation was found between the number of views and the LAP-VEGaS score; however, this relationship was not statistically significant (rs = 0.031; p = 0.860). A negative, negligible correlation was observed between the number of likes and the LAP-VEGaS score, which was also not statistically significant (rs = − 0.069; p = 0.700). A positive but again negligible correlation was found between video duration (in seconds) and the LAP-VEGaS score, and this finding was likewise not statistically significant (rs = 0.051; p = 0.773). These results indicate that quantitative characteristics such as number of views, number of likes, and video duration are not significantly associated with the educational quality of the videos as assessed by the LAP-VEGaS criteria. In addition, narration demonstrated a strong association with educational quality. Narration was significantly associated with higher LAP-VEGaS scores (Kruskal–Wallis, p < 0.001). Pairwise comparisons showed that videos with voice narration (p < 0.001), text narration (p = 0.008), and combined narration (p < 0.001) all had higher scores than non-narrated videos, with no significant differences among the narrated groups.

Table 7 Correlation between LAP-VEGaS score and video characteristics.

Discussion

While online video platforms provide valuable supplementary resources for surgical education, they cannot replace the structured, supervised, and hands-on training that remains fundamental to formal fellowship programs. Nevertheless, surgical videos—particularly those on YouTube—are now widely used by surgeons and trainees worldwide as accessible educational tools. Our study therefore aimed to evaluate whether such freely available resources align with established standards of surgical education. In contrast to most previous studies—which either lacked a standardized selection protocol or relied on randomly selected video samples—we adopted a reproducible strategy based on the analysis of the top 100 most-viewed videos returned by the YouTube search engine8,13,16. As YouTube does not disclose the total number of search results for any given query, we prioritized relevance based on view counts and applied predefined inclusion and exclusion criteria to ensure methodological rigor and comparability.

In the literature, various assessment tools have been used to evaluate the educational quality of surgical videos on YouTube, each with distinct strengths and intended purposes. In addition to the LAParoscopic surgery Video Educational GuidelineS (LAP-VEGaS)15, alternative frameworks include the Critical View of Safety (CVS)17, the Journal of the American Medical Association (JAMA) Benchmark Criteria18 and the Global Quality Score (GQS)19. While JAMA and GQS mainly assess general reliability, readability, and patient-centered information, CVS is procedure-specific for laparoscopic cholecystectomy. If these instruments had been applied, the results would likely have favored videos with polished presentation or general reliability rather than intraoperative didactic quality, and CVS is not directly transferable to thoracic surgery. By contrast, LAP-VEGaS focuses on intraoperative anatomy, stepwise education, and technical detail, aligning more closely with the instructional goals of thoracic surgical training. In the original validation of the LAP-VEGaS tool, ROC analysis demonstrated that a score of ≥ 11 correlated strongly with expert recommendations for acceptance of a video for publication or conference presentation (sensitivity 94%, specificity 73%). This validated threshold represents adequate educational quality and was therefore adopted in our study to benchmark the performance of VATS segmentectomy videos15.

A recent systematic review by Gorgy et al. (2023) further reinforces this choice by highlighting the widespread application of LAP‑VEGaS in assessing video-based surgical education9. Of the 29 studies included in the review, nine specifically applied the LAP‑VEGaS criteria, all of which uniformly reported that the majority of YouTube videos failed to meet acceptable educational standards. These studies consistently identified critical deficiencies such as inadequate demonstration of segmental anatomy, omission of key procedural steps, lack of pre- and postoperative context, and insufficient didactic narration. For instance, Balta et al. found low LAP VEGaS and CVS adherence in thoracoscopic lobectomy videos, mirroring broader concerns about the unregulated nature of publicly available surgical content12.

Taken together, these findings underscore the importance of applying structured, validated tools like LAP‑VEGaS not only to evaluate but also to guide the production of high-quality surgical videos that meet the expectations of formal training environments.

With the increasing adoption of VATS segmentectomy as a parenchyma-sparing approach for early-stage NSCLC, the need for high-quality educational content has never been greater. In this study, we evaluated publicly available YouTube videos on VATS segmentectomy using the standardized educational (LAP-VEGaS) tool. Our findings reveal that the majority of these videos lack essential components needed to support effective surgical training, raising significant concerns about their pedagogical value.

While YouTube provides global accessibility and a vast repository of surgical content, our analysis echoes previous studies suggesting that popularity does not equate to educational quality. Similar to the findings of Ferhatoglu et al.20 and Coşgun et al.21, we observed no significant correlation between view count or likes and LAP-VEGaS scores. This discrepancy highlights a fundamental limitation of using unfiltered platforms for professional education. Popularity metrics such as views and likes are likely driven by factors independent of pedagogical rigor, including editing style, production quality, uploader or institutional reputation, attention-grabbing titles and thumbnails, language accessibility (e.g., English narration or subtitles), and algorithmic exposure. In some cases, prominent surgeons may attract large audiences regardless of adherence to structured educational standards. These dynamics explain why highly viewed videos may not necessarily represent high-quality educational resources, a pattern consistently reported across other surgical specialties22,23.

Our findings align with prior research in other specialties, including general surgery and urology, where YouTube videos have also been found to be deficient in both content completeness and safety representation8,10,22,23. In thoracic surgery, where procedures often involve nuanced 3D anatomy, the absence of structured narration, clear visual aids, and postoperative outcomes further limits the educational potential of such videos. Our analysis shows that narration, regardless of format, is associated with superior educational quality. This practical insight suggests that including clear narration should be considered an essential element when producing surgical educational videos.

The use of the LAP-VEGaS framework allowed for a detailed evaluation of educational quality, yet even this tool may not fully capture procedural accuracy. A technically flawed video may score high on structure alone. Therefore, we propose that future frameworks incorporate dual-layered evaluation—assessing both educational formatting and procedural correctness, perhaps through peer-review by specialty societies.

To improve the educational value of surgical videos, we propose a concise checklist for content creators derived from LAP-VEGaS criteria and our findings. This checklist is designed for individual creators and includes: (i) providing case context while ensuring patient anonymity, (ii) presenting procedures in a standardized step-by-step fashion, (iii) continuous narration or on-screen annotation of anatomical landmarks, (iv) integration of pre- and postoperative context and outcomes, and (v) the use of diagrams or graphic aids. Collaboration with professional societies to establish peer review and endorsement mechanisms would further support creators in aligning their videos with formal training standards. A full version of this checklist is provided in Supplementary Table S1.

Given these findings, we strongly advocate for the development of centralized, peer-reviewed surgical video repositories curated by academic institutions or professional societies. YouTube remains an open-access platform without standardized submission criteria or quality control, making its content heterogeneous in educational value. In contrast, peer-reviewed surgical video repositories such as WebSurg, the Journal of Medical Insight (JOMI), and CTSNet Video Library provide curated content with expert peer review, structured didactic presentation, and disclosure standards. These platforms emphasize accuracy, reproducibility, and instructional clarity, which enhances their value as reliable educational resources compared with user-uploaded content on YouTube. To further maximize their educational value for trainees, curated repositories should incorporate specific features such as a standardized stepwise structure aligned with consensus checklists, mandatory narration or annotation of anatomical landmarks, provision of pre- and postoperative context, reporting of outcomes, and structured metadata (e.g., patient positioning, port placement, instruments used). Additional elements such as multilingual captions, conflict-of-interest disclosure, and visible endorsement by professional societies would enhance credibility and help learners readily identify trustworthy content. Platforms such as the AATS Video Library and the ESTS Learning Portal represent promising steps in this direction and may serve as models for future educational ecosystems.

Limitations

This study has several limitations. The analysis was restricted to English-language videos, which may have excluded high-quality content in other languages. The scoring process was inherently subjective, although this was mitigated by dual-rater review. We acknowledge that the ≥ 2500 views threshold, while improving ecological validity for frequently accessed content, may have excluded newly uploaded but potentially high-quality videos. Because our analysis was based on a single search at one time point, reproducibility may be affected, as repeated searches could yield different results due to the dynamic and evolving nature of YouTube. However, to mitigate this, all video characteristics and metrics were documented at the time of search. Furthermore, although the LAP-VEGaS tool is validated for assessing educational quality, it does not directly measure procedural accuracy, and technically flawed videos may still achieve high structural scores. Finally, the cross-sectional design does not account for the dynamic nature of YouTube content, which is continually evolving.

Conclusion

While YouTube offers an accessible and popular platform for surgical learning, most videos on VATS segmentectomy do not meet minimal standards for structured surgical education. A shift toward validated, peer-reviewed educational content is necessary to ensure safe dissemination of operative knowledge in thoracic surgery.