Fig. 2: Reliability-informed thresholding. | npj Digital Medicine

Fig. 2: Reliability-informed thresholding.

From: Robust language-based mental health assessments in time and space through social media

Fig. 2

Spatiotemporal reliability of language-based mental health assessments of depression across different granularities of space and time in the New York metropolitan area. The heatmap in (a) shows the 1 − Cohen’s d reliability of select New York metropolitan depression data, at each space and time unit ≥20 unique users were required. From this heatmap, we target the smallest time unit from the smallest space unit greater than 0.9, which is county-week. The plot in (b) shows how the reliability of a county-week measurement of depression increases with the minimum number of unique users required to consider that county-week. In the case of Gallup data, after a UT of 100 none of the county measurements can meet the minimum criteria to be reported. Horizontal lines are drawn at 0.8 and 0.9 reliability, which were used to select a 50 and a 200 county user threshold. The standard error of the reliability is shown with red shading, and the 95% confidence interval is shown with error bars. The county-year Intraclass Correlations, test-length corrected (ICC282;) at a UT of 50 are ICC2 = 0.33 for Gallup Sadness and ICC2 = 0.97 for LBMHA depression, while at a UT of 200 are ICC2 = 0.87 for Gallup and ICC2 = 0.99 for LBMHA. c shows data descriptives for the county-week dataset after applying a user threshold of 50 and 200 as per the reliability findings and applying all other thresholds.

Back to article page