Table 1 Results detailing the performance of the SVM, SSAST and BNN models on the nine evaluation tasks for each of the four audio modalities: sentence, three coughs, cough and exhalation

From: Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

Train			Standard (9,379⁺ 16,518⁻)				Match (2,599⁺ 2,599⁻)				Random (20,000⁺ 37,665⁻)
Test			Standard (3,820⁺ 7,301⁻)	Match (907⁺ 907⁻)	Long (10,315⁺ 20,509⁻)	Long match (2,098⁺ 2,098⁻)	Standard (3,820⁺ 7,301⁻)	Match (907⁺ 907⁻)	Long (10,315⁺ 20,509⁻)	Long match (2,098⁺ 2,098⁻)	Random (3,514⁺ 6,663⁻)
Sentence	SVM	UAR	0.669	0.566	0.699	0.570	0.658	0.567	0.646	0.579	0.721
		ROC	0.732	0.596	0.766	0.591	0.714	0.600	0.693	0.597	0.796
		PR	0.578	0.574	0.625	0.580	0.553	0.583	0.515	0.576	0.686
	SSAST	UAR	0.733	0.594	0.739	0.583	0.692	0.602	0.666	0.572	0.763
		ROC	0.800	0.619	0.818	0.621	0.760	0.635	0.732	0.604	0.846
		PR	0.684	0.594	0.715	0.594	0.631	0.626	0.590	0.579	0.774
	BNN	UAR	0.685	0.586	0.702	0.566	0.703	0.604	0.687	0.581	0.702
		ROC	0.776	0.623	0.804	0.614	0.767	0.634	0.749	0.610	0.834
		PR	0.645	0.613	0.689	0.593	0.634	0.629	0.619	0.593	0.752
Three coughs	SVM	UAR	0.669	0.555	0.694	0.541	0.635	0.539	0.639	0.550	0.713
		ROC	0.727	0.568	0.759	0.558	0.684	0.560	0.688	0.568	0.782
		PR	0.570	0.550	0.605	0.538	0.523	0.553	0.510	0.546	0.647
	SSAST	UAR	0.681	0.555	0.696	0.551	0.652	0.546	0.662	0.555	0.725
		ROC	0.750	0.577	0.781	0.569	0.714	0.571	0.723	0.568	0.809
		PR	0.607	0.553	0.648	0.552	0.563	0.557	0.561	0.557	0.701
	BNN	UAR	0.678	0.558	0.696	0.551	0.657	0.558	0.660	0.535	0.716
		ROC	0.751	0.578	0.786	0.578	0.713	0.578	0.720	0.558	0.807
		PR	0.601	0.550	0.647	0.556	0.551	0.554	0.563	0.551	0.691
Cough	SVM	UAR	0.648	0.536	0.685	0.540	0.633	0.541	0.638	0.538	0.695
		ROC	0.712	0.544	0.748	0.550	0.687	0.559	0.692	0.559	0.763
		PR	0.559	0.526	0.594	0.535	0.533	0.550	0.521	0.545	0.625
	SSAST	UAR	0.681	0.545	0.690	0.541	0.638	0.528	0.640	0.543	0.702
		ROC	0.742	0.561	0.768	0.559	0.692	0.552	0.692	0.560	0.790
		PR	0.603	0.540	0.631	0.548	0.535	0.545	0.532	0.550	0.675
	BNN	UAR	0.647	0.540	0.661	0.534	0.618	0.532	0.638	0.541	0.672
		ROC	0.732	0.570	0.765	0.563	0.682	0.542	0.698	0.556	0.786
		PR	0.581	0.556	0.621	0.549	0.511	0.526	0.522	0.541	0.678
Exhalation	SVM	UAR	0.600	0.523	0.639	0.544	0.587	0.528	0.585	0.529	0.653
		ROC	0.646	0.555	0.690	0.559	0.618	0.541	0.621	0.550	0.712
		PR	0.477	0.560	0.513	0.547	0.444	0.536	0.431	0.543	0.566
	SSAST	UAR	0.649	0.553	0.663	0.558	0.593	0.531	0.588	0.531	0.660
		ROC	0.701	0.581	0.725	0.580	0.653	0.552	0.644	0.556	0.750
		PR	0.563	0.578	0.575	0.561	0.496	0.548	0.473	0.549	0.634
	BNN	UAR	0.576	0.529	0.581	0.526	0.603	0.525	0.601	0.541	0.608
		ROC	0.683	0.569	0.722	0.578	0.679	0.570	0.675	0.567	0.744
		PR	0.539	0.581	0.573	0.563	0.519	0.573	0.507	0.551	0.620

The metrics corresponding to the highest performance for each of the 18 (evaluation procedure, test set) pairs (that is, for each pair in {UAR, ROC, PR} × {standard, match, long, long match, random}) across all modalities and models, are bolded. Each training and test set is shown with the corresponding support of individuals who are COVID⁺ and COVID⁻. ROC, ROC–AUC; PR, PR–AUC; UAR, unweighted average recall.

Back to article page

Table 1 Results detailing the performance of the SVM, SSAST and BNN models on the nine evaluation tasks for each of the four audio modalities: sentence, three coughs, cough and exhalation

Search

Quick links