Table 9 Comparison of model complexity, inference speed, and segmentation accuracy on the ISPRS Vaihingen dataset with \(256\times 256\) input resolution. Params and Size denote the number of learnable parameters and storage footprint, respectively, while FPS indicates inference throughput. Best results are shown in bold, and second-best results are underlined.

Type	Model	Params (M)	Size (MB)	FPS	OA	F1	mIoU
CNN-based	MANet	35.86	137.05	55.55	90.05	88.55	79.91
	ABCNet	13.67	52.19	163.27	90.43	87.90	78.96
	PSPNet	49.07	187.42	35.38	89.31	86.23	76.39
Transformer-based	FTransUNet	203.40	775.93	10.96	88.45	85.68	75.55
	ASMFNet	83.48	321.60	32.43	88.14	78.51	67.82
	CMFNet	104.07	397.13	10.92	90.14	88.45	79.76
	UNetFormer	11.72	44.87	171.36	90.33	89.03	80.68
Diffusion-based	SegDiff	157.69	601.55	1.77	74.77	74.75	52.52
	RNDiff	6.29	24.09	7.17	90.89	88.70	80.19
	PKDiff	6.43	24.54	4.08	91.19	89.51	81.46

Quick links

Search