Table 8 Effect of different deep learning models on the evaluation criteria values for Quick Bird images.

From: Remote sensing image description based on word embedding and end-to-end deep learning

Model features

Fusion

Word

P

R

F

P

R

F

CNN 25

C

0.8732

0.8430

0.8593

0.8809

0.8471

0.8632

Bi-LSTM28

B-L

0.8647

0.8388

0.8516

0.8799

0.8452

0.8517

Attention 29

A

0.8707

0.8399

0.8550

0.8801

0.8465

0.8629

Dense Net 31

D-N

0.8788

0.8451

0.8666

0.8822

0.8496

0.8656

Attention-CNN-Bi-LSTM

A-C-B

0.8820

0.8483

0.8698

0.8926

0.8531

0.8724

Attention-CNN-IndRNN

A-C-I

0.8811

0.8456

0.8630

0.8919

0.8524

0.8707

CNN_LSTM

C-L

0.8761

0.8439

0.8597

0.8872

0.8503

0.8684

CNN-Bi-LSTM

C-B

0.8817

0.8479

0.8645

0.8891

0.8510

0.8696

CNN-IndRNN

C-I

0.8752

0.8446

0.8596

0.8835

0.8502

0.8665

End-to-end

E2E

0.8905

0.8593

0.8778

0.9057

0.8741

0.8896