Table 8 Multimodal versus unimodal comparative analysis for CCCS-CIC-AndMal-2020 and Blended malware image datasets.

From: Multimodal malware classification using proposed ensemble deep neural network framework

Ensemble models

Multimodal implementation (Late fusion)

RUS-Boost (%)

Random forest (%)

Subspace (%)

AdaBoost-M2 (%)

BagTree (%)

Proposed NNW (Numeric + Visual) (%)

RUSBoost (Numeric) and proposed NNW(Visual) (%)

Majority Vote

94.34

94.22

55.52

79.34

92.13

94.02

95.36

Stacked Ensemble

74.95

74.38

24.05

53.94

74.83

82.55

82.64

Boosted Ensemble

94.34

76.13

46.84

63.11

73.83

94.02

95.04

Bagged Ensemble

94.34

76.13

50.56

64.68

73.83

94.02

95.36

Existing Mixed Dataset68

92.3

Unimodal accuracies

Numeric Dataset

94.2

94.42

39.70

71.93

89.04

91.40

94.2

Imagery Dataset

93.45

93.35

73.50

87.42

94.23

96.06

Existing Numeric Dataset69

-

93.36

Existing Imagery Dataset70

95 precision

Existing Pure Text68

96.2

Existing Words-changing Text68

87.3

Existing Words-missing Text68

92.4

Existing Pure Image68

91.2

Existing multimodal (API call + Visual)43

93.5