Table 2 NLP-ML models trained on microarray sample descriptions can accurately infer annotations for samples from five different genomics exp types

From: Systematic tissue annotations of genomics samples by modeling unstructured metadata

 

RNA-seq

ChIP-seq

Methylation array

CGHA

Microarray

Total

Adipose tissue

7

8

10

3

10

38

Brain

10

10

10

9

10

49

Colon

6

10

10

5

9

40

Neural tube

10

10

10

7

9

46

Muscle tissue

9

0

10

0

10

29

  1. Each row corresponds to one of five top-performing NLP-ML tissue models. The last column shows the total out of 50 that each of these models annotated correctly. Columns 2–6 show the number of samples (out of 10) from each experiment type that each tissue model annotated correctly.
  2. RNA-seq RNA-seq of coding RNA, Methylation array methylation profiling by array, CGHA comparative genomic hybridization by array, Microarray transcription profiling by array.