Table 2 Levels of HTAN data
Level | Single-cell RNA-seq | Multiplex imaging | Spatial transcriptomics |
---|---|---|---|
1 | Unaligned sequencing reads, usually in the FASTQ file format. | Raw imaging tiles that require preprocessing, such as stitching, registration or background subtraction. Typically TIFF or a proprietary format | Unaligned sequencing reads, usually in the FASTQ file format. |
2 | Aligned sequencing reads, usually in the BAM file format. | Multichannel image. Usually in the OME-TIFF file format, accompanied by a CSV file containing channel metadata. | Aligned sequencing reads, usually in the BAM file format. |
3 | Gene expression matrix. For example, a matrix of all cells by all genes, with expression counts. Multiple file formats are supported, including CSV, MTX and h5ad. | Segmentation masks denoting nuclei, cytoplasm, whole cells or regions of interest. Multiple file formats are supported although TIFF and OME-TIFF are recommended. | Gene expression matrix. For example, a matrix of all cells by all genes, with expression counts. Multiple file formats are supported, including CSV, MTX and h5ad. |
4 | Feature matrix. For example, a matrix of cluster assignments or imputed cell types across all sequenced cells. Multiple file formats are supported, including CSV and h5ad. | Feature matrix. For example, a matrix of mean intensity values per cell and channel Multiple file formats are supported, including CSV and h5ad. | Feature matrix. For example, a matrix of cluster assignments or imputed cell types across all sequenced cells. Multiple file formats are supported, including CSV and h5ad. |