Table 9 Preprocessing scripts to prepare datasets ingested into Petagraph.
Preprocess | Script |
---|---|
Creates Concept node and edge files for 4DN human chromosome loops based on 4DN dot call files and connects to HSCLO Concept nodes | 4DN_LOOP.R |
Creates Concept node and edge files for 4DN human chromosome Q values based on 4DN dot call files | 4DN_Q.R |
Creates nodes and edges files for the human embryonic heart single-cell marker data from Asp2019 | ASP2019.ipynb |
Creates edge files for ClinVar links between human genes, diseases and phenotypes | CLINVAR.R |
Creates edge files for CMAP relationships between compounds to human genes | CMAP.R |
Creates Concept node and edge files for TPM gene expression and eQTL data from the GTEx project. | GTEX.ipynb |
Creates edge files GTEXCOEXP relationships between human genes | GTEXCOEXP.R |
Creates edge files between HPO and HGNC from the HPO project. | HGNCHPO.ipynb |
Creates edge files between HPO and MP concept nodes based on PhenKnowLator output. | HPOMP.ipynb |
Creates edges between ENSEMBL and HSCLO | HSCLO_GENCODE.R |
Workflows to process Kids First phenotype and genotype count data. | KF_main.ipynb |
Creates edge files for L1000 relationships between compounds to human genes | L1000.R |
Creates nodes and edges files for the mouse data from the IMPC | MPMGI.ipynb |
Creates nodes and edges files for MSigDB linking genes to pathways | MSIGDB.R |
Creates edge files for STRING relationships between human proteins (UniProt IDs) | STRING.R |