Extended Data Fig. 3: Data visualization tools in the G2P portal.

(a) Protein sequence viewer. This viewer displays protein residue-wise variants and protein features for the selected gene and transcript. Variants can be filtered based on protein consequences and database-specific filters. Data displayed within the viewer can be exported in tabular format (View as table button) and downloaded as CSV or PDF formats (Download button). The figure shows gnomAD missense (singletons; blue) and ClinVar missense (pathogenic/likely-pathogenic; orange) for gene CBS and transcript NM_000071 along with residue-wise physicochemical properties and UniProt sequence annotations in the protein sequence viewer. (b) Protein structure viewer. In the G2P portal, the structure viewer is coupled with the sequence viewer to interactively map variants and protein features on the sequence viewer onto the structure. Users can click a track to select variants or features from the sequence viewer to visualize on the structure viewer. Users can download the customized mapping results in a PyMOL-compatible file. The figure displays the concurrent mapping of gnomAD synonymous singleton variants (green spheres), ClinVar missense pathogenic/likely pathogenic variants (orange spheres), and the Domain annotation from UniProtKB (light blue) mapped on the structure (PDB: 7QGT) (c) Variant information and protein feature cards. These cards provide a per-variant summary of variant details and protein features for the variant position (see Methods: Data visualization tools in the G2P portal, for details). The example in this figure shows the details of CBS variant Gly116Arg from ClinVar and the physicochemical, structural, and functional features for the variant position, Gly116. The variant and features are linked to their sources, whenever available. (d) Mutagenesis output viewer. This viewer shows the mutagenesis readouts, when available in MaveDB28, for a gene as a heatmap. By hovering over the heatmap, users can view the readouts from the assay and can download the entire score set by clicking the download icon. The figure highlights the residues 90-390 with a differentiating mutagenesis readouts compared to the rest of the protein.