Fig. 6: Common data transformations.

Common data transformations applied to meet data product specifications for a publishable schema. All data transformations are documented and versioned, establishing each schema’s data lineage, provided as part of each published schema’s data dictionary. For structured data, this includes methodologies applied for managing outliers, missing data, and feature engineering. For unstructured data, this includes transformation applied to prepare raw data, select target data regions (e.g., segmentation), extract features, and performance of dimensionality reduction.