Figure 1

Selection of samples for Affymetrix Genechip arrays with validation of groups. (A) A scattergram used to select different subsets from MDA-MB-231 cells, stably transfected with pEGFP1-Oct3/4 (left panel). The relative GFP intensity of the three subsets are also depicted in a histogram (right panel). (B) PCA generated from Partek analysis, showing the distribution of conditions and clustering of sample replicates. (C) Hierarchical clustering of the samples using 2227 genes differentially expressed at p < 0.001 and FDR < 1.12%. (D) Heat map with hierarchal clustering generated from Partek compares all three conditions and represents 6300 genes differentially expressed (1.2+/−fold) at p < 0.1 and 10% FDR (false discovery rate). Two-sample T-test or F-test (more than 2 classes) based p value was calculated for each significant gene with random variance model. (E) Molecular interaction network of BC overrepresented genes associated with OCT4-expressing BCCs generated from the information in the Ingenuity knowledge base. Top upstream regulators, identified by IPA core analysis, were incorporated in the network. Molecule Activity Predictor (MAP) illustrates upstream/downstream activation or inhibition of molecules in the network. Solid lines represent direct relationship and dashed line represents indirect relationship between nodes. Prediction legend and gene product’s functional class are shown in the legend key. (F) A scatter plot developed in Genespring using normalized signal values in Log10 scale. The statistical significance was assessed with an unpaired t-test between Oct4hi and Oct4low, cut-off points were p-value < 0.05 and fold-change > 2.0. Specific genes of interest have been annotated on the plot.