A basic problem in multivariate systems is finding predictive relations between the measurable variables in the system. The extent to which a target variable can be predicted via measurement of a set of predictor variables reveals the level of interaction between the variables. We want to quantify the degree to which the target variable is statistically determined by the predictor variables, where statistical determination does not imply physical causality. In signal processing systems, stochastic determination provides a window into the overall control structure. In the simple case of two variables possessing a bivariate normal distribution, the best predictive relation is linear and the correlation coefficient provides the desired coefficient of determination. The problem can still be framed in terms of correlation for multivariate linear systems (albeit, not in such a straightforward form), but not for nonlinear systems.
We examine quantification of nonlinear multivariate stochastic determination among gene expression levels using cDNA microarrays. The method allows incorporation of knowledge of other relevant conditions, such as the application of particular stimuli or the presence of inactivating gene mutations, as predictive elements affecting the target expression level. The approach is general and can be applied to any class of nonlinear predictor functions. For our talk, prediction is based on a ternary perceptron, to which is input one of three values for each predictor gene: +1 [up-regulated], −1 [down-regulated] or 0 [invariant]. External conditions are quantified as +1 [present] or 0 [not present]. Our reason for choosing a perceptron is twofold: first, it is intuitive; second, for n predictor variables, it has only n+1 parameters to estimate and therefore requires much less data than more general nonlinear predictors.
This is a preview of subscription content, access via your institution