Information for regulating a gene's transcription
is contained in the conserved patterns (motifs) on the
upstream/downstream DNA sequence (promoter region) close to the
target gene. By combining the information contained in both gene
expression measurements and genes' promoter sequences, I proposed
a novel procedure for identifying functional active motifs under
certain stimuli. A nonlinear regression model, single index model,
was used to associate promoter sequence information of a gene and
its mRNA expression measurements. Single index models postulate
that the response variable y depends on a unique linear
combination of predictors X through an unknown link
function f: y = f(Xβ,
ε), where β is the index
vector and ε represents
measurement errors. In this talk, I will describe computational
efficient variable selection procedures and criteria, which were
developed by us under profile likelihood frameworks for the single
index model. I will also demonstrate the advantage of these
methods both theoretically and empirically. Compared with existing
methods, our proposed procedures can greatly improve variable
selection sensitivities and specificities.