Stat Appl Genet Mol
-
Stat Appl Genet Mol · Jun 2014
Applying shrinkage variance estimators to the TOST test in high dimensional settings.
Identifying differentially expressed genes has been an important and widely used approach to investigate gene functions and molecular mechanisms. A related issue that has drawn much less attention but is equally important is the identification of constantly expressed genes across different conditions. A common practice is to treat genes that are not significantly differentially expressed as significantly equivalently expressed. Such naive practice often leads to large false discovery rate and low power. The more appropriate way for identifying constantly expressed genes should be conducting high dimensional statistical equivalence tests. A well-known equivalence test, the two one-sided tests (TOST), can be used for this purpose. However, due to the small sample sizes often associated with genomics data, the variance estimator in the TOST test could be unstable. Hence it would be fitting to examine the application of shrinkage variance estimators to the TOST test in high dimensional settings. ⋯ In this paper, we study the effect of shrinking the variance estimators in the TOST test in high dimensional settings through simulation studies. In addition, we derive analytic formulas for the p-value of the resultant shrinkage variance TOST test and apply it to a real data set.
-
Stat Appl Genet Mol · Jan 2005
Computing asymptotic power and sample size for case-control genetic association studies in the presence of phenotype and/or genotype misclassification errors.
It is well established that phenotype and genotype misclassification errors reduce the power to detect genetic association. Resampling a subset of the data (e.g, double-sampling) of genotype and/or phenotype with a gold standard measurement is one method to address this issue. We derive the non-centrality parameter (NCP) for the recently published Likelihood Ratio Test Allowing for Error (LRTae) in the presence of random phenotype and genotype errors. ⋯ For our cost/benefits analysis calculations, results for genotype errors are that double-sampling appears most beneficial (in terms of power gain) when cost of double-sampling is relatively low, irrespective of the proportion of individuals double-sampled. In the presence of phenotype error, there is always power gain using the LRTae method for the parameter settings considered. We have freely available software that performs power and sample size calculations for the LRTae method and cost/benefits analyses comparing power for LRTae and LRTstd methods assuming equal costs.