十大网赌网址

学术动态

【学术讲座】《Knockoffs: A New Paradigm Powering Reproducible Scientific Research》

发布日期:2018-07-06 14:53:11   来源:十大网赌网址   点击量:

主题:   Knockoffs: A New Paradigm Powering Reproducible Scientific Research

内容摘要: Many contemporary large-scale applications involve building interpretable models linking a large set of potential covariates to a response in a nonlinear fashion, such as when the response is binary. Although this modeling problem has been extensively studied, it remains unclear how to effectively control the fraction of false discoveries even in high-dimensional logistic regression, not to mention general high-dimensional nonlinear models. To address such a practical problem, we propose a new framework of model-X knockoffs, which reads from a different perspective the knockoff procedure (Barber and Candès, 2015) originally designed for controlling the false discovery rate in linear models. Whereas the knockoffs procedure is constrained to homoscedastic linear models with n ≥ p, the key innovation here is that model-X knockoffs provide valid inference from finite samples in settings in which the conditional distribution of the response is arbitrary and completely unknown. Furthermore, this holds no matter the number of covariates. Correct inference in such a broad setting is achieved by constructing knockoff variables probabilistically instead of geometrically. To do this, our approach requires the covariates be random (independent and identically distributed rows) with a distribution that is known, although we provide preliminary experimental evidence that our procedure is robust to unknown/estimated distributions. To our knowledge, no other procedure solves the controlled variable selection problem in such generality, but in the restricted settings where competitors exist, we demonstrate the superior power of knockoffs through simulations. Finally, we apply our procedure to data from a case-control study of Crohn’s disease in the United Kingdom, making twice as many discoveries as the original analysis of the same data.

主讲人:Yingying Fanhttp://www-bcf.usc.edu/~fanyingy/

时间:201879日下午2

地点:1308

主讲人简历:

南加州大学助理教授,普林斯顿大学博士,范剑青的学生;在The Annals of StatisticsJournal of the Royal Statistical Society等国际统计期刊发表多篇研究论文;担任多个经济、统计期刊编委;她的研究兴趣包括高维统计学,大数据问题,高维分类,金融计量经济学与商业应用等。

 

 

分享到:

相关信息