Speaker: Jianqing Fan (Princeton)

Title: Distributed estimation and inference with statistical guarantees

Abstract: This talk is on hypothesis testing and parameter estimation in the context of the divide and conquer algorithm. In a unified likelihood based framework, we propose new test statistics and point estimators obtained by aggregating various statistics from k subsamples of size n/k. In both low dimensional and high dimensional settings, we address the important question of how to choose k as n grows large, providing a theoretical upper bound on the number of subsamples that guarantees the errors due to insufficient use of full sample by the divide and conquer algorithms are statistically negligible. In other words, the resulting estimators have the same inferential efficiencies and l_2 estimation rates as a practically infeasible oracle with access to the full sample. For parameter estimation, we show that the error incurred through the divide and conquer estimator is negligible relative to the minimax estimation rate of the full sample procedure. Thorough numerical results are provided to back up the theory. (Joint work with Heather Battey, Han Liu, Junwei Lu and Ziwei Zhu)