In the era of big data,high-dimensional data always arrive in streams,making timely and accurate decision necessary.It has become particularly important to rapidly and sequentially identify individuals whose behavior ...In the era of big data,high-dimensional data always arrive in streams,making timely and accurate decision necessary.It has become particularly important to rapidly and sequentially identify individuals whose behavior deviates from the norm.Aiming at identifying as many irregular behavioral patterns as possible,the authors develop a large-scale dynamic testing system in the framework of false discovery rate(FDR)control.By fully exploiting the sequential feature of datastreams,the authors propose a screening-assisted procedure that filters streams and then only tests streams that pass the filter at each time point.A data-driven optimal screening threshold is derived,giving the new method an edge over existing methods.Under some mild conditions on the dependence structure of datastreams,the FDR is shown to be strongly controlled and the suggested approach for determining screening thresholds is asymptotically optimal.Simulation studies show that the proposed method is both accurate and powerful,and a real-data example is used for illustrative purpose.展开更多
This paper considers the problem of detecting structural changes in a high-dimensional regression setting. The structural parameters are subject to abrupt changes of unknown magnitudes at unknown locations. The author...This paper considers the problem of detecting structural changes in a high-dimensional regression setting. The structural parameters are subject to abrupt changes of unknown magnitudes at unknown locations. The authors propose a new procedure that minimizes a penalized least-squares loss function via a dynamic programming algorithm for estimating the locations of change points. To alleviate the computational burden, the authors adopt a prescreening procedure by eliminating a large number of irrelevant points before implementing estimation procedure. The number of change points is determined via Schwarz’s information criterion. Under mild assumptions, the authors establish the consistency of the proposed estimators, and further provide error bounds for estimated parameters which achieve almost-optimal rate. Simulation studies show that the proposed method performs reasonably well in terms of estimation accuracy, and a real data example is used for illustration.展开更多
基金supported by the National Natural Science Foundation of China under Grant Nos.11771332,11771220,11671178,11925106,11971247the National Science Foundation of Tianjin under Grant Nos.18JCJQJC46000,18ZXZNGX00140+1 种基金the 111Project B20016Mushtaq was also supported by the Fundamental Research Funds for the Central Universities。
文摘In the era of big data,high-dimensional data always arrive in streams,making timely and accurate decision necessary.It has become particularly important to rapidly and sequentially identify individuals whose behavior deviates from the norm.Aiming at identifying as many irregular behavioral patterns as possible,the authors develop a large-scale dynamic testing system in the framework of false discovery rate(FDR)control.By fully exploiting the sequential feature of datastreams,the authors propose a screening-assisted procedure that filters streams and then only tests streams that pass the filter at each time point.A data-driven optimal screening threshold is derived,giving the new method an edge over existing methods.Under some mild conditions on the dependence structure of datastreams,the FDR is shown to be strongly controlled and the suggested approach for determining screening thresholds is asymptotically optimal.Simulation studies show that the proposed method is both accurate and powerful,and a real-data example is used for illustrative purpose.
基金the National Nature Science Foundation of China under Grant Nos. 11771332,11771220, 11671178, 11925106, 11971247the Nature Science Foundation of Tianjin under Grant No.18JCJQJC46000supported by the Fundamental Research Funds for the Central Universities。
文摘This paper considers the problem of detecting structural changes in a high-dimensional regression setting. The structural parameters are subject to abrupt changes of unknown magnitudes at unknown locations. The authors propose a new procedure that minimizes a penalized least-squares loss function via a dynamic programming algorithm for estimating the locations of change points. To alleviate the computational burden, the authors adopt a prescreening procedure by eliminating a large number of irrelevant points before implementing estimation procedure. The number of change points is determined via Schwarz’s information criterion. Under mild assumptions, the authors establish the consistency of the proposed estimators, and further provide error bounds for estimated parameters which achieve almost-optimal rate. Simulation studies show that the proposed method performs reasonably well in terms of estimation accuracy, and a real data example is used for illustration.