基于滑动窗口的方法,结合机器学习分类技术,可以判定文本的作者归属。但是此类方法需要精心挑选对应的文本特征,不同的文本特征选取可能会影响判定结果。针对以上问题,提出了一种基于快速文本分类(fastText)的文本作者归属判定模型。该...基于滑动窗口的方法,结合机器学习分类技术,可以判定文本的作者归属。但是此类方法需要精心挑选对应的文本特征,不同的文本特征选取可能会影响判定结果。针对以上问题,提出了一种基于快速文本分类(fastText)的文本作者归属判定模型。该模型融合滑动窗口的思想,引入词(字)向量、数据增强技术,从而充分利用文本信息、自动提取文本特征,并且以可视化的方式将结果呈现出来。使用该模型来检测《红楼梦》、《Roman de la Rose》的作者归属,实验结果表明《红楼梦》的前八十回与后四十回为不同作者所著、《Roman de la Rose》开篇4 058行(约50 000字)与后面17 724行(约218 000字)为不同作者所著。证明了Rolling-fastText模型判定文本作者归属的有效性。展开更多
A method of fast data processing has been developed to rapidly obtain evolution of the electron density profile for a multichannel polarimeter-interferometer system(POLARIS)on J-TEXT. Compared with the Abel inversio...A method of fast data processing has been developed to rapidly obtain evolution of the electron density profile for a multichannel polarimeter-interferometer system(POLARIS)on J-TEXT. Compared with the Abel inversion method, evolution of the density profile analyzed by this method can quickly offer important information. This method has the advantage of fast calculation speed with the order of ten milliseconds per normal shot and it is capable of processing up to 1 MHz sampled data, which is helpful for studying density sawtooth instability and the disruption between shots. In the duration of a flat-top plasma current of usual ohmic discharges on J-TEXT, shape factor u is ranged from 4 to 5. When the disruption of discharge happens, the density profile becomes peaked and the shape factor u typically decreases to 1.展开更多
文摘基于滑动窗口的方法,结合机器学习分类技术,可以判定文本的作者归属。但是此类方法需要精心挑选对应的文本特征,不同的文本特征选取可能会影响判定结果。针对以上问题,提出了一种基于快速文本分类(fastText)的文本作者归属判定模型。该模型融合滑动窗口的思想,引入词(字)向量、数据增强技术,从而充分利用文本信息、自动提取文本特征,并且以可视化的方式将结果呈现出来。使用该模型来检测《红楼梦》、《Roman de la Rose》的作者归属,实验结果表明《红楼梦》的前八十回与后四十回为不同作者所著、《Roman de la Rose》开篇4 058行(约50 000字)与后面17 724行(约218 000字)为不同作者所著。证明了Rolling-fastText模型判定文本作者归属的有效性。
基金supported by the National Magnetic Confinement Fusion Science Program of China(Nos.2014GB106000,2014GB106002,and2014GB106003)National Natural Science Foundation of China(Nos.11275234,11375237 and 11505238)Scientific Research Grant of Hefei Science Center of CAS(No.2015SRG-HSC010)
文摘A method of fast data processing has been developed to rapidly obtain evolution of the electron density profile for a multichannel polarimeter-interferometer system(POLARIS)on J-TEXT. Compared with the Abel inversion method, evolution of the density profile analyzed by this method can quickly offer important information. This method has the advantage of fast calculation speed with the order of ten milliseconds per normal shot and it is capable of processing up to 1 MHz sampled data, which is helpful for studying density sawtooth instability and the disruption between shots. In the duration of a flat-top plasma current of usual ohmic discharges on J-TEXT, shape factor u is ranged from 4 to 5. When the disruption of discharge happens, the density profile becomes peaked and the shape factor u typically decreases to 1.