7Yule G U.On sentence length as a statistical characteristic of style in prose with application to two cases of disputed authorship[J]. Biometrika, 1938,30 : 363-390.
8Gani J.Literature and statistics[M]//Kotz S,Jonhnson N L.Encyclopedia of Statistics.[S.l.] : Wiley, 1985 : 90-95.
9Valenza R J.Are the Thisted-Efron authorship tests valid? [J].Computer and the Humanities,1991,25:27-46.
10Khmelev D,Tweedy F J.Using Markov chains for identification of Writers[J].Literary and Linguistic Computing,2001,16(4):299-307.
5Zeng D,Wei Dong-hua,Chau M,et al.Domain-specific Chinese word segmentation using suffix tree and mutual information[J].Information System Frontier,2011,13:115-125.
6CCL语料库[OL].http://ccl.pku.edu cn:8080/cclcorpus.
7Nagao M,Mori S.A New Method of N-gram Statistics for Large Number of n and Automatic Extraction of Words and Phrases from Large Text Data of Japanese[C] //Proceedings of the 1Sth International Conference on Computational Linguistics.1994:611-615.
8Yamamoto M,Churcht K W.Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus[J].Computational Linguistics,2001,27 (1):1-40.
9Banerjee S,Pedersen T.The Design,Implementation,and Use of the N-gram Statistics Package[C] // Proceedings of CICLing 2003.2003:370-381.