In the practice of corpus⁃based language research,computer software with corpus analysis functions(i.e.,corpus analysis tools)is as essential as corpora.In most cases,the corpus analysis tool determines the way to use a corpus.To maximize the values of corpora,therefore,it is necessary for corpus analysis tools to have complete functions of query,statistics,and comparison.In this article,the core functions of corpus analysis tools are categorized as concordance,collocation,frequency statistics,and comparison.Concordance refers to the results returned from querying the texts that match certain query conditions in the corpus,including three types:basic query,multi⁃conditional query,and corpus⁃specific query.Collocation refers to the results returned from querying combinations of words that frequently occur in the corpus and have syntactic or semantic relationships.Frequency statistics show the frequency of words or other linguistic units in the corpus.The comparison function compares the usage of language,including comparing different linguistic phenomena in the same corpus and comparing the same linguistic phenomena in different corpora(or sub⁃corpora).We select six representative fourth⁃generation corpus analysis tools,three from abroad and three from China,and review the four core functions in these tools.Both domestic and foreign corpus analysis tools have implemented diverse concordance functions,and Chinese corpus analysis tools also provide some concordance functions adapted to the characteristics of the Chinese language.In addition,the three foreign corpus analysis tools provide various collocation,frequency statistics,and comparison functions,while the three Chinese corpus analysis tools have nothing more than simple frequency statistics and comparison functions,and none of them supports collocation yet.Finally,based on comparing domestic and foreign corpus analysis tools,this article makes suggestions for these tools.It is suggested that different domestic corpus analysis tools should provide distinctive Chinese corpus analysis functions to complement each other.Domestic corpus analysis tools should:1)further improve or add functions such as collocation,frequency statistics,and comparison;2)increase continuous investment to accelerate software iteration;3)enhance the research on Chinese corpus analysis methods;and 4)promote their transformation into practical tools.It is recommended that both domestic and foreign corpus analysis tools should enhance corpus analysis functions based on deep learning and big data technologies and for multimedia and multimodal corpora.It is expected that this article will help readers comprehensively understand the current status of the core functions of the domestic and foreign fourth⁃generation corpus analysis tools and help further improve these tools,especially Chinese corpus analysis tools.
ZHANG Yongwei;WU Bingxin
Contemporary Linguistics