Guangxi tourism texts are a kind of tool to show China's image.However,there are still lots of problems despite certain achievements in recent years in Chinese-to-English(C-E) translation of tourism texts.So,how t...Guangxi tourism texts are a kind of tool to show China's image.However,there are still lots of problems despite certain achievements in recent years in Chinese-to-English(C-E) translation of tourism texts.So,how to improve the quality of tourism materials is of great significance practically.The aim is to adopt the"Creation"Thought of Guo Moruo that emphasizes creation,charming translation,having empathy with the source language and experience,aiming at discovering proper and feasible translation standards and strategies and making it better serve for the tourism development between Guangxi and ASEAN countries.展开更多
The present study is a contrastive study of inter-sentence conjunctions in Chinese/English legal parallel texts. Conjunction is one of the five cohesive devices put forward by Halliday and Hasan (1976). Many scholars ...The present study is a contrastive study of inter-sentence conjunctions in Chinese/English legal parallel texts. Conjunction is one of the five cohesive devices put forward by Halliday and Hasan (1976). Many scholars have applied their model of cohesion to the study of English and Chinese languages. As for the use of conjunction in Chinese and English, most scholars believe that there are more cases of conjunction in the English legal texts than in the Chinese ones because it is generally considered that Chinese is predominantly paratactic and English mainly hypotactic. Besides, up to now little detailed contrastive study has been done on conjunctions in Chinese/English non-literary texts.展开更多
The author compares Chinese and western language features in tourist texts, shares four strategies in translating Chinese tourist texts,and calls on building a corpus for Chinese-English translation of tourist texts.
Purpose:We present an analytical,open source and flexible natural language processing and text mining method for topic evolution,emerging topic detection and research trend forecasting for all kinds of data-tagged tex...Purpose:We present an analytical,open source and flexible natural language processing and text mining method for topic evolution,emerging topic detection and research trend forecasting for all kinds of data-tagged text.Design/methodology/approach:We make full use of the functions provided by the open source VOSviewer and Microsoft Office,including a thesaurus for data clean-up and a LOOKUP function for comparative analysis.Findings:Through application and verification in the domain of perovskite solar cells research,this method proves to be effective.Research limitations:A certain amount of manual data processing and a specific research domain background are required for better,more illustrative analysis results.Adequate time for analysis is also necessary.Practical implications:We try to set up an easy,useful,and flexible interdisciplinary text analyzing procedure for researchers,especially those without solid computer programming skills or who cannot easily access complex software.This procedure can also serve as a wonderful example for teaching information literacy.Originality/value:This text analysis approach has not been reported before.展开更多
Generation-based linguistic steganography is a popular research area of information hiding.The text generative steganographic method based on conditional probability coding is the direction that researchers have recen...Generation-based linguistic steganography is a popular research area of information hiding.The text generative steganographic method based on conditional probability coding is the direction that researchers have recently paid attention to.However,in the course of our experiment,we found that the secret information hiding in the text tends to destroy the statistical distribution characteristics of the original text,which indicates that this method has the problem of the obvious reduction of text quality when the embedding rate increases,and that the topic of generated texts is uncontrollable,so there is still room for improvement in concealment.In this paper,we propose a topic-controlled steganography method which is guided by graph-to-text generation.The proposed model can automatically generate steganographic texts carrying secret messages from knowledge graphs,and the topic of the generated texts is controllable.We also provide a graph path coding method with corresponding detailed algorithms for graph-to-text generation.Different from traditional linguistic steganography methods,we encode the secret information during graph path coding rather than using conditional probability.We test our method in different aspects and compare it with other text generative steganographic methods.The experimental results show that the model proposed in this paper can effectively improve the quality of the generated text and significantly improve the concealment of steganographic text.展开更多
Automatic text summarization(ATS)plays a significant role in Natural Language Processing(NLP).Abstractive summarization produces summaries by identifying and compressing the most important information in a document.Ho...Automatic text summarization(ATS)plays a significant role in Natural Language Processing(NLP).Abstractive summarization produces summaries by identifying and compressing the most important information in a document.However,there are only relatively several comprehensively evaluated abstractive summarization models that work well for specific types of reports due to their unstructured and oral language text characteristics.In particular,Chinese complaint reports,generated by urban complainers and collected by government employees,describe existing resident problems in daily life.Meanwhile,the reflected problems are required to respond speedily.Therefore,automatic summarization tasks for these reports have been developed.However,similar to traditional summarization models,the generated summaries still exist problems of informativeness and conciseness.To address these issues and generate suitably informative and less redundant summaries,a topic-based abstractive summarization method is proposed to obtain global and local features.Additionally,a heterogeneous graph of the original document is constructed using word-level and topic-level features.Experiments and analyses on public review datasets(Yelp and Amazon)and our constructed dataset(Chinese complaint reports)show that the proposed framework effectively improves the performance of the abstractive summarization model for Chinese complaint reports.展开更多
As the COVID-19 pandemic swept the globe,social media plat-forms became an essential source of information and communication for many.International students,particularly,turned to Twitter to express their struggles an...As the COVID-19 pandemic swept the globe,social media plat-forms became an essential source of information and communication for many.International students,particularly,turned to Twitter to express their struggles and hardships during this difficult time.To better understand the sentiments and experiences of these international students,we developed the Situational Aspect-Based Annotation and Classification(SABAC)text mining framework.This framework uses a three-layer approach,combining baseline Deep Learning(DL)models with Machine Learning(ML)models as meta-classifiers to accurately predict the sentiments and aspects expressed in tweets from our collected Student-COVID-19 dataset.Using the pro-posed aspect2class annotation algorithm,we labeled bulk unlabeled tweets according to their contained aspect terms.However,we also recognized the challenges of reducing data’s high dimensionality and sparsity to improve performance and annotation on unlabeled datasets.To address this issue,we proposed the Volatile Stopwords Filtering(VSF)technique to reduce sparsity and enhance classifier performance.The resulting Student-COVID Twitter dataset achieved a sophisticated accuracy of 93.21%when using the random forest as a meta-classifier.Through testing on three benchmark datasets,we found that the SABAC ensemble framework performed exceptionally well.Our findings showed that international students during the pandemic faced various issues,including stress,uncertainty,health concerns,financial stress,and difficulties with online classes and returning to school.By analyzing and summarizing these annotated tweets,decision-makers can better understand and address the real-time problems international students face during the ongoing pandemic.展开更多
Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in r...Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in real systems are based on graph models,which are characterized by their simplicity and stability.Thus,this paper proposes an improved extractive text summarization algorithm based on both topic and graph models.The methodology of this work consists of two stages.First,the well-known TextRank algorithm is analyzed and its shortcomings are investigated.Then,an improved method is proposed with a new computational model of sentence weights.The experimental results were carried out on standard DUC2004 and DUC2006 datasets and compared to four text summarization methods.Finally,through experiments on the DUC2004 and DUC2006 datasets,our proposed improved graph model algorithm TG-SMR(Topic Graph-Summarizer)is compared to other text summarization systems.The experimental results prove that the proposed TG-SMR algorithm achieves higher ROUGE scores.It is foreseen that the TG-SMR algorithm will open a new horizon that concerns the performance of ROUGE evaluation indicators.展开更多
This article focuses on the discussion of Chinese topics and English marked theme. It will be divided into four parts. Part I is introduction part. It introduces the concepts of Chinese topic and English marked theme....This article focuses on the discussion of Chinese topics and English marked theme. It will be divided into four parts. Part I is introduction part. It introduces the concepts of Chinese topic and English marked theme. Part II tells different types of Chinese topics and English marked themes. Part III discusses the functions of Chinese topics and English themes. Part IV is the conclusion part.展开更多
Due to the rapid increase in the exchange of text information via internet networks,the security and the reliability of digital content have become a major research issue.The main challenges faced by researchers are a...Due to the rapid increase in the exchange of text information via internet networks,the security and the reliability of digital content have become a major research issue.The main challenges faced by researchers are authentication,integrity verication,and tampering detection of the digital contents.In this paper,text zero-watermarking and text feature-based approach is proposed to improve the tampering detection accuracy of English text contents.The proposed approach embeds and detects the watermark logically without altering the original English text document.Based on hidden Markov model(HMM),the fourth level order of the word mechanism is used to analyze the contents of the given English text to nd the interrelationship between the contexts.The extracted features are used as watermark information and integrated with digital zero-watermarking techniques.To detect eventual tampering,the proposed approach has been implemented and validated with attacked English text.Experiments were performed using four standard datasets of varying lengths under multiple random locations of insertion,reorder,and deletion attacks.The experimental and simulation results prove the tampering detection accuracy of our method against all kinds of tampering attacks.Comparison results show that our proposed approach outperforms all the other baseline approaches in terms of tampering detection accuracy.展开更多
The Analects, Mengzi and Xunzi are the top-three classical works of pre-Qin Confucianism, which epitomized thoughts and ideas of Confucius, Mencius and XunKuang1. There have been lots of spirited and in-depth discussi...The Analects, Mengzi and Xunzi are the top-three classical works of pre-Qin Confucianism, which epitomized thoughts and ideas of Confucius, Mencius and XunKuang1. There have been lots of spirited and in-depth discussions on their ideological inheritance and development from all kinds of academics. This paper tries to cast a new light on these discussions through “machine reading2”.展开更多
As any other legal language does,legal English features a wealth of complex legal concepts as well as plenty of highly and unique professional terms and complicated syntax.Proper translation of legal English texts int...As any other legal language does,legal English features a wealth of complex legal concepts as well as plenty of highly and unique professional terms and complicated syntax.Proper translation of legal English texts into legal Chinese ones will throw a great social and economic impact on the society.Accordingly "accuracy" has always been regarded as the primary principle of legal English translation.However,as legal English falls within the ambit common law system and legal Chinese within the civil law system,to obtain real accuracy or even the effect of "equivalence" still remains an ideal pursuit of legal English translation standard.Having probed into Skopostheorie,which "boasts itself of one of the deconstructive translation studies"〔1〕 and focuses on the target-text's function and practicability,the author of this article finds that as to legal English translation skopos theory,in terms of text typology,offers a reasonable elucidation for a couple of translation strategies adopted in the target text.展开更多
Business English is an English variant used in international business activities for business purposes.This paper elaborates the linguistic features of business English from the aspects of phonetics,lexicology,syntax ...Business English is an English variant used in international business activities for business purposes.This paper elaborates the linguistic features of business English from the aspects of phonetics,lexicology,syntax and text linguistics.This paper serves for helping learners and businessmen to understand business English and improve their language capacities in business practice.展开更多
文摘Guangxi tourism texts are a kind of tool to show China's image.However,there are still lots of problems despite certain achievements in recent years in Chinese-to-English(C-E) translation of tourism texts.So,how to improve the quality of tourism materials is of great significance practically.The aim is to adopt the"Creation"Thought of Guo Moruo that emphasizes creation,charming translation,having empathy with the source language and experience,aiming at discovering proper and feasible translation standards and strategies and making it better serve for the tourism development between Guangxi and ASEAN countries.
文摘The present study is a contrastive study of inter-sentence conjunctions in Chinese/English legal parallel texts. Conjunction is one of the five cohesive devices put forward by Halliday and Hasan (1976). Many scholars have applied their model of cohesion to the study of English and Chinese languages. As for the use of conjunction in Chinese and English, most scholars believe that there are more cases of conjunction in the English legal texts than in the Chinese ones because it is generally considered that Chinese is predominantly paratactic and English mainly hypotactic. Besides, up to now little detailed contrastive study has been done on conjunctions in Chinese/English non-literary texts.
文摘The author compares Chinese and western language features in tourist texts, shares four strategies in translating Chinese tourist texts,and calls on building a corpus for Chinese-English translation of tourist texts.
文摘Purpose:We present an analytical,open source and flexible natural language processing and text mining method for topic evolution,emerging topic detection and research trend forecasting for all kinds of data-tagged text.Design/methodology/approach:We make full use of the functions provided by the open source VOSviewer and Microsoft Office,including a thesaurus for data clean-up and a LOOKUP function for comparative analysis.Findings:Through application and verification in the domain of perovskite solar cells research,this method proves to be effective.Research limitations:A certain amount of manual data processing and a specific research domain background are required for better,more illustrative analysis results.Adequate time for analysis is also necessary.Practical implications:We try to set up an easy,useful,and flexible interdisciplinary text analyzing procedure for researchers,especially those without solid computer programming skills or who cannot easily access complex software.This procedure can also serve as a wonderful example for teaching information literacy.Originality/value:This text analysis approach has not been reported before.
基金supported in part by the National Natural Science Foundation of China [62102136]the 2020 Opening Fund for Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering [2020SDSJ06]the Construction Fund for Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering [2019ZYYD007].
文摘Generation-based linguistic steganography is a popular research area of information hiding.The text generative steganographic method based on conditional probability coding is the direction that researchers have recently paid attention to.However,in the course of our experiment,we found that the secret information hiding in the text tends to destroy the statistical distribution characteristics of the original text,which indicates that this method has the problem of the obvious reduction of text quality when the embedding rate increases,and that the topic of generated texts is uncontrollable,so there is still room for improvement in concealment.In this paper,we propose a topic-controlled steganography method which is guided by graph-to-text generation.The proposed model can automatically generate steganographic texts carrying secret messages from knowledge graphs,and the topic of the generated texts is controllable.We also provide a graph path coding method with corresponding detailed algorithms for graph-to-text generation.Different from traditional linguistic steganography methods,we encode the secret information during graph path coding rather than using conditional probability.We test our method in different aspects and compare it with other text generative steganographic methods.The experimental results show that the model proposed in this paper can effectively improve the quality of the generated text and significantly improve the concealment of steganographic text.
基金supported byNationalNatural Science Foundation of China(52274205)and Project of Education Department of Liaoning Province(LJKZ0338).
文摘Automatic text summarization(ATS)plays a significant role in Natural Language Processing(NLP).Abstractive summarization produces summaries by identifying and compressing the most important information in a document.However,there are only relatively several comprehensively evaluated abstractive summarization models that work well for specific types of reports due to their unstructured and oral language text characteristics.In particular,Chinese complaint reports,generated by urban complainers and collected by government employees,describe existing resident problems in daily life.Meanwhile,the reflected problems are required to respond speedily.Therefore,automatic summarization tasks for these reports have been developed.However,similar to traditional summarization models,the generated summaries still exist problems of informativeness and conciseness.To address these issues and generate suitably informative and less redundant summaries,a topic-based abstractive summarization method is proposed to obtain global and local features.Additionally,a heterogeneous graph of the original document is constructed using word-level and topic-level features.Experiments and analyses on public review datasets(Yelp and Amazon)and our constructed dataset(Chinese complaint reports)show that the proposed framework effectively improves the performance of the abstractive summarization model for Chinese complaint reports.
基金supported by the National Natural Science Foundation of China[Grant Number:92067106]the Ministry of Education of the People’s Republic of China[Grant Number:E-GCCRC20200309].
文摘As the COVID-19 pandemic swept the globe,social media plat-forms became an essential source of information and communication for many.International students,particularly,turned to Twitter to express their struggles and hardships during this difficult time.To better understand the sentiments and experiences of these international students,we developed the Situational Aspect-Based Annotation and Classification(SABAC)text mining framework.This framework uses a three-layer approach,combining baseline Deep Learning(DL)models with Machine Learning(ML)models as meta-classifiers to accurately predict the sentiments and aspects expressed in tweets from our collected Student-COVID-19 dataset.Using the pro-posed aspect2class annotation algorithm,we labeled bulk unlabeled tweets according to their contained aspect terms.However,we also recognized the challenges of reducing data’s high dimensionality and sparsity to improve performance and annotation on unlabeled datasets.To address this issue,we proposed the Volatile Stopwords Filtering(VSF)technique to reduce sparsity and enhance classifier performance.The resulting Student-COVID Twitter dataset achieved a sophisticated accuracy of 93.21%when using the random forest as a meta-classifier.Through testing on three benchmark datasets,we found that the SABAC ensemble framework performed exceptionally well.Our findings showed that international students during the pandemic faced various issues,including stress,uncertainty,health concerns,financial stress,and difficulties with online classes and returning to school.By analyzing and summarizing these annotated tweets,decision-makers can better understand and address the real-time problems international students face during the ongoing pandemic.
文摘Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in real systems are based on graph models,which are characterized by their simplicity and stability.Thus,this paper proposes an improved extractive text summarization algorithm based on both topic and graph models.The methodology of this work consists of two stages.First,the well-known TextRank algorithm is analyzed and its shortcomings are investigated.Then,an improved method is proposed with a new computational model of sentence weights.The experimental results were carried out on standard DUC2004 and DUC2006 datasets and compared to four text summarization methods.Finally,through experiments on the DUC2004 and DUC2006 datasets,our proposed improved graph model algorithm TG-SMR(Topic Graph-Summarizer)is compared to other text summarization systems.The experimental results prove that the proposed TG-SMR algorithm achieves higher ROUGE scores.It is foreseen that the TG-SMR algorithm will open a new horizon that concerns the performance of ROUGE evaluation indicators.
文摘This article focuses on the discussion of Chinese topics and English marked theme. It will be divided into four parts. Part I is introduction part. It introduces the concepts of Chinese topic and English marked theme. Part II tells different types of Chinese topics and English marked themes. Part III discusses the functions of Chinese topics and English themes. Part IV is the conclusion part.
基金The author extends his appreciation to the Deanship of Scientic Research at King Khalid University for funding this work under grant number(R.G.P.2/55/40/2019),Received by Fahd N.Al-Wesabi.www.kku.edu.sa.
文摘Due to the rapid increase in the exchange of text information via internet networks,the security and the reliability of digital content have become a major research issue.The main challenges faced by researchers are authentication,integrity verication,and tampering detection of the digital contents.In this paper,text zero-watermarking and text feature-based approach is proposed to improve the tampering detection accuracy of English text contents.The proposed approach embeds and detects the watermark logically without altering the original English text document.Based on hidden Markov model(HMM),the fourth level order of the word mechanism is used to analyze the contents of the given English text to nd the interrelationship between the contexts.The extracted features are used as watermark information and integrated with digital zero-watermarking techniques.To detect eventual tampering,the proposed approach has been implemented and validated with attacked English text.Experiments were performed using four standard datasets of varying lengths under multiple random locations of insertion,reorder,and deletion attacks.The experimental and simulation results prove the tampering detection accuracy of our method against all kinds of tampering attacks.Comparison results show that our proposed approach outperforms all the other baseline approaches in terms of tampering detection accuracy.
文摘The Analects, Mengzi and Xunzi are the top-three classical works of pre-Qin Confucianism, which epitomized thoughts and ideas of Confucius, Mencius and XunKuang1. There have been lots of spirited and in-depth discussions on their ideological inheritance and development from all kinds of academics. This paper tries to cast a new light on these discussions through “machine reading2”.
基金National Social Science Project (2013) of School of Foreign Languages of Southwest University of Political Science and Law : Comparative Studies on English Versions of Laws in the Qing Dynasty,Project Number: 13BYY030
文摘As any other legal language does,legal English features a wealth of complex legal concepts as well as plenty of highly and unique professional terms and complicated syntax.Proper translation of legal English texts into legal Chinese ones will throw a great social and economic impact on the society.Accordingly "accuracy" has always been regarded as the primary principle of legal English translation.However,as legal English falls within the ambit common law system and legal Chinese within the civil law system,to obtain real accuracy or even the effect of "equivalence" still remains an ideal pursuit of legal English translation standard.Having probed into Skopostheorie,which "boasts itself of one of the deconstructive translation studies"〔1〕 and focuses on the target-text's function and practicability,the author of this article finds that as to legal English translation skopos theory,in terms of text typology,offers a reasonable elucidation for a couple of translation strategies adopted in the target text.
文摘Business English is an English variant used in international business activities for business purposes.This paper elaborates the linguistic features of business English from the aspects of phonetics,lexicology,syntax and text linguistics.This paper serves for helping learners and businessmen to understand business English and improve their language capacities in business practice.