The present study probed into the effects of text structure, structure awareness and proficiency level on EFL learners' reading test performance. There are 112 college-level students participated in the experiment an...The present study probed into the effects of text structure, structure awareness and proficiency level on EFL learners' reading test performance. There are 112 college-level students participated in the experiment and their English proficiency belonged to distinct levels. The subjects' performance on the recall of two passages written in different types of structure was examined. Results of statistical indicate that text structure, structure awareness and proficiency level all have main effects on the subjects' reading performance. More specifically, two major findings emerged from the results of the investigation. One the one hand, text structures significantly affected the quantity but not the quality of the information recalled while proficiency level and structure awareness had significant impact on both the quantity and quality of information recalled. On the other hand, structure awareness was irrelevant to either text structure or proficiency level. The implications of the findings for teaching L2/FL reading were suggested.展开更多
Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translat...Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translated texts on Neijing. Due to the diversity of editions and different understanding, the translating styles and contents are widely different. This study will focus on the different translating styles on culture-specific lexicon、figure of speech and four-Chinese-character structures in Neijin.展开更多
Newspaper is, to some extent, a mirror of our society, reflecting the latest change and development of the society.News text is a linguistic representation of the world. This paper is to briefly introduce the structur...Newspaper is, to some extent, a mirror of our society, reflecting the latest change and development of the society.News text is a linguistic representation of the world. This paper is to briefly introduce the structure, writing and linguistic styles of news texts and thus to increase readers' awareness of the distinctive features of news texts.展开更多
Translational discourse requires at least three participants, therefore it is suggested to consider the universal model of the picture of the world, according to which it is much easier for a translator to combine the...Translational discourse requires at least three participants, therefore it is suggested to consider the universal model of the picture of the world, according to which it is much easier for a translator to combine the pictures of the world of an addressee and an author. An addressee is a mental image existing in the mind of an addresser during the creative process. Having defined its parameters, a translator has an opportunity to deliver the thought of an addresser to an addressee as accurately as possible and to select the means of expression that are clear to an addressee. The type of an addressee correlates with "the relation to the new".展开更多
The probability-based covering algorithm(PBCA) is a new algorithm based on probability distribution. It decides, by voting, the class of the tested samples on the border of the coverage area, based on the probability ...The probability-based covering algorithm(PBCA) is a new algorithm based on probability distribution. It decides, by voting, the class of the tested samples on the border of the coverage area, based on the probability of training samples. When using the original covering algorithm(CA), many tested samples that are located on the border of the coverage cannot be classified by the spherical neighborhood gained. The network structure of PBCA is a mixed structure composed of both a feed-forward network and a feedback network. By using this method of adding some heterogeneous samples and enlarging the coverage radius,it is possible to decrease the number of rejected samples and improve the rate of recognition accuracy. Relevant computer experiments indicate that the algorithm improves the study precision and achieves reasonably good results in text classification.展开更多
With the remarkable growth of textual data sources in recent years,easy,fast,and accurate text processing has become a challenge with significant payoffs.Automatic text summarization is the process of compressing text...With the remarkable growth of textual data sources in recent years,easy,fast,and accurate text processing has become a challenge with significant payoffs.Automatic text summarization is the process of compressing text documents into shorter summaries for easier review of its core contents,which must be done without losing important features and information.This paper introduces a new hybrid method for extractive text summarization with feature selection based on text structure.The major advantage of the proposed summarization method over previous systems is the modeling of text structure and relationship between entities in the input text,which improves the sentence feature selection process and leads to the generation of unambiguous,concise,consistent,and coherent summaries.The paper also presents the results of the evaluation of the proposed method based on precision and recall criteria.It is shown that the method produces summaries consisting of chains of sentences with the aforementioned characteristics from the original text.展开更多
Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researcher...Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researchers recently.To realize the automatic grading of handwritten chemistry assignments,the problem of chemical notations recognition should be solved first.The recent handwritten chemical notations recognition solutions belonging to the end-to-end trainable category suffered fromthe problem of lacking the accurate alignment information between the input and output.They serve the aim of reading notations into electrical devices to better prepare relevant edocuments instead of auto-grading handwritten assignments.To tackle this limitation to enable the auto-grading of handwritten chemistry assignments at a fine-grained level.In this work,we propose a component-detectionbased approach for recognizing off-line handwritten Organic Cyclic Compound Structure Formulas(OCCSFs).Specifically,we define different components of OCCSFs as objects(including graphical objects and text objects),and adopt the deep learning detector to detect them.Then,regarding the detected text objects,we introduce an improved attention-based encoder-decoder model for text recognition.Finally,with these detection results and the geometric relationships of detected objects,this article designs a holistic algorithm for interpreting the spatial structure of handwritten OCCSFs.The proposedmethod is evaluated on a self-collected data set consisting of 3000 samples and achieves promising results.展开更多
Sentiment analysis,commonly called opinion mining or emotion artificial intelligence(AI),employs biometrics,computational linguistics,nat-ural language processing,and text analysis to systematically identify,extract,m...Sentiment analysis,commonly called opinion mining or emotion artificial intelligence(AI),employs biometrics,computational linguistics,nat-ural language processing,and text analysis to systematically identify,extract,measure,and investigate affective states and subjective data.Sentiment analy-sis algorithms include emotion lexicon,traditional machine learning,and deep learning.In the text sentiment analysis algorithm based on a neural network,multi-layer Bi-directional long short-term memory(LSTM)is widely used,but the parameter amount of this model is too huge.Hence,this paper proposes a Bi-directional LSTM with a trapezoidal structure model.The design of the trapezoidal structure is derived from classic neural networks,such as LeNet-5 and AlexNet.These classic models have trapezoidal-like structures,and these structures have achieved success in the field of deep learning.There are two benefits to using the Bi-directional LSTM with a trapezoidal structure.One is that compared with the single-layer configuration,using the of the multi-layer structure can better extract the high-dimensional features of the text.Another is that using the trapezoidal structure can reduce the model’s parameters.This paper introduces the Bi-directional LSTM with a trapezoidal structure model in detail and uses Stanford sentiment treebank 2(STS-2)for experiments.It can be seen from the experimental results that the trapezoidal structure model and the normal structure model have similar performances.However,the trapezoidal structure model parameters are 35.75%less than the normal structure model.展开更多
In addition to soil samples, conventional soil maps, and experienced soil surveyors, text about soils(e.g., soil survey reports) is an important potential data source for extracting soil–environment relationships. Co...In addition to soil samples, conventional soil maps, and experienced soil surveyors, text about soils(e.g., soil survey reports) is an important potential data source for extracting soil–environment relationships. Considering that the words describing soil–environment relationships are often mixed with unrelated words, the first step is to extract the needed words and organize them in a structured way. This paper applies natural language processing(NLP) techniques to automatically extract and structure information from soil survey reports regarding soil–environment relationships. The method includes two steps:(1) construction of a knowledge frame and(2) information extraction using either a rule-based method or a statistic-based method for different types of information. For uniformly written text information, the rule-based approach was used to extract information. These types of variables include slope, elevation, accumulated temperature, annual mean temperature, annual precipitation, and frost-free period. For information contained in text written in diverse styles, the statistic-based method was adopted. These types of variables include landform and parent material. The soil species of China soil survey reports were selected as the experimental dataset. Precision(P), recall(R), and F1-measure(F1) were used to evaluate the performances of the method. For the rule-based method, the P values were 1, the R values were above 92%, and the F1 values were above 96% for all the involved variables. For the method based on the conditional random fields(CRFs), the P, R and F1 values for the parent material were, respectively, 84.15, 83.13, and 83.64%; the values for landform were 88.33, 76.81, and 82.17%, respectively. To explore the impact of text types on the performance of the CRFs-based method, CRFs models were trained and validated separately by the descriptive texts of soil types and typical profiles. For parent material, the maximum F1 value for the descriptive text of soil types was 90.7%, while the maximum F1 value for the descriptive text of soil profiles was only 75%. For landform, the maximum F1 value for the descriptive text of soil types was 85.33%, which was similar to that of the descriptive text of soil profiles(i.e., 85.71%). These results suggest that NLP techniques are effective for the extraction and structuration of soil–environment relationship information from a text data source.展开更多
Acoustic communication is the most important form of communication in anuran amphibians. To understand the acoustic characteristics of male Babina adenopleura, we recorded advertisement calls and analyzed their acoust...Acoustic communication is the most important form of communication in anuran amphibians. To understand the acoustic characteristics of male Babina adenopleura, we recorded advertisement calls and analyzed their acoustic parameters during the breeding season. Male B. adenopleura produced calls with a variable number of notes(1–5), and each note contained harmonics. Although 6% of call notes did not exhibit frequency modulation(FM), two call note FM patterns were observed:(1) upward FM;(2) upward–downward FM. With the exception of 1- and 5- note calls, the duration of successive notes decreased monotonically. With the exception of 1 note calls, the fundamental frequency of the first note was lowest, then increased; the greatest change in the fundamental frequency was always between notes 1 and 2. The dominant frequency varied between calls. For example for the first call note the dominant frequency occurred in some cases in the first harmonic(located in the 605.320 ± 64.533 Hz frequency band), the second harmonic(918 ± 9 Hz band), the fourth harmonic(1712 ± 333 Hz band), the sixth harmonic(the 2165 ± 152 Hz band), the seventh harmonic(the 2269 ± 140 Hz band), the eighth harmonic(the 2466 ± 15 Hz band) or the ninth harmonic(the 2636 ± 21 Hz band). Although male B. adenopleura advertisement calls have a distinctive structure, they have similar characteristics to the calls of the music frog, B. daunchina.展开更多
In recent research,deep learning algorithms have presented effective representation learning models for natural languages.The deep learningbased models create better data representation than classical models.They are ...In recent research,deep learning algorithms have presented effective representation learning models for natural languages.The deep learningbased models create better data representation than classical models.They are capable of automated extraction of distributed representation of texts.In this research,we introduce a new tree Extractive text summarization that is characterized by fitting the text structure representation in knowledge base training module,and also addresses memory issues that were not addresses before.The proposed model employs a tree structured mechanism to generate the phrase and text embedding.The proposed architecture mimics the tree configuration of the text-texts and provide better feature representation.It also incorporates an attention mechanism that offers an additional information source to conduct better summary extraction.The novel model addresses text summarization as a classification process,where the model calculates the probabilities of phrase and text-summary association.The model classification is divided into multiple features recognition such as information entropy,significance,redundancy and position.The model was assessed on two datasets,on the Multi-Doc Composition Query(MCQ)and Dual Attention Composition dataset(DAC)dataset.The experimental results prove that our proposed model has better summarization precision vs.other models by a considerable margin.展开更多
文摘The present study probed into the effects of text structure, structure awareness and proficiency level on EFL learners' reading test performance. There are 112 college-level students participated in the experiment and their English proficiency belonged to distinct levels. The subjects' performance on the recall of two passages written in different types of structure was examined. Results of statistical indicate that text structure, structure awareness and proficiency level all have main effects on the subjects' reading performance. More specifically, two major findings emerged from the results of the investigation. One the one hand, text structures significantly affected the quantity but not the quality of the information recalled while proficiency level and structure awareness had significant impact on both the quantity and quality of information recalled. On the other hand, structure awareness was irrelevant to either text structure or proficiency level. The implications of the findings for teaching L2/FL reading were suggested.
文摘Huangdi's Internal Classics(Neijin) is one of the most important ancient medical classics, which plays far-reaching influence in medical field. More and more domestic and overseas scholars published their translated texts on Neijing. Due to the diversity of editions and different understanding, the translating styles and contents are widely different. This study will focus on the different translating styles on culture-specific lexicon、figure of speech and four-Chinese-character structures in Neijin.
文摘Newspaper is, to some extent, a mirror of our society, reflecting the latest change and development of the society.News text is a linguistic representation of the world. This paper is to briefly introduce the structure, writing and linguistic styles of news texts and thus to increase readers' awareness of the distinctive features of news texts.
文摘Translational discourse requires at least three participants, therefore it is suggested to consider the universal model of the picture of the world, according to which it is much easier for a translator to combine the pictures of the world of an addressee and an author. An addressee is a mental image existing in the mind of an addresser during the creative process. Having defined its parameters, a translator has an opportunity to deliver the thought of an addresser to an addressee as accurately as possible and to select the means of expression that are clear to an addressee. The type of an addressee correlates with "the relation to the new".
基金supported by the Fund for Philosophy and Social Science of Anhui Provincethe Fund for Human and Art Social Science of the Education Department of Anhui Province(Grant Nos.AHSKF0708D13 and 2009sk038)
文摘The probability-based covering algorithm(PBCA) is a new algorithm based on probability distribution. It decides, by voting, the class of the tested samples on the border of the coverage area, based on the probability of training samples. When using the original covering algorithm(CA), many tested samples that are located on the border of the coverage cannot be classified by the spherical neighborhood gained. The network structure of PBCA is a mixed structure composed of both a feed-forward network and a feedback network. By using this method of adding some heterogeneous samples and enlarging the coverage radius,it is possible to decrease the number of rejected samples and improve the rate of recognition accuracy. Relevant computer experiments indicate that the algorithm improves the study precision and achieves reasonably good results in text classification.
文摘With the remarkable growth of textual data sources in recent years,easy,fast,and accurate text processing has become a challenge with significant payoffs.Automatic text summarization is the process of compressing text documents into shorter summaries for easier review of its core contents,which must be done without losing important features and information.This paper introduces a new hybrid method for extractive text summarization with feature selection based on text structure.The major advantage of the proposed summarization method over previous systems is the modeling of text structure and relationship between entities in the input text,which improves the sentence feature selection process and leads to the generation of unambiguous,concise,consistent,and coherent summaries.The paper also presents the results of the evaluation of the proposed method based on precision and recall criteria.It is shown that the method produces summaries consisting of chains of sentences with the aforementioned characteristics from the original text.
基金supported by National Natural Science Foundation of China (Nos.62007014 and 62177024)the Humanities and Social Sciences Youth Fund of the Ministry of Education (No.20YJC880024)+1 种基金China Post Doctoral Science Foundation (No.2019M652678)the Fundamental Research Funds for the Central Universities (No.CCNU20ZT019).
文摘Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researchers recently.To realize the automatic grading of handwritten chemistry assignments,the problem of chemical notations recognition should be solved first.The recent handwritten chemical notations recognition solutions belonging to the end-to-end trainable category suffered fromthe problem of lacking the accurate alignment information between the input and output.They serve the aim of reading notations into electrical devices to better prepare relevant edocuments instead of auto-grading handwritten assignments.To tackle this limitation to enable the auto-grading of handwritten chemistry assignments at a fine-grained level.In this work,we propose a component-detectionbased approach for recognizing off-line handwritten Organic Cyclic Compound Structure Formulas(OCCSFs).Specifically,we define different components of OCCSFs as objects(including graphical objects and text objects),and adopt the deep learning detector to detect them.Then,regarding the detected text objects,we introduce an improved attention-based encoder-decoder model for text recognition.Finally,with these detection results and the geometric relationships of detected objects,this article designs a holistic algorithm for interpreting the spatial structure of handwritten OCCSFs.The proposedmethod is evaluated on a self-collected data set consisting of 3000 samples and achieves promising results.
基金supported by Yunnan Provincial Education Department Science Foundation of China under Grant construction of the seventh batch of key engineering research centers in colleges and universities(Grant Project:Yunnan College and University Edge Computing Network Engineering Research Center).
文摘Sentiment analysis,commonly called opinion mining or emotion artificial intelligence(AI),employs biometrics,computational linguistics,nat-ural language processing,and text analysis to systematically identify,extract,measure,and investigate affective states and subjective data.Sentiment analy-sis algorithms include emotion lexicon,traditional machine learning,and deep learning.In the text sentiment analysis algorithm based on a neural network,multi-layer Bi-directional long short-term memory(LSTM)is widely used,but the parameter amount of this model is too huge.Hence,this paper proposes a Bi-directional LSTM with a trapezoidal structure model.The design of the trapezoidal structure is derived from classic neural networks,such as LeNet-5 and AlexNet.These classic models have trapezoidal-like structures,and these structures have achieved success in the field of deep learning.There are two benefits to using the Bi-directional LSTM with a trapezoidal structure.One is that compared with the single-layer configuration,using the of the multi-layer structure can better extract the high-dimensional features of the text.Another is that using the trapezoidal structure can reduce the model’s parameters.This paper introduces the Bi-directional LSTM with a trapezoidal structure model in detail and uses Stanford sentiment treebank 2(STS-2)for experiments.It can be seen from the experimental results that the trapezoidal structure model and the normal structure model have similar performances.However,the trapezoidal structure model parameters are 35.75%less than the normal structure model.
基金supported by the National Natural Science Foundation of China (41431177 and 41601413)the National Basic Research Program of China (2015CB954102)+1 种基金the Natural Science Research Program of Jiangsu Province, China (BK20150975 and 14KJA170001)the Outstanding Innovation Team in Colleges and Universities in Jiangsu Province, China
文摘In addition to soil samples, conventional soil maps, and experienced soil surveyors, text about soils(e.g., soil survey reports) is an important potential data source for extracting soil–environment relationships. Considering that the words describing soil–environment relationships are often mixed with unrelated words, the first step is to extract the needed words and organize them in a structured way. This paper applies natural language processing(NLP) techniques to automatically extract and structure information from soil survey reports regarding soil–environment relationships. The method includes two steps:(1) construction of a knowledge frame and(2) information extraction using either a rule-based method or a statistic-based method for different types of information. For uniformly written text information, the rule-based approach was used to extract information. These types of variables include slope, elevation, accumulated temperature, annual mean temperature, annual precipitation, and frost-free period. For information contained in text written in diverse styles, the statistic-based method was adopted. These types of variables include landform and parent material. The soil species of China soil survey reports were selected as the experimental dataset. Precision(P), recall(R), and F1-measure(F1) were used to evaluate the performances of the method. For the rule-based method, the P values were 1, the R values were above 92%, and the F1 values were above 96% for all the involved variables. For the method based on the conditional random fields(CRFs), the P, R and F1 values for the parent material were, respectively, 84.15, 83.13, and 83.64%; the values for landform were 88.33, 76.81, and 82.17%, respectively. To explore the impact of text types on the performance of the CRFs-based method, CRFs models were trained and validated separately by the descriptive texts of soil types and typical profiles. For parent material, the maximum F1 value for the descriptive text of soil types was 90.7%, while the maximum F1 value for the descriptive text of soil profiles was only 75%. For landform, the maximum F1 value for the descriptive text of soil types was 85.33%, which was similar to that of the descriptive text of soil profiles(i.e., 85.71%). These results suggest that NLP techniques are effective for the extraction and structuration of soil–environment relationship information from a text data source.
基金financially supported by the National Science and Technology Project (2008BAC39B02–11)the National Undergraduate Innovation and Entrepreneurship Training Program (201310351015)the Zhejiang Province "Xinmiao" Project (2012R 424021)
文摘Acoustic communication is the most important form of communication in anuran amphibians. To understand the acoustic characteristics of male Babina adenopleura, we recorded advertisement calls and analyzed their acoustic parameters during the breeding season. Male B. adenopleura produced calls with a variable number of notes(1–5), and each note contained harmonics. Although 6% of call notes did not exhibit frequency modulation(FM), two call note FM patterns were observed:(1) upward FM;(2) upward–downward FM. With the exception of 1- and 5- note calls, the duration of successive notes decreased monotonically. With the exception of 1 note calls, the fundamental frequency of the first note was lowest, then increased; the greatest change in the fundamental frequency was always between notes 1 and 2. The dominant frequency varied between calls. For example for the first call note the dominant frequency occurred in some cases in the first harmonic(located in the 605.320 ± 64.533 Hz frequency band), the second harmonic(918 ± 9 Hz band), the fourth harmonic(1712 ± 333 Hz band), the sixth harmonic(the 2165 ± 152 Hz band), the seventh harmonic(the 2269 ± 140 Hz band), the eighth harmonic(the 2466 ± 15 Hz band) or the ninth harmonic(the 2636 ± 21 Hz band). Although male B. adenopleura advertisement calls have a distinctive structure, they have similar characteristics to the calls of the music frog, B. daunchina.
基金This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R113),Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘In recent research,deep learning algorithms have presented effective representation learning models for natural languages.The deep learningbased models create better data representation than classical models.They are capable of automated extraction of distributed representation of texts.In this research,we introduce a new tree Extractive text summarization that is characterized by fitting the text structure representation in knowledge base training module,and also addresses memory issues that were not addresses before.The proposed model employs a tree structured mechanism to generate the phrase and text embedding.The proposed architecture mimics the tree configuration of the text-texts and provide better feature representation.It also incorporates an attention mechanism that offers an additional information source to conduct better summary extraction.The novel model addresses text summarization as a classification process,where the model calculates the probabilities of phrase and text-summary association.The model classification is divided into multiple features recognition such as information entropy,significance,redundancy and position.The model was assessed on two datasets,on the Multi-Doc Composition Query(MCQ)and Dual Attention Composition dataset(DAC)dataset.The experimental results prove that our proposed model has better summarization precision vs.other models by a considerable margin.