In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising...In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.展开更多
This research paper mainly discusses gender of English nouns and its corresponding issues. Gender in other Indo-European languages is a grammatical abstract notion, but English gender is a semantic concrete conception...This research paper mainly discusses gender of English nouns and its corresponding issues. Gender in other Indo-European languages is a grammatical abstract notion, but English gender is a semantic concrete conception. English nouns can be divided into four categories: masculine, feminine, common and neuter. Gender genre of an English noun involves the choice of a pronoun that is employed to substitute it. Gender of the pronoun should be identical with its referent. However, the rule may be broken under special conditions. English has lost most word-ending inflectional changes, including grammatical gender of nouns.展开更多
In order to deal with the complex association relationships between classes in an object-oriented software system,a novel approach for identifying refactoring opportunities is proposed.The approach can be used to dete...In order to deal with the complex association relationships between classes in an object-oriented software system,a novel approach for identifying refactoring opportunities is proposed.The approach can be used to detect complex and duplicated many-to-many association relationships in source code,and to provide guidance for further refactoring.In the approach,source code is first transformed to an abstract syntax tree from which all data members of each class are extracted,then each class is characterized in connection with a set of association classes saving its data members.Next,classes in common associations are obtained by comparing different association classes sets in integrated analysis.Finally,on condition of pre-defined thresholds,all class sets in candidate for refactoring and their common association classes are saved and exported.This approach is tested on 4 projects.The results show that the precision is over 96%when the threshold is 3,and 100%when the threshold is 4.Meanwhile,this approach has good execution efficiency as the execution time taken for a project with more than 500 classes is less than 4 s,which also indicates that it can be applied to projects of different scales to identify their refactoring opportunities effectively.展开更多
Data aggregation from various web sources is very significant for web data analysis domain. In ad- dition, the recognition of coherence micro cluster is one of the most interesting issues in the field of data aggregat...Data aggregation from various web sources is very significant for web data analysis domain. In ad- dition, the recognition of coherence micro cluster is one of the most interesting issues in the field of data aggregation. Until now, many algorithms have been proposed to work on this issue. However, the deficiency of these solutions is that they cannot recognize the micro-cluster data stream accurately. A semantic-based coherent micro-cluster recognition algorithm for hybrid web data stream is nronosed.Firstly, an objective function is proposed to recognize the coherence micro-cluster and then the coher- ence micro-cluster recognition algorithm for hybrid web data stream based on semantic is raised. Fi-展开更多
The concept of word classes (parts of speech) has always generated controversy among linguists. The earlier Prescriptive and Descriptive Schools might have set the pace for this controversy but the present dilemma i...The concept of word classes (parts of speech) has always generated controversy among linguists. The earlier Prescriptive and Descriptive Schools might have set the pace for this controversy but the present dilemma is much deeper. Learners and even teachers are sometimes at quandary as to how to proof that a particular word belongs to a particular class. This is because a word may sometimes belong to several classes, in context as in the word "watch" which can belong to different classes. This paper therefore tries to provide answers to the problem of word class classification by using a morphological and syntactical evidence to prove that English words follow a particular range of inflections and belong to strictly ordered particular categories and do not change their class arbitrarily. This is in line with the natural perfect order of homogeneity in creation which precludes a specie from merging effectively with another specie without having to undergo some fundamental changes. Other variables were also looked into and it was concluded that teachers and learners as well, can rely on this sub-categorization approach as a reliable paradigm for their assumptions concerning word classes.展开更多
Sample entropy can reflect the change of level of new information in signal sequence as well as the size of the new information. Based on the sample entropy as the features of speech classification, the paper firstly ...Sample entropy can reflect the change of level of new information in signal sequence as well as the size of the new information. Based on the sample entropy as the features of speech classification, the paper firstly extract the sample entropy of mixed signal, mean and variance to calculate each signal sample entropy, finally uses the K mean clustering to recognize. The simulation results show that: the recognition rate can be increased to 89.2% based on sample entropy.展开更多
Wano, spoken by about 7,000 native speakers, is a Papuan language of Trans-New Guinea Phylum, Dani-Kwerba Stock, which is found in the interior of Papua of the regency of Puncak Jaya. The language is closely related t...Wano, spoken by about 7,000 native speakers, is a Papuan language of Trans-New Guinea Phylum, Dani-Kwerba Stock, which is found in the interior of Papua of the regency of Puncak Jaya. The language is closely related to Dani, Walak, and Nggem. It is an SOV language typology that has a complex morphological system. Four spatial dimensions are morphosyntactically coded in elevative deixis, which are steepness/non-steepness distinction, proximity/distality distinction, adverbial/attributive expressions, and vertical/horizontal plane. This paper discusses the grammatical operation of a set of two-term system: ei "up" and ou "down" that serves as the basic forms for the elevational deixis in Wano.展开更多
We presented a novel framework for automatic behavior clustering and unsupervised anomaly detection in a large video set. The framework consisted of the following key components: 1 ) Drawing from natural language pr...We presented a novel framework for automatic behavior clustering and unsupervised anomaly detection in a large video set. The framework consisted of the following key components: 1 ) Drawing from natural language processing, we introduced a compact and effective behavior representation method as a stochastic sequence of spatiotemporal events, where we analyzed the global structural information of behaviors using their local action statistics. 2) The natural grouping of behavior patterns was discovered through a novel clustering algorithm. 3 ) A run-time accumulative anomaly measure was introduced to detect abnormal behavior, whereas normal behavior patterns were recognized when sufficient visual evidence had become available based on an online Likelihood Ratio Test (LRT) method. This ensured robust and reliable anomaly detection and normal behavior recognition at the shortest possible time. Experimental results demonstrated the effectiveness and robustness of our approach using noisy and sparse data sets collected from a real surveillance scenario.展开更多
This article studies the situations of international students' ultimate attainment in Chinese structural auxiliary word "De" (的) acquisition through questionnaires. And we found whether international students ca...This article studies the situations of international students' ultimate attainment in Chinese structural auxiliary word "De" (的) acquisition through questionnaires. And we found whether international students can use all types attributive "De" (的) correctly is mainly affected by the following two factors: (1) Influenced by frequency of attribute type appears and (2) affected by its own clarity of grammatical rules. We also noticed that the international students who are able to discriminate the "De" (的) and "Di" (地) in the predicates position better than the "De" (的) and "Di" (地) in the objects position.展开更多
An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement.A relevant noise estimation approach,known as the speech presence uncertainty tracking met...An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement.A relevant noise estimation approach,known as the speech presence uncertainty tracking method,requires seeking the "a priori" probability of speech absence that is derived by applying microphone input signal and the noise signal based on the estimated value of the "a posteriori" signal-to-noise ratio(SNR).To overcome this problem,first,the optimal values in terms of the perceived speech quality of a variety of noise types are derived.Second,the estimated optimal values are assigned according to the determined noise type which is classified by a real-time noise classification algorithm based on the Gaussian mixture model(GMM).The proposed algorithm estimates the speech absence probability using a noise classification algorithm which is based on GMM to apply the optimal parameter of each noise type,unlike the conventional approach which uses a fixed threshold and smoothing parameter.The performance of the proposed method was evaluated by objective tests,such as the perceptual evaluation of speech quality(PESQ) and composite measure.Performance was then evaluated by a subjective test,namely,mean opinion scores(MOS) under various noise environments.The proposed method show better results than existing methods.展开更多
The opposition between the terms carcasse (carcass), conceptualized by Auguste Perret, and ossature (frame), proposed as an alternative by Le Corbusier, gives rise to the exploration of the capital contribution of...The opposition between the terms carcasse (carcass), conceptualized by Auguste Perret, and ossature (frame), proposed as an alternative by Le Corbusier, gives rise to the exploration of the capital contribution of the "Dom-ino" prototype as the basic and in escapable condition for an aesthetic operation. Some issues addressed are: the importance of the question of the structure--which remains implicit in Toward an Architecture--as key to a quest for the specificity of architecture; Le Corbusier's troublesome relationship with Perret and the debates between them, which convey two different ways of understanding the potential contributions of concrete to the redefinition of architectural vocabulary; the "Dom-ino" system considered as a new structural type in the sense ascribed to this category by Violletle Duc; the topic of the abri souverain (sovereign shelter) fit for all programs, which triggered typological invention; the ways in which Le Corbusier plays with Gottfried Semper's Urformen and, finally, how this new structural type anchors Le Corbusier's radical redefinition of the elements of the discipline, the making of a new grammar.展开更多
The construction of virtual community in foreign language learning is a comprehensive foreign language learning environment integrated with foreign language vocabulary database construction and vocabulary retrieval, c...The construction of virtual community in foreign language learning is a comprehensive foreign language learning environment integrated with foreign language vocabulary database construction and vocabulary retrieval, combining the virtual reality technology to construct the language environment of foreign language learning. The virtual community of foreign language leaming can improve the sense of language authenticity in foreign language learning and improve the quality of foreign language teaching. A method of building a virtual community for foreign language learning is proposed based on data mining technology, data acquisition and feature preprocessing model for building semantic vocabulary of foreign language learning is constructed, the linguistic environment characteristics of the semantic vocabulary data of foreign language learning is analyzed, and the semantic noumenon structure model is obtained. Fuzzy clustering method is used for vocabulary clustering and comprehensive retrieval in the virtual community of foreign language learning, the performance of vocabulary classification in foreign language learning is improved, the adaptive semantic information fusion method is used to realize the vocabulary data mining in the virtual community of foreign language learning, information retrieval and access scheduling for virtual communities in foreign language learning are realized based on data mining results. The simulation results show that the accuracy of foreign language vocabulary retrieval is good, improve the efficiency of foreign language learning.展开更多
Both a general domain-independent bottom-up multi-level model and an algorithm for establishing the taxonomic relation of Chinese ontology are proposed.The model consists of extracting domain vocabularies and establis...Both a general domain-independent bottom-up multi-level model and an algorithm for establishing the taxonomic relation of Chinese ontology are proposed.The model consists of extracting domain vocabularies and establishing taxonomic relation,with the consideration of characteristics unique to Chinese natural language.By establishing the semantic forests of domain vocabularies and then using the existing semantic dictionary or machine-readable dictionary(MRD),the proposed algorithm can integrate these semantic forests together to establish the taxonomic relation.Experimental results show that the proposed algorithm is feasible and effective in establishing the integrated taxonomic relation among domain vocabularies and concepts.展开更多
In the second half of the last century the problem of categories became less and less prominent in philosophical debates. This twilight of categorial discourse did not go unnoticed, and some authors offered different ...In the second half of the last century the problem of categories became less and less prominent in philosophical debates. This twilight of categorial discourse did not go unnoticed, and some authors offered different solutions for the revival of categorial theorizing in contemporary philosophy's repertoire. One of these authors is the American philosopher Stephen Pepper. The purpose of the present discussion is to offer yet another explanation for the decline of categorial theory, and to explore Pepper's view and its role in the transformation of categorial discourse. The main thesis which I will argue for is that traditional categories did not disappear altogether, but they have been replaced, gradually, by key empirical concepts from natural science. Even if such concepts do not satisfy the traditional requirements categories in shaping our for a categorial scheme, they are, nonetheless, fulfilling the same role as traditional worldviews.展开更多
Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with g...Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model.展开更多
基金The National Natural Science Foundation of China(No.50674086)Specialized Research Fund for the Doctoral Program of Higher Education(No.20060290508)the Postdoctoral Scientific Program of Jiangsu Province(No.0701045B)
文摘In order to mine production and security information from security supervising data and to ensure security and safety involved in production and decision-making,a clustering analysis algorithm for security supervising data based on a semantic description in coal mines is studied.First,the semantic and numerical-based hybrid description method of security supervising data in coal mines is described.Secondly,the similarity measurement method of semantic and numerical data are separately given and a weight-based hybrid similarity measurement method for the security supervising data based on a semantic description in coal mines is presented.Thirdly,taking the hybrid similarity measurement method as the distance criteria and using a grid methodology for reference,an improved CURE clustering algorithm based on the grid is presented.Finally,the simulation results of a security supervising data set in coal mines validate the efficiency of the algorithm.
文摘This research paper mainly discusses gender of English nouns and its corresponding issues. Gender in other Indo-European languages is a grammatical abstract notion, but English gender is a semantic concrete conception. English nouns can be divided into four categories: masculine, feminine, common and neuter. Gender genre of an English noun involves the choice of a pronoun that is employed to substitute it. Gender of the pronoun should be identical with its referent. However, the rule may be broken under special conditions. English has lost most word-ending inflectional changes, including grammatical gender of nouns.
文摘In order to deal with the complex association relationships between classes in an object-oriented software system,a novel approach for identifying refactoring opportunities is proposed.The approach can be used to detect complex and duplicated many-to-many association relationships in source code,and to provide guidance for further refactoring.In the approach,source code is first transformed to an abstract syntax tree from which all data members of each class are extracted,then each class is characterized in connection with a set of association classes saving its data members.Next,classes in common associations are obtained by comparing different association classes sets in integrated analysis.Finally,on condition of pre-defined thresholds,all class sets in candidate for refactoring and their common association classes are saved and exported.This approach is tested on 4 projects.The results show that the precision is over 96%when the threshold is 3,and 100%when the threshold is 4.Meanwhile,this approach has good execution efficiency as the execution time taken for a project with more than 500 classes is less than 4 s,which also indicates that it can be applied to projects of different scales to identify their refactoring opportunities effectively.
基金Supported by the National High Technology Research and Development Programme of China(No.2011AA120300,2011AA120302)the National Key Technology Support Program of China(No.2013BAH66F02)
文摘Data aggregation from various web sources is very significant for web data analysis domain. In ad- dition, the recognition of coherence micro cluster is one of the most interesting issues in the field of data aggregation. Until now, many algorithms have been proposed to work on this issue. However, the deficiency of these solutions is that they cannot recognize the micro-cluster data stream accurately. A semantic-based coherent micro-cluster recognition algorithm for hybrid web data stream is nronosed.Firstly, an objective function is proposed to recognize the coherence micro-cluster and then the coher- ence micro-cluster recognition algorithm for hybrid web data stream based on semantic is raised. Fi-
文摘The concept of word classes (parts of speech) has always generated controversy among linguists. The earlier Prescriptive and Descriptive Schools might have set the pace for this controversy but the present dilemma is much deeper. Learners and even teachers are sometimes at quandary as to how to proof that a particular word belongs to a particular class. This is because a word may sometimes belong to several classes, in context as in the word "watch" which can belong to different classes. This paper therefore tries to provide answers to the problem of word class classification by using a morphological and syntactical evidence to prove that English words follow a particular range of inflections and belong to strictly ordered particular categories and do not change their class arbitrarily. This is in line with the natural perfect order of homogeneity in creation which precludes a specie from merging effectively with another specie without having to undergo some fundamental changes. Other variables were also looked into and it was concluded that teachers and learners as well, can rely on this sub-categorization approach as a reliable paradigm for their assumptions concerning word classes.
文摘Sample entropy can reflect the change of level of new information in signal sequence as well as the size of the new information. Based on the sample entropy as the features of speech classification, the paper firstly extract the sample entropy of mixed signal, mean and variance to calculate each signal sample entropy, finally uses the K mean clustering to recognize. The simulation results show that: the recognition rate can be increased to 89.2% based on sample entropy.
文摘Wano, spoken by about 7,000 native speakers, is a Papuan language of Trans-New Guinea Phylum, Dani-Kwerba Stock, which is found in the interior of Papua of the regency of Puncak Jaya. The language is closely related to Dani, Walak, and Nggem. It is an SOV language typology that has a complex morphological system. Four spatial dimensions are morphosyntactically coded in elevative deixis, which are steepness/non-steepness distinction, proximity/distality distinction, adverbial/attributive expressions, and vertical/horizontal plane. This paper discusses the grammatical operation of a set of two-term system: ei "up" and ou "down" that serves as the basic forms for the elevational deixis in Wano.
基金This work is supported by National Natural Science Foundation of China (NSFC) under Grant No. 60573139 andNational Science & Technology Pillar Program of China under Grant NO. 2008BAH221303.
文摘We presented a novel framework for automatic behavior clustering and unsupervised anomaly detection in a large video set. The framework consisted of the following key components: 1 ) Drawing from natural language processing, we introduced a compact and effective behavior representation method as a stochastic sequence of spatiotemporal events, where we analyzed the global structural information of behaviors using their local action statistics. 2) The natural grouping of behavior patterns was discovered through a novel clustering algorithm. 3 ) A run-time accumulative anomaly measure was introduced to detect abnormal behavior, whereas normal behavior patterns were recognized when sufficient visual evidence had become available based on an online Likelihood Ratio Test (LRT) method. This ensured robust and reliable anomaly detection and normal behavior recognition at the shortest possible time. Experimental results demonstrated the effectiveness and robustness of our approach using noisy and sparse data sets collected from a real surveillance scenario.
文摘This article studies the situations of international students' ultimate attainment in Chinese structural auxiliary word "De" (的) acquisition through questionnaires. And we found whether international students can use all types attributive "De" (的) correctly is mainly affected by the following two factors: (1) Influenced by frequency of attribute type appears and (2) affected by its own clarity of grammatical rules. We also noticed that the international students who are able to discriminate the "De" (的) and "Di" (地) in the predicates position better than the "De" (的) and "Di" (地) in the objects position.
基金Project supported by an Inha University Research GrantProject(10031764) supported by the Strategic Technology Development Program of Ministry of Knowledge Economy,Korea
文摘An improved speech absence probability estimation was proposed using environmental noise classification for speech enhancement.A relevant noise estimation approach,known as the speech presence uncertainty tracking method,requires seeking the "a priori" probability of speech absence that is derived by applying microphone input signal and the noise signal based on the estimated value of the "a posteriori" signal-to-noise ratio(SNR).To overcome this problem,first,the optimal values in terms of the perceived speech quality of a variety of noise types are derived.Second,the estimated optimal values are assigned according to the determined noise type which is classified by a real-time noise classification algorithm based on the Gaussian mixture model(GMM).The proposed algorithm estimates the speech absence probability using a noise classification algorithm which is based on GMM to apply the optimal parameter of each noise type,unlike the conventional approach which uses a fixed threshold and smoothing parameter.The performance of the proposed method was evaluated by objective tests,such as the perceptual evaluation of speech quality(PESQ) and composite measure.Performance was then evaluated by a subjective test,namely,mean opinion scores(MOS) under various noise environments.The proposed method show better results than existing methods.
文摘The opposition between the terms carcasse (carcass), conceptualized by Auguste Perret, and ossature (frame), proposed as an alternative by Le Corbusier, gives rise to the exploration of the capital contribution of the "Dom-ino" prototype as the basic and in escapable condition for an aesthetic operation. Some issues addressed are: the importance of the question of the structure--which remains implicit in Toward an Architecture--as key to a quest for the specificity of architecture; Le Corbusier's troublesome relationship with Perret and the debates between them, which convey two different ways of understanding the potential contributions of concrete to the redefinition of architectural vocabulary; the "Dom-ino" system considered as a new structural type in the sense ascribed to this category by Violletle Duc; the topic of the abri souverain (sovereign shelter) fit for all programs, which triggered typological invention; the ways in which Le Corbusier plays with Gottfried Semper's Urformen and, finally, how this new structural type anchors Le Corbusier's radical redefinition of the elements of the discipline, the making of a new grammar.
文摘The construction of virtual community in foreign language learning is a comprehensive foreign language learning environment integrated with foreign language vocabulary database construction and vocabulary retrieval, combining the virtual reality technology to construct the language environment of foreign language learning. The virtual community of foreign language leaming can improve the sense of language authenticity in foreign language learning and improve the quality of foreign language teaching. A method of building a virtual community for foreign language learning is proposed based on data mining technology, data acquisition and feature preprocessing model for building semantic vocabulary of foreign language learning is constructed, the linguistic environment characteristics of the semantic vocabulary data of foreign language learning is analyzed, and the semantic noumenon structure model is obtained. Fuzzy clustering method is used for vocabulary clustering and comprehensive retrieval in the virtual community of foreign language learning, the performance of vocabulary classification in foreign language learning is improved, the adaptive semantic information fusion method is used to realize the vocabulary data mining in the virtual community of foreign language learning, information retrieval and access scheduling for virtual communities in foreign language learning are realized based on data mining results. The simulation results show that the accuracy of foreign language vocabulary retrieval is good, improve the efficiency of foreign language learning.
基金Sponsored by the National Natural Science Foundation of China(Grant No.60496326 and No.10671045)
文摘Both a general domain-independent bottom-up multi-level model and an algorithm for establishing the taxonomic relation of Chinese ontology are proposed.The model consists of extracting domain vocabularies and establishing taxonomic relation,with the consideration of characteristics unique to Chinese natural language.By establishing the semantic forests of domain vocabularies and then using the existing semantic dictionary or machine-readable dictionary(MRD),the proposed algorithm can integrate these semantic forests together to establish the taxonomic relation.Experimental results show that the proposed algorithm is feasible and effective in establishing the integrated taxonomic relation among domain vocabularies and concepts.
文摘In the second half of the last century the problem of categories became less and less prominent in philosophical debates. This twilight of categorial discourse did not go unnoticed, and some authors offered different solutions for the revival of categorial theorizing in contemporary philosophy's repertoire. One of these authors is the American philosopher Stephen Pepper. The purpose of the present discussion is to offer yet another explanation for the decline of categorial theory, and to explore Pepper's view and its role in the transformation of categorial discourse. The main thesis which I will argue for is that traditional categories did not disappear altogether, but they have been replaced, gradually, by key empirical concepts from natural science. Even if such concepts do not satisfy the traditional requirements categories in shaping our for a categorial scheme, they are, nonetheless, fulfilling the same role as traditional worldviews.
基金Project(60763001) supported by the National Natural Science Foundation of ChinaProject(2010GZS0072) supported by the Natural Science Foundation of Jiangxi Province,ChinaProject(GJJ12271) supported by the Science and Technology Foundation of Provincial Education Department of Jiangxi Province,China
文摘Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model.