Query expansion with thesaurus is one of the useful techniques in modern information retrieval (IR). In this paper, a method of query expansion for Chinese IR by using a decaying co-occurrence model is proposed and re...Query expansion with thesaurus is one of the useful techniques in modern information retrieval (IR). In this paper, a method of query expansion for Chinese IR by using a decaying co-occurrence model is proposed and realized. The model is an extension of the traditional co-occurrence model by adding a decaying factor that decreases the mutual information when the distance between the terms increases. Experimental results on TREC-9 collections show this query expansion method results in significant improvements over the IR without query expansion.展开更多
Cell-phone short messages in English possess special styles. Some are in a literary style, and others in a non-literary one. In the present paper, the stylistic features of English short messages are illustrated by ap...Cell-phone short messages in English possess special styles. Some are in a literary style, and others in a non-literary one. In the present paper, the stylistic features of English short messages are illustrated by applying the stylistic theory and employing the "Model of Analyzing Textual Function". The features of the two styles of messages are described and analyzed from the perspectives of their pronunciation, graphology, vocabulary, grammar, discourse, rhetoric, etc., and the stylistic features serve as the basis for discussing the translation principles and methods of English short messages.展开更多
On the basis of information theory and statistical methods, we use mutual information, n- tuple entropy and conditional entropy, combined with biological characteristics, to analyze the long range correlation and shor...On the basis of information theory and statistical methods, we use mutual information, n- tuple entropy and conditional entropy, combined with biological characteristics, to analyze the long range correlation and short range correlation in human Y chromosome palindromes. The magnitude distribution of the long range correlation which can be reflected by the mutual information is PS〉PSa〉PSb (P5a and P5b are the sequences that replace solely Alu repeats and all interspersed repeats with random uneorrelated sequences in human Y chromosome palindrome 5, respectively); and the magnitude distribution of the short range correlation which can be reflected by the n-tuple entropy and the conditional entropy is PS〉P5a〉PSb〉random uncorrelated sequence. In other words, when the Alu repeats and all interspersed repeats replace with random uneorrelated sequence, the long range and short range correlation decrease gradually. However, the random nncorrelated sequence has no correlation. This research indicates that more repeat sequences result in stronger correlation between bases in human Y chromosome. The analyses may be helpful to understand the special structures of human Y chromosome palindromes profoundly.展开更多
A method that combines category-based and keyword-based concepts for a better information retrieval system is introduced. To improve document clustering, a document similarity measure based on cosine vector and keywor...A method that combines category-based and keyword-based concepts for a better information retrieval system is introduced. To improve document clustering, a document similarity measure based on cosine vector and keywords frequency in documents is proposed, but also with an input ontology. The ontology is domain specific and includes a list of keywords organized by degree of importance to the categories of the ontology, and by means of semantic knowledge, the ontology can improve the effects of document similarity measure and feedback of information retrieval systems. Two approaches to evaluating the performance of this similarity measure and the comparison with standard cosine vector similarity measure are also described.展开更多
Tag key encapsulation mechanism (Tag-KEM)/data encapsulation mechanism (DEM) is a hybrid framework proposed in 2005. Tag-t(EM is one of its parts by using public-key encryption (PKE) technique to encapsulate a ...Tag key encapsulation mechanism (Tag-KEM)/data encapsulation mechanism (DEM) is a hybrid framework proposed in 2005. Tag-t(EM is one of its parts by using public-key encryption (PKE) technique to encapsulate a symmetric key. In hybrid encryptions, the long-raessage PKE is not desired due to its slow operation. A general method is presented for constructing Tag-KEM schemes with short-message PKEs. The chosen ciphertext security is proved in the random oracle model. In the method, the treatment of the tag part brings no additional ciphertext redundancy. Among all the methods for constructing Tag-KEM, the method is the first one without any validity checking on the tag part, thus showing that the Tag-KEM/DEM framework is superior to KEM+DEM one.展开更多
Text embedded in images is one of many important cues for indexing and retrieval of images and videos. In the paper, we present a novel method of detecting text aligned either horizontally or vertically, in which a py...Text embedded in images is one of many important cues for indexing and retrieval of images and videos. In the paper, we present a novel method of detecting text aligned either horizontally or vertically, in which a pyramid structure is used to represent an image and the features of the text are extracted using SUSAN edge detector. Text regions at each level of the pyramid are identified according to the autocorrelation analysis. New techniques are introduced to split the text regions into basic ones and merge them into text lines. By evaluating the method on a set of images, we obtain a very good performance of text detection.展开更多
Over-shadowed by eye-catching vocal and visual signals, chemical communication has long been overlooked in birds. This study aimed at exploring whether volatile composition of the uropygial gland secretion (UGS) of ...Over-shadowed by eye-catching vocal and visual signals, chemical communication has long been overlooked in birds. This study aimed at exploring whether volatile composition of the uropygial gland secretion (UGS) of birds was associated with the information about sex, individual and species. By using dichloromethane extraction and gas chromatography-mass spectrometry (GC-MS), we analyzed the UGS volatiles of domesticated Bengalese finches ( Lonchura striata, Estrildiea) which is also known as white-rumped munias. We characterized 16 volatile molecules from the UGS, including eight n-alkanols, five diesters, an ester, an aldehyde and a fatty acid, and quantified them in terms of GC peak area percentages (relative abundances) . Among these compounds, hexadecanol and octadecanol were major components in both sexes. The former was richer in males than in females and the latter richer in females than in males, suggesting that they might be male and female pheromone candidates, respectively. The high inter-individual variations, in relative abundance, of the UGS volatiles implied that these compounds might carry information about individuality. The similarity between GC profiles of the UGS and wing feather from same individuals indicates that the birds might preen the secretion to their feathers to transmit chemical cues. Additionally, by comparing with three sympatric passerine species, i. e., zebra finches Taeniopygia guttata, yellow-bowed buntings Emberiza chrysophrys and rooks Corvus frugilegus, we found that the composition of C13 - C18 alkanols in the UGS might code for information about species. Our study also showed that quantitative differences (degree) of same UGS volatiles might be the key for the Bengalese finch to code for information about sex and individuality whereas both the kind and degree of UGS constituents could be utilized to code for information about species [ Current Zoology 55 (5): 357-365, 2009].展开更多
One crucial issue in particle filtering is the selection of proposal distribution. Good proposal can effectively alleviate particle degeneracy and thus improve filtering accuracy. In this paper, we propose a new type ...One crucial issue in particle filtering is the selection of proposal distribution. Good proposal can effectively alleviate particle degeneracy and thus improve filtering accuracy. In this paper, we propose a new type of proposal distribution for particle filter, called as R-IEKF proposal. By combining iterated extended kalman filter with Rauch-Tung-Striebel optimal smoother, the new proposal integrates the latest observation into system and approximates the true posterior distribution reasonably well, hence generating more precise and stable particles against measurement noise. The simulation results indicate that the improved particle filter with R-IEKF proposal prevails over PF-EKF and UPF both in tracking accuracy and filtering stability. Consequently, PF-RIEKF is a competitive choice in noisy measurement environment.展开更多
The major problem of the most current approaches of information models lies in that individual words provide unreliable evidence about the content of the texts. When the document is short, e.g. only the abstract is av...The major problem of the most current approaches of information models lies in that individual words provide unreliable evidence about the content of the texts. When the document is short, e.g. only the abstract is available, the word-use variability problem will have substantial impact on the Information Retrieval (IR) performance. To solve the problem, a new technology to short document retrieval named Reference Document Model (RDM) is put forward in this letter. RDM gets the statistical semantic of the query/document by pseudo feedback both for the query and document from reference documents. The contributions of this model are three-fold: (1) Pseudo feedback both for the query and the document; (2) Building the query model and the document model from reference documents; (3) Flexible indexing units, which can be ally linguistic elements such as documents, paragraphs, sentences, n-grams, term or character. For short document retrieval, RDM achieves significant improvements over the classical probabilistic models on the task of ad hoc retrieval on Text REtrieval Conference (TREC) test sets. Results also show that the shorter the document, the better the RDM performance.展开更多
A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOV...A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOVs).After the initial segmentation,the segmentation fragments are divided into two classes as "combination"(combining several fragments as an unknown word) and "segregation"(segregating to some words).So,more OOVs can be recalled.Moreover,for the characteristics of the cross-domain segmentation,context information is reasonably used to guide Chinese Word Segmentation(CWS).This method is proved to be effective through several experiments on the test data from Sighan Bakeoffs 2007 and Bakeoffs 2010.The rates of OOV recall obtain better performance and the overall segmentation performances achieve a good effect.展开更多
The paper sets out to examine the influence of e-commerce on marketing practitioners and consumers. This researcher found out that e-commerce brings about a new experience for both consumers and marketing practitioner...The paper sets out to examine the influence of e-commerce on marketing practitioners and consumers. This researcher found out that e-commerce brings about a new experience for both consumers and marketing practitioners as both groups try to achieve their different goals that end in an online relationship between the duo. This posses a lot of challenges for marketers who have to adapt and modify their offline marketing strategies to suite and meet the demands of e-commerce bringing about the whole concept and execution of e-marketing. The issue of the benefits as well as trust for online transactions based on the fear of insecurity from the consumers' perspective was also discussed. In all, the authors concluded that it's important for organizations engaging in e-commerce to come up with proper strategies to address these issues and build consumer trust in e-commerce; aiding it to further adapt to the ever changing needs of the business world.展开更多
In order to improve the accuracy of biophysical parameters retrieved from remotely sensing data, a new algorithm was presented by using spatial contextual to estimate canopy variables from high-resolution remote sensi...In order to improve the accuracy of biophysical parameters retrieved from remotely sensing data, a new algorithm was presented by using spatial contextual to estimate canopy variables from high-resolution remote sensing images. The developed algorithm was used for inversion of leaf area index (LAI) from Enhanced Thematic Mapper Plus (ETM+) data by combining with optimization method to minimize cost functions. The results show that the distribution of LAI is spatially consistent with the false composition imagery from ETM+ and the accuracy of LAI is significantly improved over the results retrieved by the conventional pixelwise retrieval methods, demonstrating that this method can be reliably used to integrate spatial contextual information for inverting LAI from high-resolution remote sensing images.展开更多
Human-object interaction(HOIs)detection is a new branch of visual relationship detection,which plays an important role in the field of image understanding.Because of the complexity and diversity of image content,the d...Human-object interaction(HOIs)detection is a new branch of visual relationship detection,which plays an important role in the field of image understanding.Because of the complexity and diversity of image content,the detection of HOIs is still an onerous challenge.Unlike most of the current works for HOIs detection which only rely on the pairwise information of a human and an object,we propose a graph-based HOIs detection method that models context and global structure information.Firstly,to better utilize the relations between humans and objects,the detected humans and objects are regarded as nodes to construct a fully connected undirected graph,and the graph is pruned to obtain an HOI graph that only preserving the edges connecting human and object nodes.Then,in order to obtain more robust features of human and object nodes,two different attention-based feature extraction networks are proposed,which model global and local contexts respectively.Finally,the graph attention network is introduced to pass messages between different nodes in the HOI graph iteratively,and detect the potential HOIs.Experiments on V-COCO and HICO-DET datasets verify the effectiveness of the proposed method,and show that it is superior to many existing methods.展开更多
Finding out out-of-vocabulary words is an urgent and difficult task in Chinese words segmentation. To avoid the defect causing by offline training in the traditional method, the paper proposes an improved prediction b...Finding out out-of-vocabulary words is an urgent and difficult task in Chinese words segmentation. To avoid the defect causing by offline training in the traditional method, the paper proposes an improved prediction by partical match (PPM) segmenting algorithm for Chinese words based on extracting local context information, which adds the context information of the testing text into the local PPM statistical model so as to guide the detection of new words. The algorithm focuses on the process of online segmentatien and new word detection which achieves a good effect in the close or opening test, and outperforms some well-known Chinese segmentation system to a certain extent.展开更多
In Chinese, dependency analysis has been shown to be a powerful syntactic parser because the order of phrases in a sentence is relatively free compared with English. Conventional dependency parsers require a number of...In Chinese, dependency analysis has been shown to be a powerful syntactic parser because the order of phrases in a sentence is relatively free compared with English. Conventional dependency parsers require a number of sophisticated rules that have to be handcrafted by linguists, and are too cumbersome to maintain. To solve the problem, a parser using SVM (Support Vector Machine) is introduced. First, a new strategy of dependency analysis is proposed. Then some chosen feature types are used for learning and for creating the modification matrix using SVM. Finally, the dependency of phrases in the sentence is generated. Experiments conducted to analyze how each type of feature affects parsing accuracy, showed that the model can increase accuracy of the dependency parser by 9.2%.展开更多
文摘Query expansion with thesaurus is one of the useful techniques in modern information retrieval (IR). In this paper, a method of query expansion for Chinese IR by using a decaying co-occurrence model is proposed and realized. The model is an extension of the traditional co-occurrence model by adding a decaying factor that decreases the mutual information when the distance between the terms increases. Experimental results on TREC-9 collections show this query expansion method results in significant improvements over the IR without query expansion.
文摘Cell-phone short messages in English possess special styles. Some are in a literary style, and others in a non-literary one. In the present paper, the stylistic features of English short messages are illustrated by applying the stylistic theory and employing the "Model of Analyzing Textual Function". The features of the two styles of messages are described and analyzed from the perspectives of their pronunciation, graphology, vocabulary, grammar, discourse, rhetoric, etc., and the stylistic features serve as the basis for discussing the translation principles and methods of English short messages.
基金This work was supported by the National Natu- ral Science Foundation of China (No.20173023 and No.90203012) and the Specialized Research Fund for the Doctoral Program of Higher Education of China (No.20020730006).
文摘On the basis of information theory and statistical methods, we use mutual information, n- tuple entropy and conditional entropy, combined with biological characteristics, to analyze the long range correlation and short range correlation in human Y chromosome palindromes. The magnitude distribution of the long range correlation which can be reflected by the mutual information is PS〉PSa〉PSb (P5a and P5b are the sequences that replace solely Alu repeats and all interspersed repeats with random uneorrelated sequences in human Y chromosome palindrome 5, respectively); and the magnitude distribution of the short range correlation which can be reflected by the n-tuple entropy and the conditional entropy is PS〉P5a〉PSb〉random uncorrelated sequence. In other words, when the Alu repeats and all interspersed repeats replace with random uneorrelated sequence, the long range and short range correlation decrease gradually. However, the random nncorrelated sequence has no correlation. This research indicates that more repeat sequences result in stronger correlation between bases in human Y chromosome. The analyses may be helpful to understand the special structures of human Y chromosome palindromes profoundly.
基金The Young Teachers Scientific Research Foundation (YTSRF) of Nanjing University of Science and Technology in the Year of2005-2006.
文摘A method that combines category-based and keyword-based concepts for a better information retrieval system is introduced. To improve document clustering, a document similarity measure based on cosine vector and keywords frequency in documents is proposed, but also with an input ontology. The ontology is domain specific and includes a list of keywords organized by degree of importance to the categories of the ontology, and by means of semantic knowledge, the ontology can improve the effects of document similarity measure and feedback of information retrieval systems. Two approaches to evaluating the performance of this similarity measure and the comparison with standard cosine vector similarity measure are also described.
基金Supported by the National Natural Science Foundation of China(60603010,60970120)~~
文摘Tag key encapsulation mechanism (Tag-KEM)/data encapsulation mechanism (DEM) is a hybrid framework proposed in 2005. Tag-t(EM is one of its parts by using public-key encryption (PKE) technique to encapsulate a symmetric key. In hybrid encryptions, the long-raessage PKE is not desired due to its slow operation. A general method is presented for constructing Tag-KEM schemes with short-message PKEs. The chosen ciphertext security is proved in the random oracle model. In the method, the treatment of the tag part brings no additional ciphertext redundancy. Among all the methods for constructing Tag-KEM, the method is the first one without any validity checking on the tag part, thus showing that the Tag-KEM/DEM framework is superior to KEM+DEM one.
文摘Text embedded in images is one of many important cues for indexing and retrieval of images and videos. In the paper, we present a novel method of detecting text aligned either horizontally or vertically, in which a pyramid structure is used to represent an image and the features of the text are extracted using SUSAN edge detector. Text regions at each level of the pyramid are identified according to the autocorrelation analysis. New techniques are introduced to split the text regions into basic ones and merge them into text lines. By evaluating the method on a set of images, we obtain a very good performance of text detection.
基金supported by grants from Chinese NSF(No.30870297to J.X.ZNo.30370196to M.X.Z.)International Partnership Project of CAS(CXTD2005-4to L.S and J.X.Z)
文摘Over-shadowed by eye-catching vocal and visual signals, chemical communication has long been overlooked in birds. This study aimed at exploring whether volatile composition of the uropygial gland secretion (UGS) of birds was associated with the information about sex, individual and species. By using dichloromethane extraction and gas chromatography-mass spectrometry (GC-MS), we analyzed the UGS volatiles of domesticated Bengalese finches ( Lonchura striata, Estrildiea) which is also known as white-rumped munias. We characterized 16 volatile molecules from the UGS, including eight n-alkanols, five diesters, an ester, an aldehyde and a fatty acid, and quantified them in terms of GC peak area percentages (relative abundances) . Among these compounds, hexadecanol and octadecanol were major components in both sexes. The former was richer in males than in females and the latter richer in females than in males, suggesting that they might be male and female pheromone candidates, respectively. The high inter-individual variations, in relative abundance, of the UGS volatiles implied that these compounds might carry information about individuality. The similarity between GC profiles of the UGS and wing feather from same individuals indicates that the birds might preen the secretion to their feathers to transmit chemical cues. Additionally, by comparing with three sympatric passerine species, i. e., zebra finches Taeniopygia guttata, yellow-bowed buntings Emberiza chrysophrys and rooks Corvus frugilegus, we found that the composition of C13 - C18 alkanols in the UGS might code for information about species. Our study also showed that quantitative differences (degree) of same UGS volatiles might be the key for the Bengalese finch to code for information about sex and individuality whereas both the kind and degree of UGS constituents could be utilized to code for information about species [ Current Zoology 55 (5): 357-365, 2009].
基金Sponsored by the National Natural Science Foundation of China (Grant No. 61136002 )Key Project of Chinese Ministry of Education (Grant No.211180)Shannxi Provincial Industrial and Technological Project(Grant No. 2011K06-47)
文摘One crucial issue in particle filtering is the selection of proposal distribution. Good proposal can effectively alleviate particle degeneracy and thus improve filtering accuracy. In this paper, we propose a new type of proposal distribution for particle filter, called as R-IEKF proposal. By combining iterated extended kalman filter with Rauch-Tung-Striebel optimal smoother, the new proposal integrates the latest observation into system and approximates the true posterior distribution reasonably well, hence generating more precise and stable particles against measurement noise. The simulation results indicate that the improved particle filter with R-IEKF proposal prevails over PF-EKF and UPF both in tracking accuracy and filtering stability. Consequently, PF-RIEKF is a competitive choice in noisy measurement environment.
基金Supported by the Funds of Heilongjiang Outstanding Young Teacher (1151G037).
文摘The major problem of the most current approaches of information models lies in that individual words provide unreliable evidence about the content of the texts. When the document is short, e.g. only the abstract is available, the word-use variability problem will have substantial impact on the Information Retrieval (IR) performance. To solve the problem, a new technology to short document retrieval named Reference Document Model (RDM) is put forward in this letter. RDM gets the statistical semantic of the query/document by pseudo feedback both for the query and document from reference documents. The contributions of this model are three-fold: (1) Pseudo feedback both for the query and the document; (2) Building the query model and the document model from reference documents; (3) Flexible indexing units, which can be ally linguistic elements such as documents, paragraphs, sentences, n-grams, term or character. For short document retrieval, RDM achieves significant improvements over the classical probabilistic models on the task of ad hoc retrieval on Text REtrieval Conference (TREC) test sets. Results also show that the shorter the document, the better the RDM performance.
基金supported by the National Natural Science Foundation of China under Grants No.61173100,No.61173101the Fundamental Research Funds for the Central Universities under Grant No.DUT10RW202
文摘A new joint decoding strategy that combines the character-based and word-based conditional random field model is proposed.In this segmentation framework,fragments are used to generate candidate Out-of-Vocabularies(OOVs).After the initial segmentation,the segmentation fragments are divided into two classes as "combination"(combining several fragments as an unknown word) and "segregation"(segregating to some words).So,more OOVs can be recalled.Moreover,for the characteristics of the cross-domain segmentation,context information is reasonably used to guide Chinese Word Segmentation(CWS).This method is proved to be effective through several experiments on the test data from Sighan Bakeoffs 2007 and Bakeoffs 2010.The rates of OOV recall obtain better performance and the overall segmentation performances achieve a good effect.
文摘The paper sets out to examine the influence of e-commerce on marketing practitioners and consumers. This researcher found out that e-commerce brings about a new experience for both consumers and marketing practitioners as both groups try to achieve their different goals that end in an online relationship between the duo. This posses a lot of challenges for marketers who have to adapt and modify their offline marketing strategies to suite and meet the demands of e-commerce bringing about the whole concept and execution of e-marketing. The issue of the benefits as well as trust for online transactions based on the fear of insecurity from the consumers' perspective was also discussed. In all, the authors concluded that it's important for organizations engaging in e-commerce to come up with proper strategies to address these issues and build consumer trust in e-commerce; aiding it to further adapt to the ever changing needs of the business world.
基金Project(2007CB714407) supported by the Major State Basic Research and Development Program of ChinaProject(2004DFA06300) supported by Key International Collaboration Project in Science and TechnologyProjects(40571107, 40701102) supported by the National Natural Science Foundation of China
文摘In order to improve the accuracy of biophysical parameters retrieved from remotely sensing data, a new algorithm was presented by using spatial contextual to estimate canopy variables from high-resolution remote sensing images. The developed algorithm was used for inversion of leaf area index (LAI) from Enhanced Thematic Mapper Plus (ETM+) data by combining with optimization method to minimize cost functions. The results show that the distribution of LAI is spatially consistent with the false composition imagery from ETM+ and the accuracy of LAI is significantly improved over the results retrieved by the conventional pixelwise retrieval methods, demonstrating that this method can be reliably used to integrate spatial contextual information for inverting LAI from high-resolution remote sensing images.
基金Project(51678075)supported by the National Natural Science Foundation of ChinaProject(2017GK2271)supported by the Hunan Provincial Science and Technology Department,China。
文摘Human-object interaction(HOIs)detection is a new branch of visual relationship detection,which plays an important role in the field of image understanding.Because of the complexity and diversity of image content,the detection of HOIs is still an onerous challenge.Unlike most of the current works for HOIs detection which only rely on the pairwise information of a human and an object,we propose a graph-based HOIs detection method that models context and global structure information.Firstly,to better utilize the relations between humans and objects,the detected humans and objects are regarded as nodes to construct a fully connected undirected graph,and the graph is pruned to obtain an HOI graph that only preserving the edges connecting human and object nodes.Then,in order to obtain more robust features of human and object nodes,two different attention-based feature extraction networks are proposed,which model global and local contexts respectively.Finally,the graph attention network is introduced to pass messages between different nodes in the HOI graph iteratively,and detect the potential HOIs.Experiments on V-COCO and HICO-DET datasets verify the effectiveness of the proposed method,and show that it is superior to many existing methods.
基金National Natural Science Foundation of China ( No.60903129)National High Technology Research and Development Program of China (No.2006AA010107, No.2006AA010108)Foundation of Fujian Province of China (No.2008F3105)
文摘Finding out out-of-vocabulary words is an urgent and difficult task in Chinese words segmentation. To avoid the defect causing by offline training in the traditional method, the paper proposes an improved prediction by partical match (PPM) segmenting algorithm for Chinese words based on extracting local context information, which adds the context information of the testing text into the local PPM statistical model so as to guide the detection of new words. The algorithm focuses on the process of online segmentatien and new word detection which achieves a good effect in the close or opening test, and outperforms some well-known Chinese segmentation system to a certain extent.
文摘In Chinese, dependency analysis has been shown to be a powerful syntactic parser because the order of phrases in a sentence is relatively free compared with English. Conventional dependency parsers require a number of sophisticated rules that have to be handcrafted by linguists, and are too cumbersome to maintain. To solve the problem, a parser using SVM (Support Vector Machine) is introduced. First, a new strategy of dependency analysis is proposed. Then some chosen feature types are used for learning and for creating the modification matrix using SVM. Finally, the dependency of phrases in the sentence is generated. Experiments conducted to analyze how each type of feature affects parsing accuracy, showed that the model can increase accuracy of the dependency parser by 9.2%.