This paper presents the semantic analysis of queries written in natural language (French) and dedicated to the object oriented data bases. The studied queries include one or two nominal groups (NG) articulating around...This paper presents the semantic analysis of queries written in natural language (French) and dedicated to the object oriented data bases. The studied queries include one or two nominal groups (NG) articulating around a verb. A NG consists of one or several keywords (application dependent noun or value). Simple semantic filters are defined for identifying these keywords which can be of semantic value: class, simple attribute, composed attribute, key value or not key value. Coherence rules and coherence constraints are introduced, to check the validity of the co-occurrence of two consecutive nouns in complex NG. If a query is constituted of a single NG, no further analysis is required. Otherwise, if a query covers two valid NG, it is a subject of studying the semantic coherence of the verb and both NG which are attached to it.展开更多
In Chinese question answering system, because there is more semantic relation in questions than that in query words, the precision can be improved by expanding query while using natural language questions to retrieve ...In Chinese question answering system, because there is more semantic relation in questions than that in query words, the precision can be improved by expanding query while using natural language questions to retrieve documents. This paper proposes a new approach to query expansion based on semantics and statistics Firstly automatic relevance feedback method is used to generate a candidate expansion word set. Then the expanded query words are selected from the set based on the semantic similarity and seman- tic relevancy between the candidate words and the original words. Experiments show the new approach is effective for Web retrieval and out-performs the conventional expansion approaches.展开更多
In many database applications, ranking queries may reference both text and numeric attributes, where the ranking functions are based on both semantic distances/similarities for text attributes and numeric distances fo...In many database applications, ranking queries may reference both text and numeric attributes, where the ranking functions are based on both semantic distances/similarities for text attributes and numeric distances for numeric attributes. In this paper, we propose a new method for evaluating such type of ranking queries over a relational database. By statistics and training, this method builds a mechanism that combines the semantic and numeric distances, and the mechanism can be used to balance the effects of text attributes and numeric attributes on matching a given query and tuples in database search. The basic idea of the method is to create an index based on WordNet to expand the tuple words semantically for text attributes and on the information of numeric attributes. The candidate results for a query are retrieved by the index and a simple SQL selection statement, and then top-N answers are obtained. The results of extensive experiments indicate that the performance of this new strategy is efficient and effective.展开更多
This paper described an approach to make inferences on Chinese information using first order predicate logic, which could be used in the semantic query of Chinese. The predicates of the method were derived from the na...This paper described an approach to make inferences on Chinese information using first order predicate logic, which could be used in the semantic query of Chinese. The predicates of the method were derived from the natural language using rule based LFT, the axiom set was generated by extracting lexicon knowledge from HowNet, and the first order predicate inferences were made through symbol connection of center words. After all these were done, the evaluation and possible improvements of the method were provided. The experiment result shows a higher precision rate than that traditional methods can reach.展开更多
Semantic query optimization (SQO) is comparatively a recent approach for the transformation of given query into equivalent alternative query using matching rules in order to select an optimal query based on the costs ...Semantic query optimization (SQO) is comparatively a recent approach for the transformation of given query into equivalent alternative query using matching rules in order to select an optimal query based on the costs of executing alternative queries. The key aspect of the algorithm proposed here is that previous proposed SQO techniques can be considered equally in the uniform cost model, with which optimization opportunities will not be missed. At the same time, the authors used the implication closure to guarantee that any matched rule will not be lost. The authors implemented their algorithm for the optimization of decomposed sub-query in local database in Multi-Database Integrator (MDBI), which is a multidatabase project. The experimental results verify that this algorithm is effective in the process of SQO.展开更多
There have been many researches and semantics in answering top-k queries on uncertain data in various applications. However, most of these semantics must consume much of their time in computing position probability. O...There have been many researches and semantics in answering top-k queries on uncertain data in various applications. However, most of these semantics must consume much of their time in computing position probability. Our approach to support various top-k queries is based on position probability distribution (PPD) sharing. In this paper, a PPD-tree structure and several basic operations on it are proposed to support various top-k queries. In addition, we proposed an approximation method to improve the efficiency of PPD generation. We also verify the effectiveness and efficiency of our approach by both theoretical analysis and experiments.展开更多
Due to the wide-spread use of geo-positioning technologies and geo-social networks,the reverse top-k geo-social keyword query has attracted considerable attention from both industry and research communities.A reverse ...Due to the wide-spread use of geo-positioning technologies and geo-social networks,the reverse top-k geo-social keyword query has attracted considerable attention from both industry and research communities.A reverse top-k geo-social keyword(RkGSK)query finds the users who are spatially near,textually similar,and socially relevant to a specified point of interest.RkGSK queries are useful in many real-life applications.For example,they can help the query issuer identify potential customers in marketing decisions.However,the query constraints could be too strict sometimes,making it hard to find any result for the RkGSK query.The query issuers may wonder how to modify their original queries to get a certain number of query results.In this paper,we study non-answer questions on reverse top-k geo-social keyword queries(NARGSK).Given an RkGSK query and the required number M of query results,NARGSK aim to find the refined RkGSK query having M users in its result set.To efficiently answer NARGSK,we propose two algorithms(ERQ and NRG)based on query relaxation.As this is the first work to address NARGSK to the best of our knowledge,ERQ is the baseline extended from the state-of-the-art method,while NRG further improves the efficiency of ERQ.Extensive experiments using real-life datasets demonstrate the efficiency of our proposed algorithms,and the performance of NRG is improved by a factor of 1–2 on average compared with ERQ.展开更多
文摘This paper presents the semantic analysis of queries written in natural language (French) and dedicated to the object oriented data bases. The studied queries include one or two nominal groups (NG) articulating around a verb. A NG consists of one or several keywords (application dependent noun or value). Simple semantic filters are defined for identifying these keywords which can be of semantic value: class, simple attribute, composed attribute, key value or not key value. Coherence rules and coherence constraints are introduced, to check the validity of the co-occurrence of two consecutive nouns in complex NG. If a query is constituted of a single NG, no further analysis is required. Otherwise, if a query covers two valid NG, it is a subject of studying the semantic coherence of the verb and both NG which are attached to it.
基金the Specialized Research Program Fundthe Doctoral Program of Higher Education of China (20050007023)the Natural Science Foundation of Shandong Province(Y2004G04)
文摘In Chinese question answering system, because there is more semantic relation in questions than that in query words, the precision can be improved by expanding query while using natural language questions to retrieve documents. This paper proposes a new approach to query expansion based on semantics and statistics Firstly automatic relevance feedback method is used to generate a candidate expansion word set. Then the expanded query words are selected from the set based on the semantic similarity and seman- tic relevancy between the candidate words and the original words. Experiments show the new approach is effective for Web retrieval and out-performs the conventional expansion approaches.
文摘In many database applications, ranking queries may reference both text and numeric attributes, where the ranking functions are based on both semantic distances/similarities for text attributes and numeric distances for numeric attributes. In this paper, we propose a new method for evaluating such type of ranking queries over a relational database. By statistics and training, this method builds a mechanism that combines the semantic and numeric distances, and the mechanism can be used to balance the effects of text attributes and numeric attributes on matching a given query and tuples in database search. The basic idea of the method is to create an index based on WordNet to expand the tuple words semantically for text attributes and on the information of numeric attributes. The candidate results for a query are retrieved by the index and a simple SQL selection statement, and then top-N answers are obtained. The results of extensive experiments indicate that the performance of this new strategy is efficient and effective.
文摘This paper described an approach to make inferences on Chinese information using first order predicate logic, which could be used in the semantic query of Chinese. The predicates of the method were derived from the natural language using rule based LFT, the axiom set was generated by extracting lexicon knowledge from HowNet, and the first order predicate inferences were made through symbol connection of center words. After all these were done, the evaluation and possible improvements of the method were provided. The experiment result shows a higher precision rate than that traditional methods can reach.
文摘Semantic query optimization (SQO) is comparatively a recent approach for the transformation of given query into equivalent alternative query using matching rules in order to select an optimal query based on the costs of executing alternative queries. The key aspect of the algorithm proposed here is that previous proposed SQO techniques can be considered equally in the uniform cost model, with which optimization opportunities will not be missed. At the same time, the authors used the implication closure to guarantee that any matched rule will not be lost. The authors implemented their algorithm for the optimization of decomposed sub-query in local database in Multi-Database Integrator (MDBI), which is a multidatabase project. The experimental results verify that this algorithm is effective in the process of SQO.
文摘针对已有的假位置生成算法,设计了一种多次查询请求攻击算法(Multiple Query Request Attack algorithm,MQRA)来测试其安全性。为有效保护用户的位置隐私,提出了一种抵御背景信息推理攻击的假位置生成算法(Dummy Location Ge-neration Algorithm against Side Information Inference Attack,DLG_SIA),该算法综合考虑了查询概率、时间分布、位置语义和物理分散度等背景信息来生成有效的假位置集以抵御概率分布攻击、位置语义攻击和位置同质攻击,避免攻击者结合背景信息过滤掉假位置。用户首次请求时,DLG_SIA算法先利用位置熵和时间熵选取当前请求时间下查询概率相似的位置点来生成假位置集,并通过调整的余弦相似度生成满足语义差异性的位置点;然后通过距离熵保证选取的位置点间具有更大的匿名范围,并将当前请求位置的最佳假位置集进行缓存。安全性分析和仿真实验结果表明:MQRA算法能以很高的概率识别出假位置集中用户的真实位置;与已有的假位置生成算法相比,DLG_SIA算法能有效抵御背景信息推理攻击,保护用户的位置隐私。
基金Supported by the National High Technology Research and Development Program of China(863 Program 2012AA011004)the National Natural Science Foundation of China(61232002,61202033)Natural Science Foundation of Hubei Province(2011CDB448)
文摘There have been many researches and semantics in answering top-k queries on uncertain data in various applications. However, most of these semantics must consume much of their time in computing position probability. Our approach to support various top-k queries is based on position probability distribution (PPD) sharing. In this paper, a PPD-tree structure and several basic operations on it are proposed to support various top-k queries. In addition, we proposed an approximation method to improve the efficiency of PPD generation. We also verify the effectiveness and efficiency of our approach by both theoretical analysis and experiments.
基金the National Natural Science Foundation of China under Grant Nos.61972338,62025206 and 62102351。
文摘Due to the wide-spread use of geo-positioning technologies and geo-social networks,the reverse top-k geo-social keyword query has attracted considerable attention from both industry and research communities.A reverse top-k geo-social keyword(RkGSK)query finds the users who are spatially near,textually similar,and socially relevant to a specified point of interest.RkGSK queries are useful in many real-life applications.For example,they can help the query issuer identify potential customers in marketing decisions.However,the query constraints could be too strict sometimes,making it hard to find any result for the RkGSK query.The query issuers may wonder how to modify their original queries to get a certain number of query results.In this paper,we study non-answer questions on reverse top-k geo-social keyword queries(NARGSK).Given an RkGSK query and the required number M of query results,NARGSK aim to find the refined RkGSK query having M users in its result set.To efficiently answer NARGSK,we propose two algorithms(ERQ and NRG)based on query relaxation.As this is the first work to address NARGSK to the best of our knowledge,ERQ is the baseline extended from the state-of-the-art method,while NRG further improves the efficiency of ERQ.Extensive experiments using real-life datasets demonstrate the efficiency of our proposed algorithms,and the performance of NRG is improved by a factor of 1–2 on average compared with ERQ.