期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Relation-Aware Entity Matching Using Sentence-BERT 被引量:1
1
作者 Huchen Zhou Wenfeng Huang +1 位作者 Mohan Li Yulin Lai 《Computers, Materials & Continua》 SCIE EI 2022年第4期1581-1595,共15页
A key aspect of Knowledge fusion is Entity Matching.The objective of this study was to investigate how to identify heterogeneous expressions of the same real-world entity.In recent years,some representative works have... A key aspect of Knowledge fusion is Entity Matching.The objective of this study was to investigate how to identify heterogeneous expressions of the same real-world entity.In recent years,some representative works have used deep learning methods for entity matching,and these methods have achieved good results.However,the common limitation of these methods is that they assume that different attribute columns of the same entity are independent,and inputting the model in the form of paired entity records will cause repeated calculations.In fact,there are often potential relations between different attribute columns of different entities.These relations can help us improve the effect of entity matching,and can perform feature extraction on a single entity record to avoid repeated calculations.To use attribute relations to assist entity matching,this paper proposes the Relation-aware Entity Matching method,which embeds attribute relations into the original entity description to form sentences,so that entity matching is transformed into a sentence-level similarity determination task,based on Sentence-BERT completes sentence similarity calculation.We have conducted experiments on structured,dirty,and textual data,and compared them with baselines in recent years.Experimental results show that the use of relational embedding is helpful for entity matching on structured and dirty data.Our method has good results on most data sets for entity matching and reduces repeated calculations. 展开更多
关键词 Knowledge fusion entity matching Sentence-BERT relation aware
下载PDF
Similar physical entity matching strategy for mobile edge search
2
作者 Puning Zhang Xuyuan Kang 《Digital Communications and Networks》 SCIE 2020年第2期203-209,共7页
In recent years,a large number of intelligent sensing devices have been deployed in the physical world,which brings great difficulties to the existing entity search.With the increase of the number of intelligent sensi... In recent years,a large number of intelligent sensing devices have been deployed in the physical world,which brings great difficulties to the existing entity search.With the increase of the number of intelligent sensing devices,the accuracy of the search system in querying the entities to match the user’s request is reduced,and the delay of entity search is increased.We use the mobile edge technology to alleviate this problem by processing user requests on the edge side and propose a similar physical entity matching strategy for the mobile edge search.First,the raw data collected by the sensor is lightly weighted and expressed to reduce the storage overhead of the observed data.Furthermore,a physical entity matching degree estimation method is proposed,in which the similarity between the sensor and the given sensor in the network is estimated,and the matching search of the user request is performed according to the similarity.Simulation results show that the proposed method can effectively reduce the data storage overhead and improve the precision of the sensor search system. 展开更多
关键词 Mobile edge computing Internet of things search entity matching Similarity calculation
下载PDF
MapReduce-based entity matching with multiple blocking functions 被引量:1
3
作者 Cheqing JIN Jie CHEN Huiping LIU 《Frontiers of Computer Science》 SCIE EI CSCD 2017年第5期895-911,共17页
Entity matching that aims at finding some records belonging to the same real-world objects has been studied for decades. In order to avoid verifying every pair of records in a massive data set, a common method, known ... Entity matching that aims at finding some records belonging to the same real-world objects has been studied for decades. In order to avoid verifying every pair of records in a massive data set, a common method, known as the blocking- based method, tends to select a small proportion of record pairs for verification with a far lower cost than O(n2), where n is the size of the data set. Furthermore, executing multiple blocking functions independently is critical since much more matching records can be found in this way, so that the quality of the query result can be improved significantly. It is popular to use the MapReduce (MR) framework to improve the performance and the scalability of some compli- cated queries by running a lot of map (/reduce) tasks in parallel. However, entity matching upon the MapReduce frame- work is non-trivial due to two inevitable challenges: load balancing and pair deduplication. In this paper, we propose a novel solution, called M rEin, to handle these challenges with the support of multiple blocking functions. Although the existing work can deal with load balancing and pair deduplication respectively, it still cannot deal with both challenges at the same time. Theoretical analysis and experimental results upon real and synthetic data sets illustrate the high effectiveness and efficiency of our proposed solutions. 展开更多
关键词 entity matching MAPREDUCE load balancing pair deduplication
原文传递
Mixed Hierarchical Networks for Deep Entity Matching
4
作者 Chen-Chen Sun De-Rong Shen 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第4期822-838,共17页
Entity matching is a fundamental problem of data integration.It groups records according to underlying real-world entities.There is a growing trend of entity matching via deep learning techniques.We design mixed hiera... Entity matching is a fundamental problem of data integration.It groups records according to underlying real-world entities.There is a growing trend of entity matching via deep learning techniques.We design mixed hierarchical deep neural networks(MHN)for entity matching,exploiting semantics from different abstract levels in the record internal hierarchy.A family of attention mechanisms is utilized in different periods of entity matching.Self-attention focuses on internal dependency,inter-attention targets at alignments,and multi-perspective weight attention is devoted to importance discrimination.Especially,hybrid soft token alignment is proposed to address corrupted data.Attribute order is for the first time considered in deep entity matching.Then,to reduce utilization of labeled training data,we propose an adversarial domain adaption approach(DA-MHN)to transfer matching knowledge between different entity matching tasks by maximizing classifier discrepancy.Finally,we conduct comprehensive experimental evaluations on 10 datasets(seven for MHN and three for DA-MHN),which illustrate our two proposed approaches1 superiorities.MHN apparently outperforms previous studies in accuracy,and also each component of MHN is tested.DA-MHN greatly surpasses existing studies in transferability. 展开更多
关键词 entity matching attention mechanism mixed hierarchical neural network(MHN) domain adaption data integration
原文传递
Crowd-Guided Entity Matching with Consolidated Textual Data
5
作者 Zhi-Xu Li Qiang Yang +5 位作者 An Liu Guan-Feng Liu Jia Zhu Jia-Jie Xu Kai Zheng Min Zhang 《Journal of Computer Science & Technology》 SCIE EI CSCD 2017年第5期858-876,共19页
Entity matching (EM) identifies records referring to the same entity within or across databases. Existing methods using structured attribute values (such as digital, date or short string values) may fail when the stru... Entity matching (EM) identifies records referring to the same entity within or across databases. Existing methods using structured attribute values (such as digital, date or short string values) may fail when the structured information is not enough to reflect the matching relationships between records. Nowadays more and more databases may have some unstructured textual attribute containing extra consolidated textual information (CText) of the record, but seldom work has been done on using the CText for EM. Conventional string similarity metrics such as edit distance or bag-of-words are unsuitable for measuring the similarities between CText since there are hundreds or thousands of words with each piece of CText, while existing topic models either cannot work well since there are no obvious gaps between topics in CText. In this paper, we propose a novel cooccurrence-based topic model to identify various sub-topics from each piece of CText, and then measure the similarity between CText on the multiple sub-topic dimensions. To avoid ignoring some hidden important sub-topics, we let the crowd help us decide weights of different sub-topics in doing EM. Our empirical study on two real-world datasets based on Amzon Mechanical Turk Crowdsourcing Platform shows that our method outperforms the state-of-the-art EM methods and Text Understanding models. 展开更多
关键词 entity matching consolidated textual data crowdsourcing
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部