期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
EntityManager: Managing Dirty Data Based on Entity Resolution 被引量:2
1
作者 Xue-Li Liu Hong-Zhi Wang +1 位作者 Jian-Zhong Li Hong Gao 《Journal of Computer Science & Technology》 SCIE EI CSCD 2017年第3期644-662,共19页
Data quality is important in many data-driven applications, such as decision making, data analysis, and data mining. Recent studies focus on data cleaning techniques by deleting or repairing the dirty data, which may ... Data quality is important in many data-driven applications, such as decision making, data analysis, and data mining. Recent studies focus on data cleaning techniques by deleting or repairing the dirty data, which may cause information loss and bring new inconsistencies. To avoid these problems, we propose EntityManager, a general system to manage dirty data without data cleaning. This system takes real-world entity as the basic storage unit and retrieves query results according to the quality requirement of users. The system is able to handle all kinds of inconsistencies recognized by entity resolution. We elaborate the EntityManager system, covering its architecture, data model, and query processing techniques. To process queries efficiently, our system adopts novel indices, similarity operator and query optimization techniques. Finally, we verify the efficiency and effectiveness of this system and present future research challenges. 展开更多
关键词 dirty data entity resolution uncertain attribute query processing query optimization
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部