摘要
The problem of how to efficiently store and query the clustering results was considered. Three different storage schemas for clustering results using relational database were proposed, namely, full schema (f-schema), partial schema (p-schema) and compressed schema (c-schema). At the same time, a classification for queries issued to the clustering results was also presented. Finally, we empirically studied the performance of proposed queries on different storage schemas. To our knowledge, this is the first work to address the problem.
基金
国家高技术研究发展计划(863计划),the IBM SUR Research Fund