期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
A machine learning approach to query generation in plagiarism source retrieval
1
作者 Lei-lei KONG Zhi-mao LU +1 位作者 Hao-liang QI Zhong-yuan HAN 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2017年第10期1556-1572,共17页
Plagiarism source retrieval is the core task of plagiarism detection. It has become the standard for plagiarism detection to use the queries extracted from suspicious documents to retrieve the plagiarism sources. Gene... Plagiarism source retrieval is the core task of plagiarism detection. It has become the standard for plagiarism detection to use the queries extracted from suspicious documents to retrieve the plagiarism sources. Generating queries from a suspicious document is one of the most important steps in plagiarism source retrieval. Heuristic-based query generation methods are widely used in the current research. Each heuristic-based method has its own advantages, and no one statistically outperforms the others on all suspicious document segments when generating queries for source retrieval. Further improvements on heuristic methods for source retrieval rely mainly on the experience of experts. This leads to difficulties in putting forward new heuristic methods that can overcome the shortcomings of the existing ones. This paper paves the way for a new statistical machine learning approach to select the best queries from the candidates. The statistical machine learning approach to query generation for source retrieval is formulated as a ranking framework. Specifically, it aims to achieve the optimal source retrieval performance for each suspicious document segment. The proposed method exploits learning to rank to generate queries from the candidates. To our knowledge, our work is the first research to apply machine learning methods to resolve the problem of query generation for source retrieval. To solve the essential problem of an absence of training data for learning to rank, the building of training samples for source retrieval is also conducted. We rigorously evaluate various aspects of the proposed method on the publicly available PAN source retrieval corpus. With respect to the established baselines, the experimental results show that applying our proposed query generation method based on machine learning yields statistically significant improvements over baselines in source retrieval effectiveness. 展开更多
关键词 Plagiarism detection Source retrieval query generation Machine learning Learning to rank
原文传递
Predicate Oriented Pattern Analysis for Biomedical Knowledge Discovery 被引量:2
2
作者 Feichen Shen Hongfang Liu +2 位作者 Sunghwan Sohn David W. Larson Yugyung Lee 《Intelligent Information Management》 2016年第3期66-85,共20页
In the current biomedical data movement, numerous efforts have been made to convert and normalize a large number of traditional structured and unstructured data (e.g., EHRs, reports) to semi-structured data (e.g., RDF... In the current biomedical data movement, numerous efforts have been made to convert and normalize a large number of traditional structured and unstructured data (e.g., EHRs, reports) to semi-structured data (e.g., RDF, OWL). With the increasing number of semi-structured data coming into the biomedical community, data integration and knowledge discovery from heterogeneous domains become important research problem. In the application level, detection of related concepts among medical ontologies is an important goal of life science research. It is more crucial to figure out how different concepts are related within a single ontology or across multiple ontologies by analysing predicates in different knowledge bases. However, the world today is one of information explosion, and it is extremely difficult for biomedical researchers to find existing or potential predicates to perform linking among cross domain concepts without any support from schema pattern analysis. Therefore, there is a need for a mechanism to do predicate oriented pattern analysis to partition heterogeneous ontologies into closer small topics and do query generation to discover cross domain knowledge from each topic. In this paper, we present such a model that predicates oriented pattern analysis based on their close relationship and generates a similarity matrix. Based on this similarity matrix, we apply an innovated unsupervised learning algorithm to partition large data sets into smaller and closer topics and generate meaningful queries to fully discover knowledge over a set of interlinked data sources. We have implemented a prototype system named BmQGen and evaluate the proposed model with colorectal surgical cohort from the Mayo Clinic. 展开更多
关键词 Biomedical Knowledge Discovery Pattern Analysis PREDICATE query generation
下载PDF
Automated Service Search Model for the Social Internet of Things
3
作者 Farhan Amin Seong Oun Hwang 《Computers, Materials & Continua》 SCIE EI 2022年第9期5871-5888,共18页
The social internet of things(SIoT)is one of the emerging paradigms that was proposed to solve the problems of network service discovery,navigability,and service composition.The SIoT aims to socialize the IoT devices ... The social internet of things(SIoT)is one of the emerging paradigms that was proposed to solve the problems of network service discovery,navigability,and service composition.The SIoT aims to socialize the IoT devices and shape the interconnection between them into social interaction just like human beings.In IoT,an object can offer multiple services and different objects can offer the same services with different parameters and interest factors.The proliferation of offered services led to difficulties during service customization and service filtering.This problem is known as service explosion.The selection of suitable service that fits the requirements of applications and objects is a challenging task.To address these issues,we propose an efficient automated query-based service search model based on the local network navigability concept for the SIoT.In the proposed model,objects can use information from their friends or friends of their friends while searching for the desired services,rather than exploring a global network.We employ a centrality metric that computes the degree of importance for each object in the social IoT that helps in selecting neighboring objects with high centrality scores.The distributed nature of our navigation model results in high scalability and short navigation times.We verified the efficacy of our model on a real-world SIoT-related dataset.The experimental results confirm the validity of our model in terms of scalability,navigability,and the desired objects that provide services are determined quickly via the shortest path,which in return improves the service search process in the SIoT. 展开更多
关键词 Social internet of things service discovery local navigability object discovery query generation model
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部