期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
Test Case Generation Evaluator for the Implementation of Test Case Generation Algorithms Based on Learning to Rank
1
作者 Zhonghao Guo Xinyue Xu Xiangxian Chen 《Computer Systems Science & Engineering》 2024年第2期479-509,共31页
In software testing,the quality of test cases is crucial,but manual generation is time-consuming.Various automatic test case generation methods exist,requiring careful selection based on program features.Current evalu... In software testing,the quality of test cases is crucial,but manual generation is time-consuming.Various automatic test case generation methods exist,requiring careful selection based on program features.Current evaluation methods compare a limited set of metrics,which does not support a larger number of metrics or consider the relative importance of each metric to the final assessment.To address this,we propose an evaluation tool,the Test Case Generation Evaluator(TCGE),based on the learning to rank(L2R)algorithm.Unlike previous approaches,our method comprehensively evaluates algorithms by considering multiple metrics,resulting in a more reasoned assessment.The main principle of the TCGE is the formation of feature vectors that are of concern by the tester.Through training,the feature vectors are sorted to generate a list,with the order of the methods on the list determined according to their effectiveness on the tested assembly.We implement TCGE using three L2R algorithms:Listnet,LambdaMART,and RFLambdaMART.Evaluation employs a dataset with features of classical test case generation algorithms and three metrics—Normalized Discounted Cumulative Gain(NDCG),Mean Average Precision(MAP),and Mean Reciprocal Rank(MRR).Results demonstrate the TCGE’s superior effectiveness in evaluating test case generation algorithms compared to other methods.Among the three L2R algorithms,RFLambdaMART proves the most effective,achieving an accuracy above 96.5%,surpassing LambdaMART by 2%and Listnet by 1.5%.Consequently,the TCGE framework exhibits significant application value in the evaluation of test case generation algorithms. 展开更多
关键词 Test case generation evaluator learning to rank RFLambdaMART
下载PDF
A Simple yet Effective Framework for Active Learning to Rank
2
作者 Qingzhong Wang Haifang Li +7 位作者 Haoyi Xiong Wen Wang Jiang Bian Yu Lu Shuaiqiang Wang Zhicong Cheng Dejing Dou Dawei Yin 《Machine Intelligence Research》 EI CSCD 2024年第1期169-183,共15页
While China has become the largest online market in the world with approximately 1 billion internet users,Baidu runs the world's largest Chinese search engine serving more than hundreds of millions of daily active... While China has become the largest online market in the world with approximately 1 billion internet users,Baidu runs the world's largest Chinese search engine serving more than hundreds of millions of daily active users and responding to billions of queries per day.To handle the diverse query requests from users at the web-scale,Baidu has made tremendous efforts in understanding users'queries,retrieving relevant content from a pool of trillions of webpages,and ranking the most relevant webpages on the top of the res-ults.Among the components used in Baidu search,learning to rank(LTR)plays a critical role and we need to timely label an extremely large number of queries together with relevant webpages to train and update the online LTR models.To reduce the costs and time con-sumption of query/webpage labelling,we study the problem of active learning to rank(active LTR)that selects unlabeled queries for an-notation and training in this work.Specifically,we first investigate the criterion-Ranking entropy(RE)characterizing the entropy of relevant webpages under a query produced by a sequence of online LTR models updated by different checkpoints,using a query-by-com-mittee(QBC)method.Then,we explore a new criterion namely prediction variances(PV)that measures the variance of prediction res-ults for all relevant webpages under a query.Our empirical studies find that RE may favor low-frequency queries from the pool for la-belling while PV prioritizes high-frequency queries more.Finally,we combine these two complementary criteria as the sample selection strategies for active learning.Extensive experiments with comparisons to baseline algorithms show that the proposed approach could train LTR models to achieve higher discounted cumulative gain(i.e.,the relative improvement DCG4=1.38%)with the same budgeted labellingefforts. 展开更多
关键词 SEARCH information retrieval learning to rank active learning query by committee
原文传递
Event-Driven Non-Intrusive Load Monitoring Algorithm Based on Targeted Mining Multidimensional Load Characteristics
3
作者 Gang Xie Hongpeng Wang 《China Communications》 SCIE CSCD 2023年第5期40-56,共17页
Nowadays,the advancement of nonintrusive load monitoring(NILM)has been hastened by the ever-increasing requirements for the reasonable use of electricity by users and demand side management.Although existing researche... Nowadays,the advancement of nonintrusive load monitoring(NILM)has been hastened by the ever-increasing requirements for the reasonable use of electricity by users and demand side management.Although existing researches have tried their best to extract a wide variety of load features based on transient or steady state of electrical appliances,it is still very difficult for their algorithm to model the load decomposition problem of different electrical appliance types in a targeted manner to jointly mine their proposed features.This paper presents a very effective event-driven NILM solution,which aims to separately model different appliance types to mine the unique characteristics of appliances from multi-dimensional features,so that all electrical appliances can achieve the best classification performance.First,we convert the multi-classification problem into a serial multiple binary classification problem through a pre-sort model to simplify the original problem.Then,ConTrastive Loss K-Nearest Neighbour(CTLKNN)model with trainable weights is proposed to targeted mine appliance load characteristics.The simulation results show the effectiveness and stability of the proposed algorithm.Compared with existing algorithms,the proposed algorithm has improved the identification performance of all electrical appliance types. 展开更多
关键词 non-intrusive load monitoring learning to ranking smart grid electrical characteristics
下载PDF
Towards intelligent geospatial data discovery:a machine learning framework for search ranking
4
作者 Yongyao Jiang Yun Li +6 位作者 Chaowei Yang Fei Hu Edward MArmstrong Thomas Huang David Moroni Lewis J.McGibbney Christopher J.Finch 《International Journal of Digital Earth》 SCIE EI 2018年第9期956-971,共16页
Current search engines in most geospatial data portals tend to induce users to focus on one single-data characteristic dimension(e.g.popularity and release date).This approach largely fails to take account of users’m... Current search engines in most geospatial data portals tend to induce users to focus on one single-data characteristic dimension(e.g.popularity and release date).This approach largely fails to take account of users’multidimensional preferences for geospatial data,and hence may likely result in a less than optimal user experience in discovering the most applicable dataset.This study reports a machine learning framework to address the ranking challenge,the fundamental obstacle in geospatial data discovery,by(1)identifying a number of ranking features of geospatial data to represent users’multidimensional preferences by considering semantics,user behavior,spatial similarity,and static dataset metadata attributes;(2)applying a machine learning method to automatically learn a ranking function;and(3)proposing a system architecture to combine existing search-oriented open source software,semantic knowledge base,ranking feature extraction,and machine learning algorithm.Results show that the machine learning approach outperforms other methods,in terms of both precision at K and normalized discounted cumulative gain.As an early attempt of utilizing machine learning to improve the search ranking in the geospatial domain,we expect this work to set an example for further research and open the door towards intelligent geospatial data discovery. 展开更多
关键词 learning to rank semantic search user behavior search engine big data METADATA data relevancy data portal
原文传递
Visual Entity Linking via Multi-modal Learning 被引量:2
5
作者 Qiushuo Zheng Hao Wen +1 位作者 Meng Wang Guilin Qi 《Data Intelligence》 EI 2022年第1期1-19,共19页
Existing visual scene understanding methods mainly focus on identifying coarse-grained concepts about the visual objects and their relationships,largely neglecting fine-grained scene understanding.In fact,many data-dr... Existing visual scene understanding methods mainly focus on identifying coarse-grained concepts about the visual objects and their relationships,largely neglecting fine-grained scene understanding.In fact,many data-driven applications on the Web(e.g.,news-reading and e-shopping)require accurate recognition of much less coarse concepts as entities and proper linking them to a knowledge graph(KG),which can take their performance to the next level.In light of this,in this paper,we identify a new research task:visual entity linking for fine-grained scene understanding.To accomplish the task,we first extract features of candidate entities from different modalities,i.e.,visual features,textual features,and KG features.Then,we design a deep modal-attention neural network-based learning-to-rank method which aggregates all features and maps visual objects to the entities in KG.Extensive experimental results on the newly constructed dataset show that our proposed method is effective as it significantly improves the accuracy performance from 66.46%to 83.16%compared with baselines. 展开更多
关键词 Knowledge graph Multi-modal learning Entity linking learning to rank Knowledge graph representation
原文传递
Recommender systems based on ranking performance optimization 被引量:1
6
作者 Richong ZHANG Han BAO +2 位作者 Hailong SUN Yanghao WANG Xudong LIU 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第2期270-280,共11页
The rapid development of online services and information overload has inspired the fast development of recommender systems, among which collaborative filtering algorithms and model-based recommendation approaches are ... The rapid development of online services and information overload has inspired the fast development of recommender systems, among which collaborative filtering algorithms and model-based recommendation approaches are wildly exploited. For instance, matrix factorization (MF) demonstrated successful achievements and advantages in assisting internet users in finding interested information. These existing models focus on the prediction of the users' ratings on unknown items. The performance is usually evaluated by the metric root mean square error (RMSE). However, achieving good performance in terms of RMSE does not always guarantee a good ranking performance. Therefore, in this paper, we advocate to treat the recommendation as a ranking problem. Normalized discounted cumulative gain (NDCG) is chosen as the optimization target when evaluating the ranking accuracy. Specifically, we present three ranking-oriented recommender algorithms, NSME AdaMF and AdaNSME NSMF builds a NDCG approximated loss function for Matrix Factorization. AdaMF is based on an algorithm by adaptively combining component MF recommenders with boosting method. To combine the advantages of both algorithms, we propose AdaNSME which is a hybird of NSMF and AdaME and show the superiority in both ranking accuracy and model generalization. In addition, we compare our proposed approaches with the state-of-the-art recommendation algorithms. The comparison studies confirm the advantage of our proposed approaches. 展开更多
关键词 recommender system matrix factorization learning to rank
原文传递
A machine learning approach to query generation in plagiarism source retrieval
7
作者 Lei-lei KONG Zhi-mao LU +1 位作者 Hao-liang QI Zhong-yuan HAN 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2017年第10期1556-1572,共17页
Plagiarism source retrieval is the core task of plagiarism detection. It has become the standard for plagiarism detection to use the queries extracted from suspicious documents to retrieve the plagiarism sources. Gene... Plagiarism source retrieval is the core task of plagiarism detection. It has become the standard for plagiarism detection to use the queries extracted from suspicious documents to retrieve the plagiarism sources. Generating queries from a suspicious document is one of the most important steps in plagiarism source retrieval. Heuristic-based query generation methods are widely used in the current research. Each heuristic-based method has its own advantages, and no one statistically outperforms the others on all suspicious document segments when generating queries for source retrieval. Further improvements on heuristic methods for source retrieval rely mainly on the experience of experts. This leads to difficulties in putting forward new heuristic methods that can overcome the shortcomings of the existing ones. This paper paves the way for a new statistical machine learning approach to select the best queries from the candidates. The statistical machine learning approach to query generation for source retrieval is formulated as a ranking framework. Specifically, it aims to achieve the optimal source retrieval performance for each suspicious document segment. The proposed method exploits learning to rank to generate queries from the candidates. To our knowledge, our work is the first research to apply machine learning methods to resolve the problem of query generation for source retrieval. To solve the essential problem of an absence of training data for learning to rank, the building of training samples for source retrieval is also conducted. We rigorously evaluate various aspects of the proposed method on the publicly available PAN source retrieval corpus. With respect to the established baselines, the experimental results show that applying our proposed query generation method based on machine learning yields statistically significant improvements over baselines in source retrieval effectiveness. 展开更多
关键词 Plagiarism detection Source retrieval Query generation Machine learning learning to rank
原文传递
Listwise approaches based on feature ranking discovery
8
作者 Yongqing WANG Wenji MAO +1 位作者 Daniel ZENG Fen XIA 《Frontiers of Computer Science》 SCIE EI CSCD 2012年第6期647-659,共13页
Listwise approaches are an important class of learning to rank, which utilizes automatic learning techniques to discover useful information. Most previous research on listwise approaches has focused on optimizing rank... Listwise approaches are an important class of learning to rank, which utilizes automatic learning techniques to discover useful information. Most previous research on listwise approaches has focused on optimizing ranking models using weights and has used imprecisely labeled training data; optimizing ranking models using features was largely ignored thus the continuous performance improvement of these approaches was hindered. To address the limitations of previous listwise work, we propose a quasi-KNN model to discover the ranking of features and employ rank addition rule to calculate the weight of combination. On the basis of this, we propose three listwise algorithms, FeatureRank, BL-FeatureRank, and DiffRank. The experimental results show that our proposed algorithms can be applied to a strict ordered ranking training set and gain better performance than state-of-the-art listwise algorithms. 展开更多
关键词 learning to rank listwise approach feature's ranking discovery
原文传递
NetGO 3.0:Protein Language Model Improves Large-scale Functional Annotations
9
作者 Shaojun Wang Ronghui You +2 位作者 Yunjia Liu Yi Xiong Shanfeng Zhu 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2023年第2期349-358,共10页
As one of the state-of-the-art automated function prediction(AFP)methods,NetGO 2.0 integrates multi-source information to improve the performance.However,it mainly utilizes the proteins with experimentally supported f... As one of the state-of-the-art automated function prediction(AFP)methods,NetGO 2.0 integrates multi-source information to improve the performance.However,it mainly utilizes the proteins with experimentally supported functional annotations without leveraging valuable information from a vast number of unannotated proteins.Recently,protein language models have been proposed to learn informative representations[e.g.,Evolutionary Scale Modeling(ESM)-1b embedding] from protein sequences based on self-supervision.Here,we represented each protein by ESM-1b and used logistic regression(LR)to train a new model,LR-ESM,for AFP.The experimental results showed that LR-ESM achieved comparable performance with the best-performing component of NetGO 2.0.Therefore,by incorporating LR-ESM into NetGO 2.0,we developed NetGO 3.0 to improve the performance of AFP extensively. 展开更多
关键词 Protein function prediction Web service Protein language model learning to rank Large-scale multi-label learning
原文传递
Ordinal-Class Core Vector Machine 被引量:1
10
作者 顾彬 王建东 李涛 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第4期699-708,共10页
Ordinal regression is one of the most important tasks of relation learning, and several techniques based on support vector machines (SVMs) have also been proposed for tackling it, but the scalability aspect of these... Ordinal regression is one of the most important tasks of relation learning, and several techniques based on support vector machines (SVMs) have also been proposed for tackling it, but the scalability aspect of these approaches to handle large datasets still needs much of exploration. In this paper, we will extend the recent proposed algorithm Core Vector Machine (CVM) to the ordinal-class data, and propose a new algorithm named as Ordinal-Class Core Vector Machine (OCVM). Similar with CVM, its asymptotic time complexity is linear with the number of training samples, while the space complexity is independent with the number of training samples. We also give some analysis for OCVM, which mainly includes two parts, the first one shows that OCVM can guarantee that the biases are unique and properly ordered under some situation; the second one illustrates the approximate convergence of the solution from the viewpoints of objective function and KKT conditions. Experiments on several synthetic and real world datasets demonstrate that OCVM scales well with the size of the dataset and can achieve comparable generalization performance with existing SVM implementations. 展开更多
关键词 support vector machine ordinal regression ranking learning core vector machine minimum enclosing ball
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部