期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
D-Ocean: an unstructured data management system for data ocean environment 被引量:2
1
作者 Yueting ZHUANG Yaoguang WANG +5 位作者 Jian SHAO Ling CHEN Weiming LU jianling sun Baogang WEI Jiangqin WU 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第2期353-369,共17页
Together with the big data movement, many organizations collect their own big data and build distinctive applications. In order to provide smart services upon big data, massive variable data should be well linked and ... Together with the big data movement, many organizations collect their own big data and build distinctive applications. In order to provide smart services upon big data, massive variable data should be well linked and organized to form Data Ocean, which specially emphasizes the deep exploration of the relationships among unstructured data to support smart services. Currently, almost all of these applications have to deal with unstructured data by integrating various analysis and search techniques upon massive storage and processing infrastructure at the application level, which greatly increase the difficulty and cost of application development. This paper presents D-Ocean, an unstructured data management system for data ocean environment. D-Ocean has an open and scalable architecture, which consists of a core platform, pluggable components and auxiliary tools. It exploits a unified storage framework to store data in different kinds of data stores, integrates batch and incremental processing mechanisms to process unstructured data, and provides a combined search engine to conduct compound queries. Furthermore, a so-called RAISE process modeling is proposed to support the whole process of Repository, Analysis, Index, Search and Environment modeling, which can greatly simplify application development. The experiments and use cases in production demonstrate the efficiency and usability of D-Ocean. 展开更多
关键词 unstructured data STORAGE analysis INDEX SEARCH RAISE process modeling
原文传递
Combined classifier for cross-project defect prediction: an extended empirical study 被引量:2
2
作者 Yun ZHANG David LO +1 位作者 Xin XIA jianling sun 《Frontiers of Computer Science》 SCIE EI CSCD 2018年第2期280-296,共17页
To facilitate developers in effective allocation of their testing and debugging efforts, many software defect prediction techniques have been proposed in the literature. These techniques can be used to predict classes... To facilitate developers in effective allocation of their testing and debugging efforts, many software defect prediction techniques have been proposed in the literature. These techniques can be used to predict classes that are more likely to be buggy based on the past history of classes, methods, or certain other code elements. These techniques are effective provided that a sufficient amount of data is available to train a prediction model. However, sufficient training data are rarely available for new software projects. To resolve this problem, cross-project defect prediction, which transfers a prediction model trained using data from one project to another, was proposed and is regarded as a new challenge in the area of defect prediction. Thus far, only a few cross-project defect prediction techniques have been proposed. To advance the state of the art, in this study, we investigated seven composite algorithms that integrate multiple machine learning classifiers to improve cross-project defect prediction. To evaluate the performance of the composite algorithms, we performed experiments on 10 open-source software systems from the PROMISE repository, which contain a total of 5,305 instances labeled as defective or clean. We compared the composite algorithms with the combined defect predictor where logistic regression is used as the meta classification algorithm (CODEPLogistic), which is the most recent cross-project defect prediction algorithm in terms of two standard evaluation metrics: cost effectiveness and F-measure. Our experimental results show that several algorithms outperform CODEPLogistic:Maximum voting shows the best performance in terms of F-measure and its average F-measure is superior to that of CODEPLogistic by 36.88%. Bootstrap aggregation (Bagging J48) shows the best performance in terms of cost effectiveness and its average cost effectiveness is superior to that of CODEPLogistic by 15.34%. 展开更多
关键词 defect prediction cross-project classifier combination
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部