摘要
论文在软件数据中挖掘聚类模式的研究基础上,进一步提出了在软件层次上的数据挖据方法。对于解决软件工程中项目代价的估算和评测具有重要的参考价值。首先收集不同类型软件数据,接着根据Halstead软件科学从它们中间抽取不同的特征,以此来标识不同的软件;然后将这些软件归为不同的类别,对于同一类中的软件可以认为它们具有相似的软件代价或相似的结构,可以用于病毒特征检测和预测,对于在不同类中的软件可以发现二者存在差异的决定“相异因素”;最后给出了对5414个实际软件系统挖掘的实验结果。结果表明这种软件层次的数据挖掘方法是可行而有效的。
This paper brings forward a data mining method on software system on the basis of the former clustering data mining software data. This way is valuable for reference to solve the estimation of project cost in software engi- neering. Firstly,collect all kinds of different software data; the following thing we have done is drawing different fea- tures from these data according to the Halstead software science,which are used to identify different softwares; then, classify the data to different classes,to the softwares in the same class,we regard them as the same software cost or similar structures,which can be used to check or pretest the virus features,to the softwares in different classes,we try to find the 'essential different factors' among them; finally,we give the experiment results of 5414 actual software systems. The results show that this kind of data mining based on software data is feasible and effective.
出处
《计算机科学》
CSCD
北大核心
2005年第2期202-205,共4页
Computer Science
基金
澳大利亚ARC基金
国家自然科学基金(60075016)