期刊文献+

基于免疫克隆算法的网店属地判定

Taobao Shop Territorial Judgment Based on Immune Clonal Algorithm
下载PDF
导出
摘要 淘宝网个体店铺(以下简称网店)的属地判定是各地工商管理部门对网店进行有效监管的先决条件。针对网店页面属于半结构化文本、不同位置的分词特征重要性差别巨大、内容千差万别且存在大量冗余信息等问题,利用免疫克隆算法搜索有效特征子集降低特征维度,获得相应的感知机模型并用作分类器。感知机模型权重向量表现了不同网页位置特征词对属地判定贡献度的差异,可有效解决高维冗余特征和半结构化文本的分类问题。通过实时抓取网店页面,提取各网店部分特征值并使用模型进行属地判定实验,与无特征选择、基于粗糙集理论和遗传算法的特征选择的实验结果做比较,结果表明:该方法具有收敛速度快和分类效果好的优势,能够对淘宝网店进行较正确的属地判定,准确率达到95%,基本满足管理部门对属地监管的要求。 In order to supervise Taobao individual shop operators effectively,industry and commerce administrative departments need to judge the territory of the taobao shop firstly.But the web pages are semi-structured texts,the importance of text terms on different locations is greatly different.The web pages with different contents may contain a large amount of redundant information.In this paper,a new combination approach based on immune clonal selection algorithm(ICSA)and perceptron was used to select the optimal feature subset for reducing the feature space dimensions.It was also used to fetch the perceptron for a classifier,whose weight vector was used to discriminate the differences in the importance of web pages.Thus the problem of high dimension redundancy and semistructured text classification was effectively solved.Finally by grabbing shop pages in real time,the value of the shops was extracted and the model was used to determine the online territory.The obtained results showed that the proposed model had given a higher accuracy and a faster converge rate when compared with the results obtained by using all features,based on rough set and genetic algorithm to select feature subset.The proposed approach could effectively determine online shop territory,with the accuracy rate reaching 95%.It could meet the territory supervision requirements.
作者 程新党 张新刚 赵学武 CHENG Xindang;ZHANG Xingang;ZHAO Xuewu(College of Software,Nanyang Normal University,Nanyang 473061,China;College of Computer and Information,Nanyang Normal University,Nanyang 473061,China)
出处 《新乡学院学报》 2018年第3期17-25,共9页 Journal of Xinxiang University
基金 国家自然科学基金项目(61401242) 河南省基础与前沿技术研究项目(142300410396)
关键词 免疫克隆算法 淘宝店铺 感知机 网页分类 特征选择 immune clonal selection algorithm Taobao shop perceptron web page classification feature selection
  • 相关文献

参考文献7

二级参考文献84

共引文献252

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部