面向跨领域情感分类的特征选择方法被引量：3

Feature Selection for Cross-Domain Sentiment Classification

下载PDF

导出

摘要数据标记的难以获取使得跨领域适应成为一种有效的途径.然而情感分类具有较强的领域依赖性,利用传统的特征选择方法在原始领域构建的特征空间不能体现领域间的共性,难以适用于目标领域.为此,提出一种面向跨领域情感分类的特征选择方法(LLRTF),利用对数似然比选取在原始领域富有判别力的特征,并通过对照两个领域的统计信息,选出其中在目标领域影响较大的特征.基于该方法构建的公共特征空间,能减少领域间数据分布的差异.实验结果表明,LLRTF优于基准算法. The data is usually unlabeled in application, which makes the adaptation of cross-domain effective. However, the sentiment classification is domain-dependent. The feature space of source domain, .gotten by feature selection, can not represent the common character of both domains and is not suitable for the classification of target domain. Therefore, an approach of feature selection for cross-domain sentiment classification, Log-Likelihood Ratio-Term Frequency （LLRTF） is proposed. The log likelihood ratios （LLR） of features are computed in source domain, by which the discriminative feature space is gotten. Then, the statistic information term frequency of both domains is added to the LLR, and the features which are more important in target domain are selected. The feature space construction based on the LLRTF reduces the difference between source domain and target domain. The experimental result shows that the LLRTF is superior to the baselines.

作者张玉红周全胡学钢

机构地区合肥工业大学计算机与信息学院合肥

出处《模式识别与人工智能》 EI CSCD 北大核心 2013年第11期1068-1072,共5页 Pattern Recognition and Artificial Intelligence

基金国家自然科学基金项目(No.61273292,61273297) 国家863计划项目(No.2012AA011005) 安徽省自然科学基金项目(No.1208085QF122)资助

关键词特征选择跨领域情感分类 Feature Selection, Cross-Domain, Sentiment Classification

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献16

1Blitzer J , Dredze M , Pereira F. Biographies , Bollywood , Boom- Boxes and Blenders: Domain Adaptation for Sentiment Classification // Proc of the 45th Annual Meeting of the Association of Computa- tional Linguistics. Prague, Czech Republic, 2007:440-447.
2Blitzer J, McDonald R, Pereira F. Domain Adaptation with Struc- tural Correspondence Learning//Proc of the Conference on Empiri- cal Methods in Natural Language Processing. Sydney, Australia, 2006:120-128.
3Daum6 III H, Marcu D. Domain Adaptation for Statistical Classifiers. Journal of Artificial |nteUigence Research, 2006, 26 (1) : 101-126.
4Tan Songbo, Wang Yuefen, Cheng Xueqi. An Efficient Feature Ranking Measure for Text Categorization//Proc of the ACM Sym- posium on Applied Computing. Fortaleza, Brazil, 2008:407-413.
5Whitehead M, Yaeger L. Building a General Purpose Cross-Domain Sentiment Mining Model /! Proc of the WRI World Congress on Computer Science and Information Engineering. Los Augeles, USA, 2009 : 472-476.
6Church K W, Hanks P. Word Association Norms, Mutual Information and Lexicography. Computational Linguistics, 1990, 16 ( 1 ) : 22-29.
7Pan Weike, Zhong Erheng, Yang Qiang. Transfer Eearning for Text Mining//Aggarwal C C, Zhai Chengxiang, eds. Mining Text Data. Berlin, Germany : Springer-Verlag, 2012 : 223-257.
8Pan S J, Ni Xiaochun, Sun Jiantao, et al. Cross-Domain Sentiment Classification via Spectral Feature Alignment//Proc of the 19th In- ternational Conference on World Wide Web. Raleigh, USA, 2010: 75 1-760.
9Yoshida Y, Hirao T, Iwata T, et al. Transfer Learning for Multiple- Domain Sentiment Analysis-ldentifying Domain Dependent/Inde- pendent Word Polaritys//Proc of the 25th AAAI Conference on Ar- tificial Intelligence. San Francisco, USA. 2011 : 1286-1291.
10Zhuang Fuzhen, Luo Ping, Shen Zhiyong, et al. Collaborative Dual-PLSA: Mining Distinction and Commonality across Multiple Domains for Text Classification// Proc of the 19th ACM Interna- tional Conference on Information and Knowledge Management. Toronto, Canada, 2010:359-368.

二级参考文献56

1de Sa Marques J P. Pattern Recognition Concepts, Methods and Applications. Berlin, Germany: Springer-Verlag, 2002
2Ganeshanandam S, Krzanowski W J. On Selecting Variables and Assessing Their Performance in Linear Discriminant Analysis. Australian Journal of Statistics, 1989, 31(3):433-447
3Theodoridis S, Koutroumbas K. Pattern Recognition. 2nd Edition. New York, USA:Elsevier, 2003
4Dougherty E R. Small Sample Issues for Microarray-Based Classification. Comparative and Functional Genomics, 2001, 2 (1) : 28-34
5Dougherty E R, Shmulevich I, Bittner M L. Genomic Signal Processing: The Salient Issues. EURASIP Journal on Applied Signal Processing, 2004, 4(1): 146-153
6Kim S, Dougherty E R, Barrera J, et al. Strong Feature Sets from Small Samples. Journal of Computational Biology, 2002, 9 (1): 127-146
7Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, USA: Springer-Verlag, 2001
8Webb R A. Statistical Pattern Recognition. New York, USA: John Wiley & Son, 2002
9Dudoit S, Fridlyand J, Speed T P. Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association,2002, 97(457):77-87
10Adam B L, Vlahou A, Semmes O J, et al. Proteomic Approaches to Biomarker Discovery in Prostate and Bladder Cancers. Proteomics, 2001, 1(10): 1264-1270

共引文献94

1吴迪,郭嗣琮.改进的Fisher Score特征选择方法及其应用[J].辽宁工程技术大学学报（自然科学版）,2019,38(5):472-479. 被引量：10
2计智伟,吴耿锋.基于层次聚类算法和偏最小二乘的特征选择[J].计算机工程与设计,2009,30(21):4931-4935. 被引量：1
3贾曌峰,陈继荣.基于字符检测的车牌定位方法[J].计算机工程,2010,36(3):192-194. 被引量：7
4Hao-Dong Zhu,Hong-Chan Li,Xiang-Hui Zhao,Yong Zhong.Feature Selection Method by Applying Parallel Collaborative Evolutionary Genetic Algorithm[J].Journal of Electronic Science and Technology,2010,8(2):108-113. 被引量：1
5胡强.优化的互信息特征选择方法[J].湖南师范大学自然科学学报,2010,33(3):28-31. 被引量：1
6徐雪松,李玲娟,郭立玮.基于稀疏表示的数据流异常数据预测方法[J].计算机应用,2010,30(11):2956-2958. 被引量：6
7周瑞琼,朱颢东,吴洪丽.基于两种特征贡献度的特征选择[J].河南科技大学学报（自然科学版）,2010,31(5):44-47.
8郭旭,张丽杰.人体姿态特征选择方法的研究与实现[J].计算机工程,2011,37(4):184-186.
9杨杨,刘会东.一种基于成对约束的特征选择改进算法[J].南京师范大学学报（工程技术版）,2011,11(1):56-61.
10计智伟,胡珉.一种双重过滤式特征选择算法[J].计算机工程与应用,2011,47(19):190-193. 被引量：2

同被引文献25

1张海荣,朱信忠,赵建民,徐慧英.一种优化的基于用户聚类的过滤推荐策略[J].计算机系统应用,2008,17(11):95-97. 被引量：6
2徐琳宏,林鸿飞,潘宇,任惠,陈建美.情感词汇本体的构造[J].情报学报,2008,27(2):180-185. 被引量：389
3杨杰,陈恩红.面向个性化服务的用户兴趣偏移检测及处理方法[J].电子技术（上海）,2009,36(11):72-76. 被引量：5
4赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8):1834-1848. 被引量：544
5吴琼,谭松波,许洪波,段洣毅,程学旗.基于随机游走模型的跨领域倾向性分析研究[J].计算机研究与发展,2010,47(12):2123-2131. 被引量：11
6王素格,李德玉,魏英杰.基于赋权粗糙隶属度的文本情感分类方法[J].计算机研究与发展,2011,48(5):855-861. 被引量：19
7杨文让,王中卿,李培峰,朱巧明.基于质心迁移的领域适应性情感分类[J].计算机应用与软件,2011,28(12):26-28. 被引量：4
8车万翔,张梅山,刘挺.基于主动学习的中文依存句法分析[J].中文信息学报,2012,26(2):18-22. 被引量：10
9张宏莉,鲁刚.分类不平衡协议流的机器学习算法评估与比较[J].软件学报,2012,23(6):1500-1516. 被引量：26
10赵捷,赵荣彩,丁锐,黄品丰.基于嵌套循环分类的并行识别技术[J].软件学报,2012,23(10):2695-2704. 被引量：5

引证文献3

1刘建粉,史永昌.基于用户兴趣分类优化的聚类模型仿真[J].微电子学与计算机,2014,31(5):171-174. 被引量：2
2张军,王素格.基于逐步优化分类模型的跨领域文本情感分类[J].计算机科学,2016,43(7):234-239. 被引量：3
3赵传君,王素格,李德玉.跨领域文本情感分类研究进展[J].软件学报,2020,31(6):1723-1746. 被引量：12

二级引证文献17

1董跃华,刘力.基于权衡因子的决策树优化算法[J].江西理工大学学报,2015,36(5):90-97.
2李凯凯,宋礼鹏.基于社交网络的用户行为记忆性研究[J].微电子学与计算机,2017,34(3):133-135. 被引量：4
3李燕,卫志华,徐凯.基于Lasso算法的中文情感混合特征选择方法研究[J].计算机科学,2018,45(1):39-46. 被引量：8
4卢水英.探究人工智能在文本情感分析中的具体应用[J].信息系统工程,2019,32(8):145-145. 被引量：2
5徐志栋,陈炳阳,王晓,张卫山.基于胶囊网络的方面级情感分类研究[J].智能科学与技术学报,2020,2(3):284-292. 被引量：4
6于卫红,付飘云,任月,王庆武.基于PMI与BTM的船舶事故原因文本挖掘[J].交通信息与安全,2021,39(1):35-44. 被引量：5
7金玉.基于自然语言处理的日语计算机辅助教学系统设计[J].自动化技术与应用,2021,40(10):52-55. 被引量：2
8王冰,毕新伟.基于SVM的文本情感倾向性智能分析方法[J].赤峰学院学报（自然科学版）,2021,37(10):16-19.
9高尚兵,黄子赫,耿璇,臧晨,沈晓坤.视觉协同的违规驾驶行为分析方法[J].智能系统学报,2021,16(6):1158-1165. 被引量：2
10吴峰,周军,谢聪,姬少培.基于交互式学习与多头注意力机制的金融文本情感分类[J].现代计算机,2022,28(11):1-9.

1陶建林,李楠.一种改进的基于方向图的指纹细化算法[J].商情,2013(6):272-272.
2航瑞.神秘的DEP就在身边[J].计算机应用文摘,2005(1):67-67.
3邱京伟.粒关联规则的属性挖掘算法及有关标记方法[J].宁德师范学院学报（自然科学版）,2013,25(4):373-375.
4国栋,徐玉锋.无线传感器网络的时钟同步和校准技术[J].测控与通信,2010(2):15-18.
5郭晓静.地理信息系统在土地资源管理中的应用[J].江西建材,2016(8):232-233. 被引量：4
6汪锦龙.XY图表制作有“讲究”[J].电脑爱好者,2008,0(17):56-56.
7祁昌平,高彩霞,方媛.基于子空间插值的领域适应学习[J].西北师范大学学报（自然科学版）,2014,50(5):40-43.
8惠普推出新型FlowMFP数码多功能一体机[J].数码印刷,2012(12):16-16.
9杨宏杰.关于云计算的大数据存储安全的探讨[J].电子世界,2014(22):12-12.
10樊养余,李祖贺,王凤琴,马江涛.基于跨领域卷积稀疏自动编码器的抽象图像情绪性分类[J].电子与信息学报,2017,39(1):167-175. 被引量：4

模式识别与人工智能

2013年第11期

浏览历史

内容加载中请稍等...

面向跨领域情感分类的特征选择方法被引量：3

参考文献16

二级参考文献56

共引文献94

同被引文献25

引证文献3

二级引证文献17

相关作者

相关机构

相关主题

浏览历史

面向跨领域情感分类的特征选择方法 被引量：3

参考文献16

二级参考文献56

共引文献94

同被引文献25

引证文献3

二级引证文献17

相关作者

相关机构

相关主题

浏览历史

面向跨领域情感分类的特征选择方法被引量：3