期刊文献+
共找到12篇文章
< 1 >
每页显示 20 50 100
Sentiment Classification Based on Piecewise Pooling Convolutional Neural Network 被引量:2
1
作者 Yuhong Zhang Qinqin Wang +1 位作者 Yuling Li xindong wu 《Computers, Materials & Continua》 SCIE EI 2018年第8期285-297,共13页
Recently,the effectiveness of neural networks,especially convolutional neural networks,has been validated in the field of natural language processing,in which,sentiment classification for online reviews is an importan... Recently,the effectiveness of neural networks,especially convolutional neural networks,has been validated in the field of natural language processing,in which,sentiment classification for online reviews is an important and challenging task.Existing convolutional neural networks extract important features of sentences without local features or the feature sequence.Thus,these models do not perform well,especially for transition sentences.To this end,we propose a Piecewise Pooling Convolutional Neural Network(PPCNN)for sentiment classification.Firstly,with a sentence presented by word vectors,convolution operation is introduced to obtain the convolution feature map vectors.Secondly,these vectors are segmented according to the positions of transition words in sentences.Thirdly,the most significant feature of each local segment is extracted using max pooling mechanism,and then the different aspects of features can be extracted.Specifically,the relative sequence of these features is preserved.Finally,after processed by the dropout algorithm,the softmax classifier is trained for sentiment classification.Experimental results show that the proposed method PPCNN is effective and superior to other baseline methods,especially for datasets with transition sentences. 展开更多
关键词 Sentiment classification convolutional neural network piecewise pooling feature extract
下载PDF
W-Index: A Weighted Index for Evaluating Research Impact
2
作者 xindong wu 《Open Journal of Applied Sciences》 2021年第1期149-156,共8页
Academic evaluations such as tenure/promotion applications and society fellowship nominations rely heavily on bibliometric measures of each candidate’s research impact, including their research citations. This articl... Academic evaluations such as tenure/promotion applications and society fellowship nominations rely heavily on bibliometric measures of each candidate’s research impact, including their research citations. This article first reviews existing evaluation criteria such as the h-index and<em> q</em>-most-citations, and then proposes a weighted w-index which minimizes shortcomings in existing single-number measures. The w-index consists of three factors<span style="white-space:nowrap;">&#8212;</span>3 most cited first-author publications, 3 most cited publications as the corresponding/last author, and 3 additional most cited publications as a co-author, but does not allow double counting of these publications. 展开更多
关键词 Research Impact Bibliometric Measures CITATIONS PUBLICATIONS Author Rank
下载PDF
Feature Selection: Algorithms and Challenges
3
作者 xindong wu Yanglan Gan +1 位作者 Hao Wang Xuegang Hu 《南昌工程学院学报》 CAS 2006年第2期28-34,共7页
Feature selection is an active area in data mining research and development. It consists of efforts and contributions from a wide variety of communities, including statistics, machine learning, and pattern recognition... Feature selection is an active area in data mining research and development. It consists of efforts and contributions from a wide variety of communities, including statistics, machine learning, and pattern recognition. The diversity, on one hand, equips us with many methods and tools. On the other hand, the profusion of options causes confusion.This paper reviews various feature selection methods and identifies research challenges that are at the forefront of this exciting area. 展开更多
关键词 feature selection data mining research ALGORITHMS informative attvibutes of dataset CHALLENGE
下载PDF
Joint user profiling with hierarchical attention networks 被引量:1
4
作者 Xiaojian LIU Yi ZHU xindong wu 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第3期133-143,共11页
User profiling by inferring user personality traits,such as age and gender,plays an increasingly important role in many real-world applications.Most existing methods for user profiling either use only one type of data... User profiling by inferring user personality traits,such as age and gender,plays an increasingly important role in many real-world applications.Most existing methods for user profiling either use only one type of data or ignore handling the noisy information of data.Moreover,they usually consider this problem from only one perspective.In this paper,we propose a joint user profiling model with hierarchical attention networks(JUHA)to learn informative user representations for user profiling.Our JUHA method does user profiling based on both inner-user and inter-user features.We explore inner-user features from user behaviors(e.g.,purchased items and posted blogs),and inter-user features from a user-user graph(where similar users could be connected to each other).JUHA learns basic sentence and bag representations from multiple separate sources of data(user behaviors)as the first round of data preparation.In this module,convolutional neural networks(CNNs)are introduced to capture word and sentence features of age and gender while the self-attention mechanism is exploited to weaken the noisy data.Following this,we build another bag which contains a user-user graph.Inter-user features are learned from this bag using propagation information between linked users in the graph.To acquire more robust data,inter-user features and other inner-user bag representations are joined into each sentence in the current bag to learn the final bag representation.Subsequently,all of the bag representations are integrated to lean comprehensive user representation by the self-attention mechanism.Our experimental results demonstrate that our approach outperforms several state-of-the-art methods and improves prediction performance. 展开更多
关键词 user profiling hierarchical attention joint learning inner-user feature inter-user feature
原文传递
Representation learning via an integrated autoencoder for unsupervised domain adaptation 被引量:1
5
作者 Yi ZHU xindong wu +2 位作者 Jipeng QIANG Yunhao YUAN Yun LI 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第5期75-87,共13页
The purpose of unsupervised domain adaptation is to use the knowledge of the source domain whose data distribution is different from that of the target domain for promoting the learning task in the target domain.The k... The purpose of unsupervised domain adaptation is to use the knowledge of the source domain whose data distribution is different from that of the target domain for promoting the learning task in the target domain.The key bottleneck in unsupervised domain adaptation is how to obtain higher-level and more abstract feature representations between source and target domains which can bridge the chasm of domain discrepancy.Recently,deep learning methods based on autoencoder have achieved sound performance in representation learning,and many dual or serial autoencoderbased methods take different characteristics of data into consideration for improving the effectiveness of unsupervised domain adaptation.However,most existing methods of autoencoders just serially connect the features generated by different autoencoders,which pose challenges for the discriminative representation learning and fail to find the real cross-domain features.To address this problem,we propose a novel representation learning method based on an integrated autoencoders for unsupervised domain adaptation,called IAUDA.To capture the inter-and inner-domain features of the raw data,two different autoencoders,which are the marginalized autoencoder with maximum mean discrepancy(mAE)and convolutional autoencoder(CAE)respectively,are proposed to learn different feature representations.After higher-level features are obtained by these two different autoencoders,a sparse autoencoder is introduced to compact these inter-and inner-domain representations.In addition,a whitening layer is embedded for features processed before the mAE to reduce redundant features inside a local area.Experimental results demonstrate the effectiveness of our proposed method compared with several state-of-the-art baseline methods. 展开更多
关键词 unsupervised domain adaptation representation learning marginalized autoencoder convolutional autoen-coder sparse autoencoder
原文传递
Unsupervised statistical text simplification using pre-trained language modeling for initialization
6
作者 Jipeng QIANG Feng ZHANG +3 位作者 Yun LI Yunhao YUAN Yi ZHU xindong wu 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第1期81-90,共10页
Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based mach... Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based machine translation system (UnsupPBMT) achieved good performance, which initializes the phrase tables using the similar words obtained by word embedding modeling. Since word embedding modeling only considers the relevance between words, the phrase table in UnsupPBMT contains a lot of dissimilar words. In this paper, we propose an unsupervised statistical text simplification using pre-trained language modeling BERT for initialization. Specifically, we use BERT as a general linguistic knowledge base for predicting similar words. Experimental results show that our method outperforms the state-of-the-art unsupervised text simplification methods on three benchmarks, even outperforms some supervised baselines. 展开更多
关键词 text simplification pre-trained language modeling BERT word embeddings
原文传递
HUSS:A Heuristic Method for Understanding the Semantic Structure of Spreadsheets
7
作者 xindong wu Hao Chen +3 位作者 Chenyang Bu Shengwei Ji Zan Zhang Victor S.Sheng 《Data Intelligence》 EI 2023年第3期537-559,共23页
Spreadsheets contain a lot of valuable data and have many practical applications.The key technology of these practical applications is how to make machines understand the semantic structure of spreadsheets,e.g.,identi... Spreadsheets contain a lot of valuable data and have many practical applications.The key technology of these practical applications is how to make machines understand the semantic structure of spreadsheets,e.g.,identifying cell function types and discovering relationships between cell pairs.Most existing methods for understanding the semantic structure of spreadsheets do not make use of the semantic information of cells.A few studies do,but they ignore the layout structure information of spreadsheets,which affects the performance of cell function classification and the discovery of different relationship types of cell pairs.In this paper,we propose a Heuristic algorithm for Understanding the Semantic Structure of spreadsheets(HUSS).Specifically,for improving the cell function classification,we propose an error correction mechanism(ECM)based on an existing cell function classification model[11]and the layout features of spreadsheets.For improving the table structure analysis,we propose five types of heuristic rules to extract four different types of cell pairs,based on the cell style and spatial location information.Our experimental results on five real-world datasets demonstrate that HUSS can effectively understand the semantic structure of spreadsheets and outperforms corresponding baselines. 展开更多
关键词 Spreadsheet semantic structure Information extraction HEURISTICS Cell function analysis Table structure analysis
原文传递
Web News Extraction via Tag Path Feature Fusion Using DS Theory 被引量:4
8
作者 Gong-Qing wu Lei Li xindong wu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第4期661-672,共12页
Contents, layout styles, and parse structures of web news pages differ greatly from one page to another. In addition, the layout style and the parse structure of a web news page may change from time to time. For these... Contents, layout styles, and parse structures of web news pages differ greatly from one page to another. In addition, the layout style and the parse structure of a web news page may change from time to time. For these reasons, how to design features with excellent extraction performances for massive and heterogeneous web news pages is a challenging issue. Our extensive case studies indicate that there is potential relevancy between web content layouts and their tag paths. Inspired by the observation, we design a series of tag path extraction features to extract web news. Because each feature has its own strength, we fuse all those features with the DS (Dempster-Shafer) evidence theory, and then design a content extraction method CEDS. Experimental results on both CleanEval datasets and web news pages selected randomly from well-known websites show that the Fl-score with CEDS is 8.08% and 3.08% higher than existing popular content extraction methods CETR and CEPR-TPR respectively. 展开更多
关键词 content extraction web news tag path extraction feature Dempster-Shafer (DS) theory
原文传递
A survey on online feature selection with streaming features 被引量:4
9
作者 Xuegang HU Peng ZHOU +2 位作者 Peipei LI Jing WANG xindong wu 《Frontiers of Computer Science》 SCIE EI CSCD 2018年第3期479-493,共15页
In the era of big data, the dimensionality of data is increasing dramatically in many domains. To deal with high dimensionality, online feature selection becomes critical in big data mining. Recently, online selection... In the era of big data, the dimensionality of data is increasing dramatically in many domains. To deal with high dimensionality, online feature selection becomes critical in big data mining. Recently, online selection of dynamic features has received much attention. In situations where features arrive sequentially over time, we need to perform online feature selection upon feature arrivals. Meanwhile, considering grouped features, it is necessary to deal with features arriving by groups. To handle these challenges, some state-of- the-art methods for online feature selection have been proposed. In this paper, we first give a brief review of traditional feature selection approaches. Then we discuss specific problems of online feature selection with feature streams in detail. A comprehensive review of existing online feature selection methods is presented by comparing with each other. Finally, we discuss several open issues in online feature selection. 展开更多
关键词 big data feature selection online feature selection feature stream
原文传递
A multiscale transform denoising method of the bionic polarized light compass for improving the unmanned aerial vehicle navigation accuracy 被引量:1
10
作者 Donghua ZHAO Jun TANG +4 位作者 xindong wu Jing ZHAO Chenguang WANG Chong SHEN Jun LIU 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2022年第4期400-414,共15页
In recent years, the bionic polarized light compass has been widely studied for the unmanned aerial vehicle navigation. However, it is found from the obtained investigation results that a polarized light compass with ... In recent years, the bionic polarized light compass has been widely studied for the unmanned aerial vehicle navigation. However, it is found from the obtained investigation results that a polarized light compass with a sensitive and high dynamic range polarimeter still provides inferior output precision of the heading angle due to the presence of the noise generating from the compass.The noise is existed not only in the angle of the polarization image acquired by polarimeters but also in the output heading data, which leads to a sharp reduction in the accuracy of a polarized light compass. Herein, we present noise analysis and a novel multiscale transform denoising method of a polarized light compass used for the unmanned aerial vehicle navigation. Specifically, a multiscale principle component analysis utilizing one-dimensional image entropy as classification criterion is directly implemented to suppress the noise in the acquired polarization image. Subsequently, a multiscale time–frequency peak filtering method using the sample entropy as classification criterion is applied for the output heading data so as to further increase the heading measurement accuracy from the denoised image above. These two approaches are combined to significantly reduce the heading error affected by different types of noises. Our experimental results indicate the proposed multiscale transform denoising method exhibits high performance in suppressing the noise of a polarized light compass used for the unmanned aerial vehicle navigation compared to existing prior arts. 展开更多
关键词 DENOISING Multi-scale transform ORIENTATION Polarized light compass UAV navigation
原文传递
Certainty-based Preference Completion
11
作者 Lei Li Minghe Xue +2 位作者 Zan Zhang Huanhuan Chen xindong wu 《Data Intelligence》 EI 2022年第1期112-133,共22页
As from time to time it is impractical to ask agents to provide linear orders over all alternatives,for these partial rankings it is necessary to conduct preference completion.Specifically,the personalized preference ... As from time to time it is impractical to ask agents to provide linear orders over all alternatives,for these partial rankings it is necessary to conduct preference completion.Specifically,the personalized preference of each agent over all the alternatives can be estimated with partial rankings from neighboring agents over subsets of alternatives.However,since the agents’rankings are nondeterministic,where they may provide rankings with noise,it is necessary and important to conduct the certainty-based preference completion.Hence,in this paper firstly,for alternative pairs with the obtained ranking set,a bijection has been built from the ranking space to the preference space,and the certainty and conflict of alternative pairs have been evaluated with a well-built statistical measurement Probability-Certainty Density Function on subjective probability,respectively.Then,a certainty-based voting algorithm based on certainty and conflict has been taken to conduct the certainty-based preference completion.Moreover,the properties of the proposed certainty and conflict have been studied empirically,and the proposed approach on certainty-based preference completion for partial rankings has been experimentally validated compared to state-of-arts approaches with several datasets. 展开更多
关键词 Preference completion Nondeterministic CERTAINTY Subjective probability CONFLICT
原文传递
Representation learning: serial-autoencoder for personalized recommendation
12
作者 Yi ZHU Yishuai GENG +2 位作者 Yun LI Jipeng QIANG xindong wu 《Frontiers of Computer Science》 SCIE EI 2024年第4期61-72,共12页
Nowadays,the personalized recommendation has become a research hotspot for addressing information overload.Despite this,generating effective recommendations from sparse data remains a challenge.Recently,auxiliary info... Nowadays,the personalized recommendation has become a research hotspot for addressing information overload.Despite this,generating effective recommendations from sparse data remains a challenge.Recently,auxiliary information has been widely used to address data sparsity,but most models using auxiliary information are linear and have limited expressiveness.Due to the advantages of feature extraction and no-label requirements,autoencoder-based methods have become quite popular.However,most existing autoencoder-based methods discard the reconstruction of auxiliary information,which poses huge challenges for better representation learning and model scalability.To address these problems,we propose Serial-Autoencoder for Personalized Recommendation(SAPR),which aims to reduce the loss of critical information and enhance the learning of feature representations.Specifically,we first combine the original rating matrix and item attribute features and feed them into the first autoencoder for generating a higher-level representation of the input.Second,we use a second autoencoder to enhance the reconstruction of the data representation of the prediciton rating matrix.The output rating information is used for recommendation prediction.Extensive experiments on the MovieTweetings and MovieLens datasets have verified the effectiveness of SAPR compared to state-of-the-art models. 展开更多
关键词 personalized recommendation autoencoder representation learning collaborative filtering
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部