Nowadays short texts can be widely found in various social data in relation to the 5G-enabled Internet of Things (IoT). Short text classification is a challenging task due to its sparsity and the lack of context. Prev...Nowadays short texts can be widely found in various social data in relation to the 5G-enabled Internet of Things (IoT). Short text classification is a challenging task due to its sparsity and the lack of context. Previous studies mainly tackle these problems by enhancing the semantic information or the statistical information individually. However, the improvement achieved by a single type of information is limited, while fusing various information may help to improve the classification accuracy more effectively. To fuse various information for short text classification, this article proposes a feature fusion method that integrates the statistical feature and the comprehensive semantic feature together by using the weighting mechanism and deep learning models. In the proposed method, we apply Bidirectional Encoder Representations from Transformers (BERT) to generate word vectors on the sentence level automatically, and then obtain the statistical feature, the local semantic feature and the overall semantic feature using Term Frequency-Inverse Document Frequency (TF-IDF) weighting approach, Convolutional Neural Network (CNN) and Bidirectional Gate Recurrent Unit (BiGRU). Then, the fusion feature is accordingly obtained for classification. Experiments are conducted on five popular short text classification datasets and a 5G-enabled IoT social dataset and the results show that our proposed method effectively improves the classification performance.展开更多
Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series da...Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.展开更多
This paper introduced the basic theory and algorithm of the surrogate data method, which proposed a rigorous way to detect the random and seemingly stochastic characteristics in a system. The Gaussian data and the Ros...This paper introduced the basic theory and algorithm of the surrogate data method, which proposed a rigorous way to detect the random and seemingly stochastic characteristics in a system. The Gaussian data and the Rossler data were used to show the availability and effectivity of this method. According to the analysis by this method based on the short-circuiting current signals under the conditions of the same voltage and different wire feed speeds, it is demonstrated that the electrical signals time series exhibit apparently randomness when the welding parameters do not match. However, the electrical signals time series are deterministic when a match is found. The stability of short-circuiting transfer process could be judged exactly by the method of surrogate data.展开更多
A K-nearest neighbor (K-NN) based nonparametric regression model was proposed to predict travel speed for Beijing expressway. By using the historical traffic data collected from the detectors in Beijing expressways,...A K-nearest neighbor (K-NN) based nonparametric regression model was proposed to predict travel speed for Beijing expressway. By using the historical traffic data collected from the detectors in Beijing expressways, a specically designed database was developed via the processes including data filtering, wavelet analysis and clustering. The relativity based weighted Euclidean distance was used as the distance metric to identify the K groups of nearest data series. Then, a K-NN nonparametric regression model was built to predict the average travel speeds up to 6 min into the future. Several randomly selected travel speed data series, collected from the floating car data (FCD) system, were used to validate the model. The results indicate that using the FCD, the model can predict average travel speeds with an accuracy of above 90%, and hence is feasible and effective.展开更多
Common forms of short text are microblogs, Twitter posts, short product reviews, short movie reviews and instant messages. Sentiment analysis of them has been a hot topic. A highly-accurate model is proposed in this p...Common forms of short text are microblogs, Twitter posts, short product reviews, short movie reviews and instant messages. Sentiment analysis of them has been a hot topic. A highly-accurate model is proposed in this paper for short-text sentiment analysis. The researches target microblog, product review and movie reviews. Words, symbols or sentences with emotional tendencies are proved important indicators in short-text sentiment analysis based on massive users’ data. It is an effective method to predict emotional tendencies of short text using these features. The model has noticed the phenomenon of polysemy in single-character emotional word in Chinese and discusses singlecharacter and multi-character emotional word separately. The idea of model can be used to deal with various kinds of short-text data. Experiments show that this model performs well in most cases.展开更多
针对SAR图像检测船舶任务中的目标小、近岸样本目标检测困难等问题,文章提出一种名为长短路特征融合网络(Long and Short path Feature Fusion Network,LSFF-Net)的船舶检测网络。该网络通过长短路特征融合模块有效协调了大目标与小目...针对SAR图像检测船舶任务中的目标小、近岸样本目标检测困难等问题,文章提出一种名为长短路特征融合网络(Long and Short path Feature Fusion Network,LSFF-Net)的船舶检测网络。该网络通过长短路特征融合模块有效协调了大目标与小目标检测,避免小目标特征信息的丢失。网络中应用结构重参数化结构提高了模块学习能力。为了满足多尺度目标检测,加入特征金字塔网络,融合多尺度特征。为了应对近岸样本目标检测,设计数据重分配算法,提高了对近岸样本目标的检测精度。实验结果表明:在公开数据集检测时,算法的平均精度(Average Precision,AP)达到97.50%,优于主流目标检测算法。该方法为提高SAR图像中小目标和近岸样本目标检测精度提供了新的实现方案。展开更多
提出一种改进的初轨确定算法,基于动态阈值的距离搜索方法,以改进传统算法在处理数据时初轨成功率和初轨误差。通过动态调整搜索阈值,旨在实现更精准和高效的初轨确定,以满足当前对空间目标初轨确定的需求;利用LEO,MEO和GEO目标的实测...提出一种改进的初轨确定算法,基于动态阈值的距离搜索方法,以改进传统算法在处理数据时初轨成功率和初轨误差。通过动态调整搜索阈值,旨在实现更精准和高效的初轨确定,以满足当前对空间目标初轨确定的需求;利用LEO,MEO和GEO目标的实测角度数据开展算法测试。介绍了基于动态阈值的距离搜索算法的实现过程,基于数据处理的经验,用动态阈值实现初轨参数质量控制环节的轨道筛选。给出了详细的算法实现流程。利用TLE(Two Line El⁃ements)评估了初轨确定参数误差。基于“烛龙”观测网的中低轨目标和中国科学院长春人造卫星观测站的高轨目标的实测角度数据,开展算法测试。结果表明:LEO,MEO和GEO目标短弧初轨确定成功率分别约为94%,75%和89%,半长轴误差均值分别约为9,12和50 km。该算法适用性强、成功率高、定轨精度高,证明了监测数据的质量。展开更多
基金supported in part by the Beijing Natural Science Foundation under grants M21032 and 19L2029in part by the National Natural Science Foundation of China under grants U1836106 and 81961138010in part by the Scientific and Technological Innovation Foundation of Foshan under grants BK21BF001 and BK20BF010.
文摘Nowadays short texts can be widely found in various social data in relation to the 5G-enabled Internet of Things (IoT). Short text classification is a challenging task due to its sparsity and the lack of context. Previous studies mainly tackle these problems by enhancing the semantic information or the statistical information individually. However, the improvement achieved by a single type of information is limited, while fusing various information may help to improve the classification accuracy more effectively. To fuse various information for short text classification, this article proposes a feature fusion method that integrates the statistical feature and the comprehensive semantic feature together by using the weighting mechanism and deep learning models. In the proposed method, we apply Bidirectional Encoder Representations from Transformers (BERT) to generate word vectors on the sentence level automatically, and then obtain the statistical feature, the local semantic feature and the overall semantic feature using Term Frequency-Inverse Document Frequency (TF-IDF) weighting approach, Convolutional Neural Network (CNN) and Bidirectional Gate Recurrent Unit (BiGRU). Then, the fusion feature is accordingly obtained for classification. Experiments are conducted on five popular short text classification datasets and a 5G-enabled IoT social dataset and the results show that our proposed method effectively improves the classification performance.
文摘Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.
基金supported by the Young Scientists Fund of the National Natural Science Foundation of China(Grant No.51205283)
文摘This paper introduced the basic theory and algorithm of the surrogate data method, which proposed a rigorous way to detect the random and seemingly stochastic characteristics in a system. The Gaussian data and the Rossler data were used to show the availability and effectivity of this method. According to the analysis by this method based on the short-circuiting current signals under the conditions of the same voltage and different wire feed speeds, it is demonstrated that the electrical signals time series exhibit apparently randomness when the welding parameters do not match. However, the electrical signals time series are deterministic when a match is found. The stability of short-circuiting transfer process could be judged exactly by the method of surrogate data.
基金The Project of Research on Technologyand Devices for Traffic Guidance (Vehicle Navigation)System of Beijing Municipal Commission of Science and Technology(No H030630340320)the Project of Research on theIntelligence Traffic Information Platform of Beijing Education Committee
文摘A K-nearest neighbor (K-NN) based nonparametric regression model was proposed to predict travel speed for Beijing expressway. By using the historical traffic data collected from the detectors in Beijing expressways, a specically designed database was developed via the processes including data filtering, wavelet analysis and clustering. The relativity based weighted Euclidean distance was used as the distance metric to identify the K groups of nearest data series. Then, a K-NN nonparametric regression model was built to predict the average travel speeds up to 6 min into the future. Several randomly selected travel speed data series, collected from the floating car data (FCD) system, were used to validate the model. The results indicate that using the FCD, the model can predict average travel speeds with an accuracy of above 90%, and hence is feasible and effective.
文摘Common forms of short text are microblogs, Twitter posts, short product reviews, short movie reviews and instant messages. Sentiment analysis of them has been a hot topic. A highly-accurate model is proposed in this paper for short-text sentiment analysis. The researches target microblog, product review and movie reviews. Words, symbols or sentences with emotional tendencies are proved important indicators in short-text sentiment analysis based on massive users’ data. It is an effective method to predict emotional tendencies of short text using these features. The model has noticed the phenomenon of polysemy in single-character emotional word in Chinese and discusses singlecharacter and multi-character emotional word separately. The idea of model can be used to deal with various kinds of short-text data. Experiments show that this model performs well in most cases.
文摘针对SAR图像检测船舶任务中的目标小、近岸样本目标检测困难等问题,文章提出一种名为长短路特征融合网络(Long and Short path Feature Fusion Network,LSFF-Net)的船舶检测网络。该网络通过长短路特征融合模块有效协调了大目标与小目标检测,避免小目标特征信息的丢失。网络中应用结构重参数化结构提高了模块学习能力。为了满足多尺度目标检测,加入特征金字塔网络,融合多尺度特征。为了应对近岸样本目标检测,设计数据重分配算法,提高了对近岸样本目标的检测精度。实验结果表明:在公开数据集检测时,算法的平均精度(Average Precision,AP)达到97.50%,优于主流目标检测算法。该方法为提高SAR图像中小目标和近岸样本目标检测精度提供了新的实现方案。
文摘提出一种改进的初轨确定算法,基于动态阈值的距离搜索方法,以改进传统算法在处理数据时初轨成功率和初轨误差。通过动态调整搜索阈值,旨在实现更精准和高效的初轨确定,以满足当前对空间目标初轨确定的需求;利用LEO,MEO和GEO目标的实测角度数据开展算法测试。介绍了基于动态阈值的距离搜索算法的实现过程,基于数据处理的经验,用动态阈值实现初轨参数质量控制环节的轨道筛选。给出了详细的算法实现流程。利用TLE(Two Line El⁃ements)评估了初轨确定参数误差。基于“烛龙”观测网的中低轨目标和中国科学院长春人造卫星观测站的高轨目标的实测角度数据,开展算法测试。结果表明:LEO,MEO和GEO目标短弧初轨确定成功率分别约为94%,75%和89%,半长轴误差均值分别约为9,12和50 km。该算法适用性强、成功率高、定轨精度高,证明了监测数据的质量。