期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
A hybrid data-mining framework for train rescheduling strategy pattern discovery
1
作者 Ruirui Chen Xuekai Ge +1 位作者 Ping Huang Chao Wen 《Transportation Safety and Environment》 EI 2024年第1期56-67,共12页
This study presents a hybrid data-mining framework based on feature selection algorithms and clustering methods to perform the pattern discovery of high-speed railway train rescheduling strategies(RSs).The proposed mo... This study presents a hybrid data-mining framework based on feature selection algorithms and clustering methods to perform the pattern discovery of high-speed railway train rescheduling strategies(RSs).The proposed model is composed of two states.In the first state,decision tree,random forest,gradient boosting decision tree(GBDT)and extreme gradient boosting(XGBoost)models are used to investigate the importance of features.The features that have a high influence on RSs are first selected.In the second state,a K-means clustering method is used to uncover the interdependences between RSs and the influencing features,based on the results in the first state.The proposed method can determine the quantitative relationships between RSs and influencing factors.The results clearly show the influences of the factors on RSs,the possibilities of different train operation RSs under different situations,as well as some key time periods and key trains that the controllers should pay more attention to.The research in this paper can help train traffic controllers better understand the train operation patterns and provides direction for optimizing rail traffic RSs. 展开更多
关键词 high-speed railway rescheduling strategy pattern discovery feature identification clustering algorithm
原文传递
An Algorithm for Mining Gradual Moving Object Clusters Pattern From Trajectory Streams
2
作者 Yujie Zhang Genlin Ji +1 位作者 Bin Zhao Bo Sheng 《Computers, Materials & Continua》 SCIE EI 2019年第6期885-901,共17页
The discovery of gradual moving object clusters pattern from trajectory streams allows characterizing movement behavior in real time environment,which leverages new applications and services.Since the trajectory strea... The discovery of gradual moving object clusters pattern from trajectory streams allows characterizing movement behavior in real time environment,which leverages new applications and services.Since the trajectory streams is rapidly evolving,continuously created and cannot be stored indefinitely in memory,the existing approaches designed on static trajectory datasets are not suitable for discovering gradual moving object clusters pattern from trajectory streams.This paper proposes a novel algorithm of gradual moving object clusters pattern discovery from trajectory streams using sliding window models.By processing the trajectory data in current window,the mining algorithm can capture the trend and evolution of moving object clusters pattern.Firstly,the density peaks clustering algorithm is exploited to identify clusters of different snapshots.The stable relationship between relatively few moving objects is used to improve the clustering efficiency.Then,by intersecting clusters from different snapshots,the gradual moving object clusters pattern is updated.The relationship of clusters between adjacent snapshots and the gradual property are utilized to accelerate updating process.Finally,experiment results on two real datasets demonstrate that our algorithm is effective and efficient. 展开更多
关键词 Trajectory streams pattern mining moving object clusters pattern discovery of moving clusters pattern
下载PDF
A theoretical model for pattern discovery in visual analytics 被引量:1
3
作者 Natalia Andrienko Gennady Andrienko +2 位作者 Silvia Miksch Heidrun Schumann Stefan Wrobel 《Visual Informatics》 EI 2021年第1期23-42,共20页
The word‘pattern’frequently appears in the visualisation and visual analytics literature,but what do we mean when we talk about patterns?We propose a practicable definition of the concept of a pattern in a data dist... The word‘pattern’frequently appears in the visualisation and visual analytics literature,but what do we mean when we talk about patterns?We propose a practicable definition of the concept of a pattern in a data distribution as a combination of multiple interrelated elements of two or more data components that can be represented and treated as a unified whole.Our theoretical model describes how patterns are made by relationships existing between data elements.Knowing the types of these relationships,it is possible to predict what kinds of patterns may exist.We demonstrate how our model underpins and refines the established fundamental principles of visualisation.The model also suggests a range of interactive analytical operations that can support visual analytics workflows where patterns,once discovered,are explicitly involved in further data analysis. 展开更多
关键词 Visual analytics Data distribution pattern ABSTRACTION Data organisation Data arrangement Data variation pattern discovery
原文传递
Verbumculus and the Discovery of Unusual Words 被引量:1
4
作者 AlbertoApostolico Fang-ChengGong StefanoLonardi 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第1期22-41,共20页
Measures relating word frequencies and expectations have been constantly ofinterest in Bioinformatics studies. With sequence data becoming massively available, exhaustiveenumeration of such measures have become concei... Measures relating word frequencies and expectations have been constantly ofinterest in Bioinformatics studies. With sequence data becoming massively available, exhaustiveenumeration of such measures have become conceivable, and yet pose significant computational burdeneven when limited to words of bounded maximum length. In addition, the display of the huge tablespossibly resulting from these counts poses practical problems of visualization and inference.VERBUMCULUS is a suite of software tools for the efficient and fast detection of over- orunder-represented words in nucleotide sequences. The inner core of VERBUMCULUS rests on subtlyinterwoven properties of statistics, pattern matching and combinatorics on words, that enable one tolimit drastically and a priori the set of over-or under-represented candidate words of all lengthsin a given sequence, thereby rendering it more feasible both to detect and visualize such words in afast and practically useful way. This paper is devoted to the description of the facility at theoutset and to report experimental results, ranging from simulations on synthetic data to thediscovery of regulatory elements on the upstream regions of a set of genes of the yeast. 展开更多
关键词 verbumculus unusual words subword statistics pattern discovery regulatoryelements suffix trees
原文传递
An Algorithm for Finding Conserved Secondary Structure Motifs in Unaligned RNA Sequences
5
作者 GiulioPavesi GiancarloMauri GrazianoPesole 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第1期2-12,共11页
Several experiments and observations have revealed the fact that small localdistinct structural features in RNA molecules are correlated with their biological function, forexample, in post-transcriptional regulation o... Several experiments and observations have revealed the fact that small localdistinct structural features in RNA molecules are correlated with their biological function, forexample, in post-transcriptional regulation of gene expression. Thus, finding similar structuralfeatures in a set of RNA sequences known to play the same biological function could providesubstantial information concerning which parts of the sequences are responsible for the functionitself. Unfortunately, finding common structural elements in RNA molecules is a very challengingtask, even if limited to secondary structure. The main difficulty lies in the fact that in nearlyall the cases the structure of the molecules is unknown, has to be somehow predicted, and thatsequences with little or no similarity can fold into similar structures. Although they differ insome details, the approaches proposed so far are usually based on the preliminary alignment of thesequences and attempt to predict common structures (either local or global, or for some selectedregions) for the aligned sequences. These methods give good results when sequence and structuresimilarity are very high, but function less well when similarity is limited to small and localelements, like single stem-loop motifs. Instead of aligning the sequences, the algorithm we presentdirectly searches for regions of the sequences that can fold into similar structures, where thedegree of similarity can be defined by the user. Any information concerning sequence similarity inthe motifs can be used either as a search constraint, or a posteriori, by post-processing theoutput. The search for the regions sharing structural similarity is implemented with the affix tree,a novel text-indexing structure that significantly accelerates the search for patterns having asymmetric layout, such as those forming stem-loop structures. Tests based on experimentally knownstructures have shown that the algorithm is able to identify functional motifs in the secondarystructure of non coding RNA, such as Iron Responsive Elements (IRE) in the untranslated regions offerritin mRNA, and the domain IV stem-loop structure in SRP RNA. 展开更多
关键词 pattern discovery RNA secondary structure affix trees
原文传递
Incremental expectation maximization principal component analysis for missing value imputation for coevolving EEG data
6
作者 Sun Hee KIM Hyung Jeong YANG Kam Swee NG 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2011年第8期687-697,共11页
Missing values occur in bio-signal processing for various reasons,including technical problems or biological char-acteristics.These missing values are then either simply excluded or substituted with estimated values f... Missing values occur in bio-signal processing for various reasons,including technical problems or biological char-acteristics.These missing values are then either simply excluded or substituted with estimated values for further processing.When the missing signal values are estimated for electroencephalography (EEG) signals,an example where electrical signals arrive quickly and successively,rapid processing of high-speed data is required for immediate decision making.In this study,we propose an incremental expectation maximization principal component analysis (iEMPCA) method that automatically estimates missing values from multivariable EEG time series data without requiring a whole and complete data set.The proposed method solves the problem of a biased model,which inevitably results from simply removing incomplete data rather than estimating them,and thus reduces the loss of information by incorporating missing values in real time.By using an incremental approach,the proposed method alsominimizes memory usage and processing time of continuously arriving data.Experimental results show that the proposed method assigns more accurate missing values than previous methods. 展开更多
关键词 Electroencephalography (EEG) Missing value imputation Hidden pattern discovery Expectation maximization Principal component analysis
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部