期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Classification and quantification of timestamp data quality issues and its impact on data quality outcome
1
作者 Rex Ambe 《Data Intelligence》 EI 2024年第3期812-833,共22页
Timestamps play a key role in process mining because it determines the chronology of which events occurred and subsequently how they are ordered in process modelling.The timestamp in process mining gives an insight on... Timestamps play a key role in process mining because it determines the chronology of which events occurred and subsequently how they are ordered in process modelling.The timestamp in process mining gives an insight on process performance,conformance,and modelling.This therefore means problems with the timestamp will result in misrepresentations of the mined process.A few articles have been published on the quantification of data quality problems but just one of the articles at the time of this paper is based on the quantification of timestamp quality problems.This article evaluates the quality of timestamps in event log across two axes using eleven quality dimensions and four levels of potential data quality problems.The eleven data quality dimensions were obtained by doing a thorough literature review of more than fifty process mining articles which focus on quality dimensions.This evaluation resulted in twelve data quality quantification metrics and the metrics were applied to the MIMIC-ll dataset as an illustration.The outcome of the timestamp quality quantification using the proposed typology enabled the user to appreciate the quality of the event log and thus makes it possible to evaluate the risk of carrying out specific data cleaning measures to improve the process mining outcome. 展开更多
关键词 TIMESTAMP Process mining data quality dimensions Event log Quality metrics Business process
原文传递
A Comparative Study on Two Techniques of Reducing the Dimension of Text Feature Space
2
作者 Yin Zhonghang, Wang Yongcheng, Cai Wei & Diao Qian School of Electronic & Information Technology, Shanghai Jiaotong University, Shanghai 200030, P.R.China 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2002年第1期87-92,共6页
With the development of large scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to natural language processing. How to reduce the dimension... With the development of large scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to natural language processing. How to reduce the dimension has become a practical problem in the field. Here we present two clustering methods, i.e. concept association and concept abstract, to achieve the goal. The first refers to the keyword clustering based on the co occurrence of 展开更多
关键词 in the same text and the second refers to that in the same category. Then we compare the difference between them. Our experiment results show that they are efficient to reduce the dimension of text feature space. Keywords: Text data mining
下载PDF
Comparison of dimension reduction methods for DEA under big data via Monte Carlo simulation
3
作者 Zikang Chen Song Han 《Journal of Management Science and Engineering》 2021年第4期363-376,共14页
Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-outp... Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-output data.The four most widely used approaches to guide dimension reduction in DEA are compared via Monte Carlo simulation,including principal component analysis(PCA-DEA),which is based on the idea of aggregating input and output,efficiency contribution measurement(ECM),average efficiency measure(AEC),and regression-based detection(RB),which is based on the idea of variable selection.We compare the performance of these methods under different scenarios and a brand-new comparison benchmark for the simulation test.In addition,we discuss the effect of initial variable selection in RB for the first time.Based on the results,we offer guidelines that are more reliable on how to choose an appropriate method. 展开更多
关键词 data envelopment analysis Big data data dimension reduction method
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部