To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mo...To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode. The optimal data model was confirmed by identifying data objects, defining relations and reviewing entities. The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely. On this basis, a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established, for which factual tables and dimensional tables have been designed. Finally, based on service design and user interface design, the dam safety monitoring system has been developed with Delphi as the development tool. This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design. It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.展开更多
The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for mul...The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.展开更多
Since its inception in the 1970s,multi-dimensional magnetic resonance(MR)has emerged as a powerful tool for non-invasive investigations of structures and molecular interactions.MR spectroscopy beyond one dimension all...Since its inception in the 1970s,multi-dimensional magnetic resonance(MR)has emerged as a powerful tool for non-invasive investigations of structures and molecular interactions.MR spectroscopy beyond one dimension allows the study of the correlation,exchange processes,and separation of overlapping spectral information.The multi-dimensional concept has been re-implemented over the last two decades to explore molecular motion and spin dynamics in porous media.Apart from Fourier transform,methods have been developed for processing the multi-dimensional time-domain data,identifying the fluid components,and estimating pore surface permeability via joint relaxation and diffusion spectra.Through the resolution of spectroscopic signals with spatial encoding gradients,multi-dimensional MR imaging has been widely used to investigate the microscopic environment of living tissues and distinguish diseases.Signals in each voxel are usually expressed as multi-exponential decay,representing microstructures or environments along multiple pore scales.The separation of contributions from different environments is a common ill-posed problem,which can be resolved numerically.Moreover,the inversion methods and experimental parameters determine the resolution of multi-dimensional spectra.This paper reviews the algorithms that have been proposed to process multidimensional MR datasets in different scenarios.Detailed information at the microscopic level,such as tissue components,fluid types and food structures in multi-disciplinary sciences,could be revealed through multi-dimensional MR.展开更多
In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and impl...In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and implementing a method by combining classical Apriori algorithm with the model, digging out frequent items of elevator accident data to explore the main reasons for the occurrence of elevator accidents. In addition, a collaborative edge model of elevator accidents is set to achieve data sharing, making it possible to check the detail of each cause to confirm the causes of elevator accidents. Lastly the association rules are applied to find the law of elevator Accidents.展开更多
Similarity measure design for discrete data group was proposed. Similarity measure design for continuous membership function was also carried out. Proposed similarity measures were designed based on fuzzy number and d...Similarity measure design for discrete data group was proposed. Similarity measure design for continuous membership function was also carried out. Proposed similarity measures were designed based on fuzzy number and distance measure, and were proved. To calculate the degree of similarity of discrete data, relative degree between data and total distribution was obtained. Discrete data similarity measure was completed with combination of mentioned relative degrees. Power interconnected system with multi characteristics was considered to apply discrete similarity measure. Naturally, similarity measure was extended to multi-dimensional similarity measure case, and applied to bus clustering problem.展开更多
Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outl...Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outlier. In this work, an effective outlier detection method based on multi-dimensional clustering and local density(ODBMCLD) is proposed. ODBMCLD firstly identifies the center objects by the local density peak of data objects, and clusters the whole dataset based on the center objects. Then, outlier objects belonging to different clusters will be marked as candidates of abnormal data. Finally, the top N points among these abnormal candidates are chosen as final anomaly objects with high outlier factors. The feasibility and effectiveness of the method are verified by experiments.展开更多
Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction mode...Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.展开更多
We propose an approach to learning sample embedding for analyzing multi-dimensional datasets.The basic idea is to extract rules from the given dataset and learn the embedding for each sample based on the rules it sati...We propose an approach to learning sample embedding for analyzing multi-dimensional datasets.The basic idea is to extract rules from the given dataset and learn the embedding for each sample based on the rules it satisfies.The approach can filter out pattern-irrelevant attributes,leading to significant visual structures of samples satisfying the same rules in the projection.In addition,analysts can understand a visual structure based on the rules that the involved samples satisfy,which improves the projection’s pattern interpretability.Our research involves two methods for achieving and applying the approach.First,we give a method to learn rule-based embedding for each sample.Second,we integrate the method into a system to achieve an analytical workflow.Cases on real-world dataset and quantitative experiment results show the usability and effectiveness of our approach.展开更多
Exploration of artworks is enjoyable but often time consuming.For example,it is not always easy to discover the favorite types of unknown painting works.It is not also always easy to explore unpopular painting works w...Exploration of artworks is enjoyable but often time consuming.For example,it is not always easy to discover the favorite types of unknown painting works.It is not also always easy to explore unpopular painting works which looks similar to painting works created by famous artists.This paper presents a painting image browser which assists the explorative discovery of user-interested painting works.The presented browser applies a new multidimensional data visualization technique that highlights particular ranges of particular numeric values based on association rules to suggest cues to find favorite painting images.This study assumes a large number of painting images are provided where categorical information(e.g.,names of artists,created year)is assigned to the images.The presented system firstly calculates the feature values of the images as a preprocessing step.Then the browser visualizes the multidimensional feature values as a heatmap and highlights association rules discovered from the relationships between the feature values and categorical information.This mechanism enables users to explore favorite painting images or painting images that look similar to famous painting works.Our case study and user evaluation demonstrates the effectiveness of the presented image browser.展开更多
Multidimensional data query has been gaining much interest in database research communities in recent years, yet many of the existing studies focus mainly on ten tralized systems. A solution to querying in Peer-to-Pee...Multidimensional data query has been gaining much interest in database research communities in recent years, yet many of the existing studies focus mainly on ten tralized systems. A solution to querying in Peer-to-Peer(P2P) environment was proposed to achieve both low processing cost in terms of the number of peers accessed and search messages and balanced query loads among peers. The system is based on a balanced tree structured P2P network. By partitioning the query space intelligently, the amount of query forwarding is effectively controlled, and the number of peers involved and search messages are also limited. Dynamic load balancing can be achieved during space partitioning and query resolving. Extensive experiments confirm the effectiveness and scalability of our algorithms on P2P networks.展开更多
Recently, sequence anomaly detection has been widely used in many fields. Sequence data in these fields are usually multi-dimensional over the data stream. It is a challenge to design an anomaly detection method for a...Recently, sequence anomaly detection has been widely used in many fields. Sequence data in these fields are usually multi-dimensional over the data stream. It is a challenge to design an anomaly detection method for a multi-dimensional sequence over the data stream to satisfy the requirements of accuracy and high speed. It is because:(1) Redundant dimensions in sequence data and large state space lead to a poor ability for sequence modeling;(2) Anomaly detection cannot adapt to the high-speed nature of the data stream, especially when concept drift occurs, and it will reduce the detection rate. On one hand, most existing methods of sequence anomaly detection focus on the single-dimension sequence. On the other hand, some studies concerning multi-dimensional sequence concentrate mainly on the static database rather than the data stream. To improve the performance of anomaly detection for a multi-dimensional sequence over the data stream, we propose a novel unsupervised fast and accurate anomaly detection(FAAD) method which includes three algorithms. First, a method called "information calculation and minimum spanning tree cluster" is adopted to reduce redundant dimensions. Second, to speed up model construction and ensure the detection rate for the sequence over the data stream, we propose a method called"random sampling and subsequence partitioning based on the index probabilistic suffix tree." Last, the method called "anomaly buffer based on model dynamic adjustment" dramatically reduces the effects of concept drift in the data stream. FAAD is implemented on the streaming platform Storm to detect multi-dimensional log audit data.Compared with the existing anomaly detection methods, FAAD has a good performance in detection rate and speed without being affected by concept drift.展开更多
In order to perform multi-dimensional data aggregation operations efficiently in edge computing-based Internet of things(IoT) systems, a new efficient privacy-preserving multi-dimensional data aggregation(EPMDA) schem...In order to perform multi-dimensional data aggregation operations efficiently in edge computing-based Internet of things(IoT) systems, a new efficient privacy-preserving multi-dimensional data aggregation(EPMDA) scheme is proposed in this paper. EPMDA scheme is characterized by employing the homomorphic Paillier encryption and SM9 signature algorithm. To improve the computation efficiency of the Paillier encryption operation, EPMDA scheme generates a pre-computed modular exponentiation table of each dimensional data, and the Paillier encryption operation can be implemented by using only several modular multiplications. For the multi-dimensional data, the scheme concatenates zeros between two adjacent dimensional data to avoid data overflow in the sum operation of ciphertexts. To enhance security, EPMDA scheme sets random number at the high address of the exponent. Moreover, the scheme utilizes SM9 signature scheme to guarantee device authentication and data integrity. The performance evaluation and comparison show that EPMDA scheme is more efficient than the existing multi-dimensional data aggregation schemes.展开更多
Study of fuzzy entropy and similarity measure on intuitionistic fuzzy sets (IFSs) was proposed and analyzed. Unlike fuzzy set, IFSs contain uncertainty named hesitance, which is contained in fuzzy membership function ...Study of fuzzy entropy and similarity measure on intuitionistic fuzzy sets (IFSs) was proposed and analyzed. Unlike fuzzy set, IFSs contain uncertainty named hesitance, which is contained in fuzzy membership function itself. Hence, designing fuzzy entropy is not easy because of many entropy definitions. By considering different fuzzy entropy definitions, fuzzy entropy on IFSs is designed and discussed. Similarity measure was also presented and its usefulness was verified to evaluate degree of similarity.展开更多
This paper starts with untime-diversification of the time-diversification deformation model and gives displacement distribution model of untime-diversification and simplifies further the study of deformation model. Th...This paper starts with untime-diversification of the time-diversification deformation model and gives displacement distribution model of untime-diversification and simplifies further the study of deformation model. The paper discusses the problem of least squares fitting of coordinate parameters model—parameters of deformation model. During discussion, the basic means of cubic B splines and two steps of multidimensional disorder datum fitting are adopted which can make fitting function calculated mostly approximate coordinate parameters model and it can make calculation easier.展开更多
This paper gives a brief introduction to a few new indexes and methods published in recent issues of seismological literature which have been explored especially by the authors and many of their collaborators for appl...This paper gives a brief introduction to a few new indexes and methods published in recent issues of seismological literature which have been explored especially by the authors and many of their collaborators for applying in earthquake prediction research. The new indexes include the statistical indexes of seismicity (Morishita index Iδ, the parameters C and b-value spectrum derived from the magnitude-frequency relation, etc. )and indexes describing the dynamical characteristics of seismic waves obtained from digitized seismologicrecords (wave form linearities, spectral characteristics, etc. ). The new methods fall into two categories:namely the methods of non-linear sciences (fractal analysis, self-similarity and self-organization structure,neural network) and graphical analysis methods of multi-dimensional data (face analysis, projection pursuit,chronogeometric analysis ).展开更多
Holistic understanding of wind behaviour over space,time and height is essential for harvesting wind energy application.This study presents a novel approach for mapping frequent wind profile patterns using multidimen...Holistic understanding of wind behaviour over space,time and height is essential for harvesting wind energy application.This study presents a novel approach for mapping frequent wind profile patterns using multidimensional sequential pattern mining(MDSPM).This study is illustrated with a time series of 24 years of European Centre for Medium-Range Weather Forecasts European Reanalysis-Interim gridded(0.125°×0.125°)wind data for the Netherlands every 6 h and at six height levels.The wind data were first transformed into two spatio-temporal sequence databases(for speed and direction,respectively).Then,the Linear time Closed Itemset Miner Sequence algorithm was used to extract the multidimensional sequential patterns,which were then visualized using a 3D wind rose,a circular histogram and a geographical map.These patterns were further analysed to determine their wind shear coefficients and turbulence intensities as well as their spatial overlap with current areas with wind turbines.Our analysis identified four frequent wind profile patterns.One of them highly suitable to harvest wind energy at a height of 128 m and 68.97%of the geographical area covered by this pattern already contains wind turbines.This study shows that the proposed approach is capable of efficiently extracting meaningful patterns from complex spatio-temporal datasets.展开更多
Earth observations and model simulations are generating big multidimensional array-based raster data.However,it is difficult to efficiently query these big raster data due to the inconsistency among the geospatial ras...Earth observations and model simulations are generating big multidimensional array-based raster data.However,it is difficult to efficiently query these big raster data due to the inconsistency among the geospatial raster data model,distributed physical data storage model,and the data pipeline in distributed computing frameworks.To efficiently process big geospatial data,this paper proposes a three-layer hierarchical indexing strategy to optimize Apache Spark with Hadoop Distributed File System(HDFS)from the following aspects:(1)improve I/O efficiency by adopting the chunking data structure;(2)keep the workload balance and high data locality by building the global index(k-d tree);(3)enable Spark and HDFS to natively support geospatial raster data formats(e.g.,HDF4,NetCDF4,GeoTiff)by building the local index(hash table);(4)index the in-memory data to further improve geospatial data queries;(5)develop a data repartition strategy to tune the query parallelism while keeping high data locality.The above strategies are implemented by developing the customized RDDs,and evaluated by comparing the performance with that of Spark SQL and SciSpark.The proposed indexing strategy can be applied to other distributed frameworks or cloud-based computing systems to natively support big geospatial data query with high efficiency.展开更多
Traditional spatial clustering methods have the disadvantage of "hardware division", and can not describe the physical characteristics of spatial entity effectively. In view of the above, this paper sets forth a gen...Traditional spatial clustering methods have the disadvantage of "hardware division", and can not describe the physical characteristics of spatial entity effectively. In view of the above, this paper sets forth a general multi-dimensional cloud model, which describes the characteristics of spatial objects more reasonably according to the idea of non-homogeneous and non-symmetry. Based on infrastructures' classification and demarcation in Zhanjiang, a detailed interpretation of clustering results is made from the spatial distribution of membership degree of clustering, the comparative study of Fuzzy C-means and a coupled analysis of residential land prices. General multi-dimensional cloud model reflects the integrated char- acteristics of spatial objects better, reveals the spatial distribution of potential information, and realizes spatial division more accurately in complex circumstances. However, due to the complexity of spatial interactions between geographical entities, the generation of cloud model is a specific and challenging task.展开更多
The synthetic lethality(SL)relationship arises when a combination of deficiencies in two genes leads to cell death,whereas a deficiency in either one of the two genes does not.The survival of the mutant tumor cells de...The synthetic lethality(SL)relationship arises when a combination of deficiencies in two genes leads to cell death,whereas a deficiency in either one of the two genes does not.The survival of the mutant tumor cells depends on the SL partners of the mutant gene,thereby the cancer cells could be selectively killed by inhibiting the SL partners of the oncogenic genes but normal cells could not.Therefore,there is an urgent need to develop more efficient computational methods of SL pairs identification for cancer targeted therapy.In this paper,we propose a new approach based on similarity fusion to predict SL pairs.Multiple types of gene similarity measures are integrated and/c-nearest neighbors algorithm(k-NN)is applied to achieve the similarity-based classification task between gene pairs.As a similarity-based method,our method demonstrated excellent performance in multiple experiments.Besides the effectiveness of our method,the ease of use and expansibility can also make our method more widely used in practice.展开更多
基金supported by the National Natural Science Foundation of China (Grant No. 50539010, 50539110, 50579010, 50539030 and 50809025)
文摘To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode. The optimal data model was confirmed by identifying data objects, defining relations and reviewing entities. The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely. On this basis, a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established, for which factual tables and dimensional tables have been designed. Finally, based on service design and user interface design, the dam safety monitoring system has been developed with Delphi as the development tool. This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design. It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.
基金supported by the Program of Introducing Talents of Disciplines to Universities of the Ministry of Education and State Administration of the Foreign Experts Affairs of China (the 111 Project, Grant No.B08048)the Special Basic Research Fund for Methodology in Hydrology of the Ministry of Sciences and Technology of China (Grant No. 2011IM011000)
文摘The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.
基金supported by the National Natural Science Foundation of China(No.61901465,82222032,82172050).
文摘Since its inception in the 1970s,multi-dimensional magnetic resonance(MR)has emerged as a powerful tool for non-invasive investigations of structures and molecular interactions.MR spectroscopy beyond one dimension allows the study of the correlation,exchange processes,and separation of overlapping spectral information.The multi-dimensional concept has been re-implemented over the last two decades to explore molecular motion and spin dynamics in porous media.Apart from Fourier transform,methods have been developed for processing the multi-dimensional time-domain data,identifying the fluid components,and estimating pore surface permeability via joint relaxation and diffusion spectra.Through the resolution of spectroscopic signals with spatial encoding gradients,multi-dimensional MR imaging has been widely used to investigate the microscopic environment of living tissues and distinguish diseases.Signals in each voxel are usually expressed as multi-exponential decay,representing microstructures or environments along multiple pore scales.The separation of contributions from different environments is a common ill-posed problem,which can be resolved numerically.Moreover,the inversion methods and experimental parameters determine the resolution of multi-dimensional spectra.This paper reviews the algorithms that have been proposed to process multidimensional MR datasets in different scenarios.Detailed information at the microscopic level,such as tissue components,fluid types and food structures in multi-disciplinary sciences,could be revealed through multi-dimensional MR.
文摘In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and implementing a method by combining classical Apriori algorithm with the model, digging out frequent items of elevator accident data to explore the main reasons for the occurrence of elevator accidents. In addition, a collaborative edge model of elevator accidents is set to achieve data sharing, making it possible to check the detail of each cause to confirm the causes of elevator accidents. Lastly the association rules are applied to find the law of elevator Accidents.
基金Project(2010-0020163) supported by Key Research Institute Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology, Korea
文摘Similarity measure design for discrete data group was proposed. Similarity measure design for continuous membership function was also carried out. Proposed similarity measures were designed based on fuzzy number and distance measure, and were proved. To calculate the degree of similarity of discrete data, relative degree between data and total distribution was obtained. Discrete data similarity measure was completed with combination of mentioned relative degrees. Power interconnected system with multi characteristics was considered to apply discrete similarity measure. Naturally, similarity measure was extended to multi-dimensional similarity measure case, and applied to bus clustering problem.
基金Project(61362021)supported by the National Natural Science Foundation of ChinaProject(2016GXNSFAA380149)supported by Natural Science Foundation of Guangxi Province,China+1 种基金Projects(2016YJCXB02,2017YJCX34)supported by Innovation Project of GUET Graduate Education,ChinaProject(2011KF11)supported by the Key Laboratory of Cognitive Radio and Information Processing,Ministry of Education,China
文摘Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outlier. In this work, an effective outlier detection method based on multi-dimensional clustering and local density(ODBMCLD) is proposed. ODBMCLD firstly identifies the center objects by the local density peak of data objects, and clusters the whole dataset based on the center objects. Then, outlier objects belonging to different clusters will be marked as candidates of abnormal data. Finally, the top N points among these abnormal candidates are chosen as final anomaly objects with high outlier factors. The feasibility and effectiveness of the method are verified by experiments.
基金Project(2023JH26-10100002)supported by the Liaoning Science and Technology Major Project,ChinaProjects(U21A20117,52074085)supported by the National Natural Science Foundation of China+1 种基金Project(2022JH2/101300008)supported by the Liaoning Applied Basic Research Program Project,ChinaProject(22567612H)supported by the Hebei Provincial Key Laboratory Performance Subsidy Project,China。
文摘Mill vibration is a common problem in rolling production,which directly affects the thickness accuracy of the strip and may even lead to strip fracture accidents in serious cases.The existing vibration prediction models do not consider the features contained in the data,resulting in limited improvement of model accuracy.To address these challenges,this paper proposes a multi-dimensional multi-modal cold rolling vibration time series prediction model(MDMMVPM)based on the deep fusion of multi-level networks.In the model,the long-term and short-term modal features of multi-dimensional data are considered,and the appropriate prediction algorithms are selected for different data features.Based on the established prediction model,the effects of tension and rolling force on mill vibration are analyzed.Taking the 5th stand of a cold mill in a steel mill as the research object,the innovative model is applied to predict the mill vibration for the first time.The experimental results show that the correlation coefficient(R^(2))of the model proposed in this paper is 92.5%,and the root-mean-square error(RMSE)is 0.0011,which significantly improves the modeling accuracy compared with the existing models.The proposed model is also suitable for the hot rolling process,which provides a new method for the prediction of strip rolling vibration.
文摘We propose an approach to learning sample embedding for analyzing multi-dimensional datasets.The basic idea is to extract rules from the given dataset and learn the embedding for each sample based on the rules it satisfies.The approach can filter out pattern-irrelevant attributes,leading to significant visual structures of samples satisfying the same rules in the projection.In addition,analysts can understand a visual structure based on the rules that the involved samples satisfy,which improves the projection’s pattern interpretability.Our research involves two methods for achieving and applying the approach.First,we give a method to learn rule-based embedding for each sample.Second,we integrate the method into a system to achieve an analytical workflow.Cases on real-world dataset and quantitative experiment results show the usability and effectiveness of our approach.
文摘Exploration of artworks is enjoyable but often time consuming.For example,it is not always easy to discover the favorite types of unknown painting works.It is not also always easy to explore unpopular painting works which looks similar to painting works created by famous artists.This paper presents a painting image browser which assists the explorative discovery of user-interested painting works.The presented browser applies a new multidimensional data visualization technique that highlights particular ranges of particular numeric values based on association rules to suggest cues to find favorite painting images.This study assumes a large number of painting images are provided where categorical information(e.g.,names of artists,created year)is assigned to the images.The presented system firstly calculates the feature values of the images as a preprocessing step.Then the browser visualizes the multidimensional feature values as a heatmap and highlights association rules discovered from the relationships between the feature values and categorical information.This mechanism enables users to explore favorite painting images or painting images that look similar to famous painting works.Our case study and user evaluation demonstrates the effectiveness of the presented image browser.
基金Supported by the Natural Science Foundation ofJiangsu Province(BG2004034)
文摘Multidimensional data query has been gaining much interest in database research communities in recent years, yet many of the existing studies focus mainly on ten tralized systems. A solution to querying in Peer-to-Peer(P2P) environment was proposed to achieve both low processing cost in terms of the number of peers accessed and search messages and balanced query loads among peers. The system is based on a balanced tree structured P2P network. By partitioning the query space intelligently, the amount of query forwarding is effectively controlled, and the number of peers involved and search messages are also limited. Dynamic load balancing can be achieved during space partitioning and query resolving. Extensive experiments confirm the effectiveness and scalability of our algorithms on P2P networks.
基金Project supported by the National Key R&D Program of China(No.2016YFB1000101)the National Natural Science Foundation of China(Nos.61379052 and 61502513)+1 种基金the Natural Science Foundation for Distinguished Young Scholars of Hunan Province,China(No.14JJ1026)the Specialized Research Fund for the Doctoral Program of Higher Education,China(No.20124307110015)
文摘Recently, sequence anomaly detection has been widely used in many fields. Sequence data in these fields are usually multi-dimensional over the data stream. It is a challenge to design an anomaly detection method for a multi-dimensional sequence over the data stream to satisfy the requirements of accuracy and high speed. It is because:(1) Redundant dimensions in sequence data and large state space lead to a poor ability for sequence modeling;(2) Anomaly detection cannot adapt to the high-speed nature of the data stream, especially when concept drift occurs, and it will reduce the detection rate. On one hand, most existing methods of sequence anomaly detection focus on the single-dimension sequence. On the other hand, some studies concerning multi-dimensional sequence concentrate mainly on the static database rather than the data stream. To improve the performance of anomaly detection for a multi-dimensional sequence over the data stream, we propose a novel unsupervised fast and accurate anomaly detection(FAAD) method which includes three algorithms. First, a method called "information calculation and minimum spanning tree cluster" is adopted to reduce redundant dimensions. Second, to speed up model construction and ensure the detection rate for the sequence over the data stream, we propose a method called"random sampling and subsequence partitioning based on the index probabilistic suffix tree." Last, the method called "anomaly buffer based on model dynamic adjustment" dramatically reduces the effects of concept drift in the data stream. FAAD is implemented on the streaming platform Storm to detect multi-dimensional log audit data.Compared with the existing anomaly detection methods, FAAD has a good performance in detection rate and speed without being affected by concept drift.
基金supported by the Key Research and Development Program of Shandong Province (the Major Scientific and Technological Innovation Project of Shandong Province)(2020CXGC010114)。
文摘In order to perform multi-dimensional data aggregation operations efficiently in edge computing-based Internet of things(IoT) systems, a new efficient privacy-preserving multi-dimensional data aggregation(EPMDA) scheme is proposed in this paper. EPMDA scheme is characterized by employing the homomorphic Paillier encryption and SM9 signature algorithm. To improve the computation efficiency of the Paillier encryption operation, EPMDA scheme generates a pre-computed modular exponentiation table of each dimensional data, and the Paillier encryption operation can be implemented by using only several modular multiplications. For the multi-dimensional data, the scheme concatenates zeros between two adjacent dimensional data to avoid data overflow in the sum operation of ciphertexts. To enhance security, EPMDA scheme sets random number at the high address of the exponent. Moreover, the scheme utilizes SM9 signature scheme to guarantee device authentication and data integrity. The performance evaluation and comparison show that EPMDA scheme is more efficient than the existing multi-dimensional data aggregation schemes.
基金Project(ER120001) supported by Development of Application Technology BioNano Super Composites, Korea
文摘Study of fuzzy entropy and similarity measure on intuitionistic fuzzy sets (IFSs) was proposed and analyzed. Unlike fuzzy set, IFSs contain uncertainty named hesitance, which is contained in fuzzy membership function itself. Hence, designing fuzzy entropy is not easy because of many entropy definitions. By considering different fuzzy entropy definitions, fuzzy entropy on IFSs is designed and discussed. Similarity measure was also presented and its usefulness was verified to evaluate degree of similarity.
文摘This paper starts with untime-diversification of the time-diversification deformation model and gives displacement distribution model of untime-diversification and simplifies further the study of deformation model. The paper discusses the problem of least squares fitting of coordinate parameters model—parameters of deformation model. During discussion, the basic means of cubic B splines and two steps of multidimensional disorder datum fitting are adopted which can make fitting function calculated mostly approximate coordinate parameters model and it can make calculation easier.
文摘This paper gives a brief introduction to a few new indexes and methods published in recent issues of seismological literature which have been explored especially by the authors and many of their collaborators for applying in earthquake prediction research. The new indexes include the statistical indexes of seismicity (Morishita index Iδ, the parameters C and b-value spectrum derived from the magnitude-frequency relation, etc. )and indexes describing the dynamical characteristics of seismic waves obtained from digitized seismologicrecords (wave form linearities, spectral characteristics, etc. ). The new methods fall into two categories:namely the methods of non-linear sciences (fractal analysis, self-similarity and self-organization structure,neural network) and graphical analysis methods of multi-dimensional data (face analysis, projection pursuit,chronogeometric analysis ).
基金This work was supported by the Malaysian Ministry of Education(SLAI)and Universiti Teknologi Malaysia(UTM).
文摘Holistic understanding of wind behaviour over space,time and height is essential for harvesting wind energy application.This study presents a novel approach for mapping frequent wind profile patterns using multidimensional sequential pattern mining(MDSPM).This study is illustrated with a time series of 24 years of European Centre for Medium-Range Weather Forecasts European Reanalysis-Interim gridded(0.125°×0.125°)wind data for the Netherlands every 6 h and at six height levels.The wind data were first transformed into two spatio-temporal sequence databases(for speed and direction,respectively).Then,the Linear time Closed Itemset Miner Sequence algorithm was used to extract the multidimensional sequential patterns,which were then visualized using a 3D wind rose,a circular histogram and a geographical map.These patterns were further analysed to determine their wind shear coefficients and turbulence intensities as well as their spatial overlap with current areas with wind turbines.Our analysis identified four frequent wind profile patterns.One of them highly suitable to harvest wind energy at a height of 128 m and 68.97%of the geographical area covered by this pattern already contains wind turbines.This study shows that the proposed approach is capable of efficiently extracting meaningful patterns from complex spatio-temporal datasets.
基金This research is funded by NASA(National Aeronautics and Space Administration)NCCS and AIST(NNX15AM85G)NSF I/UCRC,CSSI,and EarthCube Programs(1338925 and 1835507).
文摘Earth observations and model simulations are generating big multidimensional array-based raster data.However,it is difficult to efficiently query these big raster data due to the inconsistency among the geospatial raster data model,distributed physical data storage model,and the data pipeline in distributed computing frameworks.To efficiently process big geospatial data,this paper proposes a three-layer hierarchical indexing strategy to optimize Apache Spark with Hadoop Distributed File System(HDFS)from the following aspects:(1)improve I/O efficiency by adopting the chunking data structure;(2)keep the workload balance and high data locality by building the global index(k-d tree);(3)enable Spark and HDFS to natively support geospatial raster data formats(e.g.,HDF4,NetCDF4,GeoTiff)by building the local index(hash table);(4)index the in-memory data to further improve geospatial data queries;(5)develop a data repartition strategy to tune the query parallelism while keeping high data locality.The above strategies are implemented by developing the customized RDDs,and evaluated by comparing the performance with that of Spark SQL and SciSpark.The proposed indexing strategy can be applied to other distributed frameworks or cloud-based computing systems to natively support big geospatial data query with high efficiency.
基金National Natural Science Foundation of China, N0.40971102 Knowledge Innovation Project of the Chinese Academy of Sciences, No. KZCX2-YW-322 Special Grant for Postgraduates' Scientific Innovation and So- cial Practice in 2008
文摘Traditional spatial clustering methods have the disadvantage of "hardware division", and can not describe the physical characteristics of spatial entity effectively. In view of the above, this paper sets forth a general multi-dimensional cloud model, which describes the characteristics of spatial objects more reasonably according to the idea of non-homogeneous and non-symmetry. Based on infrastructures' classification and demarcation in Zhanjiang, a detailed interpretation of clustering results is made from the spatial distribution of membership degree of clustering, the comparative study of Fuzzy C-means and a coupled analysis of residential land prices. General multi-dimensional cloud model reflects the integrated char- acteristics of spatial objects better, reveals the spatial distribution of potential information, and realizes spatial division more accurately in complex circumstances. However, due to the complexity of spatial interactions between geographical entities, the generation of cloud model is a specific and challenging task.
文摘The synthetic lethality(SL)relationship arises when a combination of deficiencies in two genes leads to cell death,whereas a deficiency in either one of the two genes does not.The survival of the mutant tumor cells depends on the SL partners of the mutant gene,thereby the cancer cells could be selectively killed by inhibiting the SL partners of the oncogenic genes but normal cells could not.Therefore,there is an urgent need to develop more efficient computational methods of SL pairs identification for cancer targeted therapy.In this paper,we propose a new approach based on similarity fusion to predict SL pairs.Multiple types of gene similarity measures are integrated and/c-nearest neighbors algorithm(k-NN)is applied to achieve the similarity-based classification task between gene pairs.As a similarity-based method,our method demonstrated excellent performance in multiple experiments.Besides the effectiveness of our method,the ease of use and expansibility can also make our method more widely used in practice.