A large amount of researches and studies have been recently performed by applying statistical and machine learning techniques for vibration-based damage detection. However, the global character inherent to the limited...A large amount of researches and studies have been recently performed by applying statistical and machine learning techniques for vibration-based damage detection. However, the global character inherent to the limited number of modal properties issued from operational modal analysis may be not appropriate for early-damage, which has generally a local character. The present paper aims at detecting this type of damage by using static SHM data and by assuming that early-damage produces dead load redistribution. To achieve this objective a data driven strategy is proposed, consisting of the combination of advanced statistical and machine learning methods such as principal component analysis, symbolic data analysis and cluster analysis. From this analysis it was observed that, under the noise levels measured on site, the proposed strategy is able to automatically detect stiffness reduction in stay cables reaching at least 1%.展开更多
It is well-known that the values of symbolic variables may take various forms such as an interval, a set of stochastic measurements of some underlying patterns or qualitative multi-values and so on. However, the major...It is well-known that the values of symbolic variables may take various forms such as an interval, a set of stochastic measurements of some underlying patterns or qualitative multi-values and so on. However, the majority of existing work in symbolic data analysis still focuses on interval values. Although some pioneering work in stochastic pattern based symbolic data and mixture of symbolic variables has been explored, it still lacks flexibility and computation efficiency to make full use of the distinctive individual symbolic variables. Therefore, we bring forward a novel hierarchical clustering method with weighted general Jaccard distance and effective global pruning strategy for complex symbolic data and apply it to emitter identification. Extensive experiments indicate that our method has outperformed its peers in both computational efficiency and emitter identification accuracy.展开更多
This paper examines the visualization of symbolic data and considers the challenges rising from its complex structure.Symbolic data is usually aggregated from large data sets and used to hide entry specific details an...This paper examines the visualization of symbolic data and considers the challenges rising from its complex structure.Symbolic data is usually aggregated from large data sets and used to hide entry specific details and to transform huge amounts of data(like big data)into analyzable quantities.It is also used to offer an overview in places where general trends are more important than individual details.Symbolic data comes in many forms like intervals,histograms,categories and modal multi-valued objects.Symbolic data can also be considered as a distribution.Currently,the de facto visualization approach for symbolic data is zoomstars which has many limitations.The biggest limitation is that the default distributions(histograms)are not supported in 2D as additional dimension is required.This paper proposes several new improvements for zoomstars which would enable it to visualize histograms in 2D by using a quantile or an equivalent interval approach.In addition,several improvements for categorical and modal variables are proposed for a clearer indication of presented categories.Recommendations for different approaches to zoomstars are offered depending on the data type and the desired goal.Furthermore,an alternative approach that allows visualizing the whole data set in comprehensive table-like graph,called shape encoding,is proposed.These visualizations and their usefulness are verified with three symbolic data sets in exploratory data mining phase to identify trends,similar objects and important features,detecting outliers and discrepancies in the data.展开更多
文摘A large amount of researches and studies have been recently performed by applying statistical and machine learning techniques for vibration-based damage detection. However, the global character inherent to the limited number of modal properties issued from operational modal analysis may be not appropriate for early-damage, which has generally a local character. The present paper aims at detecting this type of damage by using static SHM data and by assuming that early-damage produces dead load redistribution. To achieve this objective a data driven strategy is proposed, consisting of the combination of advanced statistical and machine learning methods such as principal component analysis, symbolic data analysis and cluster analysis. From this analysis it was observed that, under the noise levels measured on site, the proposed strategy is able to automatically detect stiffness reduction in stay cables reaching at least 1%.
基金This work was supported by the National Natural Science Foundation of China under Grant Nos. 61771177 and 61701454, the Natural Science Foundation of Jiangsu Province of China under Grant Nos. BK20160147 and BK20160148, and the Academy Project of Finland under Grant No. 310321.
文摘It is well-known that the values of symbolic variables may take various forms such as an interval, a set of stochastic measurements of some underlying patterns or qualitative multi-values and so on. However, the majority of existing work in symbolic data analysis still focuses on interval values. Although some pioneering work in stochastic pattern based symbolic data and mixture of symbolic variables has been explored, it still lacks flexibility and computation efficiency to make full use of the distinctive individual symbolic variables. Therefore, we bring forward a novel hierarchical clustering method with weighted general Jaccard distance and effective global pruning strategy for complex symbolic data and apply it to emitter identification. Extensive experiments indicate that our method has outperformed its peers in both computational efficiency and emitter identification accuracy.
文摘This paper examines the visualization of symbolic data and considers the challenges rising from its complex structure.Symbolic data is usually aggregated from large data sets and used to hide entry specific details and to transform huge amounts of data(like big data)into analyzable quantities.It is also used to offer an overview in places where general trends are more important than individual details.Symbolic data comes in many forms like intervals,histograms,categories and modal multi-valued objects.Symbolic data can also be considered as a distribution.Currently,the de facto visualization approach for symbolic data is zoomstars which has many limitations.The biggest limitation is that the default distributions(histograms)are not supported in 2D as additional dimension is required.This paper proposes several new improvements for zoomstars which would enable it to visualize histograms in 2D by using a quantile or an equivalent interval approach.In addition,several improvements for categorical and modal variables are proposed for a clearer indication of presented categories.Recommendations for different approaches to zoomstars are offered depending on the data type and the desired goal.Furthermore,an alternative approach that allows visualizing the whole data set in comprehensive table-like graph,called shape encoding,is proposed.These visualizations and their usefulness are verified with three symbolic data sets in exploratory data mining phase to identify trends,similar objects and important features,detecting outliers and discrepancies in the data.
基金the National Natural Science Foundation of China (NSFC) under grant No.60673044 and No.60633010the National High-Tech Program (863) under grant No.2006AA01Z402