When castings become complicated and the demands for precision of numerical simulation become higher,the numerical data of casting numerical simulation become more massive.On a general personal computer,these massive ...When castings become complicated and the demands for precision of numerical simulation become higher,the numerical data of casting numerical simulation become more massive.On a general personal computer,these massive numerical data may probably exceed the capacity of available memory,resulting in failure of rendering.Based on the out-of-core technique,this paper proposes a method to effectively utilize external storage and reduce memory usage dramatically,so as to solve the problem of insufficient memory for massive data rendering on general personal computers.Based on this method,a new postprocessor is developed.It is capable to illustrate filling and solidification processes of casting,as well as thermal stess.The new post-processor also provides fast interaction to simulation results.Theoretical analysis as well as several practical examples prove that the memory usage and loading time of the post-processor are independent of the size of the relevant files,but the proportion of the number of cells on surface.Meanwhile,the speed of rendering and fetching of value from the mouse is appreciable,and the demands of real-time and interaction are satisfied.展开更多
The Chang'e-3 (CE-3) mission is China's first exploration mission on the surface of the Moon that uses a lander and a rover. Eight instruments that form the scientific payloads have the following objectives: (1...The Chang'e-3 (CE-3) mission is China's first exploration mission on the surface of the Moon that uses a lander and a rover. Eight instruments that form the scientific payloads have the following objectives: (1) investigate the morphological features and geological structures at the landing site; (2) integrated in-situ analysis of minerals and chemical compositions; (3) integrated exploration of the structure of the lunar interior; (4) exploration of the lunar-terrestrial space environment, lunar sur- face environment and acquire Moon-based ultraviolet astronomical observations. The Ground Research and Application System (GRAS) is in charge of data acquisition and pre-processing, management of the payload in orbit, and managing the data products and their applications. The Data Pre-processing Subsystem (DPS) is a part of GRAS. The task of DPS is the pre-processing of raw data from the eight instruments that are part of CE-3, including channel processing, unpacking, package sorting, calibration and correction, identification of geographical location, calculation of probe azimuth angle, probe zenith angle, solar azimuth angle, and solar zenith angle and so on, and conducting quality checks. These processes produce Level 0, Level 1 and Level 2 data. The computing platform of this subsystem is comprised of a high-performance computing cluster, including a real-time subsystem used for processing Level 0 data and a post-time subsystem for generating Level 1 and Level 2 data. This paper de- scribes the CE-3 data pre-processing method, the data pre-processing subsystem, data classification, data validity and data products that are used for scientific studies.展开更多
In order to obtain high-precision GPS control point results and provide high-precision known points for various projects,this study uses a variety of mature GPS post-processing software to process the observation data...In order to obtain high-precision GPS control point results and provide high-precision known points for various projects,this study uses a variety of mature GPS post-processing software to process the observation data of the GPS control network of Guanyinge Reservoir,and compares the results obtained by several kinds of software.According to the test results,the reasons for the accuracy differences between different software are analyzed,and the optimal results are obtained in the analysis and comparison.The purpose of this paper is to provide useful reference for GPS software users to process data.展开更多
The present study aims to improve the efficiency of typical procedures used for post-processing flow field data by applying a neural-network technology.Assuming a problem of aircraft design as the workhorse,a regressi...The present study aims to improve the efficiency of typical procedures used for post-processing flow field data by applying a neural-network technology.Assuming a problem of aircraft design as the workhorse,a regression calculation model for processing the flow data of a FCN-VGG19 aircraft is elaborated based on VGGNet(Visual Geometry Group Net)and FCN(Fully Convolutional Network)techniques.As shown by the results,the model displays a strong fitting ability,and there is almost no over-fitting in training.Moreover,the model has good accuracy and convergence.For different input data and different grids,the model basically achieves convergence,showing good performances.It is shown that the proposed simulation regression model based on FCN has great potential in typical problems of computational fluid dynamics(CFD)and related data processing.展开更多
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse...Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.展开更多
Visual data mining is one of important approach of data mining techniques. Most of them are based on computer graphic techniques but few of them exploit image-processing techniques. This paper proposes an image proces...Visual data mining is one of important approach of data mining techniques. Most of them are based on computer graphic techniques but few of them exploit image-processing techniques. This paper proposes an image processing method, named RNAM (resemble neighborhood averaging method), to facilitate visual data mining, which is used to post-process the data mining result-image and help users to discover significant features and useful patterns effectively. The experiments show that the method is intuitive, easily-understanding and effectiveness. It provides a new approach for visual data mining.展开更多
POLAR is a compact space-borne detector initially designed to measure the polarization of hard X-rays emitted from Gamma-Ray Bursts in the energy range 50–500 ke V.This instrument was launched successfully onboard th...POLAR is a compact space-borne detector initially designed to measure the polarization of hard X-rays emitted from Gamma-Ray Bursts in the energy range 50–500 ke V.This instrument was launched successfully onboard the Chinese space laboratory Tiangong-2(TG-2) on 2016 September 15.After being switched on a few days later,tens of gigabytes of raw detection data were produced in-orbit by POLAR and transferred to the ground every day.Before the launch date,a full pipeline and related software were designed and developed for the purpose of quickly pre-processing all the raw data from POLAR,which include both science data and engineering data,then to generate the high level scientific data products that are suitable for later science analysis.This pipeline has been successfully applied for use by the POLAR Science Data Center in the Institute of High Energy Physics(IHEP) after POLAR was launched and switched on.A detailed introduction to the pipeline and some of the core relevant algorithms are presented in this paper.展开更多
Distributed/parallel-processing system like sun grid engine(SGE) that utilizes multiple nodes/cores is proposed for the faster processing of large sized satellite image data. After verification, distributed process en...Distributed/parallel-processing system like sun grid engine(SGE) that utilizes multiple nodes/cores is proposed for the faster processing of large sized satellite image data. After verification, distributed process environment for pre-processing performance can be improved by up to 560.65% from single processing system. Through this, analysis performance in various fields can be improved, and moreover, near-real time service can be achieved in near future.展开更多
There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful...There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.展开更多
There are a number of dirty data in observation data set derived from integrated ocean observing network system. Thus, the data must be carefully and reasonably processed before they are used for forecasting or analys...There are a number of dirty data in observation data set derived from integrated ocean observing network system. Thus, the data must be carefully and reasonably processed before they are used for forecasting or analysis. This paper proposes a data pre-processing model based on intelligent algorithms. Firstly, we introduce the integrated network platform of ocean observation. Next, the preprocessing model of data is presemed, and an imelligent cleaning model of data is proposed. Based on fuzzy clustering, the Kohonen clustering network is improved to fulfill the parallel calculation of fuzzy c-means clustering. The proposed dynamic algorithm can automatically f'md the new clustering center with the updated sample data. The rapid and dynamic performance of the model makes it suitable for real time calculation, and the efficiency and accuracy of the model is proved by test results through observation data analysis.展开更多
In order to evaluate radiometric normalization techniques, two image normalization algorithms for absolute radiometric correction of Landsat imagery were quantitatively compared in this paper, which are the Illuminati...In order to evaluate radiometric normalization techniques, two image normalization algorithms for absolute radiometric correction of Landsat imagery were quantitatively compared in this paper, which are the Illumination Correction Model proposed by Markham and Irish and the Illumination and Atmospheric Correction Model developed by the Remote Sensing and GIS Laboratory of the Utah State University. Relative noise, correlation coefficient and slope value were used as the criteria for the evaluation and comparison, which were derived from pseudo-invarlant features identified from multitemporal Landsat image pairs of Xiamen (厦门) and Fuzhou (福州) areas, both located in the eastern Fujian (福建) Province of China. Compared with the unnormalized image, the radiometric differences between the normalized multitemporal images were significantly reduced when the seasons of multitemporal images were different. However, there was no significant difference between the normalized and unnorrealized images with a similar seasonal condition. Furthermore, the correction results of two algorithms are similar when the images are relatively clear with a uniform atmospheric condition. Therefore, the radiometric normalization procedures should be carried out if the multitemporal images have a significant seasonal difference.展开更多
With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapi...With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapid development of IIoT.Blockchain technology has immutability,decentralization,and autonomy,which can greatly improve the inherent defects of the IIoT.In the traditional blockchain,data is stored in a Merkle tree.As data continues to grow,the scale of proofs used to validate it grows,threatening the efficiency,security,and reliability of blockchain-based IIoT.Accordingly,this paper first analyzes the inefficiency of the traditional blockchain structure in verifying the integrity and correctness of data.To solve this problem,a new Vector Commitment(VC)structure,Partition Vector Commitment(PVC),is proposed by improving the traditional VC structure.Secondly,this paper uses PVC instead of the Merkle tree to store big data generated by IIoT.PVC can improve the efficiency of traditional VC in the process of commitment and opening.Finally,this paper uses PVC to build a blockchain-based IIoT data security storage mechanism and carries out a comparative analysis of experiments.This mechanism can greatly reduce communication loss and maximize the rational use of storage space,which is of great significance for maintaining the security and stability of blockchain-based IIoT.展开更多
In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose...In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose a Hadoop based big data secure storage scheme.Firstly,in order to disperse the NameNode service from a single server to multiple servers,we combine HDFS federation and HDFS high-availability mechanisms,and use the Zookeeper distributed coordination mechanism to coordinate each node to achieve dual-channel storage.Then,we improve the ECC encryption algorithm for the encryption of ordinary data,and adopt a homomorphic encryption algorithm to encrypt data that needs to be calculated.To accelerate the encryption,we adopt the dualthread encryption mode.Finally,the HDFS control module is designed to combine the encryption algorithm with the storage model.Experimental results show that the proposed solution solves the problem of a single point of failure of metadata,performs well in terms of metadata reliability,and can realize the fault tolerance of the server.The improved encryption algorithm integrates the dual-channel storage mode,and the encryption storage efficiency improves by 27.6% on average.展开更多
Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal depende...Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal dependence,and noise.Therefore,methodologies for data augmentation and conversion of time series data into images for analysis have been studied.This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance,temporal dependence,and robustness to noise.The method of data augmentation is set as the addition of noise.It involves adding Gaussian noise,with the noise level set to 0.002,to maximize the generalization performance of the model.In addition,we use the Markov Transition Field(MTF)method to effectively visualize the dynamic transitions of the data while converting the time series data into images.It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data.For anomaly detection,the PatchCore model is applied to show excellent performance,and the detected anomaly areas are represented as heat maps.It allows for the detection of anomalies,and by applying an anomaly map to the original image,it is possible to capture the areas where anomalies occur.The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images.Additionally,when processed as images rather than as time series data,there was a significant reduction in both the size of the data and the training time.The proposed method can provide an important springboard for research in the field of anomaly detection using time series data.Besides,it helps solve problems such as analyzing complex patterns in data lightweight.展开更多
Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary w...Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary with a deformation condition.This study proposes a novel approach for accurately predicting an anisotropic deformation behavior of wrought Mg alloys using machine learning(ML)with data augmentation.The developed model combines four key strategies from data science:learning the entire flow curves,generative adversarial networks(GAN),algorithm-driven hyperparameter tuning,and gated recurrent unit(GRU)architecture.The proposed model,namely GAN-aided GRU,was extensively evaluated for various predictive scenarios,such as interpolation,extrapolation,and a limited dataset size.The model exhibited significant predictability and improved generalizability for estimating the anisotropic compressive behavior of ZK60 Mg alloys under 11 annealing conditions and for three loading directions.The GAN-aided GRU results were superior to those of previous ML models and constitutive equations.The superior performance was attributed to hyperparameter optimization,GAN-based data augmentation,and the inherent predictivity of the GRU for extrapolation.As a first attempt to employ ML techniques other than artificial neural networks,this study proposes a novel perspective on predicting the anisotropic deformation behaviors of wrought Mg alloys.展开更多
There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction...There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction temperature estimation approach based on neural network without additional cost is proposed and the lifetime calculation for IGBT using electric vehicle big data is performed.The direct current(DC)voltage,operation current,switching frequency,negative thermal coefficient thermistor(NTC)temperature and IGBT lifetime are inputs.And the junction temperature(T_(j))is output.With the rain flow counting method,the classified irregular temperatures are brought into the life model for the failure cycles.The fatigue accumulation method is then used to calculate the IGBT lifetime.To solve the limited computational and storage resources of electric vehicle controllers,the operation of IGBT lifetime calculation is running on a big data platform.The lifetime is then transmitted wirelessly to electric vehicles as input for neural network.Thus the junction temperature of IGBT under long-term operating conditions can be accurately estimated.A test platform of the motor controller combined with the vehicle big data server is built for the IGBT accelerated aging test.Subsequently,the IGBT lifetime predictions are derived from the junction temperature estimation by the neural network method and the thermal network method.The experiment shows that the lifetime prediction based on a neural network with big data demonstrates a higher accuracy than that of the thermal network,which improves the reliability evaluation of system.展开更多
As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The ...As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The eddy dissipation rate(EDR)has been established as the standard metric for quantifying turbulence in civil aviation.This study aims to explore a universally applicable symbolic classification approach based on genetic programming to detect turbulence anomalies using quick access recorder(QAR)data.The detection of atmospheric turbulence is approached as an anomaly detection problem.Comparative evaluations demonstrate that this approach performs on par with direct EDR calculation methods in identifying turbulence events.Moreover,comparisons with alternative machine learning techniques indicate that the proposed technique is the optimal methodology currently available.In summary,the use of symbolic classification via genetic programming enables accurate turbulence detection from QAR data,comparable to that with established EDR approaches and surpassing that achieved with machine learning algorithms.This finding highlights the potential of integrating symbolic classifiers into turbulence monitoring systems to enhance civil aviation safety amidst rising environmental and operational hazards.展开更多
Since the impoundment of Three Gorges Reservoir(TGR)in 2003,numerous slopes have experienced noticeable movement or destabilization owing to reservoir level changes and seasonal rainfall.One case is the Outang landsli...Since the impoundment of Three Gorges Reservoir(TGR)in 2003,numerous slopes have experienced noticeable movement or destabilization owing to reservoir level changes and seasonal rainfall.One case is the Outang landslide,a large-scale and active landslide,on the south bank of the Yangtze River.The latest monitoring data and site investigations available are analyzed to establish spatial and temporal landslide deformation characteristics.Data mining technology,including the two-step clustering and Apriori algorithm,is then used to identify the dominant triggers of landslide movement.In the data mining process,the two-step clustering method clusters the candidate triggers and displacement rate into several groups,and the Apriori algorithm generates correlation criteria for the cause-and-effect.The analysis considers multiple locations of the landslide and incorporates two types of time scales:longterm deformation on a monthly basis and short-term deformation on a daily basis.This analysis shows that the deformations of the Outang landslide are driven by both rainfall and reservoir water while its deformation varies spatiotemporally mainly due to the difference in local responses to hydrological factors.The data mining results reveal different dominant triggering factors depending on the monitoring frequency:the monthly and bi-monthly cumulative rainfall control the monthly deformation,and the 10-d cumulative rainfall and the 5-d cumulative drop of water level in the reservoir dominate the daily deformation of the landslide.It is concluded that the spatiotemporal deformation pattern and data mining rules associated with precipitation and reservoir water level have the potential to be broadly implemented for improving landslide prevention and control in the dam reservoirs and other landslideprone areas.展开更多
A benchmark experiment on^(238)U slab samples was conducted using a deuterium-tritium neutron source at the China Institute of Atomic Energy.The leakage neutron spectra within energy levels of 0.8-16 MeV at 60°an...A benchmark experiment on^(238)U slab samples was conducted using a deuterium-tritium neutron source at the China Institute of Atomic Energy.The leakage neutron spectra within energy levels of 0.8-16 MeV at 60°and 120°were measured using the time-of-flight method.The samples were prepared as rectangular slabs with a 30 cm square base and thicknesses of 3,6,and 9 cm.The leakage neutron spectra were also calculated using the MCNP-4C program based on the latest evaluated files of^(238)U evaluated neutron data from CENDL-3.2,ENDF/B-Ⅷ.0,JENDL-5.0,and JEFF-3.3.Based on the comparison,the deficiencies and improvements in^(238)U evaluated nuclear data were analyzed.The results showed the following.(1)The calculated results for CENDL-3.2 significantly overestimated the measurements in the energy interval of elastic scattering at 60°and 120°.(2)The calculated results of CENDL-3.2 overestimated the measurements in the energy interval of inelastic scattering at 120°.(3)The calculated results for CENDL-3.2 significantly overestimated the measurements in the 3-8.5 MeV energy interval at 60°and 120°.(4)The calculated results with JENDL-5.0 were generally consistent with the measurement results.展开更多
基金supported by the New Century Excellent Talents in University(NCET-09-0396)the National Science&Technology Key Projects of Numerical Control(2012ZX04014-031)+1 种基金the Natural Science Foundation of Hubei Province(2011CDB279)the Foundation for Innovative Research Groups of the Natural Science Foundation of Hubei Province,China(2010CDA067)
文摘When castings become complicated and the demands for precision of numerical simulation become higher,the numerical data of casting numerical simulation become more massive.On a general personal computer,these massive numerical data may probably exceed the capacity of available memory,resulting in failure of rendering.Based on the out-of-core technique,this paper proposes a method to effectively utilize external storage and reduce memory usage dramatically,so as to solve the problem of insufficient memory for massive data rendering on general personal computers.Based on this method,a new postprocessor is developed.It is capable to illustrate filling and solidification processes of casting,as well as thermal stess.The new post-processor also provides fast interaction to simulation results.Theoretical analysis as well as several practical examples prove that the memory usage and loading time of the post-processor are independent of the size of the relevant files,but the proportion of the number of cells on surface.Meanwhile,the speed of rendering and fetching of value from the mouse is appreciable,and the demands of real-time and interaction are satisfied.
文摘The Chang'e-3 (CE-3) mission is China's first exploration mission on the surface of the Moon that uses a lander and a rover. Eight instruments that form the scientific payloads have the following objectives: (1) investigate the morphological features and geological structures at the landing site; (2) integrated in-situ analysis of minerals and chemical compositions; (3) integrated exploration of the structure of the lunar interior; (4) exploration of the lunar-terrestrial space environment, lunar sur- face environment and acquire Moon-based ultraviolet astronomical observations. The Ground Research and Application System (GRAS) is in charge of data acquisition and pre-processing, management of the payload in orbit, and managing the data products and their applications. The Data Pre-processing Subsystem (DPS) is a part of GRAS. The task of DPS is the pre-processing of raw data from the eight instruments that are part of CE-3, including channel processing, unpacking, package sorting, calibration and correction, identification of geographical location, calculation of probe azimuth angle, probe zenith angle, solar azimuth angle, and solar zenith angle and so on, and conducting quality checks. These processes produce Level 0, Level 1 and Level 2 data. The computing platform of this subsystem is comprised of a high-performance computing cluster, including a real-time subsystem used for processing Level 0 data and a post-time subsystem for generating Level 1 and Level 2 data. This paper de- scribes the CE-3 data pre-processing method, the data pre-processing subsystem, data classification, data validity and data products that are used for scientific studies.
文摘In order to obtain high-precision GPS control point results and provide high-precision known points for various projects,this study uses a variety of mature GPS post-processing software to process the observation data of the GPS control network of Guanyinge Reservoir,and compares the results obtained by several kinds of software.According to the test results,the reasons for the accuracy differences between different software are analyzed,and the optimal results are obtained in the analysis and comparison.The purpose of this paper is to provide useful reference for GPS software users to process data.
文摘The present study aims to improve the efficiency of typical procedures used for post-processing flow field data by applying a neural-network technology.Assuming a problem of aircraft design as the workhorse,a regression calculation model for processing the flow data of a FCN-VGG19 aircraft is elaborated based on VGGNet(Visual Geometry Group Net)and FCN(Fully Convolutional Network)techniques.As shown by the results,the model displays a strong fitting ability,and there is almost no over-fitting in training.Moreover,the model has good accuracy and convergence.For different input data and different grids,the model basically achieves convergence,showing good performances.It is shown that the proposed simulation regression model based on FCN has great potential in typical problems of computational fluid dynamics(CFD)and related data processing.
基金supported in part by NIH grants R01NS39600,U01MH114829RF1MH128693(to GAA)。
文摘Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.
基金Supported by the National Natural Science Foun-dation of China (60173051) ,the Teaching and Research Award Pro-gramfor Outstanding Young Teachers in Higher Education Institu-tions of Ministry of Education of China ,and Liaoning Province HigherEducation Research Foundation (20040206)
文摘Visual data mining is one of important approach of data mining techniques. Most of them are based on computer graphic techniques but few of them exploit image-processing techniques. This paper proposes an image processing method, named RNAM (resemble neighborhood averaging method), to facilitate visual data mining, which is used to post-process the data mining result-image and help users to discover significant features and useful patterns effectively. The experiments show that the method is intuitive, easily-understanding and effectiveness. It provides a new approach for visual data mining.
基金financial support from the Joint Research Fund in Astronomy under a cooperative agreement between the National Natural Science Foundation of China and the Chinese Academy of Sciences (Grant No. U1631242)the National Natural Science Foundation of China (Grant Nos. 11503028 and 11403028)+1 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB23040400)the National Basic Research Program (973 Program) of China (2014CB845800)
文摘POLAR is a compact space-borne detector initially designed to measure the polarization of hard X-rays emitted from Gamma-Ray Bursts in the energy range 50–500 ke V.This instrument was launched successfully onboard the Chinese space laboratory Tiangong-2(TG-2) on 2016 September 15.After being switched on a few days later,tens of gigabytes of raw detection data were produced in-orbit by POLAR and transferred to the ground every day.Before the launch date,a full pipeline and related software were designed and developed for the purpose of quickly pre-processing all the raw data from POLAR,which include both science data and engineering data,then to generate the high level scientific data products that are suitable for later science analysis.This pipeline has been successfully applied for use by the POLAR Science Data Center in the Institute of High Energy Physics(IHEP) after POLAR was launched and switched on.A detailed introduction to the pipeline and some of the core relevant algorithms are presented in this paper.
基金supported by the Sharing and Diffusion of National R&D Outcome funded by the Korea Institute of Science and Technology Information
文摘Distributed/parallel-processing system like sun grid engine(SGE) that utilizes multiple nodes/cores is proposed for the faster processing of large sized satellite image data. After verification, distributed process environment for pre-processing performance can be improved by up to 560.65% from single processing system. Through this, analysis performance in various fields can be improved, and moreover, near-real time service can be achieved in near future.
文摘There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.
基金Key Science and Technology Project of the Shanghai Committee of Science and Technology, China (No.06dz1200921)Major Basic Research Project of the Shanghai Committee of Science and Technology(No.08JC1400100)+1 种基金Shanghai Talent Developing Foundation, China(No.001)Specialized Foundation for Excellent Talent of Shanghai,China
文摘There are a number of dirty data in observation data set derived from integrated ocean observing network system. Thus, the data must be carefully and reasonably processed before they are used for forecasting or analysis. This paper proposes a data pre-processing model based on intelligent algorithms. Firstly, we introduce the integrated network platform of ocean observation. Next, the preprocessing model of data is presemed, and an imelligent cleaning model of data is proposed. Based on fuzzy clustering, the Kohonen clustering network is improved to fulfill the parallel calculation of fuzzy c-means clustering. The proposed dynamic algorithm can automatically f'md the new clustering center with the updated sample data. The rapid and dynamic performance of the model makes it suitable for real time calculation, and the efficiency and accuracy of the model is proved by test results through observation data analysis.
基金This paper is supported by the National Natural Science Foundation ofChina (No .40371107) .
文摘In order to evaluate radiometric normalization techniques, two image normalization algorithms for absolute radiometric correction of Landsat imagery were quantitatively compared in this paper, which are the Illumination Correction Model proposed by Markham and Irish and the Illumination and Atmospheric Correction Model developed by the Remote Sensing and GIS Laboratory of the Utah State University. Relative noise, correlation coefficient and slope value were used as the criteria for the evaluation and comparison, which were derived from pseudo-invarlant features identified from multitemporal Landsat image pairs of Xiamen (厦门) and Fuzhou (福州) areas, both located in the eastern Fujian (福建) Province of China. Compared with the unnormalized image, the radiometric differences between the normalized multitemporal images were significantly reduced when the seasons of multitemporal images were different. However, there was no significant difference between the normalized and unnorrealized images with a similar seasonal condition. Furthermore, the correction results of two algorithms are similar when the images are relatively clear with a uniform atmospheric condition. Therefore, the radiometric normalization procedures should be carried out if the multitemporal images have a significant seasonal difference.
基金supported by China’s National Natural Science Foundation(Nos.62072249,62072056)This work is also funded by the National Science Foundation of Hunan Province(2020JJ2029).
文摘With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapid development of IIoT.Blockchain technology has immutability,decentralization,and autonomy,which can greatly improve the inherent defects of the IIoT.In the traditional blockchain,data is stored in a Merkle tree.As data continues to grow,the scale of proofs used to validate it grows,threatening the efficiency,security,and reliability of blockchain-based IIoT.Accordingly,this paper first analyzes the inefficiency of the traditional blockchain structure in verifying the integrity and correctness of data.To solve this problem,a new Vector Commitment(VC)structure,Partition Vector Commitment(PVC),is proposed by improving the traditional VC structure.Secondly,this paper uses PVC instead of the Merkle tree to store big data generated by IIoT.PVC can improve the efficiency of traditional VC in the process of commitment and opening.Finally,this paper uses PVC to build a blockchain-based IIoT data security storage mechanism and carries out a comparative analysis of experiments.This mechanism can greatly reduce communication loss and maximize the rational use of storage space,which is of great significance for maintaining the security and stability of blockchain-based IIoT.
文摘In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose a Hadoop based big data secure storage scheme.Firstly,in order to disperse the NameNode service from a single server to multiple servers,we combine HDFS federation and HDFS high-availability mechanisms,and use the Zookeeper distributed coordination mechanism to coordinate each node to achieve dual-channel storage.Then,we improve the ECC encryption algorithm for the encryption of ordinary data,and adopt a homomorphic encryption algorithm to encrypt data that needs to be calculated.To accelerate the encryption,we adopt the dualthread encryption mode.Finally,the HDFS control module is designed to combine the encryption algorithm with the storage model.Experimental results show that the proposed solution solves the problem of a single point of failure of metadata,performs well in terms of metadata reliability,and can realize the fault tolerance of the server.The improved encryption algorithm integrates the dual-channel storage mode,and the encryption storage efficiency improves by 27.6% on average.
基金This research was financially supported by the Ministry of Trade,Industry,and Energy(MOTIE),Korea,under the“Project for Research and Development with Middle Markets Enterprises and DNA(Data,Network,AI)Universities”(AI-based Safety Assessment and Management System for Concrete Structures)(ReferenceNumber P0024559)supervised by theKorea Institute for Advancement of Technology(KIAT).
文摘Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal dependence,and noise.Therefore,methodologies for data augmentation and conversion of time series data into images for analysis have been studied.This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance,temporal dependence,and robustness to noise.The method of data augmentation is set as the addition of noise.It involves adding Gaussian noise,with the noise level set to 0.002,to maximize the generalization performance of the model.In addition,we use the Markov Transition Field(MTF)method to effectively visualize the dynamic transitions of the data while converting the time series data into images.It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data.For anomaly detection,the PatchCore model is applied to show excellent performance,and the detected anomaly areas are represented as heat maps.It allows for the detection of anomalies,and by applying an anomaly map to the original image,it is possible to capture the areas where anomalies occur.The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images.Additionally,when processed as images rather than as time series data,there was a significant reduction in both the size of the data and the training time.The proposed method can provide an important springboard for research in the field of anomaly detection using time series data.Besides,it helps solve problems such as analyzing complex patterns in data lightweight.
基金Korea Institute of Energy Technology Evaluation and Planning(KETEP)grant funded by the Korea government(Grant No.20214000000140,Graduate School of Convergence for Clean Energy Integrated Power Generation)Korea Basic Science Institute(National Research Facilities and Equipment Center)grant funded by the Ministry of Education(2021R1A6C101A449)the National Research Foundation of Korea grant funded by the Ministry of Science and ICT(2021R1A2C1095139),Republic of Korea。
文摘Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary with a deformation condition.This study proposes a novel approach for accurately predicting an anisotropic deformation behavior of wrought Mg alloys using machine learning(ML)with data augmentation.The developed model combines four key strategies from data science:learning the entire flow curves,generative adversarial networks(GAN),algorithm-driven hyperparameter tuning,and gated recurrent unit(GRU)architecture.The proposed model,namely GAN-aided GRU,was extensively evaluated for various predictive scenarios,such as interpolation,extrapolation,and a limited dataset size.The model exhibited significant predictability and improved generalizability for estimating the anisotropic compressive behavior of ZK60 Mg alloys under 11 annealing conditions and for three loading directions.The GAN-aided GRU results were superior to those of previous ML models and constitutive equations.The superior performance was attributed to hyperparameter optimization,GAN-based data augmentation,and the inherent predictivity of the GRU for extrapolation.As a first attempt to employ ML techniques other than artificial neural networks,this study proposes a novel perspective on predicting the anisotropic deformation behaviors of wrought Mg alloys.
文摘There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction temperature estimation approach based on neural network without additional cost is proposed and the lifetime calculation for IGBT using electric vehicle big data is performed.The direct current(DC)voltage,operation current,switching frequency,negative thermal coefficient thermistor(NTC)temperature and IGBT lifetime are inputs.And the junction temperature(T_(j))is output.With the rain flow counting method,the classified irregular temperatures are brought into the life model for the failure cycles.The fatigue accumulation method is then used to calculate the IGBT lifetime.To solve the limited computational and storage resources of electric vehicle controllers,the operation of IGBT lifetime calculation is running on a big data platform.The lifetime is then transmitted wirelessly to electric vehicles as input for neural network.Thus the junction temperature of IGBT under long-term operating conditions can be accurately estimated.A test platform of the motor controller combined with the vehicle big data server is built for the IGBT accelerated aging test.Subsequently,the IGBT lifetime predictions are derived from the junction temperature estimation by the neural network method and the thermal network method.The experiment shows that the lifetime prediction based on a neural network with big data demonstrates a higher accuracy than that of the thermal network,which improves the reliability evaluation of system.
基金supported by the Meteorological Soft Science Project(Grant No.2023ZZXM29)the Natural Science Fund Project of Tianjin,China(Grant No.21JCYBJC00740)the Key Research and Development-Social Development Program of Jiangsu Province,China(Grant No.BE2021685).
文摘As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The eddy dissipation rate(EDR)has been established as the standard metric for quantifying turbulence in civil aviation.This study aims to explore a universally applicable symbolic classification approach based on genetic programming to detect turbulence anomalies using quick access recorder(QAR)data.The detection of atmospheric turbulence is approached as an anomaly detection problem.Comparative evaluations demonstrate that this approach performs on par with direct EDR calculation methods in identifying turbulence events.Moreover,comparisons with alternative machine learning techniques indicate that the proposed technique is the optimal methodology currently available.In summary,the use of symbolic classification via genetic programming enables accurate turbulence detection from QAR data,comparable to that with established EDR approaches and surpassing that achieved with machine learning algorithms.This finding highlights the potential of integrating symbolic classifiers into turbulence monitoring systems to enhance civil aviation safety amidst rising environmental and operational hazards.
基金supported by the Natural Science Foundation of Shandong Province,China(Grant No.ZR2021QD032)。
文摘Since the impoundment of Three Gorges Reservoir(TGR)in 2003,numerous slopes have experienced noticeable movement or destabilization owing to reservoir level changes and seasonal rainfall.One case is the Outang landslide,a large-scale and active landslide,on the south bank of the Yangtze River.The latest monitoring data and site investigations available are analyzed to establish spatial and temporal landslide deformation characteristics.Data mining technology,including the two-step clustering and Apriori algorithm,is then used to identify the dominant triggers of landslide movement.In the data mining process,the two-step clustering method clusters the candidate triggers and displacement rate into several groups,and the Apriori algorithm generates correlation criteria for the cause-and-effect.The analysis considers multiple locations of the landslide and incorporates two types of time scales:longterm deformation on a monthly basis and short-term deformation on a daily basis.This analysis shows that the deformations of the Outang landslide are driven by both rainfall and reservoir water while its deformation varies spatiotemporally mainly due to the difference in local responses to hydrological factors.The data mining results reveal different dominant triggering factors depending on the monitoring frequency:the monthly and bi-monthly cumulative rainfall control the monthly deformation,and the 10-d cumulative rainfall and the 5-d cumulative drop of water level in the reservoir dominate the daily deformation of the landslide.It is concluded that the spatiotemporal deformation pattern and data mining rules associated with precipitation and reservoir water level have the potential to be broadly implemented for improving landslide prevention and control in the dam reservoirs and other landslideprone areas.
基金This work was supported by the general program(No.1177531)joint funding(No.U2067205)from the National Natural Science Foundation of China.
文摘A benchmark experiment on^(238)U slab samples was conducted using a deuterium-tritium neutron source at the China Institute of Atomic Energy.The leakage neutron spectra within energy levels of 0.8-16 MeV at 60°and 120°were measured using the time-of-flight method.The samples were prepared as rectangular slabs with a 30 cm square base and thicknesses of 3,6,and 9 cm.The leakage neutron spectra were also calculated using the MCNP-4C program based on the latest evaluated files of^(238)U evaluated neutron data from CENDL-3.2,ENDF/B-Ⅷ.0,JENDL-5.0,and JEFF-3.3.Based on the comparison,the deficiencies and improvements in^(238)U evaluated nuclear data were analyzed.The results showed the following.(1)The calculated results for CENDL-3.2 significantly overestimated the measurements in the energy interval of elastic scattering at 60°and 120°.(2)The calculated results of CENDL-3.2 overestimated the measurements in the energy interval of inelastic scattering at 120°.(3)The calculated results for CENDL-3.2 significantly overestimated the measurements in the 3-8.5 MeV energy interval at 60°and 120°.(4)The calculated results with JENDL-5.0 were generally consistent with the measurement results.