Hydrocarbon micro-seepage can cause oxidation reduction reactions and produce altered minerals in surface sediments and soft. The typical altered minerals mapping by their diagnostic spectral features on hyper-spectra...Hydrocarbon micro-seepage can cause oxidation reduction reactions and produce altered minerals in surface sediments and soft. The typical altered minerals mapping by their diagnostic spectral features on hyper-spectral images is an important tool for the petroleum exploration industry. In this study, the airborne hyper-spectral data were used to investigate the altered minerals induced by hydrocarbon micro-seepages by spectral feature fitting (SFF) in the loess coverage area of Xifeng Oflfield. The results re- veal that the distribution region of the altered minerals induced by hydrocarbon micro-seepage is larger than the known oilfield exploration area. The potential hydrocarbon micro-seepage region was also re- vealed by the distribution of altered minerals besides the known hydrocarbon area. A fast index was pro- posed by the absorption depths of clay and carbonate minerals for assessment of hydrocarbon micro- seepage. And it gave much clearer boundaries for the hydrocarbon micro-seepage in the loess coverage area than those by the altered mineral mapping. In addition, some field samples were analyzed by X-ray diffrac- tion (XRD) and atomic absorption spectrophotometer to validate the results. Within the extents of hydro- carbon micro-seepage, there are lower contents of ferric iron and higher contents of carbonate minerals in these samples. Therefore, it is satisfactory to have the airborne hyper-spectral data to outline the extents of hydrocarbon micro-seepage for further hydrocarbon exploration in the loess coverage area.展开更多
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse...Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.展开更多
There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful...There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.展开更多
Haze is mainly caused by the suspended particulate matters in the air,of which the particulate matters pollution harms leaf vegetables.In this paper,oilseed rapes at four different growing periods were investigated in...Haze is mainly caused by the suspended particulate matters in the air,of which the particulate matters pollution harms leaf vegetables.In this paper,oilseed rapes at four different growing periods were investigated in a simulated particulate pollution environment.In combination of hyper-spectral technology and micro examination,the response of hyper-spectral characteristics of the leaf to particulate matters was investigated in-depth.The hyperspectral,chlorophyll content,net photosynthetic rate and stomatal conductance of leaf were obtained.The deposition and adsorption of particulate matters on the leaf were observed by Environmental Scanning Electron Microscope(ESEM).Normalized difference vegetation index(NDVI),modified red edge normalized(mNDVI705)and modified red edge simple ratio index(mSR705)were selected as characteristic parameters and the range of 510 nm~620 nm as the sensitive band.16 methods were used to establish the physiological information inversion model.The main results were as follows:Under the influence of particulate matters,the spectral reflectance decreased as a whole.With the increase of leaf age,the phenomenon of blue shift aggravated.The amplitude of yellow and blue edge decreased with overall decreasing vegetation indices.The furrows and irregular band protrusions in leaves were favorable for keeping particulate matters.With longer affecting time and more deposition of particle matters on the leaf,the stomatal opening became smaller.After comparing,principal component regression(PCR)+multiple scatter correction(MSC)+second derivative(SD)+Savitzky-Golay smooth(SG),and partial least square(PLS)+multiple scatter correction(MSC)+first derivative(FD)+Savitzky-Golay smooth(SG)were determined the best method to establish the inversion model of chlorophyll content and net photosynthetic rate respectively.This study may bring novel ideas for the diagnosis and analysis of the physiological response of leaf vegetables under particulate matters pollution using hyper-spectral technology.展开更多
With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapi...With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapid development of IIoT.Blockchain technology has immutability,decentralization,and autonomy,which can greatly improve the inherent defects of the IIoT.In the traditional blockchain,data is stored in a Merkle tree.As data continues to grow,the scale of proofs used to validate it grows,threatening the efficiency,security,and reliability of blockchain-based IIoT.Accordingly,this paper first analyzes the inefficiency of the traditional blockchain structure in verifying the integrity and correctness of data.To solve this problem,a new Vector Commitment(VC)structure,Partition Vector Commitment(PVC),is proposed by improving the traditional VC structure.Secondly,this paper uses PVC instead of the Merkle tree to store big data generated by IIoT.PVC can improve the efficiency of traditional VC in the process of commitment and opening.Finally,this paper uses PVC to build a blockchain-based IIoT data security storage mechanism and carries out a comparative analysis of experiments.This mechanism can greatly reduce communication loss and maximize the rational use of storage space,which is of great significance for maintaining the security and stability of blockchain-based IIoT.展开更多
In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose...In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose a Hadoop based big data secure storage scheme.Firstly,in order to disperse the NameNode service from a single server to multiple servers,we combine HDFS federation and HDFS high-availability mechanisms,and use the Zookeeper distributed coordination mechanism to coordinate each node to achieve dual-channel storage.Then,we improve the ECC encryption algorithm for the encryption of ordinary data,and adopt a homomorphic encryption algorithm to encrypt data that needs to be calculated.To accelerate the encryption,we adopt the dualthread encryption mode.Finally,the HDFS control module is designed to combine the encryption algorithm with the storage model.Experimental results show that the proposed solution solves the problem of a single point of failure of metadata,performs well in terms of metadata reliability,and can realize the fault tolerance of the server.The improved encryption algorithm integrates the dual-channel storage mode,and the encryption storage efficiency improves by 27.6% on average.展开更多
Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal depende...Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal dependence,and noise.Therefore,methodologies for data augmentation and conversion of time series data into images for analysis have been studied.This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance,temporal dependence,and robustness to noise.The method of data augmentation is set as the addition of noise.It involves adding Gaussian noise,with the noise level set to 0.002,to maximize the generalization performance of the model.In addition,we use the Markov Transition Field(MTF)method to effectively visualize the dynamic transitions of the data while converting the time series data into images.It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data.For anomaly detection,the PatchCore model is applied to show excellent performance,and the detected anomaly areas are represented as heat maps.It allows for the detection of anomalies,and by applying an anomaly map to the original image,it is possible to capture the areas where anomalies occur.The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images.Additionally,when processed as images rather than as time series data,there was a significant reduction in both the size of the data and the training time.The proposed method can provide an important springboard for research in the field of anomaly detection using time series data.Besides,it helps solve problems such as analyzing complex patterns in data lightweight.展开更多
A new technique is introduced in this paper regarding red tide recognition with remotely sensed hyper-spectral images based on empirical mode decomposition (EMD), from an artificial red tide experiment in the East C...A new technique is introduced in this paper regarding red tide recognition with remotely sensed hyper-spectral images based on empirical mode decomposition (EMD), from an artificial red tide experiment in the East China Sea in 2002. A set of characteristic parameters that describe absorbing crest and reflecting crest of the red tide and its recognition methods are put forward based on general pictre data, with which the spectral information of certain non-dominant alga species of a red tide occurrence is analyzed for establishing the foundation to estimate the species. Comparative experiments have proved that the method is effective. Meanwhile, the transitional area between red-tide zone and non-red-tide zone can be detected with the information of thickness of algae influence, with which a red tide can be forecast.展开更多
Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary w...Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary with a deformation condition.This study proposes a novel approach for accurately predicting an anisotropic deformation behavior of wrought Mg alloys using machine learning(ML)with data augmentation.The developed model combines four key strategies from data science:learning the entire flow curves,generative adversarial networks(GAN),algorithm-driven hyperparameter tuning,and gated recurrent unit(GRU)architecture.The proposed model,namely GAN-aided GRU,was extensively evaluated for various predictive scenarios,such as interpolation,extrapolation,and a limited dataset size.The model exhibited significant predictability and improved generalizability for estimating the anisotropic compressive behavior of ZK60 Mg alloys under 11 annealing conditions and for three loading directions.The GAN-aided GRU results were superior to those of previous ML models and constitutive equations.The superior performance was attributed to hyperparameter optimization,GAN-based data augmentation,and the inherent predictivity of the GRU for extrapolation.As a first attempt to employ ML techniques other than artificial neural networks,this study proposes a novel perspective on predicting the anisotropic deformation behaviors of wrought Mg alloys.展开更多
There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction...There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction temperature estimation approach based on neural network without additional cost is proposed and the lifetime calculation for IGBT using electric vehicle big data is performed.The direct current(DC)voltage,operation current,switching frequency,negative thermal coefficient thermistor(NTC)temperature and IGBT lifetime are inputs.And the junction temperature(T_(j))is output.With the rain flow counting method,the classified irregular temperatures are brought into the life model for the failure cycles.The fatigue accumulation method is then used to calculate the IGBT lifetime.To solve the limited computational and storage resources of electric vehicle controllers,the operation of IGBT lifetime calculation is running on a big data platform.The lifetime is then transmitted wirelessly to electric vehicles as input for neural network.Thus the junction temperature of IGBT under long-term operating conditions can be accurately estimated.A test platform of the motor controller combined with the vehicle big data server is built for the IGBT accelerated aging test.Subsequently,the IGBT lifetime predictions are derived from the junction temperature estimation by the neural network method and the thermal network method.The experiment shows that the lifetime prediction based on a neural network with big data demonstrates a higher accuracy than that of the thermal network,which improves the reliability evaluation of system.展开更多
As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The ...As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The eddy dissipation rate(EDR)has been established as the standard metric for quantifying turbulence in civil aviation.This study aims to explore a universally applicable symbolic classification approach based on genetic programming to detect turbulence anomalies using quick access recorder(QAR)data.The detection of atmospheric turbulence is approached as an anomaly detection problem.Comparative evaluations demonstrate that this approach performs on par with direct EDR calculation methods in identifying turbulence events.Moreover,comparisons with alternative machine learning techniques indicate that the proposed technique is the optimal methodology currently available.In summary,the use of symbolic classification via genetic programming enables accurate turbulence detection from QAR data,comparable to that with established EDR approaches and surpassing that achieved with machine learning algorithms.This finding highlights the potential of integrating symbolic classifiers into turbulence monitoring systems to enhance civil aviation safety amidst rising environmental and operational hazards.展开更多
A benchmark experiment on^(238)U slab samples was conducted using a deuterium-tritium neutron source at the China Institute of Atomic Energy.The leakage neutron spectra within energy levels of 0.8-16 MeV at 60°an...A benchmark experiment on^(238)U slab samples was conducted using a deuterium-tritium neutron source at the China Institute of Atomic Energy.The leakage neutron spectra within energy levels of 0.8-16 MeV at 60°and 120°were measured using the time-of-flight method.The samples were prepared as rectangular slabs with a 30 cm square base and thicknesses of 3,6,and 9 cm.The leakage neutron spectra were also calculated using the MCNP-4C program based on the latest evaluated files of^(238)U evaluated neutron data from CENDL-3.2,ENDF/B-Ⅷ.0,JENDL-5.0,and JEFF-3.3.Based on the comparison,the deficiencies and improvements in^(238)U evaluated nuclear data were analyzed.The results showed the following.(1)The calculated results for CENDL-3.2 significantly overestimated the measurements in the energy interval of elastic scattering at 60°and 120°.(2)The calculated results of CENDL-3.2 overestimated the measurements in the energy interval of inelastic scattering at 120°.(3)The calculated results for CENDL-3.2 significantly overestimated the measurements in the 3-8.5 MeV energy interval at 60°and 120°.(4)The calculated results with JENDL-5.0 were generally consistent with the measurement results.展开更多
When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to ...When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.展开更多
Accurate prediction of formation pore pressure is essential to predict fluid flow and manage hydrocarbon production in petroleum engineering.Recent deep learning technique has been receiving more interest due to the g...Accurate prediction of formation pore pressure is essential to predict fluid flow and manage hydrocarbon production in petroleum engineering.Recent deep learning technique has been receiving more interest due to the great potential to deal with pore pressure prediction.However,most of the traditional deep learning models are less efficient to address generalization problems.To fill this technical gap,in this work,we developed a new adaptive physics-informed deep learning model with high generalization capability to predict pore pressure values directly from seismic data.Specifically,the new model,named CGP-NN,consists of a novel parametric features extraction approach(1DCPP),a stacked multilayer gated recurrent model(multilayer GRU),and an adaptive physics-informed loss function.Through machine training,the developed model can automatically select the optimal physical model to constrain the results for each pore pressure prediction.The CGP-NN model has the best generalization when the physicsrelated metricλ=0.5.A hybrid approach combining Eaton and Bowers methods is also proposed to build machine-learnable labels for solving the problem of few labels.To validate the developed model and methodology,a case study on a complex reservoir in Tarim Basin was further performed to demonstrate the high accuracy on the pore pressure prediction of new wells along with the strong generalization ability.The adaptive physics-informed deep learning approach presented here has potential application in the prediction of pore pressures coupled with multiple genesis mechanisms using seismic data.展开更多
Getting insight into the spatiotemporal distribution patterns of knowledge innovation is receiving increasing attention from policymakers and economic research organizations.Many studies use bibliometric data to analy...Getting insight into the spatiotemporal distribution patterns of knowledge innovation is receiving increasing attention from policymakers and economic research organizations.Many studies use bibliometric data to analyze the popularity of certain research topics,well-adopted methodologies,influential authors,and the interrelationships among research disciplines.However,the visual exploration of the patterns of research topics with an emphasis on their spatial and temporal distribution remains challenging.This study combined a Space-Time Cube(STC)and a 3D glyph to represent the complex multivariate bibliographic data.We further implemented a visual design by developing an interactive interface.The effectiveness,understandability,and engagement of ST-Map are evaluated by seven experts in geovisualization.The results suggest that it is promising to use three-dimensional visualization to show the overview and on-demand details on a single screen.展开更多
For the goals of security and privacy preservation,we propose a blind batch encryption-and public ledger-based data sharing protocol that allows the integrity of sensitive data to be audited by a public ledger and all...For the goals of security and privacy preservation,we propose a blind batch encryption-and public ledger-based data sharing protocol that allows the integrity of sensitive data to be audited by a public ledger and allows privacy information to be preserved.Data owners can tightly manage their data with efficient revocation and only grant one-time adaptive access for the fulfillment of the requester.We prove that our protocol is semanticallly secure,blind,and secure against oblivious requesters and malicious file keepers.We also provide security analysis in the context of four typical attacks.展开更多
Ratoon rice,which refers to a second harvest of rice obtained from the regenerated tillers originating from the stubble of the first harvested crop,plays an important role in both food security and agroecology while r...Ratoon rice,which refers to a second harvest of rice obtained from the regenerated tillers originating from the stubble of the first harvested crop,plays an important role in both food security and agroecology while requiring minimal agricultural inputs.However,accurately identifying ratoon rice crops is challenging due to the similarity of its spectral features with other rice cropping systems(e.g.,double rice).Moreover,images with a high spatiotemporal resolution are essential since ratoon rice is generally cultivated in fragmented croplands within regions that frequently exhibit cloudy and rainy weather.In this study,taking Qichun County in Hubei Province,China as an example,we developed a new phenology-based ratoon rice vegetation index(PRVI)for the purpose of ratoon rice mapping at a 30 m spatial resolution using a robust time series generated from Harmonized Landsat and Sentinel-2(HLS)images.The PRVI that incorporated the red,near-infrared,and shortwave infrared 1 bands was developed based on the analysis of spectro-phenological separability and feature selection.Based on actual field samples,the performance of the PRVI for ratoon rice mapping was carefully evaluated by comparing it to several vegetation indices,including normalized difference vegetation index(NDVI),enhanced vegetation index(EVI)and land surface water index(LSWI).The results suggested that the PRVI could sufficiently capture the specific characteristics of ratoon rice,leading to a favorable separability between ratoon rice and other land cover types.Furthermore,the PRVI showed the best performance for identifying ratoon rice in the phenological phases characterized by grain filling and harvesting to tillering of the ratoon crop(GHS-TS2),indicating that only several images are required to obtain an accurate ratoon rice map.Finally,the PRVI performed better than NDVI,EVI,LSWI and their combination at the GHS-TS2 stages,with producer's accuracy and user's accuracy of 92.22 and 89.30%,respectively.These results demonstrate that the proposed PRVI based on HLS data can effectively identify ratoon rice in fragmented croplands at crucial phenological stages,which is promising for identifying the earliest timing of ratoon rice planting and can provide a fundamental dataset for crop management activities.展开更多
A comprehensive assessment of the spatial.aware mpervised learning algorithms for hyper.spectral image (HSI) classification was presented. For this purpose, standard support vector machines ( SVMs ), mudttnomial l...A comprehensive assessment of the spatial.aware mpervised learning algorithms for hyper.spectral image (HSI) classification was presented. For this purpose, standard support vector machines ( SVMs ), mudttnomial logistic regression ( MLR ) and sparse representation (SR) based supervised learning algorithm were compared both theoretically and experimentally. Performance of the discussed techniques was evaluated in terms of overall accuracy, average accuracy, kappa statistic coefficients, and sparsity of the solutions. Execution time, the computational burden, and the capability of the methods were investigated by using probabilistie analysis. For validating the accuracy a classical benchmark AVIRIS Indian pines data set was used. Experiments show that integrating spectral.spatial context can further improve the accuracy, reduce the misclassltication error although the cost of computational time will be increased.展开更多
The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive st...The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive structure for measuring the worth of data elements,hindering effective navigation of the changing digital environment.This paper aims to fill this research gap by introducing the innovative concept of“data components.”It proposes a graphtheoretic representation model that presents a clear mathematical definition and demonstrates the superiority of data components over traditional processing methods.Additionally,the paper introduces an information measurement model that provides a way to calculate the information entropy of data components and establish their increased informational value.The paper also assesses the value of information,suggesting a pricing mechanism based on its significance.In conclusion,this paper establishes a robust framework for understanding and quantifying the value of implicit information in data,laying the groundwork for future research and practical applications.展开更多
基金supported by the National High Technology Research and Development Program of China(No.2012AA12A308)China Geological Surveys(No.1212011087112)
文摘Hydrocarbon micro-seepage can cause oxidation reduction reactions and produce altered minerals in surface sediments and soft. The typical altered minerals mapping by their diagnostic spectral features on hyper-spectral images is an important tool for the petroleum exploration industry. In this study, the airborne hyper-spectral data were used to investigate the altered minerals induced by hydrocarbon micro-seepages by spectral feature fitting (SFF) in the loess coverage area of Xifeng Oflfield. The results re- veal that the distribution region of the altered minerals induced by hydrocarbon micro-seepage is larger than the known oilfield exploration area. The potential hydrocarbon micro-seepage region was also re- vealed by the distribution of altered minerals besides the known hydrocarbon area. A fast index was pro- posed by the absorption depths of clay and carbonate minerals for assessment of hydrocarbon micro- seepage. And it gave much clearer boundaries for the hydrocarbon micro-seepage in the loess coverage area than those by the altered mineral mapping. In addition, some field samples were analyzed by X-ray diffrac- tion (XRD) and atomic absorption spectrophotometer to validate the results. Within the extents of hydro- carbon micro-seepage, there are lower contents of ferric iron and higher contents of carbonate minerals in these samples. Therefore, it is satisfactory to have the airborne hyper-spectral data to outline the extents of hydrocarbon micro-seepage for further hydrocarbon exploration in the loess coverage area.
基金supported in part by NIH grants R01NS39600,U01MH114829RF1MH128693(to GAA)。
文摘Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.
文摘There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.
基金This work was funded under the auspices of the National Natural Science Foundation for Young Scientists Fund(31801259)the National Natural Science Foundation for Young Scientists Fund(32001418)the Science and Technology Development Project of Jilin Province(20200402015NC).
文摘Haze is mainly caused by the suspended particulate matters in the air,of which the particulate matters pollution harms leaf vegetables.In this paper,oilseed rapes at four different growing periods were investigated in a simulated particulate pollution environment.In combination of hyper-spectral technology and micro examination,the response of hyper-spectral characteristics of the leaf to particulate matters was investigated in-depth.The hyperspectral,chlorophyll content,net photosynthetic rate and stomatal conductance of leaf were obtained.The deposition and adsorption of particulate matters on the leaf were observed by Environmental Scanning Electron Microscope(ESEM).Normalized difference vegetation index(NDVI),modified red edge normalized(mNDVI705)and modified red edge simple ratio index(mSR705)were selected as characteristic parameters and the range of 510 nm~620 nm as the sensitive band.16 methods were used to establish the physiological information inversion model.The main results were as follows:Under the influence of particulate matters,the spectral reflectance decreased as a whole.With the increase of leaf age,the phenomenon of blue shift aggravated.The amplitude of yellow and blue edge decreased with overall decreasing vegetation indices.The furrows and irregular band protrusions in leaves were favorable for keeping particulate matters.With longer affecting time and more deposition of particle matters on the leaf,the stomatal opening became smaller.After comparing,principal component regression(PCR)+multiple scatter correction(MSC)+second derivative(SD)+Savitzky-Golay smooth(SG),and partial least square(PLS)+multiple scatter correction(MSC)+first derivative(FD)+Savitzky-Golay smooth(SG)were determined the best method to establish the inversion model of chlorophyll content and net photosynthetic rate respectively.This study may bring novel ideas for the diagnosis and analysis of the physiological response of leaf vegetables under particulate matters pollution using hyper-spectral technology.
基金supported by China’s National Natural Science Foundation(Nos.62072249,62072056)This work is also funded by the National Science Foundation of Hunan Province(2020JJ2029).
文摘With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapid development of IIoT.Blockchain technology has immutability,decentralization,and autonomy,which can greatly improve the inherent defects of the IIoT.In the traditional blockchain,data is stored in a Merkle tree.As data continues to grow,the scale of proofs used to validate it grows,threatening the efficiency,security,and reliability of blockchain-based IIoT.Accordingly,this paper first analyzes the inefficiency of the traditional blockchain structure in verifying the integrity and correctness of data.To solve this problem,a new Vector Commitment(VC)structure,Partition Vector Commitment(PVC),is proposed by improving the traditional VC structure.Secondly,this paper uses PVC instead of the Merkle tree to store big data generated by IIoT.PVC can improve the efficiency of traditional VC in the process of commitment and opening.Finally,this paper uses PVC to build a blockchain-based IIoT data security storage mechanism and carries out a comparative analysis of experiments.This mechanism can greatly reduce communication loss and maximize the rational use of storage space,which is of great significance for maintaining the security and stability of blockchain-based IIoT.
文摘In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose a Hadoop based big data secure storage scheme.Firstly,in order to disperse the NameNode service from a single server to multiple servers,we combine HDFS federation and HDFS high-availability mechanisms,and use the Zookeeper distributed coordination mechanism to coordinate each node to achieve dual-channel storage.Then,we improve the ECC encryption algorithm for the encryption of ordinary data,and adopt a homomorphic encryption algorithm to encrypt data that needs to be calculated.To accelerate the encryption,we adopt the dualthread encryption mode.Finally,the HDFS control module is designed to combine the encryption algorithm with the storage model.Experimental results show that the proposed solution solves the problem of a single point of failure of metadata,performs well in terms of metadata reliability,and can realize the fault tolerance of the server.The improved encryption algorithm integrates the dual-channel storage mode,and the encryption storage efficiency improves by 27.6% on average.
基金This research was financially supported by the Ministry of Trade,Industry,and Energy(MOTIE),Korea,under the“Project for Research and Development with Middle Markets Enterprises and DNA(Data,Network,AI)Universities”(AI-based Safety Assessment and Management System for Concrete Structures)(ReferenceNumber P0024559)supervised by theKorea Institute for Advancement of Technology(KIAT).
文摘Time-series data provide important information in many fields,and their processing and analysis have been the focus of much research.However,detecting anomalies is very difficult due to data imbalance,temporal dependence,and noise.Therefore,methodologies for data augmentation and conversion of time series data into images for analysis have been studied.This paper proposes a fault detection model that uses time series data augmentation and transformation to address the problems of data imbalance,temporal dependence,and robustness to noise.The method of data augmentation is set as the addition of noise.It involves adding Gaussian noise,with the noise level set to 0.002,to maximize the generalization performance of the model.In addition,we use the Markov Transition Field(MTF)method to effectively visualize the dynamic transitions of the data while converting the time series data into images.It enables the identification of patterns in time series data and assists in capturing the sequential dependencies of the data.For anomaly detection,the PatchCore model is applied to show excellent performance,and the detected anomaly areas are represented as heat maps.It allows for the detection of anomalies,and by applying an anomaly map to the original image,it is possible to capture the areas where anomalies occur.The performance evaluation shows that both F1-score and Accuracy are high when time series data is converted to images.Additionally,when processed as images rather than as time series data,there was a significant reduction in both the size of the data and the training time.The proposed method can provide an important springboard for research in the field of anomaly detection using time series data.Besides,it helps solve problems such as analyzing complex patterns in data lightweight.
基金Shandong Natural Science Fund (No.Y2007G32)the Doctoral Fund of Qingdao University of Science & Technology (No.0022143).
文摘A new technique is introduced in this paper regarding red tide recognition with remotely sensed hyper-spectral images based on empirical mode decomposition (EMD), from an artificial red tide experiment in the East China Sea in 2002. A set of characteristic parameters that describe absorbing crest and reflecting crest of the red tide and its recognition methods are put forward based on general pictre data, with which the spectral information of certain non-dominant alga species of a red tide occurrence is analyzed for establishing the foundation to estimate the species. Comparative experiments have proved that the method is effective. Meanwhile, the transitional area between red-tide zone and non-red-tide zone can be detected with the information of thickness of algae influence, with which a red tide can be forecast.
基金Korea Institute of Energy Technology Evaluation and Planning(KETEP)grant funded by the Korea government(Grant No.20214000000140,Graduate School of Convergence for Clean Energy Integrated Power Generation)Korea Basic Science Institute(National Research Facilities and Equipment Center)grant funded by the Ministry of Education(2021R1A6C101A449)the National Research Foundation of Korea grant funded by the Ministry of Science and ICT(2021R1A2C1095139),Republic of Korea。
文摘Mg alloys possess an inherent plastic anisotropy owing to the selective activation of deformation mechanisms depending on the loading condition.This characteristic results in a diverse range of flow curves that vary with a deformation condition.This study proposes a novel approach for accurately predicting an anisotropic deformation behavior of wrought Mg alloys using machine learning(ML)with data augmentation.The developed model combines four key strategies from data science:learning the entire flow curves,generative adversarial networks(GAN),algorithm-driven hyperparameter tuning,and gated recurrent unit(GRU)architecture.The proposed model,namely GAN-aided GRU,was extensively evaluated for various predictive scenarios,such as interpolation,extrapolation,and a limited dataset size.The model exhibited significant predictability and improved generalizability for estimating the anisotropic compressive behavior of ZK60 Mg alloys under 11 annealing conditions and for three loading directions.The GAN-aided GRU results were superior to those of previous ML models and constitutive equations.The superior performance was attributed to hyperparameter optimization,GAN-based data augmentation,and the inherent predictivity of the GRU for extrapolation.As a first attempt to employ ML techniques other than artificial neural networks,this study proposes a novel perspective on predicting the anisotropic deformation behaviors of wrought Mg alloys.
文摘There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction temperature estimation approach based on neural network without additional cost is proposed and the lifetime calculation for IGBT using electric vehicle big data is performed.The direct current(DC)voltage,operation current,switching frequency,negative thermal coefficient thermistor(NTC)temperature and IGBT lifetime are inputs.And the junction temperature(T_(j))is output.With the rain flow counting method,the classified irregular temperatures are brought into the life model for the failure cycles.The fatigue accumulation method is then used to calculate the IGBT lifetime.To solve the limited computational and storage resources of electric vehicle controllers,the operation of IGBT lifetime calculation is running on a big data platform.The lifetime is then transmitted wirelessly to electric vehicles as input for neural network.Thus the junction temperature of IGBT under long-term operating conditions can be accurately estimated.A test platform of the motor controller combined with the vehicle big data server is built for the IGBT accelerated aging test.Subsequently,the IGBT lifetime predictions are derived from the junction temperature estimation by the neural network method and the thermal network method.The experiment shows that the lifetime prediction based on a neural network with big data demonstrates a higher accuracy than that of the thermal network,which improves the reliability evaluation of system.
基金supported by the Meteorological Soft Science Project(Grant No.2023ZZXM29)the Natural Science Fund Project of Tianjin,China(Grant No.21JCYBJC00740)the Key Research and Development-Social Development Program of Jiangsu Province,China(Grant No.BE2021685).
文摘As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The eddy dissipation rate(EDR)has been established as the standard metric for quantifying turbulence in civil aviation.This study aims to explore a universally applicable symbolic classification approach based on genetic programming to detect turbulence anomalies using quick access recorder(QAR)data.The detection of atmospheric turbulence is approached as an anomaly detection problem.Comparative evaluations demonstrate that this approach performs on par with direct EDR calculation methods in identifying turbulence events.Moreover,comparisons with alternative machine learning techniques indicate that the proposed technique is the optimal methodology currently available.In summary,the use of symbolic classification via genetic programming enables accurate turbulence detection from QAR data,comparable to that with established EDR approaches and surpassing that achieved with machine learning algorithms.This finding highlights the potential of integrating symbolic classifiers into turbulence monitoring systems to enhance civil aviation safety amidst rising environmental and operational hazards.
基金This work was supported by the general program(No.1177531)joint funding(No.U2067205)from the National Natural Science Foundation of China.
文摘A benchmark experiment on^(238)U slab samples was conducted using a deuterium-tritium neutron source at the China Institute of Atomic Energy.The leakage neutron spectra within energy levels of 0.8-16 MeV at 60°and 120°were measured using the time-of-flight method.The samples were prepared as rectangular slabs with a 30 cm square base and thicknesses of 3,6,and 9 cm.The leakage neutron spectra were also calculated using the MCNP-4C program based on the latest evaluated files of^(238)U evaluated neutron data from CENDL-3.2,ENDF/B-Ⅷ.0,JENDL-5.0,and JEFF-3.3.Based on the comparison,the deficiencies and improvements in^(238)U evaluated nuclear data were analyzed.The results showed the following.(1)The calculated results for CENDL-3.2 significantly overestimated the measurements in the energy interval of elastic scattering at 60°and 120°.(2)The calculated results of CENDL-3.2 overestimated the measurements in the energy interval of inelastic scattering at 120°.(3)The calculated results for CENDL-3.2 significantly overestimated the measurements in the 3-8.5 MeV energy interval at 60°and 120°.(4)The calculated results with JENDL-5.0 were generally consistent with the measurement results.
基金supported by the Yunnan Major Scientific and Technological Projects(Grant No.202302AD080001)the National Natural Science Foundation,China(No.52065033).
文摘When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.
基金funded by the National Natural Science Foundation of China(General Program:No.52074314,No.U19B6003-05)National Key Research and Development Program of China(2019YFA0708303-05)。
文摘Accurate prediction of formation pore pressure is essential to predict fluid flow and manage hydrocarbon production in petroleum engineering.Recent deep learning technique has been receiving more interest due to the great potential to deal with pore pressure prediction.However,most of the traditional deep learning models are less efficient to address generalization problems.To fill this technical gap,in this work,we developed a new adaptive physics-informed deep learning model with high generalization capability to predict pore pressure values directly from seismic data.Specifically,the new model,named CGP-NN,consists of a novel parametric features extraction approach(1DCPP),a stacked multilayer gated recurrent model(multilayer GRU),and an adaptive physics-informed loss function.Through machine training,the developed model can automatically select the optimal physical model to constrain the results for each pore pressure prediction.The CGP-NN model has the best generalization when the physicsrelated metricλ=0.5.A hybrid approach combining Eaton and Bowers methods is also proposed to build machine-learnable labels for solving the problem of few labels.To validate the developed model and methodology,a case study on a complex reservoir in Tarim Basin was further performed to demonstrate the high accuracy on the pore pressure prediction of new wells along with the strong generalization ability.The adaptive physics-informed deep learning approach presented here has potential application in the prediction of pore pressures coupled with multiple genesis mechanisms using seismic data.
文摘Getting insight into the spatiotemporal distribution patterns of knowledge innovation is receiving increasing attention from policymakers and economic research organizations.Many studies use bibliometric data to analyze the popularity of certain research topics,well-adopted methodologies,influential authors,and the interrelationships among research disciplines.However,the visual exploration of the patterns of research topics with an emphasis on their spatial and temporal distribution remains challenging.This study combined a Space-Time Cube(STC)and a 3D glyph to represent the complex multivariate bibliographic data.We further implemented a visual design by developing an interactive interface.The effectiveness,understandability,and engagement of ST-Map are evaluated by seven experts in geovisualization.The results suggest that it is promising to use three-dimensional visualization to show the overview and on-demand details on a single screen.
基金partially supported by the National Natural Science Foundation of China under grant no.62372245the Foundation of Yunnan Key Laboratory of Blockchain Application Technology under Grant 202105AG070005+1 种基金in part by the Foundation of State Key Laboratory of Public Big Datain part by the Foundation of Key Laboratory of Computational Science and Application of Hainan Province under Grant JSKX202202。
文摘For the goals of security and privacy preservation,we propose a blind batch encryption-and public ledger-based data sharing protocol that allows the integrity of sensitive data to be audited by a public ledger and allows privacy information to be preserved.Data owners can tightly manage their data with efficient revocation and only grant one-time adaptive access for the fulfillment of the requester.We prove that our protocol is semanticallly secure,blind,and secure against oblivious requesters and malicious file keepers.We also provide security analysis in the context of four typical attacks.
基金supported by the National Natural Science Foundation of China(42271360 and 42271399)the Young Elite Scientists Sponsorship Program by China Association for Science and Technology(CAST)(2020QNRC001)the Fundamental Research Funds for the Central Universities,China(2662021JC013,CCNU22QN018)。
文摘Ratoon rice,which refers to a second harvest of rice obtained from the regenerated tillers originating from the stubble of the first harvested crop,plays an important role in both food security and agroecology while requiring minimal agricultural inputs.However,accurately identifying ratoon rice crops is challenging due to the similarity of its spectral features with other rice cropping systems(e.g.,double rice).Moreover,images with a high spatiotemporal resolution are essential since ratoon rice is generally cultivated in fragmented croplands within regions that frequently exhibit cloudy and rainy weather.In this study,taking Qichun County in Hubei Province,China as an example,we developed a new phenology-based ratoon rice vegetation index(PRVI)for the purpose of ratoon rice mapping at a 30 m spatial resolution using a robust time series generated from Harmonized Landsat and Sentinel-2(HLS)images.The PRVI that incorporated the red,near-infrared,and shortwave infrared 1 bands was developed based on the analysis of spectro-phenological separability and feature selection.Based on actual field samples,the performance of the PRVI for ratoon rice mapping was carefully evaluated by comparing it to several vegetation indices,including normalized difference vegetation index(NDVI),enhanced vegetation index(EVI)and land surface water index(LSWI).The results suggested that the PRVI could sufficiently capture the specific characteristics of ratoon rice,leading to a favorable separability between ratoon rice and other land cover types.Furthermore,the PRVI showed the best performance for identifying ratoon rice in the phenological phases characterized by grain filling and harvesting to tillering of the ratoon crop(GHS-TS2),indicating that only several images are required to obtain an accurate ratoon rice map.Finally,the PRVI performed better than NDVI,EVI,LSWI and their combination at the GHS-TS2 stages,with producer's accuracy and user's accuracy of 92.22 and 89.30%,respectively.These results demonstrate that the proposed PRVI based on HLS data can effectively identify ratoon rice in fragmented croplands at crucial phenological stages,which is promising for identifying the earliest timing of ratoon rice planting and can provide a fundamental dataset for crop management activities.
基金National Key Research and Development Program of China(No.2016YFF0103604)National Natural Science Foundations of China(Nos.61171165,11431015,61571230)+1 种基金National Scientific Equipment Developing Project of China(No.2012YQ050250)Natural Science Foundation of Jiangsu Province,China(No.BK20161500)
文摘A comprehensive assessment of the spatial.aware mpervised learning algorithms for hyper.spectral image (HSI) classification was presented. For this purpose, standard support vector machines ( SVMs ), mudttnomial logistic regression ( MLR ) and sparse representation (SR) based supervised learning algorithm were compared both theoretically and experimentally. Performance of the discussed techniques was evaluated in terms of overall accuracy, average accuracy, kappa statistic coefficients, and sparsity of the solutions. Execution time, the computational burden, and the capability of the methods were investigated by using probabilistie analysis. For validating the accuracy a classical benchmark AVIRIS Indian pines data set was used. Experiments show that integrating spectral.spatial context can further improve the accuracy, reduce the misclassltication error although the cost of computational time will be increased.
基金supported by the EU H2020 Research and Innovation Program under the Marie Sklodowska-Curie Grant Agreement(Project-DEEP,Grant number:101109045)National Key R&D Program of China with Grant number 2018YFB1800804+2 种基金the National Natural Science Foundation of China(Nos.NSFC 61925105,and 62171257)Tsinghua University-China Mobile Communications Group Co.,Ltd,Joint Institutethe Fundamental Research Funds for the Central Universities,China(No.FRF-NP-20-03)。
文摘The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive structure for measuring the worth of data elements,hindering effective navigation of the changing digital environment.This paper aims to fill this research gap by introducing the innovative concept of“data components.”It proposes a graphtheoretic representation model that presents a clear mathematical definition and demonstrates the superiority of data components over traditional processing methods.Additionally,the paper introduces an information measurement model that provides a way to calculate the information entropy of data components and establish their increased informational value.The paper also assesses the value of information,suggesting a pricing mechanism based on its significance.In conclusion,this paper establishes a robust framework for understanding and quantifying the value of implicit information in data,laying the groundwork for future research and practical applications.