The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive st...The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive structure for measuring the worth of data elements,hindering effective navigation of the changing digital environment.This paper aims to fill this research gap by introducing the innovative concept of“data components.”It proposes a graphtheoretic representation model that presents a clear mathematical definition and demonstrates the superiority of data components over traditional processing methods.Additionally,the paper introduces an information measurement model that provides a way to calculate the information entropy of data components and establish their increased informational value.The paper also assesses the value of information,suggesting a pricing mechanism based on its significance.In conclusion,this paper establishes a robust framework for understanding and quantifying the value of implicit information in data,laying the groundwork for future research and practical applications.展开更多
Mobile networks possess significant information and thus are considered a gold mine for the researcher’s community.The call detail records(CDR)of a mobile network are used to identify the network’s efficacy and the ...Mobile networks possess significant information and thus are considered a gold mine for the researcher’s community.The call detail records(CDR)of a mobile network are used to identify the network’s efficacy and the mobile user’s behavior.It is evident from the recent literature that cyber-physical systems(CPS)were used in the analytics and modeling of telecom data.In addition,CPS is used to provide valuable services in smart cities.In general,a typical telecom company hasmillions of subscribers and thus generatesmassive amounts of data.From this aspect,data storage,analysis,and processing are the key concerns.To solve these issues,herein we propose a multilevel cyber-physical social system(CPSS)for the analysis and modeling of large internet data.Our proposed multilevel system has three levels and each level has a specific functionality.Initially,raw Call Detail Data(CDR)was collected at the first level.Herein,the data preprocessing,cleaning,and error removal operations were performed.In the second level,data processing,cleaning,reduction,integration,processing,and storage were performed.Herein,suggested internet activity record measures were applied.Our proposed system initially constructs a graph and then performs network analysis.Thus proposed CPSS system accurately identifies different areas of internet peak usage in a city(Milan city).Our research is helpful for the network operators to plan effective network configuration,management,and optimization of resources.展开更多
A modified multiple-component scattering power decomposition for analyzing polarimetric synthetic aperture radar(PolSAR)data is proposed.The modified decomposition involves two distinct steps.Firstly,ei⁃genvectors of ...A modified multiple-component scattering power decomposition for analyzing polarimetric synthetic aperture radar(PolSAR)data is proposed.The modified decomposition involves two distinct steps.Firstly,ei⁃genvectors of the coherency matrix are used to modify the scattering models.Secondly,the entropy and anisotro⁃py of targets are used to improve the volume scattering power.With the guarantee of high double-bounce scatter⁃ing power in the urban areas,the proposed algorithm effectively improves the volume scattering power of vegeta⁃tion areas.The efficacy of the modified multiple-component scattering power decomposition is validated using ac⁃tual AIRSAR PolSAR data.The scattering power obtained through decomposing the original coherency matrix and the coherency matrix after orientation angle compensation is compared with three algorithms.Results from the experiment demonstrate that the proposed decomposition yields more effective scattering power for different PolSAR data sets.展开更多
This article introduces a novel variant of the generalized linear exponential(GLE)distribution,known as the sine generalized linear exponential(SGLE)distribution.The SGLE distribution utilizes the sine transformation ...This article introduces a novel variant of the generalized linear exponential(GLE)distribution,known as the sine generalized linear exponential(SGLE)distribution.The SGLE distribution utilizes the sine transformation to enhance its capabilities.The updated distribution is very adaptable and may be efficiently used in the modeling of survival data and dependability issues.The suggested model incorporates a hazard rate function(HRF)that may display a rising,J-shaped,or bathtub form,depending on its unique characteristics.This model includes many well-known lifespan distributions as separate sub-models.The suggested model is accompanied with a range of statistical features.The model parameters are examined using the techniques of maximum likelihood and Bayesian estimation using progressively censored data.In order to evaluate the effectiveness of these techniques,we provide a set of simulated data for testing purposes.The relevance of the newly presented model is shown via two real-world dataset applications,highlighting its superiority over other respected similar models.展开更多
To solve the problems in restoring sedimentary facies and predicting reservoirs in loose gas-bearing sediment,based on seismic sedimentologic analysis of the first 9-component S-wave 3D seismic dataset of China,a four...To solve the problems in restoring sedimentary facies and predicting reservoirs in loose gas-bearing sediment,based on seismic sedimentologic analysis of the first 9-component S-wave 3D seismic dataset of China,a fourth-order isochronous stratigraphic framework was set up and then sedimentary facies and reservoirs in the Pleistocene Qigequan Formation in Taidong area of Qaidam Basin were studied by seismic geomorphology and seismic lithology.The study method and thought are as following.Firstly,techniques of phase rotation,frequency decomposition and fusion,and stratal slicing were applied to the 9-component S-wave seismic data to restore sedimentary facies of major marker beds based on sedimentary models reflected by satellite images.Then,techniques of seismic attribute extraction,principal component analysis,and random fitting were applied to calculate the reservoir thickness and physical parameters of a key sandbody,and the results are satisfactory and confirmed by blind testing wells.Study results reveal that the dominant sedimentary facies in the Qigequan Formation within the study area are delta front and shallow lake.The RGB fused slices indicate that there are two cycles with three sets of underwater distributary channel systems in one period.Among them,sandstones in the distributary channels of middle-low Qigequan Formation are thick and broad with superior physical properties,which are favorable reservoirs.The reservoir permeability is also affected by diagenesis.Distributary channel sandstone reservoirs extend further to the west of Sebei-1 gas field,which provides a basis to expand exploration to the western peripheral area.展开更多
Based on the actual data collected from the tight sandstone development zone, correlation analysis using theSpearman method was conducted to determine the main factors influencing the gas production rate of tightsands...Based on the actual data collected from the tight sandstone development zone, correlation analysis using theSpearman method was conducted to determine the main factors influencing the gas production rate of tightsandstone fracturing. An integrated model combining geological engineering and numerical simulation of fracturepropagation and production was completed. Based on data analysis, the hydraulic fracture parameters wereoptimized to develop a differentiated fracturing treatment adjustment plan. The results indicate that the influenceof geological and engineering factors in the X1 and X2 development zones in the study area differs significantly.Therefore, it is challenging to adopt a uniform development strategy to achieve rapid production increase. Thedata analysis reveals that the variation in gas production rate is primarily affected by the reservoir thickness andpermeability parameters as geological factors. On the other hand, the amount of treatment fluid and proppantaddition significantly impact the gas production rate as engineering factors. Among these factors, the influence ofgeological factors is more pronounced in block X1. Therefore, the main focus should be on further optimizing thefracturing interval and adjusting the geological development well location. Given the existing well location, thereis limited potential for further optimizing fracture parameters to increase production. For block X2, the fracturingparameters should be optimized. Data screening was conducted to identify outliers in the entire dataset, and adata-driven fracturing parameter optimization method was employed to determine the basic adjustment directionfor reservoir stimulation in the target block. This approach provides insights into the influence of geological,stimulation, and completion parameters on gas production rate. Consequently, the subsequent fracturing parameteroptimization design can significantly reduce the modeling and simulation workload and guide field operations toimprove and optimize hydraulic fracturing efficiency.展开更多
Every day,an NDT(Non-Destructive Testing)report will govern key decisions and inform inspection strategies that could affect the flow of millions of dollars which ultimately affects local environments and potential ri...Every day,an NDT(Non-Destructive Testing)report will govern key decisions and inform inspection strategies that could affect the flow of millions of dollars which ultimately affects local environments and potential risk to life.There is a direct correlation between report quality and equipment capability.The more able the equipment is-in terms of efficient data gathering,signal to noise ratio,positioning,and coverage-the more actionable the report is.This results in optimal maintenance and repair strategies providing the report is clear and well presented.Furthermore,when considering tank floor storage inspection it is essential that asset owners have total confidence in inspection findings and the ensuing reports.Tank floor inspection equipment must not only be efficient and highly capable,but data sets should be traceable and integrity maintained throughout.Corrosion mapping of large surface areas such as storage tank bottoms is an inherently arduous and time-consuming process.MFL(magnetic flux leakage)based tank bottom scanners present a well-established and highly rated method for inspection.There are many benefits of using modern MFL technology to generate actionable reports.Chief among these includes efficiency of coverage while gaining valuable information regarding defect location,severity,surface origin and the extent of coverage.More recent advancements in modern MFL tank bottom scanners afford the ability to scan and record data sets at areas of the tank bottom which were previously classed as dead zones or areas not scanned due to physical restraints.An example of this includes scanning the CZ(critical zone)which is the area close to the annular to shell junction weld.Inclusion of these additional dead zones increases overall inspection coverage,quality and traceability.Inspection of the CZ areas allows engineers to quickly determine the integrity of arguably the most important area of the tank bottom.Herein we discuss notable developments in CZ coverage,inspection efficiency and data integrity that combines to deliver an actionable report.The asset owner can interrogate this report to develop pertinent and accurate maintenance and repair strategies.展开更多
Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subse...Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.展开更多
The limited amount of data in the healthcare domain and the necessity of training samples for increased performance of deep learning models is a recurrent challenge,especially in medical imaging.Newborn Solutions aims...The limited amount of data in the healthcare domain and the necessity of training samples for increased performance of deep learning models is a recurrent challenge,especially in medical imaging.Newborn Solutions aims to enhance its non-invasive white blood cell counting device,Neosonics,by creating synthetic in vitro ultrasound images to facilitate a more efficient image generation process.This study addresses the data scarcity issue by designing and evaluating a continuous scalar conditional Generative Adversarial Network(GAN)to augment in vitro peritoneal dialysis ultrasound images,increasing both the volume and variability of training samples.The developed GAN architecture incorporates novel design features:varying kernel sizes in the generator’s transposed convolutional layers and a latent intermediate space,projecting noise and condition values for enhanced image resolution and specificity.The experimental results show that the GAN successfully generated diverse images of high visual quality,closely resembling real ultrasound samples.While visual results were promising,the use of GAN-based data augmentation did not consistently improve the performance of an image regressor in distinguishing features specific to varied white blood cell concentrations.Ultimately,while this continuous scalar conditional GAN model made strides in generating realistic images,further work is needed to achieve consistent gains in regression tasks,aiming for robust model generalization.展开更多
The Internet of things(IoT)is a wireless network designed to perform specific tasks and plays a crucial role in various fields such as environmental monitoring,surveillance,and healthcare.To address the limitations im...The Internet of things(IoT)is a wireless network designed to perform specific tasks and plays a crucial role in various fields such as environmental monitoring,surveillance,and healthcare.To address the limitations imposed by inadequate resources,energy,and network scalability,this type of network relies heavily on data aggregation and clustering algorithms.Although various conventional studies have aimed to enhance the lifespan of a network through robust systems,they do not always provide optimal efficiency for real-time applications.This paper presents an approach based on state-of-the-art machine-learning methods.In this study,we employed a novel approach that combines an extended version of principal component analysis(PCA)and a reinforcement learning algorithm to achieve efficient clustering and data reduction.The primary objectives of this study are to enhance the service life of a network,reduce energy usage,and improve data aggregation efficiency.We evaluated the proposed methodology using data collected from sensors deployed in agricultural fields for crop monitoring.Our proposed approach(PQL)was compared to previous studies that utilized adaptive Q-learning(AQL)and regional energy-aware clustering(REAC).Our study outperformed in terms of both network longevity and energy consumption and established a fault-tolerant network.展开更多
There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful...There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.展开更多
Irregular seismic data causes problems with multi-trace processing algorithms and degrades processing quality. We introduce the Projection onto Convex Sets (POCS) based image restoration method into the seismic data...Irregular seismic data causes problems with multi-trace processing algorithms and degrades processing quality. We introduce the Projection onto Convex Sets (POCS) based image restoration method into the seismic data reconstruction field to interpolate irregularly missing traces. For entire dead traces, we transfer the POCS iteration reconstruction process from the time to frequency domain to save computational cost because forward and reverse Fourier time transforms are not needed. In each iteration, the selection threshold parameter is important for reconstruction efficiency. In this paper, we designed two types of threshold models to reconstruct irregularly missing seismic data. The experimental results show that an exponential threshold can greatly reduce iterations and improve reconstruction efficiency compared to a linear threshold for the same reconstruction result. We also analyze the anti- noise and anti-alias ability of the POCS reconstruction method. Finally, theoretical model tests and real data examples indicate that the proposed method is efficient and applicable.展开更多
JCOMM has strategy to establish the network of WMO-IOC Centres for Marine-meteorological and Oceanographic Climate Data (CMOCs) under the new Marine Climate Data System (MCDS) in 2012 for improving the quality and...JCOMM has strategy to establish the network of WMO-IOC Centres for Marine-meteorological and Oceanographic Climate Data (CMOCs) under the new Marine Climate Data System (MCDS) in 2012 for improving the quality and timeliness of the marine-meteorological and oceanographic data, metadata and products available to end users. China as a candidate of CMOC China has been approved to run on a trial basis after the 4th Meeting of the Joint IOC/WMO Technical Commission for Oceanography and Marine Meteorology (JCOMM). This article states the developing intention of CMOC China in the next few years through the brief introduction to critical marine data, products and service system and cooperation projects in the world.展开更多
The absence of low-frequency information in seismic data is one of the most difficult problems in elastic full waveform inversion. Without low-frequency data, it is difficult to recover the long-wavelength components ...The absence of low-frequency information in seismic data is one of the most difficult problems in elastic full waveform inversion. Without low-frequency data, it is difficult to recover the long-wavelength components of subsurface models and the inversion converges to local minima. To solve this problem, the elastic envelope inversion method is introduced. Based on the elastic envelope operator that is capable of retrieving low- frequency signals hidden in multicomponent data, the proposed method uses the envelope of multicomponent seismic signals to construct a misfit function and then recover the long- wavelength components of the subsurface model. Numerical tests verify that the elastic envelope method reduces the inversion nonlinearity and provides better starting models for the subsequent conventional elastic full waveform inversion and elastic depth migration, even when low frequencies are missing in multicomponent data and the starting model is far from the true model. Numerical tests also suggest that the proposed method is more effective in reconstructing the long-wavelength components of the S-wave velocity model. The inversion of synthetic data based on the Marmousi-2 model shows that the resolution of conventional elastic full waveform inversion improves after using the starting model obtained using the elastic envelope method. Finally, the limitations of the elastic envelope inversion method are discussed.展开更多
A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR...A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR) modeling technique to measure correlations between data streams.It exploits estimated frequencies spectra to extract the essential features of streams.Each stream is represented as the sum of spectral components and the correlation is measured component-wise.Each spectral component is described by four parameters,namely,amplitude,phase,damping rate and frequency.The ε-lag-correlation between two spectral components is calculated.The algorithm uses such information as similarity measures in clustering data streams.Based on a sliding window model,the algorithm can continuously report the most recent clustering results and adjust the number of clusters.Experiments on real and synthetic streams show that the proposed clustering method has a higher speed and clustering quality than other similar methods.展开更多
In multi-component seismic exploration, the horizontal and vertical components both contain P- and SV-waves. The P- and SV-wavefields in a seismic record can be separated by their horizontal and vertical displacements...In multi-component seismic exploration, the horizontal and vertical components both contain P- and SV-waves. The P- and SV-wavefields in a seismic record can be separated by their horizontal and vertical displacements when upgoing P- and SV-waves arrive at the sea floor. If the sea floor P wave velocity, S wave velocity, and density are known, the separation can be achieved in ther-p domain. The separated wavefields are then transformed to the time domain. A method of separating P- and SV-wavefields is presented in this paper and used to effectively separate P- and SV-wavefields in synthetic and real data. The application to real data shows that this method is feasible and effective. It also can be used for free surface data.展开更多
A large number of autonomous profiling floats deployed in global oceans have provided abundant temperature and salinity profiles of the upper ocean. Many floats occasionally profile observations during the passage of ...A large number of autonomous profiling floats deployed in global oceans have provided abundant temperature and salinity profiles of the upper ocean. Many floats occasionally profile observations during the passage of tropical cyclones. These in-situ observations are valuable and useful in studying the ocean’s response to tropical cyclones, which are rarely observed due to harsh weather conditions. In this paper, the upper ocean response to the tropical cyclones in the northwestern Pacific during 2000–2005 is analyzed and discussed based on the data from Argo profiling floats. Results suggest that the passage of tropical cyclones caused the deepening of mixed layer depth (MLD), cooling of mixed layer temperature (MLT), and freshening of mixed layer salinity (MLS). The change in MLT is negatively correlated to wind speed. The cooling of the MLT extended for 50–150 km on the right side of the cyclone track. The change of MLS is almost symmetrical in distribution on both sides of the track, and the change of MLD is negatively correlated to pre-cyclone initial MLD.展开更多
Seismic data structure characteristics means the waveform character arranged in the time sequence at discrete data points in each 2-D or 3-D seismic trace. Hydrocarbon prediction using seismic data structure character...Seismic data structure characteristics means the waveform character arranged in the time sequence at discrete data points in each 2-D or 3-D seismic trace. Hydrocarbon prediction using seismic data structure characteristics is a new reservoir prediction technique. When the main pay interval is in carbonate fracture and fissure-cavern type reservoirs with very strong inhomogeneity, there are some difficulties with hydrocarbon prediction. Because of the special geological conditions of the eighth zone in the Tahe oil field, we apply seismic data structure characteristics to hydrocarbon prediction for the Ordovician reservoir in this zone. We divide the area oil zone into favorable and unfavorable blocks. Eighteen well locations were proposed in the favorable oil block, drilled, and recovered higher output of oil and gas.展开更多
In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonline...In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.展开更多
基金supported by the EU H2020 Research and Innovation Program under the Marie Sklodowska-Curie Grant Agreement(Project-DEEP,Grant number:101109045)National Key R&D Program of China with Grant number 2018YFB1800804+2 种基金the National Natural Science Foundation of China(Nos.NSFC 61925105,and 62171257)Tsinghua University-China Mobile Communications Group Co.,Ltd,Joint Institutethe Fundamental Research Funds for the Central Universities,China(No.FRF-NP-20-03)。
文摘The increasing dependence on data highlights the need for a detailed understanding of its behavior,encompassing the challenges involved in processing and evaluating it.However,current research lacks a comprehensive structure for measuring the worth of data elements,hindering effective navigation of the changing digital environment.This paper aims to fill this research gap by introducing the innovative concept of“data components.”It proposes a graphtheoretic representation model that presents a clear mathematical definition and demonstrates the superiority of data components over traditional processing methods.Additionally,the paper introduces an information measurement model that provides a way to calculate the information entropy of data components and establish their increased informational value.The paper also assesses the value of information,suggesting a pricing mechanism based on its significance.In conclusion,this paper establishes a robust framework for understanding and quantifying the value of implicit information in data,laying the groundwork for future research and practical applications.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(NRF-2021R1A6A1A03039493).
文摘Mobile networks possess significant information and thus are considered a gold mine for the researcher’s community.The call detail records(CDR)of a mobile network are used to identify the network’s efficacy and the mobile user’s behavior.It is evident from the recent literature that cyber-physical systems(CPS)were used in the analytics and modeling of telecom data.In addition,CPS is used to provide valuable services in smart cities.In general,a typical telecom company hasmillions of subscribers and thus generatesmassive amounts of data.From this aspect,data storage,analysis,and processing are the key concerns.To solve these issues,herein we propose a multilevel cyber-physical social system(CPSS)for the analysis and modeling of large internet data.Our proposed multilevel system has three levels and each level has a specific functionality.Initially,raw Call Detail Data(CDR)was collected at the first level.Herein,the data preprocessing,cleaning,and error removal operations were performed.In the second level,data processing,cleaning,reduction,integration,processing,and storage were performed.Herein,suggested internet activity record measures were applied.Our proposed system initially constructs a graph and then performs network analysis.Thus proposed CPSS system accurately identifies different areas of internet peak usage in a city(Milan city).Our research is helpful for the network operators to plan effective network configuration,management,and optimization of resources.
基金Supported by the National Natural Science Foundation of China(62376214)the Natural Science Basic Research Program of Shaanxi(2023-JC-YB-533)Foundation of Ministry of Education Key Lab.of Cognitive Radio and Information Processing(Guilin University of Electronic Technology)(CRKL200203)。
文摘A modified multiple-component scattering power decomposition for analyzing polarimetric synthetic aperture radar(PolSAR)data is proposed.The modified decomposition involves two distinct steps.Firstly,ei⁃genvectors of the coherency matrix are used to modify the scattering models.Secondly,the entropy and anisotro⁃py of targets are used to improve the volume scattering power.With the guarantee of high double-bounce scatter⁃ing power in the urban areas,the proposed algorithm effectively improves the volume scattering power of vegeta⁃tion areas.The efficacy of the modified multiple-component scattering power decomposition is validated using ac⁃tual AIRSAR PolSAR data.The scattering power obtained through decomposing the original coherency matrix and the coherency matrix after orientation angle compensation is compared with three algorithms.Results from the experiment demonstrate that the proposed decomposition yields more effective scattering power for different PolSAR data sets.
基金This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)(Grant Number IMSIU-RG23142).
文摘This article introduces a novel variant of the generalized linear exponential(GLE)distribution,known as the sine generalized linear exponential(SGLE)distribution.The SGLE distribution utilizes the sine transformation to enhance its capabilities.The updated distribution is very adaptable and may be efficiently used in the modeling of survival data and dependability issues.The suggested model incorporates a hazard rate function(HRF)that may display a rising,J-shaped,or bathtub form,depending on its unique characteristics.This model includes many well-known lifespan distributions as separate sub-models.The suggested model is accompanied with a range of statistical features.The model parameters are examined using the techniques of maximum likelihood and Bayesian estimation using progressively censored data.In order to evaluate the effectiveness of these techniques,we provide a set of simulated data for testing purposes.The relevance of the newly presented model is shown via two real-world dataset applications,highlighting its superiority over other respected similar models.
基金Supported by the CNPC Science and Technology Projects(2022-N/G-47808,2023-N/G-67014)RIPED International Cooperation Project(19HTY5000008).
文摘To solve the problems in restoring sedimentary facies and predicting reservoirs in loose gas-bearing sediment,based on seismic sedimentologic analysis of the first 9-component S-wave 3D seismic dataset of China,a fourth-order isochronous stratigraphic framework was set up and then sedimentary facies and reservoirs in the Pleistocene Qigequan Formation in Taidong area of Qaidam Basin were studied by seismic geomorphology and seismic lithology.The study method and thought are as following.Firstly,techniques of phase rotation,frequency decomposition and fusion,and stratal slicing were applied to the 9-component S-wave seismic data to restore sedimentary facies of major marker beds based on sedimentary models reflected by satellite images.Then,techniques of seismic attribute extraction,principal component analysis,and random fitting were applied to calculate the reservoir thickness and physical parameters of a key sandbody,and the results are satisfactory and confirmed by blind testing wells.Study results reveal that the dominant sedimentary facies in the Qigequan Formation within the study area are delta front and shallow lake.The RGB fused slices indicate that there are two cycles with three sets of underwater distributary channel systems in one period.Among them,sandstones in the distributary channels of middle-low Qigequan Formation are thick and broad with superior physical properties,which are favorable reservoirs.The reservoir permeability is also affected by diagenesis.Distributary channel sandstone reservoirs extend further to the west of Sebei-1 gas field,which provides a basis to expand exploration to the western peripheral area.
基金Research and Application of Key Technologies for Tight Gas Production Improvement and Rehabilitation of Linxing Shenfu(YXKY-ZL-01-2021)。
文摘Based on the actual data collected from the tight sandstone development zone, correlation analysis using theSpearman method was conducted to determine the main factors influencing the gas production rate of tightsandstone fracturing. An integrated model combining geological engineering and numerical simulation of fracturepropagation and production was completed. Based on data analysis, the hydraulic fracture parameters wereoptimized to develop a differentiated fracturing treatment adjustment plan. The results indicate that the influenceof geological and engineering factors in the X1 and X2 development zones in the study area differs significantly.Therefore, it is challenging to adopt a uniform development strategy to achieve rapid production increase. Thedata analysis reveals that the variation in gas production rate is primarily affected by the reservoir thickness andpermeability parameters as geological factors. On the other hand, the amount of treatment fluid and proppantaddition significantly impact the gas production rate as engineering factors. Among these factors, the influence ofgeological factors is more pronounced in block X1. Therefore, the main focus should be on further optimizing thefracturing interval and adjusting the geological development well location. Given the existing well location, thereis limited potential for further optimizing fracture parameters to increase production. For block X2, the fracturingparameters should be optimized. Data screening was conducted to identify outliers in the entire dataset, and adata-driven fracturing parameter optimization method was employed to determine the basic adjustment directionfor reservoir stimulation in the target block. This approach provides insights into the influence of geological,stimulation, and completion parameters on gas production rate. Consequently, the subsequent fracturing parameteroptimization design can significantly reduce the modeling and simulation workload and guide field operations toimprove and optimize hydraulic fracturing efficiency.
文摘Every day,an NDT(Non-Destructive Testing)report will govern key decisions and inform inspection strategies that could affect the flow of millions of dollars which ultimately affects local environments and potential risk to life.There is a direct correlation between report quality and equipment capability.The more able the equipment is-in terms of efficient data gathering,signal to noise ratio,positioning,and coverage-the more actionable the report is.This results in optimal maintenance and repair strategies providing the report is clear and well presented.Furthermore,when considering tank floor storage inspection it is essential that asset owners have total confidence in inspection findings and the ensuing reports.Tank floor inspection equipment must not only be efficient and highly capable,but data sets should be traceable and integrity maintained throughout.Corrosion mapping of large surface areas such as storage tank bottoms is an inherently arduous and time-consuming process.MFL(magnetic flux leakage)based tank bottom scanners present a well-established and highly rated method for inspection.There are many benefits of using modern MFL technology to generate actionable reports.Chief among these includes efficiency of coverage while gaining valuable information regarding defect location,severity,surface origin and the extent of coverage.More recent advancements in modern MFL tank bottom scanners afford the ability to scan and record data sets at areas of the tank bottom which were previously classed as dead zones or areas not scanned due to physical restraints.An example of this includes scanning the CZ(critical zone)which is the area close to the annular to shell junction weld.Inclusion of these additional dead zones increases overall inspection coverage,quality and traceability.Inspection of the CZ areas allows engineers to quickly determine the integrity of arguably the most important area of the tank bottom.Herein we discuss notable developments in CZ coverage,inspection efficiency and data integrity that combines to deliver an actionable report.The asset owner can interrogate this report to develop pertinent and accurate maintenance and repair strategies.
基金supported in part by NIH grants R01NS39600,U01MH114829RF1MH128693(to GAA)。
文摘Many fields,such as neuroscience,are experiencing the vast prolife ration of cellular data,underscoring the need fo r organizing and interpreting large datasets.A popular approach partitions data into manageable subsets via hierarchical clustering,but objective methods to determine the appropriate classification granularity are missing.We recently introduced a technique to systematically identify when to stop subdividing clusters based on the fundamental principle that cells must differ more between than within clusters.Here we present the corresponding protocol to classify cellular datasets by combining datadriven unsupervised hierarchical clustering with statistical testing.These general-purpose functions are applicable to any cellular dataset that can be organized as two-dimensional matrices of numerical values,including molecula r,physiological,and anatomical datasets.We demonstrate the protocol using cellular data from the Janelia MouseLight project to chara cterize morphological aspects of neurons.
文摘The limited amount of data in the healthcare domain and the necessity of training samples for increased performance of deep learning models is a recurrent challenge,especially in medical imaging.Newborn Solutions aims to enhance its non-invasive white blood cell counting device,Neosonics,by creating synthetic in vitro ultrasound images to facilitate a more efficient image generation process.This study addresses the data scarcity issue by designing and evaluating a continuous scalar conditional Generative Adversarial Network(GAN)to augment in vitro peritoneal dialysis ultrasound images,increasing both the volume and variability of training samples.The developed GAN architecture incorporates novel design features:varying kernel sizes in the generator’s transposed convolutional layers and a latent intermediate space,projecting noise and condition values for enhanced image resolution and specificity.The experimental results show that the GAN successfully generated diverse images of high visual quality,closely resembling real ultrasound samples.While visual results were promising,the use of GAN-based data augmentation did not consistently improve the performance of an image regressor in distinguishing features specific to varied white blood cell concentrations.Ultimately,while this continuous scalar conditional GAN model made strides in generating realistic images,further work is needed to achieve consistent gains in regression tasks,aiming for robust model generalization.
文摘The Internet of things(IoT)is a wireless network designed to perform specific tasks and plays a crucial role in various fields such as environmental monitoring,surveillance,and healthcare.To address the limitations imposed by inadequate resources,energy,and network scalability,this type of network relies heavily on data aggregation and clustering algorithms.Although various conventional studies have aimed to enhance the lifespan of a network through robust systems,they do not always provide optimal efficiency for real-time applications.This paper presents an approach based on state-of-the-art machine-learning methods.In this study,we employed a novel approach that combines an extended version of principal component analysis(PCA)and a reinforcement learning algorithm to achieve efficient clustering and data reduction.The primary objectives of this study are to enhance the service life of a network,reduce energy usage,and improve data aggregation efficiency.We evaluated the proposed methodology using data collected from sensors deployed in agricultural fields for crop monitoring.Our proposed approach(PQL)was compared to previous studies that utilized adaptive Q-learning(AQL)and regional energy-aware clustering(REAC).Our study outperformed in terms of both network longevity and energy consumption and established a fault-tolerant network.
文摘There is a growing body of clinical research on the utility of synthetic data derivatives,an emerging research tool in medicine.In nephrology,clinicians can use machine learning and artificial intelligence as powerful aids in their clinical decision-making while also preserving patient privacy.This is especially important given the epidemiology of chronic kidney disease,renal oncology,and hypertension worldwide.However,there remains a need to create a framework for guidance regarding how to better utilize synthetic data as a practical application in this research.
基金financially supported by National 863 Program (Grants No.2006AA 09A 102-09)National Science and Technology of Major Projects ( Grants No.2008ZX0 5025-001-001)
文摘Irregular seismic data causes problems with multi-trace processing algorithms and degrades processing quality. We introduce the Projection onto Convex Sets (POCS) based image restoration method into the seismic data reconstruction field to interpolate irregularly missing traces. For entire dead traces, we transfer the POCS iteration reconstruction process from the time to frequency domain to save computational cost because forward and reverse Fourier time transforms are not needed. In each iteration, the selection threshold parameter is important for reconstruction efficiency. In this paper, we designed two types of threshold models to reconstruct irregularly missing seismic data. The experimental results show that an exponential threshold can greatly reduce iterations and improve reconstruction efficiency compared to a linear threshold for the same reconstruction result. We also analyze the anti- noise and anti-alias ability of the POCS reconstruction method. Finally, theoretical model tests and real data examples indicate that the proposed method is efficient and applicable.
文摘JCOMM has strategy to establish the network of WMO-IOC Centres for Marine-meteorological and Oceanographic Climate Data (CMOCs) under the new Marine Climate Data System (MCDS) in 2012 for improving the quality and timeliness of the marine-meteorological and oceanographic data, metadata and products available to end users. China as a candidate of CMOC China has been approved to run on a trial basis after the 4th Meeting of the Joint IOC/WMO Technical Commission for Oceanography and Marine Meteorology (JCOMM). This article states the developing intention of CMOC China in the next few years through the brief introduction to critical marine data, products and service system and cooperation projects in the world.
文摘The absence of low-frequency information in seismic data is one of the most difficult problems in elastic full waveform inversion. Without low-frequency data, it is difficult to recover the long-wavelength components of subsurface models and the inversion converges to local minima. To solve this problem, the elastic envelope inversion method is introduced. Based on the elastic envelope operator that is capable of retrieving low- frequency signals hidden in multicomponent data, the proposed method uses the envelope of multicomponent seismic signals to construct a misfit function and then recover the long- wavelength components of the subsurface model. Numerical tests verify that the elastic envelope method reduces the inversion nonlinearity and provides better starting models for the subsequent conventional elastic full waveform inversion and elastic depth migration, even when low frequencies are missing in multicomponent data and the starting model is far from the true model. Numerical tests also suggest that the proposed method is more effective in reconstructing the long-wavelength components of the S-wave velocity model. The inversion of synthetic data based on the Marmousi-2 model shows that the resolution of conventional elastic full waveform inversion improves after using the starting model obtained using the elastic envelope method. Finally, the limitations of the elastic envelope inversion method are discussed.
基金The National Natural Science Foundation of China(No.60673060)the Natural Science Foundation of Jiangsu Province(No.BK2005047)
文摘A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR) modeling technique to measure correlations between data streams.It exploits estimated frequencies spectra to extract the essential features of streams.Each stream is represented as the sum of spectral components and the correlation is measured component-wise.Each spectral component is described by four parameters,namely,amplitude,phase,damping rate and frequency.The ε-lag-correlation between two spectral components is calculated.The algorithm uses such information as similarity measures in clustering data streams.Based on a sliding window model,the algorithm can continuously report the most recent clustering results and adjust the number of clusters.Experiments on real and synthetic streams show that the proposed clustering method has a higher speed and clustering quality than other similar methods.
基金This research is sponsored by National Natural Science Foundation of China (No. 40272041) and Innovative Foundation of CNPC (N0. 04E702).
文摘In multi-component seismic exploration, the horizontal and vertical components both contain P- and SV-waves. The P- and SV-wavefields in a seismic record can be separated by their horizontal and vertical displacements when upgoing P- and SV-waves arrive at the sea floor. If the sea floor P wave velocity, S wave velocity, and density are known, the separation can be achieved in ther-p domain. The separated wavefields are then transformed to the time domain. A method of separating P- and SV-wavefields is presented in this paper and used to effectively separate P- and SV-wavefields in synthetic and real data. The application to real data shows that this method is feasible and effective. It also can be used for free surface data.
基金the Ministry of Science and Technology of China (No.2002CB714001 and 2001CCB00200)the Youth Fund of State Oceanic Administration (No. 2004203)
文摘A large number of autonomous profiling floats deployed in global oceans have provided abundant temperature and salinity profiles of the upper ocean. Many floats occasionally profile observations during the passage of tropical cyclones. These in-situ observations are valuable and useful in studying the ocean’s response to tropical cyclones, which are rarely observed due to harsh weather conditions. In this paper, the upper ocean response to the tropical cyclones in the northwestern Pacific during 2000–2005 is analyzed and discussed based on the data from Argo profiling floats. Results suggest that the passage of tropical cyclones caused the deepening of mixed layer depth (MLD), cooling of mixed layer temperature (MLT), and freshening of mixed layer salinity (MLS). The change in MLT is negatively correlated to wind speed. The cooling of the MLT extended for 50–150 km on the right side of the cyclone track. The change of MLS is almost symmetrical in distribution on both sides of the track, and the change of MLD is negatively correlated to pre-cyclone initial MLD.
基金This reservoir research is sponsored by the National 973 Subject Project (No. 2001CB209).
文摘Seismic data structure characteristics means the waveform character arranged in the time sequence at discrete data points in each 2-D or 3-D seismic trace. Hydrocarbon prediction using seismic data structure characteristics is a new reservoir prediction technique. When the main pay interval is in carbonate fracture and fissure-cavern type reservoirs with very strong inhomogeneity, there are some difficulties with hydrocarbon prediction. Because of the special geological conditions of the eighth zone in the Tahe oil field, we apply seismic data structure characteristics to hydrocarbon prediction for the Ordovician reservoir in this zone. We divide the area oil zone into favorable and unfavorable blocks. Eighteen well locations were proposed in the favorable oil block, drilled, and recovered higher output of oil and gas.
基金This project is supported by Special Foundation for Major State Basic Research of China (Project 973, No.G1998030415)
文摘In the industrial process situation, principal component analysis (PCA) is ageneral method in data reconciliation. However, PCA sometime is unfeasible to nonlinear featureanalysis and limited in application to nonlinear industrial process. Kernel PCA (KPCA) is extensionof PCA and can be used for nonlinear feature analysis. A nonlinear data reconciliation method basedon KPCA is proposed. The basic idea of this method is that firstly original data are mapped to highdimensional feature space by nonlinear function, and PCA is implemented in the feature space. Thennonlinear feature analysis is implemented and data are reconstructed by using the kernel. The datareconciliation method based on KPCA is applied to ternary distillation column. Simulation resultsshow that this method can filter the noise in measurements of nonlinear process and reconciliateddata can represent the true information of nonlinear process.