Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In exist...Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.展开更多
The convergence of Internet of Things(IoT),5G,and cloud collaboration offers tailored solutions to the rigorous demands of multi-flow integrated energy aggregation dispatch data processing.While generative adversarial...The convergence of Internet of Things(IoT),5G,and cloud collaboration offers tailored solutions to the rigorous demands of multi-flow integrated energy aggregation dispatch data processing.While generative adversarial networks(GANs)are instrumental in resource scheduling,their application in this domain is impeded by challenges such as convergence speed,inferior optimality searching capability,and the inability to learn from failed decision making feedbacks.Therefore,a cloud-edge collaborative federated GAN-based communication and computing resource scheduling algorithm with long-term constraint violation sensitiveness is proposed to address these challenges.The proposed algorithm facilitates real-time,energy-efficient data processing by optimizing transmission power control,data migration,and computing resource allocation.It employs federated learning for global parameter aggregation to enhance GAN parameter updating and dynamically adjusts GAN learning rates and global aggregation weights based on energy consumption constraint violations.Simulation results indicate that the proposed algorithm effectively reduces data processing latency,energy consumption,and convergence time.展开更多
Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometri...Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometric observations,outliers may exist in the obtained light curves due to various reasons.Therefore,preprocessing is required to remove these outliers to obtain high quality light curves.Through statistical analysis,the reasons leading to outliers can be categorized into two main types:first,the brightness of the object significantly increases due to the passage of a star nearby,referred to as“stellar contamination,”and second,the brightness markedly decreases due to cloudy cover,referred to as“cloudy contamination.”The traditional approach of manually inspecting images for contamination is time-consuming and labor-intensive.However,we propose the utilization of machine learning methods as a substitute.Convolutional Neural Networks and SVMs are employed to identify cases of stellar contamination and cloudy contamination,achieving F1 scores of 1.00 and 0.98 on a test set,respectively.We also explore other machine learning methods such as ResNet-18 and Light Gradient Boosting Machine,then conduct comparative analyses of the results.展开更多
To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,al...To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance.展开更多
A novel method for noise removal from the rotating accelerometer gravity gradiometer(MAGG)is presented.It introduces a head-to-tail data expansion technique based on the zero-phase filtering principle.A scheme for det...A novel method for noise removal from the rotating accelerometer gravity gradiometer(MAGG)is presented.It introduces a head-to-tail data expansion technique based on the zero-phase filtering principle.A scheme for determining band-pass filter parameters based on signal-to-noise ratio gain,smoothness index,and cross-correlation coefficient is designed using the Chebyshev optimal consistent approximation theory.Additionally,a wavelet denoising evaluation function is constructed,with the dmey wavelet basis function identified as most effective for processing gravity gradient data.The results of hard-in-the-loop simulation and prototype experiments show that the proposed processing method has shown a 14%improvement in the measurement variance of gravity gradient signals,and the measurement accuracy has reached within 4E,compared to other commonly used methods,which verifies that the proposed method effectively removes noise from the gradient signals,improved gravity gradiometry accuracy,and has certain technical insights for high-precision airborne gravity gradiometry.展开更多
In order to obtain high-precision GPS control point results and provide high-precision known points for various projects,this study uses a variety of mature GPS post-processing software to process the observation data...In order to obtain high-precision GPS control point results and provide high-precision known points for various projects,this study uses a variety of mature GPS post-processing software to process the observation data of the GPS control network of Guanyinge Reservoir,and compares the results obtained by several kinds of software.According to the test results,the reasons for the accuracy differences between different software are analyzed,and the optimal results are obtained in the analysis and comparison.The purpose of this paper is to provide useful reference for GPS software users to process data.展开更多
In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical D...In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical Database. Our investigation entails a comprehensive exploration of various methodologies aimed at enhancing the efficiency of ETL processes, with a primary emphasis on optimizing time and resource utilization. Through meticulous experimentation utilizing a representative dataset, we shed light on the advantages associated with the incorporation of PySpark and Docker containerized applications. Our research illuminates significant advancements in time efficiency, process streamlining, and resource optimization attained through the utilization of PySpark for distributed computing within Big Data Engineering workflows. Additionally, we underscore the strategic integration of Docker containers, delineating their pivotal role in augmenting scalability and reproducibility within the ETL pipeline. This paper encapsulates the pivotal insights gleaned from our experimental journey, accentuating the practical implications and benefits entailed in the adoption of PySpark and Docker. By streamlining Big Data Engineering and ETL processes in the context of clinical big data, our study contributes to the ongoing discourse on optimizing data processing efficiency in healthcare applications. The source code is available on request.展开更多
Gravitational wave detection is one of the most cutting-edge research areas in modern physics, with its success relying on advanced data analysis and signal processing techniques. This study provides a comprehensive r...Gravitational wave detection is one of the most cutting-edge research areas in modern physics, with its success relying on advanced data analysis and signal processing techniques. This study provides a comprehensive review of data analysis methods and signal processing techniques in gravitational wave detection. The research begins by introducing the characteristics of gravitational wave signals and the challenges faced in their detection, such as extremely low signal-to-noise ratios and complex noise backgrounds. It then systematically analyzes the application of time-frequency analysis methods in extracting transient gravitational wave signals, including wavelet transforms and Hilbert-Huang transforms. The study focuses on discussing the crucial role of matched filtering techniques in improving signal detection sensitivity and explores strategies for template bank optimization. Additionally, the research evaluates the potential of machine learning algorithms, especially deep learning networks, in rapidly identifying and classifying gravitational wave events. The study also analyzes the application of Bayesian inference methods in parameter estimation and model selection, as well as their advantages in handling uncertainties. However, the research also points out the challenges faced by current technologies, such as dealing with non-Gaussian noise and improving computational efficiency. To address these issues, the study proposes a hybrid analysis framework combining physical models and data-driven methods. Finally, the research looks ahead to the potential applications of quantum computing in future gravitational wave data analysis. This study provides a comprehensive theoretical foundation for the optimization and innovation of gravitational wave data analysis methods, contributing to the advancement of gravitational wave astronomy.展开更多
The High Precision Magnetometer(HPM) on board the China Seismo-Electromagnetic Satellite(CSES) allows highly accurate measurement of the geomagnetic field; it includes FGM(Fluxgate Magnetometer) and CDSM(Coupled Dark ...The High Precision Magnetometer(HPM) on board the China Seismo-Electromagnetic Satellite(CSES) allows highly accurate measurement of the geomagnetic field; it includes FGM(Fluxgate Magnetometer) and CDSM(Coupled Dark State Magnetometer)probes. This article introduces the main processing method, algorithm, and processing procedure of the HPM data. First, the FGM and CDSM probes are calibrated according to ground sensor data. Then the FGM linear parameters can be corrected in orbit, by applying the absolute vector magnetic field correction algorithm from CDSM data. At the same time, the magnetic interference of the satellite is eliminated according to ground-satellite magnetic test results. Finally, according to the characteristics of the magnetic field direction in the low latitude region, the transformation matrix between FGM probe and star sensor is calibrated in orbit to determine the correct direction of the magnetic field. Comparing the magnetic field data of CSES and SWARM satellites in five continuous geomagnetic quiet days, the difference in measurements of the vector magnetic field is about 10 nT, which is within the uncertainty interval of geomagnetic disturbance.展开更多
The data processing mode is vital to the performance of an entire coalmine gas early-warning system, especially in real-time performance. Our objective was to present the structural features of coalmine gas data, so t...The data processing mode is vital to the performance of an entire coalmine gas early-warning system, especially in real-time performance. Our objective was to present the structural features of coalmine gas data, so that the data could be processed at different priority levels in C language. Two different data processing models, one with priority and the other without priority, were built based on queuing theory. Their theoretical formulas were determined via a M/M/I model in order to calculate average occupation time of each measuring point in an early-warning program. We validated the model with the gas early-warning system of the Huaibei Coalmine Group Corp. The results indicate that the average occupation time for gas data processing by using the queuing system model with priority is nearly 1/30 of that of the model without priority.展开更多
Low-field(nuclear magnetic resonance)NMR has been widely used in petroleum industry,such as well logging and laboratory rock core analysis.However,the signal-to-noise ratio is low due to the low magnetic field strengt...Low-field(nuclear magnetic resonance)NMR has been widely used in petroleum industry,such as well logging and laboratory rock core analysis.However,the signal-to-noise ratio is low due to the low magnetic field strength of NMR tools and the complex petrophysical properties of detected samples.Suppressing the noise and highlighting the available NMR signals is very important for subsequent data processing.Most denoising methods are normally based on fixed mathematical transformation or handdesign feature selectors to suppress noise characteristics,which may not perform well because of their non-adaptive performance to different noisy signals.In this paper,we proposed a“data processing framework”to improve the quality of low field NMR echo data based on dictionary learning.Dictionary learning is a machine learning method based on redundancy and sparse representation theory.Available information in noisy NMR echo data can be adaptively extracted and reconstructed by dictionary learning.The advantages and application effectiveness of the proposed method were verified with a number of numerical simulations,NMR core data analyses,and NMR logging data processing.The results show that dictionary learning can significantly improve the quality of NMR echo data with high noise level and effectively improve the accuracy and reliability of inversion results.展开更多
In the course of network supported collaborative design, the data processing plays a very vital role. Much effort has been spent in this area, and many kinds of approaches have been proposed. Based on the correlative ...In the course of network supported collaborative design, the data processing plays a very vital role. Much effort has been spent in this area, and many kinds of approaches have been proposed. Based on the correlative materials, this paper presents extensible markup language (XML) based strategy for several important problems of data processing in network supported collaborative design, such as the representation of standard for the exchange of product model data (STEP) with XML in the product information expression and the management of XML documents using relational database. The paper gives a detailed exposition on how to clarify the mapping between XML structure and the relationship database structure and how XML-QL queries can be translated into structured query language (SQL) queries. Finally, the structure of data processing system based on XML is presented.展开更多
In non-homogeneous environment, traditional space-time adaptive processing doesn't effectively suppress interference and detect target, because the secondary data don' t exactly reflect the statistical characteristi...In non-homogeneous environment, traditional space-time adaptive processing doesn't effectively suppress interference and detect target, because the secondary data don' t exactly reflect the statistical characteristic of the range cell under test. A ravel methodology utilizing the direct data domain approach to space-time adaptive processing ( STAP ) in airbome radar non-homogeneous environments is presented. The deterministic least squares adaptive signal processing technique operates on a "snapshot-by-snapshot" basis to dethrone the adaptive adaptive weights for nulling interferences and estimating signal of interest (SOI). Furthermore, this approach eliminates the requirement for estimating the covariance through the data of neighboring range cell, which eliminates calculating the inverse of covariance, and can be implemented to operate in real-time. Simulation results illustrate the efficiency of interference suppression in non-homogeneous environment.展开更多
Due to the limited scenes that synthetic aperture radar(SAR)satellites can detect,the full-track utilization rate is not high.Because of the computing and storage limitation of one satellite,it is difficult to process...Due to the limited scenes that synthetic aperture radar(SAR)satellites can detect,the full-track utilization rate is not high.Because of the computing and storage limitation of one satellite,it is difficult to process large amounts of data of spaceborne synthetic aperture radars.It is proposed to use a new method of networked satellite data processing for improving the efficiency of data processing.A multi-satellite distributed SAR real-time processing method based on Chirp Scaling(CS)imaging algorithm is studied in this paper,and a distributed data processing system is built with field programmable gate array(FPGA)chips as the kernel.Different from the traditional CS algorithm processing,the system divides data processing into three stages.The computing tasks are reasonably allocated to different data processing units(i.e.,satellites)in each stage.The method effectively saves computing and storage resources of satellites,improves the utilization rate of a single satellite,and shortens the data processing time.Gaofen-3(GF-3)satellite SAR raw data is processed by the system,with the performance of the method verified.展开更多
Due to the increasing number of cloud applications,the amount of data in the cloud shows signs of growing faster than ever before.The nature of cloud computing requires cloud data processing systems that can handle hu...Due to the increasing number of cloud applications,the amount of data in the cloud shows signs of growing faster than ever before.The nature of cloud computing requires cloud data processing systems that can handle huge volumes of data and have high performance.However,most cloud storage systems currently adopt a hash-like approach to retrieving data that only supports simple keyword-based enquiries,but lacks various forms of information search.Therefore,a scalable and efficient indexing scheme is clearly required.In this paper,we present a skip list-based cloud index,called SLC-index,which is a novel,scalable skip list-based indexing for cloud data processing.The SLC-index offers a two-layered architecture for extending indexing scope and facilitating better throughput.Dynamic load-balancing for the SLC-index is achieved by online migration of index nodes between servers.Furthermore,it is a flexible system due to its dynamic addition and removal of servers.The SLC-index is efficient for both point and range queries.Experimental results show the efficiency of the SLC-index and its usefulness as an alternative approach for cloud-suitable data structures.展开更多
The present study aims to improve the efficiency of typical procedures used for post-processing flow field data by applying a neural-network technology.Assuming a problem of aircraft design as the workhorse,a regressi...The present study aims to improve the efficiency of typical procedures used for post-processing flow field data by applying a neural-network technology.Assuming a problem of aircraft design as the workhorse,a regression calculation model for processing the flow data of a FCN-VGG19 aircraft is elaborated based on VGGNet(Visual Geometry Group Net)and FCN(Fully Convolutional Network)techniques.As shown by the results,the model displays a strong fitting ability,and there is almost no over-fitting in training.Moreover,the model has good accuracy and convergence.For different input data and different grids,the model basically achieves convergence,showing good performances.It is shown that the proposed simulation regression model based on FCN has great potential in typical problems of computational fluid dynamics(CFD)and related data processing.展开更多
To improve our understanding of the formation and evolution of the Moon, one of the payloads onboard the Chang'e-3 (CE-3) rover is Lunar Penetrating Radar (LPR). This investigation is the first attempt to explore...To improve our understanding of the formation and evolution of the Moon, one of the payloads onboard the Chang'e-3 (CE-3) rover is Lunar Penetrating Radar (LPR). This investigation is the first attempt to explore the lunar subsurface structure by using ground penetrating radar with high resolution. We have probed the subsur- face to a depth of several hundred meters using LPR. In-orbit testing, data processing and the preliminary results are presented. These observations have revealed the con- figuration of regolith where the thickness of regolith varies from about 4 m to 6 m. In addition, one layer of lunar rock, which is about 330 m deep and might have been accumulated during the depositional hiatus of mare basalts, was detected.展开更多
The Chang'e-3 Visible and Near-infrared Imaging Spectrometer (VNIS) is one of the four payloads on the Yutu rover. After traversing the landing site during the first two lunar days, four different areas are detecte...The Chang'e-3 Visible and Near-infrared Imaging Spectrometer (VNIS) is one of the four payloads on the Yutu rover. After traversing the landing site during the first two lunar days, four different areas are detected, and Level 2A and 2B ra- diance data have been released to the scientific community. The released data have been processed by dark current subtraction, correction for the effect of temperature, radiometric calibration and geometric calibration. We emphasize approaches for re- flectance analysis and mineral identification for in-situ analysis with VNIS. Then the preliminary spectral and mineralogical results from the landing site are derived. After comparing spectral data from VNIS with data collected by the Ma instrument and samples of mare that were returned from the Apollo program, all the reflectance data have been found to have similar absorption features near 1000 nm except lunar sample 71061. In addition, there is also a weak absorption feature between 1750-2400nm on VNIS, but the slopes of VNIS and Ma reflectance at longer wavelengths are lower than data taken from samples of lunar mare. Spectral parameters such as Band Centers and Integrated Band Depth Ratios are used to analyze mineralogical features. The results show that detection points E and N205 are mixtures of high-Ca pyroxene and olivine, and the composition of olivineat point N205 is higher than that at point E, but the compositions of detection points S3 and N203 are mainly olivine-rich. Since there are no obvious absorption features near 1250 nm, plagioclase is not directly identified at the landing site.展开更多
The Extreme Ultraviolet Camera (EUVC) onboard the Chang'e-3 (CE-3) lander is used to observe the structure and dynamics of Earth's plasmasphere from the Moon. By detecting the resonance line emission of helium i...The Extreme Ultraviolet Camera (EUVC) onboard the Chang'e-3 (CE-3) lander is used to observe the structure and dynamics of Earth's plasmasphere from the Moon. By detecting the resonance line emission of helium ions (He+) at 30.4 nm, the EUVC images the entire plasmasphere with a time resolution of 10 min and a spatial resolution of about 0.1 Earth radius (RE) in a single frame. We first present details about the data processing from EUVC and the data acquisition in the commissioning phase, and then report some initial results, which reflect the basic features of the plas- masphere well. The photon count and emission intensity of EUVC are consistent with previous observations and models, which indicate that the EUVC works normally and can provide high quality data for future studies.展开更多
Since 2008 a network of five sea-level monitoring stations was progressively installed in French Polynesia.The stations are autonomous and data,collected at a sampling rate of 1 or 2 min,are not only recorded locally,...Since 2008 a network of five sea-level monitoring stations was progressively installed in French Polynesia.The stations are autonomous and data,collected at a sampling rate of 1 or 2 min,are not only recorded locally,but also transferred in real time by a radio-link to the NOAA through the GOES satellite.The new ET34-ANA-V80 version of ETERNA,initially developed for Earth Tides analysis,is now able to analyze ocean tides records.Through a two-step validation scheme,we took advantage of the flexibility of this new version,operated in conjunction with the preprocessing facilities of the Tsoft software,to recover co rrected data series able to model sea-level variations after elimination of the ocean tides signal.We performed the tidal analysis of the tide gauge data with the highest possible selectivity(optimal wave grouping)and a maximum of additional terms(shallow water constituents).Our goal was to provide corrected data series and modelled ocean tides signal to compute tide-free sea-level variations as well as tidal prediction models with centimeter precision.We also present in this study the characteristics of the ocean tides in French Polynesia and preliminary results concerning the non-tidal variations of the sea level concerning the tide gauge setting.展开更多
基金supported by National Natural Sciences Foundation of China(No.62271165,62027802,62201307)the Guangdong Basic and Applied Basic Research Foundation(No.2023A1515030297)+2 种基金the Shenzhen Science and Technology Program ZDSYS20210623091808025Stable Support Plan Program GXWD20231129102638002the Major Key Project of PCL(No.PCL2024A01)。
文摘Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.
基金supported by China Southern Power Grid Technology Project under Grant 03600KK52220019(GDKJXM20220253).
文摘The convergence of Internet of Things(IoT),5G,and cloud collaboration offers tailored solutions to the rigorous demands of multi-flow integrated energy aggregation dispatch data processing.While generative adversarial networks(GANs)are instrumental in resource scheduling,their application in this domain is impeded by challenges such as convergence speed,inferior optimality searching capability,and the inability to learn from failed decision making feedbacks.Therefore,a cloud-edge collaborative federated GAN-based communication and computing resource scheduling algorithm with long-term constraint violation sensitiveness is proposed to address these challenges.The proposed algorithm facilitates real-time,energy-efficient data processing by optimizing transmission power control,data migration,and computing resource allocation.It employs federated learning for global parameter aggregation to enhance GAN parameter updating and dynamically adjusts GAN learning rates and global aggregation weights based on energy consumption constraint violations.Simulation results indicate that the proposed algorithm effectively reduces data processing latency,energy consumption,and convergence time.
基金funded by the National Natural Science Foundation of China(NSFC,Nos.12373086 and 12303082)CAS“Light of West China”Program+2 种基金Yunnan Revitalization Talent Support Program in Yunnan ProvinceNational Key R&D Program of ChinaGravitational Wave Detection Project No.2022YFC2203800。
文摘Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometric observations,outliers may exist in the obtained light curves due to various reasons.Therefore,preprocessing is required to remove these outliers to obtain high quality light curves.Through statistical analysis,the reasons leading to outliers can be categorized into two main types:first,the brightness of the object significantly increases due to the passage of a star nearby,referred to as“stellar contamination,”and second,the brightness markedly decreases due to cloudy cover,referred to as“cloudy contamination.”The traditional approach of manually inspecting images for contamination is time-consuming and labor-intensive.However,we propose the utilization of machine learning methods as a substitute.Convolutional Neural Networks and SVMs are employed to identify cases of stellar contamination and cloudy contamination,achieving F1 scores of 1.00 and 0.98 on a test set,respectively.We also explore other machine learning methods such as ResNet-18 and Light Gradient Boosting Machine,then conduct comparative analyses of the results.
基金Weaponry Equipment Pre-Research Foundation of PLA Equipment Ministry (No. 9140A06050409JB8102)Pre-Research Foundation of PLA University of Science and Technology (No. 2009JSJ11)
文摘To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance.
文摘A novel method for noise removal from the rotating accelerometer gravity gradiometer(MAGG)is presented.It introduces a head-to-tail data expansion technique based on the zero-phase filtering principle.A scheme for determining band-pass filter parameters based on signal-to-noise ratio gain,smoothness index,and cross-correlation coefficient is designed using the Chebyshev optimal consistent approximation theory.Additionally,a wavelet denoising evaluation function is constructed,with the dmey wavelet basis function identified as most effective for processing gravity gradient data.The results of hard-in-the-loop simulation and prototype experiments show that the proposed processing method has shown a 14%improvement in the measurement variance of gravity gradient signals,and the measurement accuracy has reached within 4E,compared to other commonly used methods,which verifies that the proposed method effectively removes noise from the gradient signals,improved gravity gradiometry accuracy,and has certain technical insights for high-precision airborne gravity gradiometry.
文摘In order to obtain high-precision GPS control point results and provide high-precision known points for various projects,this study uses a variety of mature GPS post-processing software to process the observation data of the GPS control network of Guanyinge Reservoir,and compares the results obtained by several kinds of software.According to the test results,the reasons for the accuracy differences between different software are analyzed,and the optimal results are obtained in the analysis and comparison.The purpose of this paper is to provide useful reference for GPS software users to process data.
文摘In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical Database. Our investigation entails a comprehensive exploration of various methodologies aimed at enhancing the efficiency of ETL processes, with a primary emphasis on optimizing time and resource utilization. Through meticulous experimentation utilizing a representative dataset, we shed light on the advantages associated with the incorporation of PySpark and Docker containerized applications. Our research illuminates significant advancements in time efficiency, process streamlining, and resource optimization attained through the utilization of PySpark for distributed computing within Big Data Engineering workflows. Additionally, we underscore the strategic integration of Docker containers, delineating their pivotal role in augmenting scalability and reproducibility within the ETL pipeline. This paper encapsulates the pivotal insights gleaned from our experimental journey, accentuating the practical implications and benefits entailed in the adoption of PySpark and Docker. By streamlining Big Data Engineering and ETL processes in the context of clinical big data, our study contributes to the ongoing discourse on optimizing data processing efficiency in healthcare applications. The source code is available on request.
文摘Gravitational wave detection is one of the most cutting-edge research areas in modern physics, with its success relying on advanced data analysis and signal processing techniques. This study provides a comprehensive review of data analysis methods and signal processing techniques in gravitational wave detection. The research begins by introducing the characteristics of gravitational wave signals and the challenges faced in their detection, such as extremely low signal-to-noise ratios and complex noise backgrounds. It then systematically analyzes the application of time-frequency analysis methods in extracting transient gravitational wave signals, including wavelet transforms and Hilbert-Huang transforms. The study focuses on discussing the crucial role of matched filtering techniques in improving signal detection sensitivity and explores strategies for template bank optimization. Additionally, the research evaluates the potential of machine learning algorithms, especially deep learning networks, in rapidly identifying and classifying gravitational wave events. The study also analyzes the application of Bayesian inference methods in parameter estimation and model selection, as well as their advantages in handling uncertainties. However, the research also points out the challenges faced by current technologies, such as dealing with non-Gaussian noise and improving computational efficiency. To address these issues, the study proposes a hybrid analysis framework combining physical models and data-driven methods. Finally, the research looks ahead to the potential applications of quantum computing in future gravitational wave data analysis. This study provides a comprehensive theoretical foundation for the optimization and innovation of gravitational wave data analysis methods, contributing to the advancement of gravitational wave astronomy.
基金supported by National Key Research and Development Program of China from MOST (2016YFB0501503)
文摘The High Precision Magnetometer(HPM) on board the China Seismo-Electromagnetic Satellite(CSES) allows highly accurate measurement of the geomagnetic field; it includes FGM(Fluxgate Magnetometer) and CDSM(Coupled Dark State Magnetometer)probes. This article introduces the main processing method, algorithm, and processing procedure of the HPM data. First, the FGM and CDSM probes are calibrated according to ground sensor data. Then the FGM linear parameters can be corrected in orbit, by applying the absolute vector magnetic field correction algorithm from CDSM data. At the same time, the magnetic interference of the satellite is eliminated according to ground-satellite magnetic test results. Finally, according to the characteristics of the magnetic field direction in the low latitude region, the transformation matrix between FGM probe and star sensor is calibrated in orbit to determine the correct direction of the magnetic field. Comparing the magnetic field data of CSES and SWARM satellites in five continuous geomagnetic quiet days, the difference in measurements of the vector magnetic field is about 10 nT, which is within the uncertainty interval of geomagnetic disturbance.
基金Project 70533050 supported by the National Natural Science Foundation of China
文摘The data processing mode is vital to the performance of an entire coalmine gas early-warning system, especially in real-time performance. Our objective was to present the structural features of coalmine gas data, so that the data could be processed at different priority levels in C language. Two different data processing models, one with priority and the other without priority, were built based on queuing theory. Their theoretical formulas were determined via a M/M/I model in order to calculate average occupation time of each measuring point in an early-warning program. We validated the model with the gas early-warning system of the Huaibei Coalmine Group Corp. The results indicate that the average occupation time for gas data processing by using the queuing system model with priority is nearly 1/30 of that of the model without priority.
基金supported by Science Foundation of China University of Petroleum,Beijing(Grant Number ZX20210024)Chinese Postdoctoral Science Foundation(Grant Number 2021M700172)+1 种基金The Strategic Cooperation Technology Projects of CNPC and CUP(Grant Number ZLZX2020-03)National Natural Science Foundation of China(Grant Number 42004105)
文摘Low-field(nuclear magnetic resonance)NMR has been widely used in petroleum industry,such as well logging and laboratory rock core analysis.However,the signal-to-noise ratio is low due to the low magnetic field strength of NMR tools and the complex petrophysical properties of detected samples.Suppressing the noise and highlighting the available NMR signals is very important for subsequent data processing.Most denoising methods are normally based on fixed mathematical transformation or handdesign feature selectors to suppress noise characteristics,which may not perform well because of their non-adaptive performance to different noisy signals.In this paper,we proposed a“data processing framework”to improve the quality of low field NMR echo data based on dictionary learning.Dictionary learning is a machine learning method based on redundancy and sparse representation theory.Available information in noisy NMR echo data can be adaptively extracted and reconstructed by dictionary learning.The advantages and application effectiveness of the proposed method were verified with a number of numerical simulations,NMR core data analyses,and NMR logging data processing.The results show that dictionary learning can significantly improve the quality of NMR echo data with high noise level and effectively improve the accuracy and reliability of inversion results.
基金supported by National High Technology Research and Development Program of China (863 Program) (No. AA420060)
文摘In the course of network supported collaborative design, the data processing plays a very vital role. Much effort has been spent in this area, and many kinds of approaches have been proposed. Based on the correlative materials, this paper presents extensible markup language (XML) based strategy for several important problems of data processing in network supported collaborative design, such as the representation of standard for the exchange of product model data (STEP) with XML in the product information expression and the management of XML documents using relational database. The paper gives a detailed exposition on how to clarify the mapping between XML structure and the relationship database structure and how XML-QL queries can be translated into structured query language (SQL) queries. Finally, the structure of data processing system based on XML is presented.
文摘In non-homogeneous environment, traditional space-time adaptive processing doesn't effectively suppress interference and detect target, because the secondary data don' t exactly reflect the statistical characteristic of the range cell under test. A ravel methodology utilizing the direct data domain approach to space-time adaptive processing ( STAP ) in airbome radar non-homogeneous environments is presented. The deterministic least squares adaptive signal processing technique operates on a "snapshot-by-snapshot" basis to dethrone the adaptive adaptive weights for nulling interferences and estimating signal of interest (SOI). Furthermore, this approach eliminates the requirement for estimating the covariance through the data of neighboring range cell, which eliminates calculating the inverse of covariance, and can be implemented to operate in real-time. Simulation results illustrate the efficiency of interference suppression in non-homogeneous environment.
基金Project(2017YFC1405600)supported by the National Key R&D Program of ChinaProject(18JK05032)supported by the Scientific Research Project of Education Department of Shaanxi Province,China。
文摘Due to the limited scenes that synthetic aperture radar(SAR)satellites can detect,the full-track utilization rate is not high.Because of the computing and storage limitation of one satellite,it is difficult to process large amounts of data of spaceborne synthetic aperture radars.It is proposed to use a new method of networked satellite data processing for improving the efficiency of data processing.A multi-satellite distributed SAR real-time processing method based on Chirp Scaling(CS)imaging algorithm is studied in this paper,and a distributed data processing system is built with field programmable gate array(FPGA)chips as the kernel.Different from the traditional CS algorithm processing,the system divides data processing into three stages.The computing tasks are reasonably allocated to different data processing units(i.e.,satellites)in each stage.The method effectively saves computing and storage resources of satellites,improves the utilization rate of a single satellite,and shortens the data processing time.Gaofen-3(GF-3)satellite SAR raw data is processed by the system,with the performance of the method verified.
基金Projects(61363021,61540061,61663047)supported by the National Natural Science Foundation of ChinaProject(2017SE206)supported by the Open Foundation of Key Laboratory in Software Engineering of Yunnan Province,China
文摘Due to the increasing number of cloud applications,the amount of data in the cloud shows signs of growing faster than ever before.The nature of cloud computing requires cloud data processing systems that can handle huge volumes of data and have high performance.However,most cloud storage systems currently adopt a hash-like approach to retrieving data that only supports simple keyword-based enquiries,but lacks various forms of information search.Therefore,a scalable and efficient indexing scheme is clearly required.In this paper,we present a skip list-based cloud index,called SLC-index,which is a novel,scalable skip list-based indexing for cloud data processing.The SLC-index offers a two-layered architecture for extending indexing scope and facilitating better throughput.Dynamic load-balancing for the SLC-index is achieved by online migration of index nodes between servers.Furthermore,it is a flexible system due to its dynamic addition and removal of servers.The SLC-index is efficient for both point and range queries.Experimental results show the efficiency of the SLC-index and its usefulness as an alternative approach for cloud-suitable data structures.
文摘The present study aims to improve the efficiency of typical procedures used for post-processing flow field data by applying a neural-network technology.Assuming a problem of aircraft design as the workhorse,a regression calculation model for processing the flow data of a FCN-VGG19 aircraft is elaborated based on VGGNet(Visual Geometry Group Net)and FCN(Fully Convolutional Network)techniques.As shown by the results,the model displays a strong fitting ability,and there is almost no over-fitting in training.Moreover,the model has good accuracy and convergence.For different input data and different grids,the model basically achieves convergence,showing good performances.It is shown that the proposed simulation regression model based on FCN has great potential in typical problems of computational fluid dynamics(CFD)and related data processing.
基金Supported by the National Natural Science Foundation of China
文摘To improve our understanding of the formation and evolution of the Moon, one of the payloads onboard the Chang'e-3 (CE-3) rover is Lunar Penetrating Radar (LPR). This investigation is the first attempt to explore the lunar subsurface structure by using ground penetrating radar with high resolution. We have probed the subsur- face to a depth of several hundred meters using LPR. In-orbit testing, data processing and the preliminary results are presented. These observations have revealed the con- figuration of regolith where the thickness of regolith varies from about 4 m to 6 m. In addition, one layer of lunar rock, which is about 330 m deep and might have been accumulated during the depositional hiatus of mare basalts, was detected.
基金Supported by the National Natural Science Foundation of China
文摘The Chang'e-3 Visible and Near-infrared Imaging Spectrometer (VNIS) is one of the four payloads on the Yutu rover. After traversing the landing site during the first two lunar days, four different areas are detected, and Level 2A and 2B ra- diance data have been released to the scientific community. The released data have been processed by dark current subtraction, correction for the effect of temperature, radiometric calibration and geometric calibration. We emphasize approaches for re- flectance analysis and mineral identification for in-situ analysis with VNIS. Then the preliminary spectral and mineralogical results from the landing site are derived. After comparing spectral data from VNIS with data collected by the Ma instrument and samples of mare that were returned from the Apollo program, all the reflectance data have been found to have similar absorption features near 1000 nm except lunar sample 71061. In addition, there is also a weak absorption feature between 1750-2400nm on VNIS, but the slopes of VNIS and Ma reflectance at longer wavelengths are lower than data taken from samples of lunar mare. Spectral parameters such as Band Centers and Integrated Band Depth Ratios are used to analyze mineralogical features. The results show that detection points E and N205 are mixtures of high-Ca pyroxene and olivine, and the composition of olivineat point N205 is higher than that at point E, but the compositions of detection points S3 and N203 are mainly olivine-rich. Since there are no obvious absorption features near 1250 nm, plagioclase is not directly identified at the landing site.
文摘The Extreme Ultraviolet Camera (EUVC) onboard the Chang'e-3 (CE-3) lander is used to observe the structure and dynamics of Earth's plasmasphere from the Moon. By detecting the resonance line emission of helium ions (He+) at 30.4 nm, the EUVC images the entire plasmasphere with a time resolution of 10 min and a spatial resolution of about 0.1 Earth radius (RE) in a single frame. We first present details about the data processing from EUVC and the data acquisition in the commissioning phase, and then report some initial results, which reflect the basic features of the plas- masphere well. The photon count and emission intensity of EUVC are consistent with previous observations and models, which indicate that the EUVC works normally and can provide high quality data for future studies.
基金funding from the“Talent Introduction Scientific Research Start-Up Fund”of Shandong University of Science and Technology(Grant number 0104060510217)the“Open Fund of State Key Laboratory of Geodesy and Earth’s Dynamics”(Grant number SKLGED2021-3-5)。
文摘Since 2008 a network of five sea-level monitoring stations was progressively installed in French Polynesia.The stations are autonomous and data,collected at a sampling rate of 1 or 2 min,are not only recorded locally,but also transferred in real time by a radio-link to the NOAA through the GOES satellite.The new ET34-ANA-V80 version of ETERNA,initially developed for Earth Tides analysis,is now able to analyze ocean tides records.Through a two-step validation scheme,we took advantage of the flexibility of this new version,operated in conjunction with the preprocessing facilities of the Tsoft software,to recover co rrected data series able to model sea-level variations after elimination of the ocean tides signal.We performed the tidal analysis of the tide gauge data with the highest possible selectivity(optimal wave grouping)and a maximum of additional terms(shallow water constituents).Our goal was to provide corrected data series and modelled ocean tides signal to compute tide-free sea-level variations as well as tidal prediction models with centimeter precision.We also present in this study the characteristics of the ocean tides in French Polynesia and preliminary results concerning the non-tidal variations of the sea level concerning the tide gauge setting.