The convergence of Internet of Things(IoT),5G,and cloud collaboration offers tailored solutions to the rigorous demands of multi-flow integrated energy aggregation dispatch data processing.While generative adversarial...The convergence of Internet of Things(IoT),5G,and cloud collaboration offers tailored solutions to the rigorous demands of multi-flow integrated energy aggregation dispatch data processing.While generative adversarial networks(GANs)are instrumental in resource scheduling,their application in this domain is impeded by challenges such as convergence speed,inferior optimality searching capability,and the inability to learn from failed decision making feedbacks.Therefore,a cloud-edge collaborative federated GAN-based communication and computing resource scheduling algorithm with long-term constraint violation sensitiveness is proposed to address these challenges.The proposed algorithm facilitates real-time,energy-efficient data processing by optimizing transmission power control,data migration,and computing resource allocation.It employs federated learning for global parameter aggregation to enhance GAN parameter updating and dynamically adjusts GAN learning rates and global aggregation weights based on energy consumption constraint violations.Simulation results indicate that the proposed algorithm effectively reduces data processing latency,energy consumption,and convergence time.展开更多
A novel method for noise removal from the rotating accelerometer gravity gradiometer(MAGG)is presented.It introduces a head-to-tail data expansion technique based on the zero-phase filtering principle.A scheme for det...A novel method for noise removal from the rotating accelerometer gravity gradiometer(MAGG)is presented.It introduces a head-to-tail data expansion technique based on the zero-phase filtering principle.A scheme for determining band-pass filter parameters based on signal-to-noise ratio gain,smoothness index,and cross-correlation coefficient is designed using the Chebyshev optimal consistent approximation theory.Additionally,a wavelet denoising evaluation function is constructed,with the dmey wavelet basis function identified as most effective for processing gravity gradient data.The results of hard-in-the-loop simulation and prototype experiments show that the proposed processing method has shown a 14%improvement in the measurement variance of gravity gradient signals,and the measurement accuracy has reached within 4E,compared to other commonly used methods,which verifies that the proposed method effectively removes noise from the gradient signals,improved gravity gradiometry accuracy,and has certain technical insights for high-precision airborne gravity gradiometry.展开更多
Gas hydrate drilling expeditions in the Pearl River Mouth Basin,South China Sea,have identified concentrated gas hydrates with variable thickness.Moreover,free gas and the coexistence of gas hydrate and free gas have ...Gas hydrate drilling expeditions in the Pearl River Mouth Basin,South China Sea,have identified concentrated gas hydrates with variable thickness.Moreover,free gas and the coexistence of gas hydrate and free gas have been confirmed by logging,coring,and production tests in the foraminifera-rich silty sediments with complex bottom-simulating reflectors(BSRs).The broad-band processing is conducted on conventional three-dimensional(3D)seismic data to improve the image and detection accuracy of gas hydratebearing layers and delineate the saturation and thickness of gas hydrate-and free gas-bearing sediments.Several geophysical attributes extracted along the base of the gas hydrate stability zone are used to demonstrate the variable distribution and the controlling factors for the differential enrichment of gas hydrate.The inverted gas hydrate saturation at the production zone is over 40% with a thickness of 90 m,showing the interbedded distribution with different boundaries between gas hydrate-and free gas-bearing layers.However,the gas hydrate saturation value at the adjacent canyon is 70%,with 30-m-thick patches and linear features.The lithological and fault controls on gas hydrate and free gas distributions are demonstrated by tracing each gas hydrate-bearing layer.Moreover,the BSR depths based on broad-band reprocessed 3D seismic data not only exhibit variations due to small-scale topographic changes caused by seafloor sedimentation and erosion but also show the upward shift of BSR and the blocky distribution of the coexistence of gas hydrate and free gas in the Pearl River Mouth Basin.展开更多
Recently published in Joule,Feng Liu and colleagues from Shanghai Jiaotong University reported a record-breaking 20.8%power conversion efficiency in organic solar cells(OSCs)with an interpenetrating fibril network act...Recently published in Joule,Feng Liu and colleagues from Shanghai Jiaotong University reported a record-breaking 20.8%power conversion efficiency in organic solar cells(OSCs)with an interpenetrating fibril network active layer morphology,featuring a bulk p-in structure and proper vertical segregation achieved through additive-assisted layer-by-layer deposition.This optimized hierarchical gradient fibrillar morphology and optical management synergistically facilitates exciton diffusion,reduces recombination losses,and enhances light capture capability.This approach not only offers a solution to achieving high-efficiency devices but also demonstrates the potential for commercial applications of OSCs.展开更多
Thin walls of an AZ91 magnesium alloy with fine equiaxed grains were fabricated via cold arc-based wire arc additive manufacturing(CA-WAAM),and the droplet transfer behaviours,microstructures,and mechanical properties...Thin walls of an AZ91 magnesium alloy with fine equiaxed grains were fabricated via cold arc-based wire arc additive manufacturing(CA-WAAM),and the droplet transfer behaviours,microstructures,and mechanical properties were investigated.The results showed that the cold arc process reduced splashing at the moment of liquid bridge breakage and effectively shortened the droplet transfer period.The microstructures of the deposited samples exhibited layered characteristics with alternating distributions of coarse and fine grains.During layer-by-layer deposition,the β-phase precipitated and grew preferentially along grain boundaries,while the fineη-Al_(8)Mn_(5)phase was dispersed in the α-Mg matrix.The mechanical properties of the CA-WAAM deposited sample showed isotropic characteristics.The ultimate tensile strength and elongation in the building direction(BD)were 282.7 MPa and 14.2%,respectively.The microhardness values of the deposited parts were relatively uniform,with an average value of HV 69.6.展开更多
Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In exist...Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.展开更多
Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometri...Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometric observations,outliers may exist in the obtained light curves due to various reasons.Therefore,preprocessing is required to remove these outliers to obtain high quality light curves.Through statistical analysis,the reasons leading to outliers can be categorized into two main types:first,the brightness of the object significantly increases due to the passage of a star nearby,referred to as“stellar contamination,”and second,the brightness markedly decreases due to cloudy cover,referred to as“cloudy contamination.”The traditional approach of manually inspecting images for contamination is time-consuming and labor-intensive.However,we propose the utilization of machine learning methods as a substitute.Convolutional Neural Networks and SVMs are employed to identify cases of stellar contamination and cloudy contamination,achieving F1 scores of 1.00 and 0.98 on a test set,respectively.We also explore other machine learning methods such as ResNet-18 and Light Gradient Boosting Machine,then conduct comparative analyses of the results.展开更多
In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical D...In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical Database. Our investigation entails a comprehensive exploration of various methodologies aimed at enhancing the efficiency of ETL processes, with a primary emphasis on optimizing time and resource utilization. Through meticulous experimentation utilizing a representative dataset, we shed light on the advantages associated with the incorporation of PySpark and Docker containerized applications. Our research illuminates significant advancements in time efficiency, process streamlining, and resource optimization attained through the utilization of PySpark for distributed computing within Big Data Engineering workflows. Additionally, we underscore the strategic integration of Docker containers, delineating their pivotal role in augmenting scalability and reproducibility within the ETL pipeline. This paper encapsulates the pivotal insights gleaned from our experimental journey, accentuating the practical implications and benefits entailed in the adoption of PySpark and Docker. By streamlining Big Data Engineering and ETL processes in the context of clinical big data, our study contributes to the ongoing discourse on optimizing data processing efficiency in healthcare applications. The source code is available on request.展开更多
The recent pandemic crisis has highlighted the importance of the availability and management of health data to respond quickly and effectively to health emergencies, while respecting the fundamental rights of every in...The recent pandemic crisis has highlighted the importance of the availability and management of health data to respond quickly and effectively to health emergencies, while respecting the fundamental rights of every individual. In this context, it is essential to find a balance between the protection of privacy and the safeguarding of public health, using tools that guarantee transparency and consent to the processing of data by the population. This work, starting from a pilot investigation conducted in the Polyclinic of Bari as part of the Horizon Europe Seeds project entitled “Multidisciplinary analysis of technological tracing models of contagion: the protection of rights in the management of health data”, has the objective of promoting greater patient awareness regarding the processing of their health data and the protection of privacy. The methodology used the PHICAT (Personal Health Information Competence Assessment Tool) as a tool and, through the administration of a questionnaire, the aim was to evaluate the patients’ ability to express their consent to the release and processing of health data. The results that emerged were analyzed in relation to the 4 domains in which the process is divided which allows evaluating the patients’ ability to express a conscious choice and, also, in relation to the socio-demographic and clinical characteristics of the patients themselves. This study can contribute to understanding patients’ ability to give their consent and improve information regarding the management of health data by increasing confidence in granting the use of their data for research and clinical management.展开更多
Gravitational wave detection is one of the most cutting-edge research areas in modern physics, with its success relying on advanced data analysis and signal processing techniques. This study provides a comprehensive r...Gravitational wave detection is one of the most cutting-edge research areas in modern physics, with its success relying on advanced data analysis and signal processing techniques. This study provides a comprehensive review of data analysis methods and signal processing techniques in gravitational wave detection. The research begins by introducing the characteristics of gravitational wave signals and the challenges faced in their detection, such as extremely low signal-to-noise ratios and complex noise backgrounds. It then systematically analyzes the application of time-frequency analysis methods in extracting transient gravitational wave signals, including wavelet transforms and Hilbert-Huang transforms. The study focuses on discussing the crucial role of matched filtering techniques in improving signal detection sensitivity and explores strategies for template bank optimization. Additionally, the research evaluates the potential of machine learning algorithms, especially deep learning networks, in rapidly identifying and classifying gravitational wave events. The study also analyzes the application of Bayesian inference methods in parameter estimation and model selection, as well as their advantages in handling uncertainties. However, the research also points out the challenges faced by current technologies, such as dealing with non-Gaussian noise and improving computational efficiency. To address these issues, the study proposes a hybrid analysis framework combining physical models and data-driven methods. Finally, the research looks ahead to the potential applications of quantum computing in future gravitational wave data analysis. This study provides a comprehensive theoretical foundation for the optimization and innovation of gravitational wave data analysis methods, contributing to the advancement of gravitational wave astronomy.展开更多
For real-time processing of ultra-wide bandwidth low-frequency pulsar baseband data,we designed and implemented an ultra-wide bandwidth low-frequency pulsar data processing pipeline(UWLPIPE)based on the shared ringbuf...For real-time processing of ultra-wide bandwidth low-frequency pulsar baseband data,we designed and implemented an ultra-wide bandwidth low-frequency pulsar data processing pipeline(UWLPIPE)based on the shared ringbuffer and GPU parallel technology.UWLPIPE runs on the GPU cluster and can simultaneously receive multiple 128 MHz dual-polarization VDIF data packets preprocessed by the front-end FPGA.After aligning the dual-polarization data,multiple 128M subband data are packaged into PSRDADA baseband data or multi-channel coherent dispersion filterbank data,and multiple subband filterbank data can be spliced into wideband data after time alignment.We used the Nanshan 26 m radio telescope with the L-band receiver at964~1732 MHz to observe multiple pulsars.Finally,we processed the data using DSPSR software,and the results showed that each subband could correctly fold out the pulse profile,and the wideband pulse profile accumulated by multiple subbands could be correctly aligned.展开更多
In order to attain good quality transfer function estimates from magnetotelluric field data(i.e.,smooth behavior and small uncertainties across all frequencies),we compare time series data processing with and without ...In order to attain good quality transfer function estimates from magnetotelluric field data(i.e.,smooth behavior and small uncertainties across all frequencies),we compare time series data processing with and without a multitaper approach for spectral estimation.There are several common ways to increase the reliability of the Fourier spectral estimation from experimental(noisy)data;for example to subdivide the experimental time series into segments,taper these segments(using single taper),perform the Fourier transform of the individual segments,and average the resulting spectra.展开更多
Current methodologies for cleaning wind power anomaly data exhibit limited capabilities in identifying abnormal data within extensive datasets and struggle to accommodate the considerable variability and intricacy of ...Current methodologies for cleaning wind power anomaly data exhibit limited capabilities in identifying abnormal data within extensive datasets and struggle to accommodate the considerable variability and intricacy of wind farm data.Consequently,a method for cleaning wind power anomaly data by combining image processing with community detection algorithms(CWPAD-IPCDA)is proposed.To precisely identify and initially clean anomalous data,wind power curve(WPC)images are converted into graph structures,which employ the Louvain community recognition algorithm and graph-theoretic methods for community detection and segmentation.Furthermore,the mathematical morphology operation(MMO)determines the main part of the initially cleaned wind power curve images and maps them back to the normal wind power points to complete the final cleaning.The CWPAD-IPCDA method was applied to clean datasets from 25 wind turbines(WTs)in two wind farms in northwest China to validate its feasibility.A comparison was conducted using density-based spatial clustering of applications with noise(DBSCAN)algorithm,an improved isolation forest algorithm,and an image-based(IB)algorithm.The experimental results demonstrate that the CWPAD-IPCDA method surpasses the other three algorithms,achieving an approximately 7.23%higher average data cleaning rate.The mean value of the sum of the squared errors(SSE)of the dataset after cleaning is approximately 6.887 lower than that of the other algorithms.Moreover,the mean of overall accuracy,as measured by the F1-score,exceeds that of the other methods by approximately 10.49%;this indicates that the CWPAD-IPCDA method is more conducive to improving the accuracy and reliability of wind power curve modeling and wind farm power forecasting.展开更多
To address the problem of real-time processing of ultra-wide bandwidth pulsar baseband data,we designed and implemented a pulsar baseband data processing algorithm(PSRDP)based on GPU parallel computing technology.PSRD...To address the problem of real-time processing of ultra-wide bandwidth pulsar baseband data,we designed and implemented a pulsar baseband data processing algorithm(PSRDP)based on GPU parallel computing technology.PSRDP can perform operations such as baseband data unpacking,channel separation,coherent dedispersion,Stokes detection,phase and folding period prediction,and folding integration in GPU clusters.We tested the algorithm using the J0437-4715 pulsar baseband data generated by the CASPSR and Medusa backends of the Parkes,and the J0332+5434 pulsar baseband data generated by the self-developed backend of the Nan Shan Radio Telescope.We obtained the pulse profiles of each baseband data.Through experimental analysis,we have found that the pulse profiles generated by the PSRDP algorithm in this paper are essentially consistent with the processing results of Digital Signal Processing Software for Pulsar Astronomy(DSPSR),which verified the effectiveness of the PSRDP algorithm.Furthermore,using the same baseband data,we compared the processing speed of PSRDP with DSPSR,and the results showed that PSRDP was not slower than DSPSR in terms of speed.The theoretical and technical experience gained from the PSRDP algorithm research in this article lays a technical foundation for the real-time processing of QTT(Qi Tai radio Telescope)ultra-wide bandwidth pulsar baseband data.展开更多
Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentime...Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentiment analysisin widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grapplingwith resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language,characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu,Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguisticfeatures, presents an additional hurdle due to the lack of accessible datasets, rendering sentiment analysis aformidable undertaking. The limited availability of resources has fueled increased interest among researchers,prompting a deeper exploration into Urdu sentiment analysis. This research is dedicated to Urdu languagesentiment analysis, employing sophisticated deep learning models on an extensive dataset categorized into fivelabels: Positive, Negative, Neutral, Mixed, and Ambiguous. The primary objective is to discern sentiments andemotions within the Urdu language, despite the absence of well-curated datasets. To tackle this challenge, theinitial step involves the creation of a comprehensive Urdu dataset by aggregating data from various sources such asnewspapers, articles, and socialmedia comments. Subsequent to this data collection, a thorough process of cleaningand preprocessing is implemented to ensure the quality of the data. The study leverages two well-known deeplearningmodels, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), for bothtraining and evaluating sentiment analysis performance. Additionally, the study explores hyperparameter tuning tooptimize the models’ efficacy. Evaluation metrics such as precision, recall, and the F1-score are employed to assessthe effectiveness of the models. The research findings reveal that RNN surpasses CNN in Urdu sentiment analysis,gaining a significantly higher accuracy rate of 91%. This result accentuates the exceptional performance of RNN,solidifying its status as a compelling option for conducting sentiment analysis tasks in the Urdu language.展开更多
In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model...In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model with 1DCNN-attention network and the enhanced preprocessing techniques is proposed for loan approval prediction. Our proposed model consists of the enhanced data preprocessing and stacking of multiple hybrid modules. Initially, the enhanced data preprocessing techniques using a combination of methods such as standardization, SMOTE oversampling, feature construction, recursive feature elimination (RFE), information value (IV) and principal component analysis (PCA), which not only eliminates the effects of data jitter and non-equilibrium, but also removes redundant features while improving the representation of features. Subsequently, a hybrid module that combines a 1DCNN with an attention mechanism is proposed to extract local and global spatio-temporal features. Finally, the comprehensive experiments conducted validate that the proposed model surpasses state-of-the-art baseline models across various performance metrics, including accuracy, precision, recall, F1 score, and AUC. Our proposed model helps to automate the loan approval process and provides scientific guidance to financial institutions for loan risk control.展开更多
The processing of measuri ng data plays an important role in reverse engineering. Based on grey system the ory, we first propose some methods to the processing of measuring data in revers e engineering. The measured d...The processing of measuri ng data plays an important role in reverse engineering. Based on grey system the ory, we first propose some methods to the processing of measuring data in revers e engineering. The measured data usually have some abnormalities. When the abnor mal data are eliminated by filtering, blanks are created. The grey generation an d GM(1,1) are used to create new data for these blanks. For the uneven data sequ en ce created by measuring error, the mean generation is used to smooth it and then the stepwise and smooth generations are used to improve the data sequence.展开更多
To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,al...To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance.展开更多
A television based multistatic radar system is described. The commercial television transmitter is used as the illuminator in the multistatic radar system. The reflected commercial television signals are measured by ...A television based multistatic radar system is described. The commercial television transmitter is used as the illuminator in the multistatic radar system. The reflected commercial television signals are measured by an array of sensors. A data processing scheme is developed that adapts to the poor signal processing ability. The innovation is focused on the construction of the observation space, which could reduce the non linearity error. The new method leads to better system stability than the traditional one. Monte Carlo simulation is utilized and compared with the traditional method.展开更多
基金supported by China Southern Power Grid Technology Project under Grant 03600KK52220019(GDKJXM20220253).
文摘The convergence of Internet of Things(IoT),5G,and cloud collaboration offers tailored solutions to the rigorous demands of multi-flow integrated energy aggregation dispatch data processing.While generative adversarial networks(GANs)are instrumental in resource scheduling,their application in this domain is impeded by challenges such as convergence speed,inferior optimality searching capability,and the inability to learn from failed decision making feedbacks.Therefore,a cloud-edge collaborative federated GAN-based communication and computing resource scheduling algorithm with long-term constraint violation sensitiveness is proposed to address these challenges.The proposed algorithm facilitates real-time,energy-efficient data processing by optimizing transmission power control,data migration,and computing resource allocation.It employs federated learning for global parameter aggregation to enhance GAN parameter updating and dynamically adjusts GAN learning rates and global aggregation weights based on energy consumption constraint violations.Simulation results indicate that the proposed algorithm effectively reduces data processing latency,energy consumption,and convergence time.
文摘A novel method for noise removal from the rotating accelerometer gravity gradiometer(MAGG)is presented.It introduces a head-to-tail data expansion technique based on the zero-phase filtering principle.A scheme for determining band-pass filter parameters based on signal-to-noise ratio gain,smoothness index,and cross-correlation coefficient is designed using the Chebyshev optimal consistent approximation theory.Additionally,a wavelet denoising evaluation function is constructed,with the dmey wavelet basis function identified as most effective for processing gravity gradient data.The results of hard-in-the-loop simulation and prototype experiments show that the proposed processing method has shown a 14%improvement in the measurement variance of gravity gradient signals,and the measurement accuracy has reached within 4E,compared to other commonly used methods,which verifies that the proposed method effectively removes noise from the gradient signals,improved gravity gradiometry accuracy,and has certain technical insights for high-precision airborne gravity gradiometry.
基金supported by the State Key Laboratory of Natural Gas Hydrate(No.2022-KFJJ-SHW)the National Natural Science Foundation of China(No.42376058)+2 种基金the International Science&Technology Cooperation Program of China(No.2023YFE0119900)the Hainan Province Key Research and Development Project(No.ZDYF2024GXJS002)the Research Start-Up Funds of Zhufeng Scholars Program.
文摘Gas hydrate drilling expeditions in the Pearl River Mouth Basin,South China Sea,have identified concentrated gas hydrates with variable thickness.Moreover,free gas and the coexistence of gas hydrate and free gas have been confirmed by logging,coring,and production tests in the foraminifera-rich silty sediments with complex bottom-simulating reflectors(BSRs).The broad-band processing is conducted on conventional three-dimensional(3D)seismic data to improve the image and detection accuracy of gas hydratebearing layers and delineate the saturation and thickness of gas hydrate-and free gas-bearing sediments.Several geophysical attributes extracted along the base of the gas hydrate stability zone are used to demonstrate the variable distribution and the controlling factors for the differential enrichment of gas hydrate.The inverted gas hydrate saturation at the production zone is over 40% with a thickness of 90 m,showing the interbedded distribution with different boundaries between gas hydrate-and free gas-bearing layers.However,the gas hydrate saturation value at the adjacent canyon is 70%,with 30-m-thick patches and linear features.The lithological and fault controls on gas hydrate and free gas distributions are demonstrated by tracing each gas hydrate-bearing layer.Moreover,the BSR depths based on broad-band reprocessed 3D seismic data not only exhibit variations due to small-scale topographic changes caused by seafloor sedimentation and erosion but also show the upward shift of BSR and the blocky distribution of the coexistence of gas hydrate and free gas in the Pearl River Mouth Basin.
基金Technology Development Program of Jilin Province(YDZJ202201ZYTS640)the National Key Research and Development Program of China(2022YFB4200400)funded by MOST+4 种基金the National Natural Science Foundation of China(52172048 and 52103221)Shandong Provincial Natural Science Foundation(ZR2021QB024 and ZR2021ZD06)Guangdong Basic and Applied Basic Research Foundation(2023A1515012323,2023A1515010943,and 2024A1515010023)the Qingdao New Energy Shandong Laboratory open Project(QNESL OP 202309)the Fundamental Research Funds of Shandong University.
文摘Recently published in Joule,Feng Liu and colleagues from Shanghai Jiaotong University reported a record-breaking 20.8%power conversion efficiency in organic solar cells(OSCs)with an interpenetrating fibril network active layer morphology,featuring a bulk p-in structure and proper vertical segregation achieved through additive-assisted layer-by-layer deposition.This optimized hierarchical gradient fibrillar morphology and optical management synergistically facilitates exciton diffusion,reduces recombination losses,and enhances light capture capability.This approach not only offers a solution to achieving high-efficiency devices but also demonstrates the potential for commercial applications of OSCs.
基金supported by the National Natural Science Foundation of China(No.51805265)the Fundamental Research Funds for the Central Universities,China(No.30922010921).
文摘Thin walls of an AZ91 magnesium alloy with fine equiaxed grains were fabricated via cold arc-based wire arc additive manufacturing(CA-WAAM),and the droplet transfer behaviours,microstructures,and mechanical properties were investigated.The results showed that the cold arc process reduced splashing at the moment of liquid bridge breakage and effectively shortened the droplet transfer period.The microstructures of the deposited samples exhibited layered characteristics with alternating distributions of coarse and fine grains.During layer-by-layer deposition,the β-phase precipitated and grew preferentially along grain boundaries,while the fineη-Al_(8)Mn_(5)phase was dispersed in the α-Mg matrix.The mechanical properties of the CA-WAAM deposited sample showed isotropic characteristics.The ultimate tensile strength and elongation in the building direction(BD)were 282.7 MPa and 14.2%,respectively.The microhardness values of the deposited parts were relatively uniform,with an average value of HV 69.6.
基金supported by National Natural Sciences Foundation of China(No.62271165,62027802,62201307)the Guangdong Basic and Applied Basic Research Foundation(No.2023A1515030297)+2 种基金the Shenzhen Science and Technology Program ZDSYS20210623091808025Stable Support Plan Program GXWD20231129102638002the Major Key Project of PCL(No.PCL2024A01)。
文摘Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.
基金funded by the National Natural Science Foundation of China(NSFC,Nos.12373086 and 12303082)CAS“Light of West China”Program+2 种基金Yunnan Revitalization Talent Support Program in Yunnan ProvinceNational Key R&D Program of ChinaGravitational Wave Detection Project No.2022YFC2203800。
文摘Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometric observations,outliers may exist in the obtained light curves due to various reasons.Therefore,preprocessing is required to remove these outliers to obtain high quality light curves.Through statistical analysis,the reasons leading to outliers can be categorized into two main types:first,the brightness of the object significantly increases due to the passage of a star nearby,referred to as“stellar contamination,”and second,the brightness markedly decreases due to cloudy cover,referred to as“cloudy contamination.”The traditional approach of manually inspecting images for contamination is time-consuming and labor-intensive.However,we propose the utilization of machine learning methods as a substitute.Convolutional Neural Networks and SVMs are employed to identify cases of stellar contamination and cloudy contamination,achieving F1 scores of 1.00 and 0.98 on a test set,respectively.We also explore other machine learning methods such as ResNet-18 and Light Gradient Boosting Machine,then conduct comparative analyses of the results.
文摘In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical Database. Our investigation entails a comprehensive exploration of various methodologies aimed at enhancing the efficiency of ETL processes, with a primary emphasis on optimizing time and resource utilization. Through meticulous experimentation utilizing a representative dataset, we shed light on the advantages associated with the incorporation of PySpark and Docker containerized applications. Our research illuminates significant advancements in time efficiency, process streamlining, and resource optimization attained through the utilization of PySpark for distributed computing within Big Data Engineering workflows. Additionally, we underscore the strategic integration of Docker containers, delineating their pivotal role in augmenting scalability and reproducibility within the ETL pipeline. This paper encapsulates the pivotal insights gleaned from our experimental journey, accentuating the practical implications and benefits entailed in the adoption of PySpark and Docker. By streamlining Big Data Engineering and ETL processes in the context of clinical big data, our study contributes to the ongoing discourse on optimizing data processing efficiency in healthcare applications. The source code is available on request.
文摘The recent pandemic crisis has highlighted the importance of the availability and management of health data to respond quickly and effectively to health emergencies, while respecting the fundamental rights of every individual. In this context, it is essential to find a balance between the protection of privacy and the safeguarding of public health, using tools that guarantee transparency and consent to the processing of data by the population. This work, starting from a pilot investigation conducted in the Polyclinic of Bari as part of the Horizon Europe Seeds project entitled “Multidisciplinary analysis of technological tracing models of contagion: the protection of rights in the management of health data”, has the objective of promoting greater patient awareness regarding the processing of their health data and the protection of privacy. The methodology used the PHICAT (Personal Health Information Competence Assessment Tool) as a tool and, through the administration of a questionnaire, the aim was to evaluate the patients’ ability to express their consent to the release and processing of health data. The results that emerged were analyzed in relation to the 4 domains in which the process is divided which allows evaluating the patients’ ability to express a conscious choice and, also, in relation to the socio-demographic and clinical characteristics of the patients themselves. This study can contribute to understanding patients’ ability to give their consent and improve information regarding the management of health data by increasing confidence in granting the use of their data for research and clinical management.
文摘Gravitational wave detection is one of the most cutting-edge research areas in modern physics, with its success relying on advanced data analysis and signal processing techniques. This study provides a comprehensive review of data analysis methods and signal processing techniques in gravitational wave detection. The research begins by introducing the characteristics of gravitational wave signals and the challenges faced in their detection, such as extremely low signal-to-noise ratios and complex noise backgrounds. It then systematically analyzes the application of time-frequency analysis methods in extracting transient gravitational wave signals, including wavelet transforms and Hilbert-Huang transforms. The study focuses on discussing the crucial role of matched filtering techniques in improving signal detection sensitivity and explores strategies for template bank optimization. Additionally, the research evaluates the potential of machine learning algorithms, especially deep learning networks, in rapidly identifying and classifying gravitational wave events. The study also analyzes the application of Bayesian inference methods in parameter estimation and model selection, as well as their advantages in handling uncertainties. However, the research also points out the challenges faced by current technologies, such as dealing with non-Gaussian noise and improving computational efficiency. To address these issues, the study proposes a hybrid analysis framework combining physical models and data-driven methods. Finally, the research looks ahead to the potential applications of quantum computing in future gravitational wave data analysis. This study provides a comprehensive theoretical foundation for the optimization and innovation of gravitational wave data analysis methods, contributing to the advancement of gravitational wave astronomy.
基金supported by the National Key R&D Program of China Nos.2021YFC2203502 and 2022YFF0711502the National Natural Science Foundation of China(NSFC)(12173077)+4 种基金the Tianshan Talent Project of Xinjiang Uygur Autonomous Region(2022TSYCCX0095 and2023TSYCCX0112)the Scientific Instrument Developing Project of the Chinese Academy of Sciences,grant No.PTYQ2022YZZD01China National Astronomical Data Center(NADC)the Operation,Maintenance and Upgrading Fund for Astronomical Telescopes and Facility Instruments,budgeted from the Ministry of Finance of China(MOF)and administrated by the Chinese Academy of Sciences(CAS)Natural Science Foundation of Xinjiang Uygur Autonomous Region(2022D01A360)。
文摘For real-time processing of ultra-wide bandwidth low-frequency pulsar baseband data,we designed and implemented an ultra-wide bandwidth low-frequency pulsar data processing pipeline(UWLPIPE)based on the shared ringbuffer and GPU parallel technology.UWLPIPE runs on the GPU cluster and can simultaneously receive multiple 128 MHz dual-polarization VDIF data packets preprocessed by the front-end FPGA.After aligning the dual-polarization data,multiple 128M subband data are packaged into PSRDADA baseband data or multi-channel coherent dispersion filterbank data,and multiple subband filterbank data can be spliced into wideband data after time alignment.We used the Nanshan 26 m radio telescope with the L-band receiver at964~1732 MHz to observe multiple pulsars.Finally,we processed the data using DSPSR software,and the results showed that each subband could correctly fold out the pulse profile,and the wideband pulse profile accumulated by multiple subbands could be correctly aligned.
文摘In order to attain good quality transfer function estimates from magnetotelluric field data(i.e.,smooth behavior and small uncertainties across all frequencies),we compare time series data processing with and without a multitaper approach for spectral estimation.There are several common ways to increase the reliability of the Fourier spectral estimation from experimental(noisy)data;for example to subdivide the experimental time series into segments,taper these segments(using single taper),perform the Fourier transform of the individual segments,and average the resulting spectra.
基金supported by the National Natural Science Foundation of China(Project No.51767018)Natural Science Foundation of Gansu Province(Project No.23JRRA836).
文摘Current methodologies for cleaning wind power anomaly data exhibit limited capabilities in identifying abnormal data within extensive datasets and struggle to accommodate the considerable variability and intricacy of wind farm data.Consequently,a method for cleaning wind power anomaly data by combining image processing with community detection algorithms(CWPAD-IPCDA)is proposed.To precisely identify and initially clean anomalous data,wind power curve(WPC)images are converted into graph structures,which employ the Louvain community recognition algorithm and graph-theoretic methods for community detection and segmentation.Furthermore,the mathematical morphology operation(MMO)determines the main part of the initially cleaned wind power curve images and maps them back to the normal wind power points to complete the final cleaning.The CWPAD-IPCDA method was applied to clean datasets from 25 wind turbines(WTs)in two wind farms in northwest China to validate its feasibility.A comparison was conducted using density-based spatial clustering of applications with noise(DBSCAN)algorithm,an improved isolation forest algorithm,and an image-based(IB)algorithm.The experimental results demonstrate that the CWPAD-IPCDA method surpasses the other three algorithms,achieving an approximately 7.23%higher average data cleaning rate.The mean value of the sum of the squared errors(SSE)of the dataset after cleaning is approximately 6.887 lower than that of the other algorithms.Moreover,the mean of overall accuracy,as measured by the F1-score,exceeds that of the other methods by approximately 10.49%;this indicates that the CWPAD-IPCDA method is more conducive to improving the accuracy and reliability of wind power curve modeling and wind farm power forecasting.
基金supported by the National Key R&D Program of China Nos.2021YFC2203502 and 2022YFF0711502the National Natural Science Foundation of China(NSFC)(12173077 and 12003062)+5 种基金the Tianshan Innovation Team Plan of Xinjiang Uygur Autonomous Region(2022D14020)the Tianshan Talent Project of Xinjiang Uygur Autonomous Region(2022TSYCCX0095)the Scientific Instrument Developing Project of the Chinese Academy of Sciences,grant No.PTYQ2022YZZD01China National Astronomical Data Center(NADC)the Operation,Maintenance and Upgrading Fund for Astronomical Telescopes and Facility Instruments,budgeted from the Ministry of Finance of China(MOF)and administrated by the Chinese Academy of Sciences(CAS)Natural Science Foundation of Xinjiang Uygur Autonomous Region(2022D01A360)。
文摘To address the problem of real-time processing of ultra-wide bandwidth pulsar baseband data,we designed and implemented a pulsar baseband data processing algorithm(PSRDP)based on GPU parallel computing technology.PSRDP can perform operations such as baseband data unpacking,channel separation,coherent dedispersion,Stokes detection,phase and folding period prediction,and folding integration in GPU clusters.We tested the algorithm using the J0437-4715 pulsar baseband data generated by the CASPSR and Medusa backends of the Parkes,and the J0332+5434 pulsar baseband data generated by the self-developed backend of the Nan Shan Radio Telescope.We obtained the pulse profiles of each baseband data.Through experimental analysis,we have found that the pulse profiles generated by the PSRDP algorithm in this paper are essentially consistent with the processing results of Digital Signal Processing Software for Pulsar Astronomy(DSPSR),which verified the effectiveness of the PSRDP algorithm.Furthermore,using the same baseband data,we compared the processing speed of PSRDP with DSPSR,and the results showed that PSRDP was not slower than DSPSR in terms of speed.The theoretical and technical experience gained from the PSRDP algorithm research in this article lays a technical foundation for the real-time processing of QTT(Qi Tai radio Telescope)ultra-wide bandwidth pulsar baseband data.
文摘Sentiment analysis, a crucial task in discerning emotional tones within the text, plays a pivotal role in understandingpublic opinion and user sentiment across diverse languages.While numerous scholars conduct sentiment analysisin widely spoken languages such as English, Chinese, Arabic, Roman Arabic, and more, we come to grapplingwith resource-poor languages like Urdu literature which becomes a challenge. Urdu is a uniquely crafted language,characterized by a script that amalgamates elements from diverse languages, including Arabic, Parsi, Pashtu,Turkish, Punjabi, Saraiki, and more. As Urdu literature, characterized by distinct character sets and linguisticfeatures, presents an additional hurdle due to the lack of accessible datasets, rendering sentiment analysis aformidable undertaking. The limited availability of resources has fueled increased interest among researchers,prompting a deeper exploration into Urdu sentiment analysis. This research is dedicated to Urdu languagesentiment analysis, employing sophisticated deep learning models on an extensive dataset categorized into fivelabels: Positive, Negative, Neutral, Mixed, and Ambiguous. The primary objective is to discern sentiments andemotions within the Urdu language, despite the absence of well-curated datasets. To tackle this challenge, theinitial step involves the creation of a comprehensive Urdu dataset by aggregating data from various sources such asnewspapers, articles, and socialmedia comments. Subsequent to this data collection, a thorough process of cleaningand preprocessing is implemented to ensure the quality of the data. The study leverages two well-known deeplearningmodels, namely Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), for bothtraining and evaluating sentiment analysis performance. Additionally, the study explores hyperparameter tuning tooptimize the models’ efficacy. Evaluation metrics such as precision, recall, and the F1-score are employed to assessthe effectiveness of the models. The research findings reveal that RNN surpasses CNN in Urdu sentiment analysis,gaining a significantly higher accuracy rate of 91%. This result accentuates the exceptional performance of RNN,solidifying its status as a compelling option for conducting sentiment analysis tasks in the Urdu language.
文摘In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model with 1DCNN-attention network and the enhanced preprocessing techniques is proposed for loan approval prediction. Our proposed model consists of the enhanced data preprocessing and stacking of multiple hybrid modules. Initially, the enhanced data preprocessing techniques using a combination of methods such as standardization, SMOTE oversampling, feature construction, recursive feature elimination (RFE), information value (IV) and principal component analysis (PCA), which not only eliminates the effects of data jitter and non-equilibrium, but also removes redundant features while improving the representation of features. Subsequently, a hybrid module that combines a 1DCNN with an attention mechanism is proposed to extract local and global spatio-temporal features. Finally, the comprehensive experiments conducted validate that the proposed model surpasses state-of-the-art baseline models across various performance metrics, including accuracy, precision, recall, F1 score, and AUC. Our proposed model helps to automate the loan approval process and provides scientific guidance to financial institutions for loan risk control.
文摘The processing of measuri ng data plays an important role in reverse engineering. Based on grey system the ory, we first propose some methods to the processing of measuring data in revers e engineering. The measured data usually have some abnormalities. When the abnor mal data are eliminated by filtering, blanks are created. The grey generation an d GM(1,1) are used to create new data for these blanks. For the uneven data sequ en ce created by measuring error, the mean generation is used to smooth it and then the stepwise and smooth generations are used to improve the data sequence.
基金Weaponry Equipment Pre-Research Foundation of PLA Equipment Ministry (No. 9140A06050409JB8102)Pre-Research Foundation of PLA University of Science and Technology (No. 2009JSJ11)
文摘To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance.
文摘A television based multistatic radar system is described. The commercial television transmitter is used as the illuminator in the multistatic radar system. The reflected commercial television signals are measured by an array of sensors. A data processing scheme is developed that adapts to the poor signal processing ability. The innovation is focused on the construction of the observation space, which could reduce the non linearity error. The new method leads to better system stability than the traditional one. Monte Carlo simulation is utilized and compared with the traditional method.