Research data infrastructures form the cornerstone in both cyber and physical spaces,driving the progression of the data-intensive scientific research paradigm.This opinion paper presents an overview of global researc...Research data infrastructures form the cornerstone in both cyber and physical spaces,driving the progression of the data-intensive scientific research paradigm.This opinion paper presents an overview of global research data infrastructure,drawing insights from national roadmaps and strategic documents related to research data infrastructure.It emphasizes the pivotal role of research data infrastructures by delineating four new missions aimed at positioning them at the core of the current scientific research and communication ecosystem.The four new missions of research data infrastructures are:(1)as a pioneer,to transcend the disciplinary border and address complex,cutting-edge scientific and social challenges with problem-and data-oriented insights;(2)as an architect,to establish a digital,intelligent,flexible research and knowledge services environment;(3)as a platform,to foster the high-end academic communication;(4)as a coordinator,to balance scientific openness with ethics needs.展开更多
In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose...In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose a Hadoop based big data secure storage scheme.Firstly,in order to disperse the NameNode service from a single server to multiple servers,we combine HDFS federation and HDFS high-availability mechanisms,and use the Zookeeper distributed coordination mechanism to coordinate each node to achieve dual-channel storage.Then,we improve the ECC encryption algorithm for the encryption of ordinary data,and adopt a homomorphic encryption algorithm to encrypt data that needs to be calculated.To accelerate the encryption,we adopt the dualthread encryption mode.Finally,the HDFS control module is designed to combine the encryption algorithm with the storage model.Experimental results show that the proposed solution solves the problem of a single point of failure of metadata,performs well in terms of metadata reliability,and can realize the fault tolerance of the server.The improved encryption algorithm integrates the dual-channel storage mode,and the encryption storage efficiency improves by 27.6% on average.展开更多
Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series ”(GOES-R) Geostationary Lightning Mapper(GLM) flash extent density(FED) data within the operational Gridpoint Statistical Interp...Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series ”(GOES-R) Geostationary Lightning Mapper(GLM) flash extent density(FED) data within the operational Gridpoint Statistical Interpolation ensemble Kalman filter(GSI-EnKF) framework were previously developed and tested with a mesoscale convective system(MCS) case. In this study, such capabilities are further developed to assimilate GOES GLM FED data within the GSI ensemble-variational(EnVar) hybrid data assimilation(DA) framework. The results of assimilating the GLM FED data using 3DVar, and pure En3DVar(PEn3DVar, using 100% ensemble covariance and no static covariance) are compared with those of EnKF/DfEnKF for a supercell storm case. The focus of this study is to validate the correctness and evaluate the performance of the new implementation rather than comparing the performance of FED DA among different DA schemes. Only the results of 3DVar and pEn3DVar are examined and compared with EnKF/DfEnKF. Assimilation of a single FED observation shows that the magnitude and horizontal extent of the analysis increments from PEn3DVar are generally larger than from EnKF, which is mainly caused by using different localization strategies in EnFK/DfEnKF and PEn3DVar as well as the integration limits of the graupel mass in the observation operator. Overall, the forecast performance of PEn3DVar is comparable to EnKF/DfEnKF, suggesting correct implementation.展开更多
Various approaches have been proposed to minimize the upper-level systematic biases in global numerical weather prediction(NWP)models by using satellite upper-air sounding channels as anchors.However,since the China M...Various approaches have been proposed to minimize the upper-level systematic biases in global numerical weather prediction(NWP)models by using satellite upper-air sounding channels as anchors.However,since the China Meteorological Administration Global Forecast System(CMA-GFS)has a model top near 0.1 hPa(60 km),the upper-level temperature bias may exceed 4 K near 1 hPa and further extend to 5 hPa.In this study,channels 12–14 of the Advanced Microwave Sounding Unit A(AMSU-A)onboard five satellites of NOAA and METOP,whose weighting function peaks range from 10 to 2 hPa are all used as anchor observations in CMA-GFS.It is shown that the new“Anchor”approach can effectively reduce the biases near the model top and their downward propagation in three-month assimilation cycles.The bias growth rate of simulated upper-level channel observations is reduced to±0.001 K d^(–1),compared to–0.03 K d^(–1)derived from the current dynamic correction scheme.The relatively stable bias significantly improves the upper-level analysis field and leads to better global medium-range forecasts up to 10 days with significant reductions in the temperature and geopotential forecast error above 10 hPa.展开更多
The main obstacle to the open sharing of scientific data is the lack of a legal protection system for intellectual property.This article analyzes the progress of research papers on intellectual property in scientific ...The main obstacle to the open sharing of scientific data is the lack of a legal protection system for intellectual property.This article analyzes the progress of research papers on intellectual property in scientific data in China through literature search and statistics.Currently,research subjects are unbalanced,research content is uneven,research methods are intellectual single,and research depth is insufficient.It is recommended that different stakeholders engage in deep cross disciplinary cooperation,further improve China’s legal and policy protection system for scientific data intellectual property,and promote the open sharing of scientific data.展开更多
Every day,an NDT(Non-Destructive Testing)report will govern key decisions and inform inspection strategies that could affect the flow of millions of dollars which ultimately affects local environments and potential ri...Every day,an NDT(Non-Destructive Testing)report will govern key decisions and inform inspection strategies that could affect the flow of millions of dollars which ultimately affects local environments and potential risk to life.There is a direct correlation between report quality and equipment capability.The more able the equipment is-in terms of efficient data gathering,signal to noise ratio,positioning,and coverage-the more actionable the report is.This results in optimal maintenance and repair strategies providing the report is clear and well presented.Furthermore,when considering tank floor storage inspection it is essential that asset owners have total confidence in inspection findings and the ensuing reports.Tank floor inspection equipment must not only be efficient and highly capable,but data sets should be traceable and integrity maintained throughout.Corrosion mapping of large surface areas such as storage tank bottoms is an inherently arduous and time-consuming process.MFL(magnetic flux leakage)based tank bottom scanners present a well-established and highly rated method for inspection.There are many benefits of using modern MFL technology to generate actionable reports.Chief among these includes efficiency of coverage while gaining valuable information regarding defect location,severity,surface origin and the extent of coverage.More recent advancements in modern MFL tank bottom scanners afford the ability to scan and record data sets at areas of the tank bottom which were previously classed as dead zones or areas not scanned due to physical restraints.An example of this includes scanning the CZ(critical zone)which is the area close to the annular to shell junction weld.Inclusion of these additional dead zones increases overall inspection coverage,quality and traceability.Inspection of the CZ areas allows engineers to quickly determine the integrity of arguably the most important area of the tank bottom.Herein we discuss notable developments in CZ coverage,inspection efficiency and data integrity that combines to deliver an actionable report.The asset owner can interrogate this report to develop pertinent and accurate maintenance and repair strategies.展开更多
That the world is a global village is no longer news through the tremendous advancement in the Information Communication Technology (ICT). The metamorphosis of the human data storage and analysis from analogue through...That the world is a global village is no longer news through the tremendous advancement in the Information Communication Technology (ICT). The metamorphosis of the human data storage and analysis from analogue through the jaguars-loom mainframe computer to the present modern high power processing computers with sextillion bytes storage capacity has prompted discussion of Big Data concept as a tool in managing hitherto all human challenges of complex human system multiplier effects. The supply chain management (SCM) that deals with spatial service delivery that must be safe, efficient, reliable, cheap, transparent, and foreseeable to meet customers’ needs cannot but employ bid data tools in its operation. This study employs secondary data online to review the importance of big data in supply chain management and the levels of adoption in Nigeria. The study revealed that the application of big data tools in SCM and other industrial sectors is synonymous to human and national development. It is therefore recommended that both private and governmental bodies should key into e-transactions for easy data assemblage and analysis for profitable forecasting and policy formation.展开更多
With the rapid development of modern science and technology, traditional randomized controlled trials have become insufficient to meet current scientific research needs, particularly in the field of clinical research....With the rapid development of modern science and technology, traditional randomized controlled trials have become insufficient to meet current scientific research needs, particularly in the field of clinical research. The emergence of real-world data studies, which align more closely with actual clinical evidence, has garnered significant attention in recent years. The following is a brief overview of the specific utilization of real-world data in drug development, which often involves large sample sizes and analyses covering a relatively diverse population without strict inclusion and exclusion criteria. Real-world data often reflects real clinical practice: treatment options are chosen according to the actual conditions and willingness of patients rather than through random assignment. Analysis based on real-world data also focuses on endpoints highly relevant to clinical benefits and the quality of life of patients. The booming big data technology supports the utilization of real-world data to accelerate new drug development, serving as an important supplement to traditional clinical trials.展开更多
As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by ...As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by everyone.To this end,we discuss some of our explorations and attempts in the construction and teaching process of big data courses for the major of big data management and application from the perspective of course planning,course implementation,and course summary.After interviews with students and feedback from questionnaires,students are highly satisfied with some of the teaching measures and programs currently adopted.展开更多
Background:Missing data are frequently occurred in clinical studies.Due to the development of precision medicine,there is an increased interest in N-of-1 trial.Bayesian models are one of main statistical methods for a...Background:Missing data are frequently occurred in clinical studies.Due to the development of precision medicine,there is an increased interest in N-of-1 trial.Bayesian models are one of main statistical methods for analyzing the data of N-of-1 trials.This simulation study aimed to compare two statistical methods for handling missing values of quantitative data in Bayesian N-of-1 trials.Methods:The simulated data of N-of-1 trials with different coefficients of autocorrelation,effect sizes and missing ratios are obtained by SAS 9.1 system.The missing values are filled with mean filling and regression filling respectively in the condition of different coefficients of autocorrelation,effect sizes and missing ratios by SPSS 25.0 software.Bayesian models are built to estimate the posterior means by Winbugs 14 software.Results:When the missing ratio is relatively small,e.g.5%,missing values have relatively little effect on the results.Therapeutic effects may be underestimated when the coefficient of autocorrelation increases and no filling is used.However,it may be overestimated when mean or regression filling is used,and the results after mean filling are closer to the actual effect than regression filling.In the case of moderate missing ratio,the estimated effect after mean filling is closer to the actual effect compared to regression filling.When a large missing ratio(20%)occurs,data missing can lead to significantly underestimate the effect.In this case,the estimated effect after regression filling is closer to the actual effect compared to mean filling.Conclusion:Data missing can affect the estimated therapeutic effects using Bayesian models in N-of-1 trials.The present study suggests that mean filling can be used under situation of missing ratio≤10%.Otherwise,regression filling may be preferable.展开更多
This paper investigates the security issue of multisensor remote estimation systems.An optimal stealthy false data injection(FDI)attack scheme based on historical and current residuals,which only tampers with the meas...This paper investigates the security issue of multisensor remote estimation systems.An optimal stealthy false data injection(FDI)attack scheme based on historical and current residuals,which only tampers with the measurement residuals of partial sensors due to limited attack resources,is proposed to maximally degrade system estimation performance.The attack stealthiness condition is given,and then the estimation error covariance in compromised state is derived to quantify the system performance under attack.The optimal attack strategy is obtained by solving several convex optimization problems which maximize the trace of the compromised estimation error covariance subject to the stealthiness condition.Moreover,due to the constraint of attack resources,the selection principle of the attacked sensor is provided to determine which sensor is attacked so as to hold the most impact on system performance.Finally,simulation results are presented to verify the theoretical analysis.展开更多
With advanced communication technologies,cyberphysical systems such as networked industrial control systems can be monitored and controlled by a remote control center via communication networks.While lots of benefits ...With advanced communication technologies,cyberphysical systems such as networked industrial control systems can be monitored and controlled by a remote control center via communication networks.While lots of benefits can be achieved with such a configuration,it also brings the concern of cyber attacks to the industrial control systems,such as networked manipulators that are widely adopted in industrial automation.For such systems,a false data injection attack on a control-center-to-manipulator(CC-M)communication channel is undesirable,and has negative effects on the manufacture quality.In this paper,we propose a resilient remote kinematic control method for serial manipulators undergoing a false data injection attack by leveraging the kinematic model.Theoretical analysis shows that the proposed method can guarantee asymptotic convergence of the regulation error to zero in the presence of a type of false data injection attack.The efficacy of the proposed method is validated via simulations.展开更多
Data protection in databases is critical for any organization,as unauthorized access or manipulation can have severe negative consequences.Intrusion detection systems are essential for keeping databases secure.Advance...Data protection in databases is critical for any organization,as unauthorized access or manipulation can have severe negative consequences.Intrusion detection systems are essential for keeping databases secure.Advancements in technology will lead to significant changes in the medical field,improving healthcare services through real-time information sharing.However,reliability and consistency still need to be solved.Safeguards against cyber-attacks are necessary due to the risk of unauthorized access to sensitive information and potential data corruption.Dis-ruptions to data items can propagate throughout the database,making it crucial to reverse fraudulent transactions without delay,especially in the healthcare industry,where real-time data access is vital.This research presents a role-based access control architecture for an anomaly detection technique.Additionally,the Structured Query Language(SQL)queries are stored in a new data structure called Pentaplet.These pentaplets allow us to maintain the correlation between SQL statements within the same transaction by employing the transaction-log entry information,thereby increasing detection accuracy,particularly for individuals within the company exhibiting unusual behavior.To identify anomalous queries,this system employs a supervised machine learning technique called Support Vector Machine(SVM).According to experimental findings,the proposed model performed well in terms of detection accuracy,achieving 99.92%through SVM with One Hot Encoding and Principal Component Analysis(PCA).展开更多
Nowadays,numerous applications are associated with cloud and user data gets collected globally and stored in cloud units.In addition to shared data storage,cloud computing technique offers multiple advantages for the ...Nowadays,numerous applications are associated with cloud and user data gets collected globally and stored in cloud units.In addition to shared data storage,cloud computing technique offers multiple advantages for the user through different distribution designs like hybrid cloud,public cloud,community cloud and private cloud.Though cloud-based computing solutions are highly con-venient to the users,it also brings a challenge i.e.,security of the data shared.Hence,in current research paper,blockchain with data integrity authentication technique is developed for an efficient and secure operation with user authentica-tion process.Blockchain technology is utilized in this study to enable efficient and secure operation which not only empowers cloud security but also avoids threats and attacks.Additionally,the data integrity authentication technique is also uti-lized to limit the unwanted access of data in cloud storage unit.The major objec-tive of the projected technique is to empower data security and user authentication in cloud computing environment.To improve the proposed authentication pro-cess,cuckoofilter and Merkle Hash Tree(MHT)are utilized.The proposed meth-odology was validated using few performance metrics such as processing time,uploading time,downloading time,authentication time,consensus time,waiting time,initialization time,in addition to storage overhead.The proposed method was compared with conventional cloud security techniques and the outcomes establish the supremacy of the proposed method.展开更多
The recent developments in smart cities pose major security issues for the Internet of Things(IoT)devices.These security issues directly result from inappropriate security management protocols and their implementation...The recent developments in smart cities pose major security issues for the Internet of Things(IoT)devices.These security issues directly result from inappropriate security management protocols and their implementation by IoT gadget developers.Cyber-attackers take advantage of such gadgets’vulnerabilities through various attacks such as injection and Distributed Denial of Service(DDoS)attacks.In this background,Intrusion Detection(ID)is the only way to identify the attacks and mitigate their damage.The recent advancements in Machine Learning(ML)and Deep Learning(DL)models are useful in effectively classifying cyber-attacks.The current research paper introduces a new Coot Optimization Algorithm with a Deep Learning-based False Data Injection Attack Recognition(COADL-FDIAR)model for the IoT environment.The presented COADL-FDIAR technique aims to identify false data injection attacks in the IoT environment.To accomplish this,the COADL-FDIAR model initially preprocesses the input data and selects the features with the help of the Chi-square test.To detect and classify false data injection attacks,the Stacked Long Short-Term Memory(SLSTM)model is exploited in this study.Finally,the COA algorithm effectively adjusts the SLTSM model’s hyperparameters effectively and accomplishes a superior recognition efficiency.The proposed COADL-FDIAR model was experimentally validated using a standard dataset,and the outcomes were scrutinized under distinct aspects.The comparative analysis results assured the superior performance of the proposed COADL-FDIAR model over other recent approaches with a maximum accuracy of 98.84%.展开更多
Data Integrity is a critical component of Data lifecycle management. Its importance increases even more in a complex and dynamic landscape. Actions like unauthorized access, unauthorized modifications, data manipulati...Data Integrity is a critical component of Data lifecycle management. Its importance increases even more in a complex and dynamic landscape. Actions like unauthorized access, unauthorized modifications, data manipulations, audit tampering, data backdating, data falsification, phishing and spoofing are no longer restricted to rogue individuals but in fact also prevalent in systematic organizations and states as well. Therefore, data security requires strong data integrity measures and associated technical controls in place. Without proper customized framework in place, organizations are prone to high risk of financial, reputational, revenue losses, bankruptcies, and legal penalties which we shall discuss further throughout this paper. We will also explore some of the improvised and innovative techniques in product development to better tackle the challenges and requirements of data security and integrity.展开更多
Since its inception in the 1970s,multi-dimensional magnetic resonance(MR)has emerged as a powerful tool for non-invasive investigations of structures and molecular interactions.MR spectroscopy beyond one dimension all...Since its inception in the 1970s,multi-dimensional magnetic resonance(MR)has emerged as a powerful tool for non-invasive investigations of structures and molecular interactions.MR spectroscopy beyond one dimension allows the study of the correlation,exchange processes,and separation of overlapping spectral information.The multi-dimensional concept has been re-implemented over the last two decades to explore molecular motion and spin dynamics in porous media.Apart from Fourier transform,methods have been developed for processing the multi-dimensional time-domain data,identifying the fluid components,and estimating pore surface permeability via joint relaxation and diffusion spectra.Through the resolution of spectroscopic signals with spatial encoding gradients,multi-dimensional MR imaging has been widely used to investigate the microscopic environment of living tissues and distinguish diseases.Signals in each voxel are usually expressed as multi-exponential decay,representing microstructures or environments along multiple pore scales.The separation of contributions from different environments is a common ill-posed problem,which can be resolved numerically.Moreover,the inversion methods and experimental parameters determine the resolution of multi-dimensional spectra.This paper reviews the algorithms that have been proposed to process multidimensional MR datasets in different scenarios.Detailed information at the microscopic level,such as tissue components,fluid types and food structures in multi-disciplinary sciences,could be revealed through multi-dimensional MR.展开更多
Assimilation of the Advanced Geostationary Radiance Imager(AGRI)clear-sky radiance in a regional model is performed.The forecasting effectiveness of the assimilation of two water vapor(WV)channels with conventional ob...Assimilation of the Advanced Geostationary Radiance Imager(AGRI)clear-sky radiance in a regional model is performed.The forecasting effectiveness of the assimilation of two water vapor(WV)channels with conventional observations for the“21·7”Henan extremely heavy rainfall is analyzed and compared with a baseline test that assimilates only conventional observations in this study.The results show that the 24-h cumulative precipitation forecast by the assimilation experiment with the addition of the AGRI exceeds 500 mm,compared to a maximum value of 532.6 mm measured by the national meteorological stations,and that the location of the maximum precipitation is consistent with the observations.The results for the short periods of intense precipitation processes are that the simulation of the location and intensity of the 3-h cumulative precipitation is also relatively accurate.The analysis increment shows that the main difference between the two sets of assimilation experiments is over the ocean due to the additional ocean observations provided by FY-4A,which compensates for the lack of ocean observations.The assimilation of satellite data adjusts the vertical and horizontal wind fields over the ocean by adjusting the atmospheric temperature and humidity,which ultimately results in a narrower and stronger WV transport path to the center of heavy precipitation in Zhengzhou in the lower troposphere.Conversely,the WV convergence and upward motion in the control experiment are more dispersed;therefore,the precipitation centers are also correspondingly more dispersed.展开更多
Since 2008 a network of five sea-level monitoring stations was progressively installed in French Polynesia.The stations are autonomous and data,collected at a sampling rate of 1 or 2 min,are not only recorded locally,...Since 2008 a network of five sea-level monitoring stations was progressively installed in French Polynesia.The stations are autonomous and data,collected at a sampling rate of 1 or 2 min,are not only recorded locally,but also transferred in real time by a radio-link to the NOAA through the GOES satellite.The new ET34-ANA-V80 version of ETERNA,initially developed for Earth Tides analysis,is now able to analyze ocean tides records.Through a two-step validation scheme,we took advantage of the flexibility of this new version,operated in conjunction with the preprocessing facilities of the Tsoft software,to recover co rrected data series able to model sea-level variations after elimination of the ocean tides signal.We performed the tidal analysis of the tide gauge data with the highest possible selectivity(optimal wave grouping)and a maximum of additional terms(shallow water constituents).Our goal was to provide corrected data series and modelled ocean tides signal to compute tide-free sea-level variations as well as tidal prediction models with centimeter precision.We also present in this study the characteristics of the ocean tides in French Polynesia and preliminary results concerning the non-tidal variations of the sea level concerning the tide gauge setting.展开更多
Wind and wave data are essential in climatological and engineering design applications.In this study,data from 15 buoys located throughout the South China Sea(SCS)were used to evaluate the ERA5 wind and wave data.Appl...Wind and wave data are essential in climatological and engineering design applications.In this study,data from 15 buoys located throughout the South China Sea(SCS)were used to evaluate the ERA5 wind and wave data.Applicability assessment are beneficial for gaining insight into the reliability of the ERA5 data in the SCS.The bias range between the ERA5 and observed wind-speed data was-0.78-0.99 m/s.The result indicates that,while the ERA5 wind-speed data underestimation was dominate,the overestimation of such data existed as well.Additionally,the ERA5 data underestimated annual maximum wind-speed by up to 38%,with a correlation coefficient>0.87.The bias between the ERA5 and observed significant wave height(SWH)data varied from-0.24 to 0.28 m.And the ERA5 data showed positive SWH bias,which implied a general underestimation at all locations,except those in the Beibu Gulf and centralwestern SCS,where overestimation was observed.Under extreme conditions,annual maximum SWH in the ERA5 data was underestimated by up to 30%.The correlation coefficients between the ERA5 and observed SWH data at all locations were greater than 0.92,except in the central-western SCS(0.84).The bias between the ERA5 and observed mean wave period(MWP)data varied from-0.74 to 0.57 s.The ERA5 data showed negative MWP biases implying a general overestimation at all locations,except for B1(the Beibu Gulf)and B7(the northeastern SCS),where underestimation was observed.The correlation coefficient between the ERA5 and observed MWP data in the Beibu Gulf was the smallest(0.56),and those of other locations fluctuated within a narrow range from 0.82 to 0.90.The intercomparison indicates that during the analyzed time-span,the ERA5 data generally underestimated wind-speed and SWH,but overestimated MWP.Under non-extreme conditions,the ERA5 wind-speed and SWH data can be used with confidence in most regions of the SCS,except in the central-western SCS.展开更多
基金the National Social Science Fund of China(Grant No.22CTQ031)Special Project on Library Capacity Building of the Chinese Academy of Sciences(Grant No.E2290431).
文摘Research data infrastructures form the cornerstone in both cyber and physical spaces,driving the progression of the data-intensive scientific research paradigm.This opinion paper presents an overview of global research data infrastructure,drawing insights from national roadmaps and strategic documents related to research data infrastructure.It emphasizes the pivotal role of research data infrastructures by delineating four new missions aimed at positioning them at the core of the current scientific research and communication ecosystem.The four new missions of research data infrastructures are:(1)as a pioneer,to transcend the disciplinary border and address complex,cutting-edge scientific and social challenges with problem-and data-oriented insights;(2)as an architect,to establish a digital,intelligent,flexible research and knowledge services environment;(3)as a platform,to foster the high-end academic communication;(4)as a coordinator,to balance scientific openness with ethics needs.
文摘In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose a Hadoop based big data secure storage scheme.Firstly,in order to disperse the NameNode service from a single server to multiple servers,we combine HDFS federation and HDFS high-availability mechanisms,and use the Zookeeper distributed coordination mechanism to coordinate each node to achieve dual-channel storage.Then,we improve the ECC encryption algorithm for the encryption of ordinary data,and adopt a homomorphic encryption algorithm to encrypt data that needs to be calculated.To accelerate the encryption,we adopt the dualthread encryption mode.Finally,the HDFS control module is designed to combine the encryption algorithm with the storage model.Experimental results show that the proposed solution solves the problem of a single point of failure of metadata,performs well in terms of metadata reliability,and can realize the fault tolerance of the server.The improved encryption algorithm integrates the dual-channel storage mode,and the encryption storage efficiency improves by 27.6% on average.
基金supported by NOAA JTTI award via Grant #NA21OAR4590165, NOAA GOESR Program funding via Grant #NA16OAR4320115provided by NOAA/Office of Oceanic and Atmospheric Research under NOAA-University of Oklahoma Cooperative Agreement #NA11OAR4320072, U.S. Department of Commercesupported by the National Oceanic and Atmospheric Administration (NOAA) of the U.S. Department of Commerce via Grant #NA18NWS4680063。
文摘Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series ”(GOES-R) Geostationary Lightning Mapper(GLM) flash extent density(FED) data within the operational Gridpoint Statistical Interpolation ensemble Kalman filter(GSI-EnKF) framework were previously developed and tested with a mesoscale convective system(MCS) case. In this study, such capabilities are further developed to assimilate GOES GLM FED data within the GSI ensemble-variational(EnVar) hybrid data assimilation(DA) framework. The results of assimilating the GLM FED data using 3DVar, and pure En3DVar(PEn3DVar, using 100% ensemble covariance and no static covariance) are compared with those of EnKF/DfEnKF for a supercell storm case. The focus of this study is to validate the correctness and evaluate the performance of the new implementation rather than comparing the performance of FED DA among different DA schemes. Only the results of 3DVar and pEn3DVar are examined and compared with EnKF/DfEnKF. Assimilation of a single FED observation shows that the magnitude and horizontal extent of the analysis increments from PEn3DVar are generally larger than from EnKF, which is mainly caused by using different localization strategies in EnFK/DfEnKF and PEn3DVar as well as the integration limits of the graupel mass in the observation operator. Overall, the forecast performance of PEn3DVar is comparable to EnKF/DfEnKF, suggesting correct implementation.
基金supported by the Hunan Provincial Natural Science Foundation of China(Grant No.2021JC0009)the Natural Science Foundation of China(Grant Nos.U2142212 and 42105136)。
文摘Various approaches have been proposed to minimize the upper-level systematic biases in global numerical weather prediction(NWP)models by using satellite upper-air sounding channels as anchors.However,since the China Meteorological Administration Global Forecast System(CMA-GFS)has a model top near 0.1 hPa(60 km),the upper-level temperature bias may exceed 4 K near 1 hPa and further extend to 5 hPa.In this study,channels 12–14 of the Advanced Microwave Sounding Unit A(AMSU-A)onboard five satellites of NOAA and METOP,whose weighting function peaks range from 10 to 2 hPa are all used as anchor observations in CMA-GFS.It is shown that the new“Anchor”approach can effectively reduce the biases near the model top and their downward propagation in three-month assimilation cycles.The bias growth rate of simulated upper-level channel observations is reduced to±0.001 K d^(–1),compared to–0.03 K d^(–1)derived from the current dynamic correction scheme.The relatively stable bias significantly improves the upper-level analysis field and leads to better global medium-range forecasts up to 10 days with significant reductions in the temperature and geopotential forecast error above 10 hPa.
文摘The main obstacle to the open sharing of scientific data is the lack of a legal protection system for intellectual property.This article analyzes the progress of research papers on intellectual property in scientific data in China through literature search and statistics.Currently,research subjects are unbalanced,research content is uneven,research methods are intellectual single,and research depth is insufficient.It is recommended that different stakeholders engage in deep cross disciplinary cooperation,further improve China’s legal and policy protection system for scientific data intellectual property,and promote the open sharing of scientific data.
文摘Every day,an NDT(Non-Destructive Testing)report will govern key decisions and inform inspection strategies that could affect the flow of millions of dollars which ultimately affects local environments and potential risk to life.There is a direct correlation between report quality and equipment capability.The more able the equipment is-in terms of efficient data gathering,signal to noise ratio,positioning,and coverage-the more actionable the report is.This results in optimal maintenance and repair strategies providing the report is clear and well presented.Furthermore,when considering tank floor storage inspection it is essential that asset owners have total confidence in inspection findings and the ensuing reports.Tank floor inspection equipment must not only be efficient and highly capable,but data sets should be traceable and integrity maintained throughout.Corrosion mapping of large surface areas such as storage tank bottoms is an inherently arduous and time-consuming process.MFL(magnetic flux leakage)based tank bottom scanners present a well-established and highly rated method for inspection.There are many benefits of using modern MFL technology to generate actionable reports.Chief among these includes efficiency of coverage while gaining valuable information regarding defect location,severity,surface origin and the extent of coverage.More recent advancements in modern MFL tank bottom scanners afford the ability to scan and record data sets at areas of the tank bottom which were previously classed as dead zones or areas not scanned due to physical restraints.An example of this includes scanning the CZ(critical zone)which is the area close to the annular to shell junction weld.Inclusion of these additional dead zones increases overall inspection coverage,quality and traceability.Inspection of the CZ areas allows engineers to quickly determine the integrity of arguably the most important area of the tank bottom.Herein we discuss notable developments in CZ coverage,inspection efficiency and data integrity that combines to deliver an actionable report.The asset owner can interrogate this report to develop pertinent and accurate maintenance and repair strategies.
文摘That the world is a global village is no longer news through the tremendous advancement in the Information Communication Technology (ICT). The metamorphosis of the human data storage and analysis from analogue through the jaguars-loom mainframe computer to the present modern high power processing computers with sextillion bytes storage capacity has prompted discussion of Big Data concept as a tool in managing hitherto all human challenges of complex human system multiplier effects. The supply chain management (SCM) that deals with spatial service delivery that must be safe, efficient, reliable, cheap, transparent, and foreseeable to meet customers’ needs cannot but employ bid data tools in its operation. This study employs secondary data online to review the importance of big data in supply chain management and the levels of adoption in Nigeria. The study revealed that the application of big data tools in SCM and other industrial sectors is synonymous to human and national development. It is therefore recommended that both private and governmental bodies should key into e-transactions for easy data assemblage and analysis for profitable forecasting and policy formation.
文摘With the rapid development of modern science and technology, traditional randomized controlled trials have become insufficient to meet current scientific research needs, particularly in the field of clinical research. The emergence of real-world data studies, which align more closely with actual clinical evidence, has garnered significant attention in recent years. The following is a brief overview of the specific utilization of real-world data in drug development, which often involves large sample sizes and analyses covering a relatively diverse population without strict inclusion and exclusion criteria. Real-world data often reflects real clinical practice: treatment options are chosen according to the actual conditions and willingness of patients rather than through random assignment. Analysis based on real-world data also focuses on endpoints highly relevant to clinical benefits and the quality of life of patients. The booming big data technology supports the utilization of real-world data to accelerate new drug development, serving as an important supplement to traditional clinical trials.
文摘As an introductory course for the emerging major of big data management and application,“Introduction to Big Data”has not yet formed a curriculum standard and implementation plan that is widely accepted and used by everyone.To this end,we discuss some of our explorations and attempts in the construction and teaching process of big data courses for the major of big data management and application from the perspective of course planning,course implementation,and course summary.After interviews with students and feedback from questionnaires,students are highly satisfied with some of the teaching measures and programs currently adopted.
基金supported by the National Natural Science Foundation of China (No.81973705).
文摘Background:Missing data are frequently occurred in clinical studies.Due to the development of precision medicine,there is an increased interest in N-of-1 trial.Bayesian models are one of main statistical methods for analyzing the data of N-of-1 trials.This simulation study aimed to compare two statistical methods for handling missing values of quantitative data in Bayesian N-of-1 trials.Methods:The simulated data of N-of-1 trials with different coefficients of autocorrelation,effect sizes and missing ratios are obtained by SAS 9.1 system.The missing values are filled with mean filling and regression filling respectively in the condition of different coefficients of autocorrelation,effect sizes and missing ratios by SPSS 25.0 software.Bayesian models are built to estimate the posterior means by Winbugs 14 software.Results:When the missing ratio is relatively small,e.g.5%,missing values have relatively little effect on the results.Therapeutic effects may be underestimated when the coefficient of autocorrelation increases and no filling is used.However,it may be overestimated when mean or regression filling is used,and the results after mean filling are closer to the actual effect than regression filling.In the case of moderate missing ratio,the estimated effect after mean filling is closer to the actual effect compared to regression filling.When a large missing ratio(20%)occurs,data missing can lead to significantly underestimate the effect.In this case,the estimated effect after regression filling is closer to the actual effect compared to mean filling.Conclusion:Data missing can affect the estimated therapeutic effects using Bayesian models in N-of-1 trials.The present study suggests that mean filling can be used under situation of missing ratio≤10%.Otherwise,regression filling may be preferable.
基金supported by the National Natural Science Foundation of China(61925303,62173034,62088101,U20B2073,62173002)the National Key Research and Development Program of China(2021YFB1714800)Beijing Natural Science Foundation(4222045)。
文摘This paper investigates the security issue of multisensor remote estimation systems.An optimal stealthy false data injection(FDI)attack scheme based on historical and current residuals,which only tampers with the measurement residuals of partial sensors due to limited attack resources,is proposed to maximally degrade system estimation performance.The attack stealthiness condition is given,and then the estimation error covariance in compromised state is derived to quantify the system performance under attack.The optimal attack strategy is obtained by solving several convex optimization problems which maximize the trace of the compromised estimation error covariance subject to the stealthiness condition.Moreover,due to the constraint of attack resources,the selection principle of the attacked sensor is provided to determine which sensor is attacked so as to hold the most impact on system performance.Finally,simulation results are presented to verify the theoretical analysis.
基金This work was supported in part by the National Natural Science Foundation of China(62206109)the Fundamental Research Funds for the Central Universities(21620346)。
文摘With advanced communication technologies,cyberphysical systems such as networked industrial control systems can be monitored and controlled by a remote control center via communication networks.While lots of benefits can be achieved with such a configuration,it also brings the concern of cyber attacks to the industrial control systems,such as networked manipulators that are widely adopted in industrial automation.For such systems,a false data injection attack on a control-center-to-manipulator(CC-M)communication channel is undesirable,and has negative effects on the manufacture quality.In this paper,we propose a resilient remote kinematic control method for serial manipulators undergoing a false data injection attack by leveraging the kinematic model.Theoretical analysis shows that the proposed method can guarantee asymptotic convergence of the regulation error to zero in the presence of a type of false data injection attack.The efficacy of the proposed method is validated via simulations.
基金thankful to the Dean of Scientific Research at Najran University for funding this work under the Research Groups Funding Program,Grant Code(NU/RG/SERC/12/6).
文摘Data protection in databases is critical for any organization,as unauthorized access or manipulation can have severe negative consequences.Intrusion detection systems are essential for keeping databases secure.Advancements in technology will lead to significant changes in the medical field,improving healthcare services through real-time information sharing.However,reliability and consistency still need to be solved.Safeguards against cyber-attacks are necessary due to the risk of unauthorized access to sensitive information and potential data corruption.Dis-ruptions to data items can propagate throughout the database,making it crucial to reverse fraudulent transactions without delay,especially in the healthcare industry,where real-time data access is vital.This research presents a role-based access control architecture for an anomaly detection technique.Additionally,the Structured Query Language(SQL)queries are stored in a new data structure called Pentaplet.These pentaplets allow us to maintain the correlation between SQL statements within the same transaction by employing the transaction-log entry information,thereby increasing detection accuracy,particularly for individuals within the company exhibiting unusual behavior.To identify anomalous queries,this system employs a supervised machine learning technique called Support Vector Machine(SVM).According to experimental findings,the proposed model performed well in terms of detection accuracy,achieving 99.92%through SVM with One Hot Encoding and Principal Component Analysis(PCA).
文摘Nowadays,numerous applications are associated with cloud and user data gets collected globally and stored in cloud units.In addition to shared data storage,cloud computing technique offers multiple advantages for the user through different distribution designs like hybrid cloud,public cloud,community cloud and private cloud.Though cloud-based computing solutions are highly con-venient to the users,it also brings a challenge i.e.,security of the data shared.Hence,in current research paper,blockchain with data integrity authentication technique is developed for an efficient and secure operation with user authentica-tion process.Blockchain technology is utilized in this study to enable efficient and secure operation which not only empowers cloud security but also avoids threats and attacks.Additionally,the data integrity authentication technique is also uti-lized to limit the unwanted access of data in cloud storage unit.The major objec-tive of the projected technique is to empower data security and user authentication in cloud computing environment.To improve the proposed authentication pro-cess,cuckoofilter and Merkle Hash Tree(MHT)are utilized.The proposed meth-odology was validated using few performance metrics such as processing time,uploading time,downloading time,authentication time,consensus time,waiting time,initialization time,in addition to storage overhead.The proposed method was compared with conventional cloud security techniques and the outcomes establish the supremacy of the proposed method.
基金This research was supported by the Universiti Sains Malaysia(USM)and the ministry of Higher Education Malaysia through Fundamental Research GrantScheme(FRGS-Grant No:FRGS/1/2020/TK0/USM/02/1).
文摘The recent developments in smart cities pose major security issues for the Internet of Things(IoT)devices.These security issues directly result from inappropriate security management protocols and their implementation by IoT gadget developers.Cyber-attackers take advantage of such gadgets’vulnerabilities through various attacks such as injection and Distributed Denial of Service(DDoS)attacks.In this background,Intrusion Detection(ID)is the only way to identify the attacks and mitigate their damage.The recent advancements in Machine Learning(ML)and Deep Learning(DL)models are useful in effectively classifying cyber-attacks.The current research paper introduces a new Coot Optimization Algorithm with a Deep Learning-based False Data Injection Attack Recognition(COADL-FDIAR)model for the IoT environment.The presented COADL-FDIAR technique aims to identify false data injection attacks in the IoT environment.To accomplish this,the COADL-FDIAR model initially preprocesses the input data and selects the features with the help of the Chi-square test.To detect and classify false data injection attacks,the Stacked Long Short-Term Memory(SLSTM)model is exploited in this study.Finally,the COA algorithm effectively adjusts the SLTSM model’s hyperparameters effectively and accomplishes a superior recognition efficiency.The proposed COADL-FDIAR model was experimentally validated using a standard dataset,and the outcomes were scrutinized under distinct aspects.The comparative analysis results assured the superior performance of the proposed COADL-FDIAR model over other recent approaches with a maximum accuracy of 98.84%.
文摘Data Integrity is a critical component of Data lifecycle management. Its importance increases even more in a complex and dynamic landscape. Actions like unauthorized access, unauthorized modifications, data manipulations, audit tampering, data backdating, data falsification, phishing and spoofing are no longer restricted to rogue individuals but in fact also prevalent in systematic organizations and states as well. Therefore, data security requires strong data integrity measures and associated technical controls in place. Without proper customized framework in place, organizations are prone to high risk of financial, reputational, revenue losses, bankruptcies, and legal penalties which we shall discuss further throughout this paper. We will also explore some of the improvised and innovative techniques in product development to better tackle the challenges and requirements of data security and integrity.
基金supported by the National Natural Science Foundation of China(No.61901465,82222032,82172050).
文摘Since its inception in the 1970s,multi-dimensional magnetic resonance(MR)has emerged as a powerful tool for non-invasive investigations of structures and molecular interactions.MR spectroscopy beyond one dimension allows the study of the correlation,exchange processes,and separation of overlapping spectral information.The multi-dimensional concept has been re-implemented over the last two decades to explore molecular motion and spin dynamics in porous media.Apart from Fourier transform,methods have been developed for processing the multi-dimensional time-domain data,identifying the fluid components,and estimating pore surface permeability via joint relaxation and diffusion spectra.Through the resolution of spectroscopic signals with spatial encoding gradients,multi-dimensional MR imaging has been widely used to investigate the microscopic environment of living tissues and distinguish diseases.Signals in each voxel are usually expressed as multi-exponential decay,representing microstructures or environments along multiple pore scales.The separation of contributions from different environments is a common ill-posed problem,which can be resolved numerically.Moreover,the inversion methods and experimental parameters determine the resolution of multi-dimensional spectra.This paper reviews the algorithms that have been proposed to process multidimensional MR datasets in different scenarios.Detailed information at the microscopic level,such as tissue components,fluid types and food structures in multi-disciplinary sciences,could be revealed through multi-dimensional MR.
基金supported by the National Key R&D Program of China(Grant Nos.2017YFC1501803 and 2017YFC1502102)。
文摘Assimilation of the Advanced Geostationary Radiance Imager(AGRI)clear-sky radiance in a regional model is performed.The forecasting effectiveness of the assimilation of two water vapor(WV)channels with conventional observations for the“21·7”Henan extremely heavy rainfall is analyzed and compared with a baseline test that assimilates only conventional observations in this study.The results show that the 24-h cumulative precipitation forecast by the assimilation experiment with the addition of the AGRI exceeds 500 mm,compared to a maximum value of 532.6 mm measured by the national meteorological stations,and that the location of the maximum precipitation is consistent with the observations.The results for the short periods of intense precipitation processes are that the simulation of the location and intensity of the 3-h cumulative precipitation is also relatively accurate.The analysis increment shows that the main difference between the two sets of assimilation experiments is over the ocean due to the additional ocean observations provided by FY-4A,which compensates for the lack of ocean observations.The assimilation of satellite data adjusts the vertical and horizontal wind fields over the ocean by adjusting the atmospheric temperature and humidity,which ultimately results in a narrower and stronger WV transport path to the center of heavy precipitation in Zhengzhou in the lower troposphere.Conversely,the WV convergence and upward motion in the control experiment are more dispersed;therefore,the precipitation centers are also correspondingly more dispersed.
基金funding from the“Talent Introduction Scientific Research Start-Up Fund”of Shandong University of Science and Technology(Grant number 0104060510217)the“Open Fund of State Key Laboratory of Geodesy and Earth’s Dynamics”(Grant number SKLGED2021-3-5)。
文摘Since 2008 a network of five sea-level monitoring stations was progressively installed in French Polynesia.The stations are autonomous and data,collected at a sampling rate of 1 or 2 min,are not only recorded locally,but also transferred in real time by a radio-link to the NOAA through the GOES satellite.The new ET34-ANA-V80 version of ETERNA,initially developed for Earth Tides analysis,is now able to analyze ocean tides records.Through a two-step validation scheme,we took advantage of the flexibility of this new version,operated in conjunction with the preprocessing facilities of the Tsoft software,to recover co rrected data series able to model sea-level variations after elimination of the ocean tides signal.We performed the tidal analysis of the tide gauge data with the highest possible selectivity(optimal wave grouping)and a maximum of additional terms(shallow water constituents).Our goal was to provide corrected data series and modelled ocean tides signal to compute tide-free sea-level variations as well as tidal prediction models with centimeter precision.We also present in this study the characteristics of the ocean tides in French Polynesia and preliminary results concerning the non-tidal variations of the sea level concerning the tide gauge setting.
基金Supported by the Southern Marine Science and Engineering Guangdong Laboratory(Zhuhai)(No.SML2021SP102)the Key Laboratory of Marine Environmental Survey Technology and Application+2 种基金Ministry of Natural Resources(Nos.MESTA-2020-C003,MESTA-2020-C004)the Key Research and Development Project of Guangdong Province(No.2020B1111020003)the Science and Technology Research Project of Jiangxi Provincial Department of Education(No.GJJ200330)。
文摘Wind and wave data are essential in climatological and engineering design applications.In this study,data from 15 buoys located throughout the South China Sea(SCS)were used to evaluate the ERA5 wind and wave data.Applicability assessment are beneficial for gaining insight into the reliability of the ERA5 data in the SCS.The bias range between the ERA5 and observed wind-speed data was-0.78-0.99 m/s.The result indicates that,while the ERA5 wind-speed data underestimation was dominate,the overestimation of such data existed as well.Additionally,the ERA5 data underestimated annual maximum wind-speed by up to 38%,with a correlation coefficient>0.87.The bias between the ERA5 and observed significant wave height(SWH)data varied from-0.24 to 0.28 m.And the ERA5 data showed positive SWH bias,which implied a general underestimation at all locations,except those in the Beibu Gulf and centralwestern SCS,where overestimation was observed.Under extreme conditions,annual maximum SWH in the ERA5 data was underestimated by up to 30%.The correlation coefficients between the ERA5 and observed SWH data at all locations were greater than 0.92,except in the central-western SCS(0.84).The bias between the ERA5 and observed mean wave period(MWP)data varied from-0.74 to 0.57 s.The ERA5 data showed negative MWP biases implying a general overestimation at all locations,except for B1(the Beibu Gulf)and B7(the northeastern SCS),where underestimation was observed.The correlation coefficient between the ERA5 and observed MWP data in the Beibu Gulf was the smallest(0.56),and those of other locations fluctuated within a narrow range from 0.82 to 0.90.The intercomparison indicates that during the analyzed time-span,the ERA5 data generally underestimated wind-speed and SWH,but overestimated MWP.Under non-extreme conditions,the ERA5 wind-speed and SWH data can be used with confidence in most regions of the SCS,except in the central-western SCS.