In order to overcome the defects that the analysis of multi-well typical curves of shale gas reservoirs is rarely applied to engineering,this study proposes a robust production data analysis method based on deconvolut...In order to overcome the defects that the analysis of multi-well typical curves of shale gas reservoirs is rarely applied to engineering,this study proposes a robust production data analysis method based on deconvolution,which is used for multi-well inter-well interference research.In this study,a multi-well conceptual trilinear seepage model for multi-stage fractured horizontal wells was established,and its Laplace solutions under two different outer boundary conditions were obtained.Then,an improved pressure deconvolution algorithm was used to normalize the scattered production data.Furthermore,the typical curve fitting was carried out using the production data and the seepage model solution.Finally,some reservoir parameters and fracturing parameters were interpreted,and the intensity of inter-well interference was compared.The effectiveness of the method was verified by analyzing the production dynamic data of six shale gas wells in Duvernay area.The results showed that the fitting effect of typical curves was greatly improved due to the mutual restriction between deconvolution calculation parameter debugging and seepage model parameter debugging.Besides,by using the morphological characteristics of the log-log typical curves and the time corresponding to the intersection point of the log-log typical curves of two models under different outer boundary conditions,the strength of the interference between wells on the same well platform was well judged.This work can provide a reference for the optimization of well spacing and hydraulic fracturing measures for shale gas wells.展开更多
Seeing is an important index to evaluate the quality of an astronomical site.To estimate seeing at the Muztagh-Ata site with height and time quantitatively,the European Centre for Medium-Range Weather Forecasts reanal...Seeing is an important index to evaluate the quality of an astronomical site.To estimate seeing at the Muztagh-Ata site with height and time quantitatively,the European Centre for Medium-Range Weather Forecasts reanalysis database(ERA5)is used.Seeing calculated from ERA5 is compared consistently with the Differential Image Motion Monitor seeing at the height of 12 m.Results show that seeing decays exponentially with height at the Muztagh-Ata site.Seeing decays the fastest in fall in 2021 and most slowly with height in summer.The seeing condition is better in fall than in summer.The median value of seeing at 12 m is 0.89 arcsec,the maximum value is1.21 arcsec in August and the minimum is 0.66 arcsec in October.The median value of seeing at 12 m is 0.72arcsec in the nighttime and 1.08 arcsec in the daytime.Seeing is a combination of annual and about biannual variations with the same phase as temperature and wind speed indicating that seeing variation with time is influenced by temperature and wind speed.The Richardson number Ri is used to analyze the atmospheric stability and the variations of seeing are consistent with Ri between layers.These quantitative results can provide an important reference for a telescopic observation strategy.展开更多
Peanut allergy is majorly related to severe food induced allergic reactions.Several food including cow's milk,hen's eggs,soy,wheat,peanuts,tree nuts(walnuts,hazelnuts,almonds,cashews,pecans and pistachios),fis...Peanut allergy is majorly related to severe food induced allergic reactions.Several food including cow's milk,hen's eggs,soy,wheat,peanuts,tree nuts(walnuts,hazelnuts,almonds,cashews,pecans and pistachios),fish and shellfish are responsible for more than 90%of food allergies.Here,we provide promising insights using a large-scale data-driven analysis,comparing the mechanistic feature and biological relevance of different ingredients presents in peanuts,tree nuts(walnuts,almonds,cashews,pecans and pistachios)and soybean.Additionally,we have analysed the chemical compositions of peanuts in different processed form raw,boiled and dry-roasted.Using the data-driven approach we are able to generate new hypotheses to explain why nuclear receptors like the peroxisome proliferator-activated receptors(PPARs)and its isoform and their interaction with dietary lipids may have significant effect on allergic response.The results obtained from this study will direct future experimeantal and clinical studies to understand the role of dietary lipids and PPARisoforms to exert pro-inflammatory or anti-inflammatory functions on cells of the innate immunity and influence antigen presentation to the cells of the adaptive immunity.展开更多
As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The ...As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The eddy dissipation rate(EDR)has been established as the standard metric for quantifying turbulence in civil aviation.This study aims to explore a universally applicable symbolic classification approach based on genetic programming to detect turbulence anomalies using quick access recorder(QAR)data.The detection of atmospheric turbulence is approached as an anomaly detection problem.Comparative evaluations demonstrate that this approach performs on par with direct EDR calculation methods in identifying turbulence events.Moreover,comparisons with alternative machine learning techniques indicate that the proposed technique is the optimal methodology currently available.In summary,the use of symbolic classification via genetic programming enables accurate turbulence detection from QAR data,comparable to that with established EDR approaches and surpassing that achieved with machine learning algorithms.This finding highlights the potential of integrating symbolic classifiers into turbulence monitoring systems to enhance civil aviation safety amidst rising environmental and operational hazards.展开更多
Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision...Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.展开更多
To obtain more stable spectral data for accurate quantitative analysis of multi-element,especially for the large-area in-situ elements detection of soils, we propose a method for a multielement quantitative analysis o...To obtain more stable spectral data for accurate quantitative analysis of multi-element,especially for the large-area in-situ elements detection of soils, we propose a method for a multielement quantitative analysis of soils using calibration-free laser-induced breakdown spectroscopy(CF-LIBS) based on data filtering. In this study, we analyze a standard soil sample doped with two heavy metal elements, Cu and Cd, with a specific focus on the line of Cu I324.75 nm for filtering the experimental data of multiple sample sets. Pre-and post-data filtering,the relative standard deviation for Cu decreased from 30% to 10%, The limits of detection(LOD)values for Cu and Cd decreased by 5% and 4%, respectively. Through CF-LIBS, a quantitative analysis was conducted to determine the relative content of elements in soils. Using Cu as a reference, the concentration of Cd was accurately calculated. The results show that post-data filtering, the average relative error of the Cd decreases from 11% to 5%, indicating the effectiveness of data filtering in improving the accuracy of quantitative analysis. Moreover, the content of Si, Fe and other elements can be accurately calculated using this method. To further correct the calculation, the results for Cd was used to provide a more precise calculation. This approach is of great importance for the large-area in-situ heavy metals and trace elements detection in soil, as well as for rapid and accurate quantitative analysis.展开更多
The recent pandemic crisis has highlighted the importance of the availability and management of health data to respond quickly and effectively to health emergencies, while respecting the fundamental rights of every in...The recent pandemic crisis has highlighted the importance of the availability and management of health data to respond quickly and effectively to health emergencies, while respecting the fundamental rights of every individual. In this context, it is essential to find a balance between the protection of privacy and the safeguarding of public health, using tools that guarantee transparency and consent to the processing of data by the population. This work, starting from a pilot investigation conducted in the Polyclinic of Bari as part of the Horizon Europe Seeds project entitled “Multidisciplinary analysis of technological tracing models of contagion: the protection of rights in the management of health data”, has the objective of promoting greater patient awareness regarding the processing of their health data and the protection of privacy. The methodology used the PHICAT (Personal Health Information Competence Assessment Tool) as a tool and, through the administration of a questionnaire, the aim was to evaluate the patients’ ability to express their consent to the release and processing of health data. The results that emerged were analyzed in relation to the 4 domains in which the process is divided which allows evaluating the patients’ ability to express a conscious choice and, also, in relation to the socio-demographic and clinical characteristics of the patients themselves. This study can contribute to understanding patients’ ability to give their consent and improve information regarding the management of health data by increasing confidence in granting the use of their data for research and clinical management.展开更多
How can we efficiently store and mine dynamically generated dense tensors for modeling the behavior of multidimensional dynamic data?Much of the multidimensional dynamic data in the real world is generated in the form...How can we efficiently store and mine dynamically generated dense tensors for modeling the behavior of multidimensional dynamic data?Much of the multidimensional dynamic data in the real world is generated in the form of time-growing tensors.For example,air quality tensor data consists of multiple sensory values gathered from wide locations for a long time.Such data,accumulated over time,is redundant and consumes a lot ofmemory in its raw form.We need a way to efficiently store dynamically generated tensor data that increase over time and to model their behavior on demand between arbitrary time blocks.To this end,we propose a Block IncrementalDense Tucker Decomposition(BID-Tucker)method for efficient storage and on-demand modeling ofmultidimensional spatiotemporal data.Assuming that tensors come in unit blocks where only the time domain changes,our proposed BID-Tucker first slices the blocks into matrices and decomposes them via singular value decomposition(SVD).The SVDs of the time×space sliced matrices are stored instead of the raw tensor blocks to save space.When modeling from data is required at particular time blocks,the SVDs of corresponding time blocks are retrieved and incremented to be used for Tucker decomposition.The factor matrices and core tensor of the decomposed results can then be used for further data analysis.We compared our proposed BID-Tucker with D-Tucker,which our method extends,and vanilla Tucker decomposition.We show that our BID-Tucker is faster than both D-Tucker and vanilla Tucker decomposition and uses less memory for storage with a comparable reconstruction error.We applied our proposed BID-Tucker to model the spatial and temporal trends of air quality data collected in South Korea from 2018 to 2022.We were able to model the spatial and temporal air quality trends.We were also able to verify unusual events,such as chronic ozone alerts and large fire events.展开更多
Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and sha...Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and share such multimodal data.However,due to professional discrepancies among annotators and lax quality control,noisy labels might be introduced.Recent research suggests that deep neural networks(DNNs)will overfit noisy labels,leading to the poor performance of the DNNs.To address this challenging problem,we present a Multimodal Robust Meta Learning framework(MRML)for multimodal sentiment analysis to resist noisy labels and correlate distinct modalities simultaneously.Specifically,we propose a two-layer fusion net to deeply fuse different modalities and improve the quality of the multimodal data features for label correction and network training.Besides,a multiple meta-learner(label corrector)strategy is proposed to enhance the label correction approach and prevent models from overfitting to noisy labels.We conducted experiments on three popular multimodal datasets to verify the superiority of ourmethod by comparing it with four baselines.展开更多
This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and i...This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.展开更多
Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted ...Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted multi-user IDET system is studied,where all the received signals at the users are exploited for energy harvesting without the degradation of wireless data transfer(WDT)performance.The joint IDET performance is then analysed theoretically by conceiving a practical time-dependent wireless channel.With the aid of the AO based algorithm,the average effective data rate among users are maximized by ensuring the BER and the wireless energy transfer(WET)performance.Simulation results validate and evaluate the IDET performance of the EHM assisted system,which also demonstrates that the optimal number of user clusters and IDET time slots should be allocated,in order to improve the WET and WDT performance.展开更多
This study investigates university English teachers’acceptance and willingness to use learning management system(LMS)data analysis tools in their teaching practices.The research employs a mixed-method approach,combin...This study investigates university English teachers’acceptance and willingness to use learning management system(LMS)data analysis tools in their teaching practices.The research employs a mixed-method approach,combining quantitative surveys and qualitative interviews to understand teachers’perceptions and attitudes,and the factors influencing their adoption of LMS data analysis tools.The findings reveal that perceived usefulness,perceived ease of use,technical literacy,organizational support,and data privacy concerns significantly impact teachers’willingness to use these tools.Based on these insights,the study offers practical recommendations for educational institutions to enhance the effective adoption of LMS data analysis tools in English language teaching.展开更多
The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of con...The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of contiguous composite sandbodies in this area makes it challenging to characterize connectivity changes with conventional seismic attributes.Aiming at the above problem in the Bohai A Oilfield,this study proposes a big data analysis method based on the Deep Forest algorithm to predict the sandbody connectivity.Firstly,by compiling the abundant exploration and development sandbodies data in the study area,typical sandbodies with reliable connectivity were selected.Then,sensitive seismic attribute were extracted to obtain training samples.Finally,based on the Deep Forest algorithm,mapping model between attribute combinations and sandbody connectivity was established through machine learning.This method achieves the first quantitative determination of the connectivity for continuous composite sandbodies in the Bohai Oilfield.Compared with conventional connectivity discrimination methods such as high-resolution processing and seismic attribute analysis,this method can combine the sandbody characteristics of the study area in the process of machine learning,and jointly judge connectivity by combining multiple seismic attributes.The study results show that this method has high accuracy and timeliness in predicting connectivity for continuous composite sandbodies.Applied to the Bohai A Oilfield,it successfully identified multiple sandbody connectivity relationships and provided strong support for the subsequent exploration potential assessment and well placement optimization.This method also provides a new idea and method for studying sandbody connectivity under similar complex geological conditions.展开更多
This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hac...This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hackers, thereby making customer/client data visible and unprotected. Also, this led to enormous risk of the clients/customers due to defective equipment, bugs, faulty servers, and specious actions. The aim if this paper therefore is to analyze a secure model using Unicode Transformation Format (UTF) base 64 algorithms for storage of data in cloud securely. The methodology used was Object Orientated Hypermedia Analysis and Design Methodology (OOHADM) was adopted. Python was used to develop the security model;the role-based access control (RBAC) and multi-factor authentication (MFA) to enhance security Algorithm were integrated into the Information System developed with HTML 5, JavaScript, Cascading Style Sheet (CSS) version 3 and PHP7. This paper also discussed some of the following concepts;Development of Computing in Cloud, Characteristics of computing, Cloud deployment Model, Cloud Service Models, etc. The results showed that the proposed enhanced security model for information systems of cooperate platform handled multiple authorization and authentication menace, that only one login page will direct all login requests of the different modules to one Single Sign On Server (SSOS). This will in turn redirect users to their requested resources/module when authenticated, leveraging on the Geo-location integration for physical location validation. The emergence of this newly developed system will solve the shortcomings of the existing systems and reduce time and resources incurred while using the existing system.展开更多
Mobile networks possess significant information and thus are considered a gold mine for the researcher’s community.The call detail records(CDR)of a mobile network are used to identify the network’s efficacy and the ...Mobile networks possess significant information and thus are considered a gold mine for the researcher’s community.The call detail records(CDR)of a mobile network are used to identify the network’s efficacy and the mobile user’s behavior.It is evident from the recent literature that cyber-physical systems(CPS)were used in the analytics and modeling of telecom data.In addition,CPS is used to provide valuable services in smart cities.In general,a typical telecom company hasmillions of subscribers and thus generatesmassive amounts of data.From this aspect,data storage,analysis,and processing are the key concerns.To solve these issues,herein we propose a multilevel cyber-physical social system(CPSS)for the analysis and modeling of large internet data.Our proposed multilevel system has three levels and each level has a specific functionality.Initially,raw Call Detail Data(CDR)was collected at the first level.Herein,the data preprocessing,cleaning,and error removal operations were performed.In the second level,data processing,cleaning,reduction,integration,processing,and storage were performed.Herein,suggested internet activity record measures were applied.Our proposed system initially constructs a graph and then performs network analysis.Thus proposed CPSS system accurately identifies different areas of internet peak usage in a city(Milan city).Our research is helpful for the network operators to plan effective network configuration,management,and optimization of resources.展开更多
The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure ...The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure of urban agglomeration in the Greater Bay Area through the use of social network analysis method.This is the inaugural application of big data based on location services in the study of urban agglomeration network structure,which represents a novel research perspective on this topic.The study reveals that the density of network linkages in the Greater Bay Area urban agglomeration has reached 100%,indicating a mature network-like spatial structure.This structure has given rise to three distinct communities:Shenzhen-Dongguan-Huizhou,Guangzhou-Foshan-Zhaoqing,and Zhuhai-Zhongshan-Jiangmen.Additionally,cities within the Greater Bay Area urban agglomeration play different roles,suggesting that varying development strategies may be necessary to achieve staggered development.The study demonstrates that large datasets represented by LBS can offer novel insights and methodologies for the examination of urban agglomeration network structures,contingent on the appropriate mining and processing of the data.展开更多
Guest Editors Prof.Ling Tian Prof.Jian-Hua Tao University of Electronic Science and Technology of China Tsinghua University lingtian@uestc.edu.cn jhtao@tsinghua.edu.cn Dr.Bin Zhou National University of Defense Techno...Guest Editors Prof.Ling Tian Prof.Jian-Hua Tao University of Electronic Science and Technology of China Tsinghua University lingtian@uestc.edu.cn jhtao@tsinghua.edu.cn Dr.Bin Zhou National University of Defense Technology binzhou@nudt.edu.cn, Since the concept of “Big Data” was first introduced in Nature in 2008, it has been widely applied in fields, such as business, healthcare, national defense, education, transportation, and security. With the maturity of artificial intelligence technology, big data analysis techniques tailored to various fields have made significant progress, but still face many challenges in terms of data quality, algorithms, and computing power.展开更多
Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can a...Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can affect its quantification performance.In this work,we propose a hybrid variable selection method to improve the performance of LIBS quantification.Important variables are first identified using Pearson's correlation coefficient,mutual information,least absolute shrinkage and selection operator(LASSO)and random forest,and then filtered and combined with empirical variables related to fingerprint elements of coal ash content.Subsequently,these variables are fed into a partial least squares regression(PLSR).Additionally,in some models,certain variables unrelated to ash content are removed manually to study the impact of variable deselection on model performance.The proposed hybrid strategy was tested on three LIBS datasets for quantitative analysis of coal ash content and compared with the corresponding data-driven baseline method.It is significantly better than the variable selection only method based on empirical knowledge and in most cases outperforms the baseline method.The results showed that on all three datasets the hybrid strategy for variable selection combining empirical knowledge and data-driven algorithms achieved the lowest root mean square error of prediction(RMSEP)values of 1.605,3.478 and 1.647,respectively,which were significantly lower than those obtained from multiple linear regression using only 12 empirical variables,which are 1.959,3.718 and 2.181,respectively.The LASSO-PLSR model with empirical support and 20 selected variables exhibited a significantly improved performance after variable deselection,with RMSEP values dropping from 1.635,3.962 and 1.647 to 1.483,3.086 and 1.567,respectively.Such results demonstrate that using empirical knowledge as a support for datadriven variable selection can be a viable approach to improve the accuracy and reliability of LIBS quantification.展开更多
Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series ”(GOES-R) Geostationary Lightning Mapper(GLM) flash extent density(FED) data within the operational Gridpoint Statistical Interp...Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series ”(GOES-R) Geostationary Lightning Mapper(GLM) flash extent density(FED) data within the operational Gridpoint Statistical Interpolation ensemble Kalman filter(GSI-EnKF) framework were previously developed and tested with a mesoscale convective system(MCS) case. In this study, such capabilities are further developed to assimilate GOES GLM FED data within the GSI ensemble-variational(EnVar) hybrid data assimilation(DA) framework. The results of assimilating the GLM FED data using 3DVar, and pure En3DVar(PEn3DVar, using 100% ensemble covariance and no static covariance) are compared with those of EnKF/DfEnKF for a supercell storm case. The focus of this study is to validate the correctness and evaluate the performance of the new implementation rather than comparing the performance of FED DA among different DA schemes. Only the results of 3DVar and pEn3DVar are examined and compared with EnKF/DfEnKF. Assimilation of a single FED observation shows that the magnitude and horizontal extent of the analysis increments from PEn3DVar are generally larger than from EnKF, which is mainly caused by using different localization strategies in EnFK/DfEnKF and PEn3DVar as well as the integration limits of the graupel mass in the observation operator. Overall, the forecast performance of PEn3DVar is comparable to EnKF/DfEnKF, suggesting correct implementation.展开更多
The literary review presented in the following paper aims to analyze the tracking tools used in different countries during the period of the COVID-19 pandemic. Tracking apps that have been adopted in many countries to...The literary review presented in the following paper aims to analyze the tracking tools used in different countries during the period of the COVID-19 pandemic. Tracking apps that have been adopted in many countries to collect data in a homogeneous and immediate way have made up for the difficulty of collecting data and standardizing evaluation criteria. However, the regulation on the protection of personal data in the health sector and the adoption of the new General Data Protection Regulation in European countries has placed a strong limitation on their use. This has not been the case in non-European countries, where monitoring methodologies have become widespread. The textual analysis presented is based on co-occurrence and multiple correspondence analysis to show the contact tracing methods adopted in different countries in the pandemic period by relating them to the issue of privacy. It also analyzed the possibility of applying Blockchain technology in applications for tracking contagions from COVID-19 and managing health data to provide a high level of security and transparency, including through anonymization, thus increasing user trust in using the apps.展开更多
基金financial support from PetroChina Innovation Foundation。
文摘In order to overcome the defects that the analysis of multi-well typical curves of shale gas reservoirs is rarely applied to engineering,this study proposes a robust production data analysis method based on deconvolution,which is used for multi-well inter-well interference research.In this study,a multi-well conceptual trilinear seepage model for multi-stage fractured horizontal wells was established,and its Laplace solutions under two different outer boundary conditions were obtained.Then,an improved pressure deconvolution algorithm was used to normalize the scattered production data.Furthermore,the typical curve fitting was carried out using the production data and the seepage model solution.Finally,some reservoir parameters and fracturing parameters were interpreted,and the intensity of inter-well interference was compared.The effectiveness of the method was verified by analyzing the production dynamic data of six shale gas wells in Duvernay area.The results showed that the fitting effect of typical curves was greatly improved due to the mutual restriction between deconvolution calculation parameter debugging and seepage model parameter debugging.Besides,by using the morphological characteristics of the log-log typical curves and the time corresponding to the intersection point of the log-log typical curves of two models under different outer boundary conditions,the strength of the interference between wells on the same well platform was well judged.This work can provide a reference for the optimization of well spacing and hydraulic fracturing measures for shale gas wells.
基金funded by the National Natural Science Foundation of China(NSFC)the Chinese Academy of Sciences(CAS)(grant No.U2031209)the National Natural Science Foundation of China(NSFC,grant Nos.11872128,42174192,and 91952111)。
文摘Seeing is an important index to evaluate the quality of an astronomical site.To estimate seeing at the Muztagh-Ata site with height and time quantitatively,the European Centre for Medium-Range Weather Forecasts reanalysis database(ERA5)is used.Seeing calculated from ERA5 is compared consistently with the Differential Image Motion Monitor seeing at the height of 12 m.Results show that seeing decays exponentially with height at the Muztagh-Ata site.Seeing decays the fastest in fall in 2021 and most slowly with height in summer.The seeing condition is better in fall than in summer.The median value of seeing at 12 m is 0.89 arcsec,the maximum value is1.21 arcsec in August and the minimum is 0.66 arcsec in October.The median value of seeing at 12 m is 0.72arcsec in the nighttime and 1.08 arcsec in the daytime.Seeing is a combination of annual and about biannual variations with the same phase as temperature and wind speed indicating that seeing variation with time is influenced by temperature and wind speed.The Richardson number Ri is used to analyze the atmospheric stability and the variations of seeing are consistent with Ri between layers.These quantitative results can provide an important reference for a telescopic observation strategy.
文摘Peanut allergy is majorly related to severe food induced allergic reactions.Several food including cow's milk,hen's eggs,soy,wheat,peanuts,tree nuts(walnuts,hazelnuts,almonds,cashews,pecans and pistachios),fish and shellfish are responsible for more than 90%of food allergies.Here,we provide promising insights using a large-scale data-driven analysis,comparing the mechanistic feature and biological relevance of different ingredients presents in peanuts,tree nuts(walnuts,almonds,cashews,pecans and pistachios)and soybean.Additionally,we have analysed the chemical compositions of peanuts in different processed form raw,boiled and dry-roasted.Using the data-driven approach we are able to generate new hypotheses to explain why nuclear receptors like the peroxisome proliferator-activated receptors(PPARs)and its isoform and their interaction with dietary lipids may have significant effect on allergic response.The results obtained from this study will direct future experimeantal and clinical studies to understand the role of dietary lipids and PPARisoforms to exert pro-inflammatory or anti-inflammatory functions on cells of the innate immunity and influence antigen presentation to the cells of the adaptive immunity.
基金supported by the Meteorological Soft Science Project(Grant No.2023ZZXM29)the Natural Science Fund Project of Tianjin,China(Grant No.21JCYBJC00740)the Key Research and Development-Social Development Program of Jiangsu Province,China(Grant No.BE2021685).
文摘As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The eddy dissipation rate(EDR)has been established as the standard metric for quantifying turbulence in civil aviation.This study aims to explore a universally applicable symbolic classification approach based on genetic programming to detect turbulence anomalies using quick access recorder(QAR)data.The detection of atmospheric turbulence is approached as an anomaly detection problem.Comparative evaluations demonstrate that this approach performs on par with direct EDR calculation methods in identifying turbulence events.Moreover,comparisons with alternative machine learning techniques indicate that the proposed technique is the optimal methodology currently available.In summary,the use of symbolic classification via genetic programming enables accurate turbulence detection from QAR data,comparable to that with established EDR approaches and surpassing that achieved with machine learning algorithms.This finding highlights the potential of integrating symbolic classifiers into turbulence monitoring systems to enhance civil aviation safety amidst rising environmental and operational hazards.
文摘Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.
基金supported by the Major Science and Technology Project of Gansu Province(No.22ZD6FA021-5)the Industrial Support Project of Gansu Province(Nos.2023CYZC-19 and 2021CYZC-22)the Science and Technology Project of Gansu Province(Nos.23YFFA0074,22JR5RA137 and 22JR5RA151).
文摘To obtain more stable spectral data for accurate quantitative analysis of multi-element,especially for the large-area in-situ elements detection of soils, we propose a method for a multielement quantitative analysis of soils using calibration-free laser-induced breakdown spectroscopy(CF-LIBS) based on data filtering. In this study, we analyze a standard soil sample doped with two heavy metal elements, Cu and Cd, with a specific focus on the line of Cu I324.75 nm for filtering the experimental data of multiple sample sets. Pre-and post-data filtering,the relative standard deviation for Cu decreased from 30% to 10%, The limits of detection(LOD)values for Cu and Cd decreased by 5% and 4%, respectively. Through CF-LIBS, a quantitative analysis was conducted to determine the relative content of elements in soils. Using Cu as a reference, the concentration of Cd was accurately calculated. The results show that post-data filtering, the average relative error of the Cd decreases from 11% to 5%, indicating the effectiveness of data filtering in improving the accuracy of quantitative analysis. Moreover, the content of Si, Fe and other elements can be accurately calculated using this method. To further correct the calculation, the results for Cd was used to provide a more precise calculation. This approach is of great importance for the large-area in-situ heavy metals and trace elements detection in soil, as well as for rapid and accurate quantitative analysis.
文摘The recent pandemic crisis has highlighted the importance of the availability and management of health data to respond quickly and effectively to health emergencies, while respecting the fundamental rights of every individual. In this context, it is essential to find a balance between the protection of privacy and the safeguarding of public health, using tools that guarantee transparency and consent to the processing of data by the population. This work, starting from a pilot investigation conducted in the Polyclinic of Bari as part of the Horizon Europe Seeds project entitled “Multidisciplinary analysis of technological tracing models of contagion: the protection of rights in the management of health data”, has the objective of promoting greater patient awareness regarding the processing of their health data and the protection of privacy. The methodology used the PHICAT (Personal Health Information Competence Assessment Tool) as a tool and, through the administration of a questionnaire, the aim was to evaluate the patients’ ability to express their consent to the release and processing of health data. The results that emerged were analyzed in relation to the 4 domains in which the process is divided which allows evaluating the patients’ ability to express a conscious choice and, also, in relation to the socio-demographic and clinical characteristics of the patients themselves. This study can contribute to understanding patients’ ability to give their consent and improve information regarding the management of health data by increasing confidence in granting the use of their data for research and clinical management.
基金supported by the Institute of Information&Communications Technology Planning&Evaluation (IITP)grant funded by the Korean government (MSIT) (No.2022-0-00369)by the NationalResearch Foundation of Korea Grant funded by the Korean government (2018R1A5A1060031,2022R1F1A1065664).
文摘How can we efficiently store and mine dynamically generated dense tensors for modeling the behavior of multidimensional dynamic data?Much of the multidimensional dynamic data in the real world is generated in the form of time-growing tensors.For example,air quality tensor data consists of multiple sensory values gathered from wide locations for a long time.Such data,accumulated over time,is redundant and consumes a lot ofmemory in its raw form.We need a way to efficiently store dynamically generated tensor data that increase over time and to model their behavior on demand between arbitrary time blocks.To this end,we propose a Block IncrementalDense Tucker Decomposition(BID-Tucker)method for efficient storage and on-demand modeling ofmultidimensional spatiotemporal data.Assuming that tensors come in unit blocks where only the time domain changes,our proposed BID-Tucker first slices the blocks into matrices and decomposes them via singular value decomposition(SVD).The SVDs of the time×space sliced matrices are stored instead of the raw tensor blocks to save space.When modeling from data is required at particular time blocks,the SVDs of corresponding time blocks are retrieved and incremented to be used for Tucker decomposition.The factor matrices and core tensor of the decomposed results can then be used for further data analysis.We compared our proposed BID-Tucker with D-Tucker,which our method extends,and vanilla Tucker decomposition.We show that our BID-Tucker is faster than both D-Tucker and vanilla Tucker decomposition and uses less memory for storage with a comparable reconstruction error.We applied our proposed BID-Tucker to model the spatial and temporal trends of air quality data collected in South Korea from 2018 to 2022.We were able to model the spatial and temporal air quality trends.We were also able to verify unusual events,such as chronic ozone alerts and large fire events.
基金supported by STI 2030-Major Projects 2021ZD0200400National Natural Science Foundation of China(62276233 and 62072405)Key Research Project of Zhejiang Province(2023C01048).
文摘Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and share such multimodal data.However,due to professional discrepancies among annotators and lax quality control,noisy labels might be introduced.Recent research suggests that deep neural networks(DNNs)will overfit noisy labels,leading to the poor performance of the DNNs.To address this challenging problem,we present a Multimodal Robust Meta Learning framework(MRML)for multimodal sentiment analysis to resist noisy labels and correlate distinct modalities simultaneously.Specifically,we propose a two-layer fusion net to deeply fuse different modalities and improve the quality of the multimodal data features for label correction and network training.Besides,a multiple meta-learner(label corrector)strategy is proposed to enhance the label correction approach and prevent models from overfitting to noisy labels.We conducted experiments on three popular multimodal datasets to verify the superiority of ourmethod by comparing it with four baselines.
文摘This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.
基金supported in part by the MOST Major Research and Development Project(Grant No.2021YFB2900204)the National Natural Science Foundation of China(NSFC)(Grant No.62201123,No.62132004,No.61971102)+3 种基金China Postdoctoral Science Foundation(Grant No.2022TQ0056)in part by the financial support of the Sichuan Science and Technology Program(Grant No.2022YFH0022)Sichuan Major R&D Project(Grant No.22QYCX0168)the Municipal Government of Quzhou(Grant No.2022D031)。
文摘Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted multi-user IDET system is studied,where all the received signals at the users are exploited for energy harvesting without the degradation of wireless data transfer(WDT)performance.The joint IDET performance is then analysed theoretically by conceiving a practical time-dependent wireless channel.With the aid of the AO based algorithm,the average effective data rate among users are maximized by ensuring the BER and the wireless energy transfer(WET)performance.Simulation results validate and evaluate the IDET performance of the EHM assisted system,which also demonstrates that the optimal number of user clusters and IDET time slots should be allocated,in order to improve the WET and WDT performance.
文摘This study investigates university English teachers’acceptance and willingness to use learning management system(LMS)data analysis tools in their teaching practices.The research employs a mixed-method approach,combining quantitative surveys and qualitative interviews to understand teachers’perceptions and attitudes,and the factors influencing their adoption of LMS data analysis tools.The findings reveal that perceived usefulness,perceived ease of use,technical literacy,organizational support,and data privacy concerns significantly impact teachers’willingness to use these tools.Based on these insights,the study offers practical recommendations for educational institutions to enhance the effective adoption of LMS data analysis tools in English language teaching.
文摘The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of contiguous composite sandbodies in this area makes it challenging to characterize connectivity changes with conventional seismic attributes.Aiming at the above problem in the Bohai A Oilfield,this study proposes a big data analysis method based on the Deep Forest algorithm to predict the sandbody connectivity.Firstly,by compiling the abundant exploration and development sandbodies data in the study area,typical sandbodies with reliable connectivity were selected.Then,sensitive seismic attribute were extracted to obtain training samples.Finally,based on the Deep Forest algorithm,mapping model between attribute combinations and sandbody connectivity was established through machine learning.This method achieves the first quantitative determination of the connectivity for continuous composite sandbodies in the Bohai Oilfield.Compared with conventional connectivity discrimination methods such as high-resolution processing and seismic attribute analysis,this method can combine the sandbody characteristics of the study area in the process of machine learning,and jointly judge connectivity by combining multiple seismic attributes.The study results show that this method has high accuracy and timeliness in predicting connectivity for continuous composite sandbodies.Applied to the Bohai A Oilfield,it successfully identified multiple sandbody connectivity relationships and provided strong support for the subsequent exploration potential assessment and well placement optimization.This method also provides a new idea and method for studying sandbody connectivity under similar complex geological conditions.
文摘This paper was motivated by the existing problems of Cloud Data storage in Imo State University, Nigeria such as outsourced data causing the loss of data and misuse of customer information by unauthorized users or hackers, thereby making customer/client data visible and unprotected. Also, this led to enormous risk of the clients/customers due to defective equipment, bugs, faulty servers, and specious actions. The aim if this paper therefore is to analyze a secure model using Unicode Transformation Format (UTF) base 64 algorithms for storage of data in cloud securely. The methodology used was Object Orientated Hypermedia Analysis and Design Methodology (OOHADM) was adopted. Python was used to develop the security model;the role-based access control (RBAC) and multi-factor authentication (MFA) to enhance security Algorithm were integrated into the Information System developed with HTML 5, JavaScript, Cascading Style Sheet (CSS) version 3 and PHP7. This paper also discussed some of the following concepts;Development of Computing in Cloud, Characteristics of computing, Cloud deployment Model, Cloud Service Models, etc. The results showed that the proposed enhanced security model for information systems of cooperate platform handled multiple authorization and authentication menace, that only one login page will direct all login requests of the different modules to one Single Sign On Server (SSOS). This will in turn redirect users to their requested resources/module when authenticated, leveraging on the Geo-location integration for physical location validation. The emergence of this newly developed system will solve the shortcomings of the existing systems and reduce time and resources incurred while using the existing system.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(NRF-2021R1A6A1A03039493).
文摘Mobile networks possess significant information and thus are considered a gold mine for the researcher’s community.The call detail records(CDR)of a mobile network are used to identify the network’s efficacy and the mobile user’s behavior.It is evident from the recent literature that cyber-physical systems(CPS)were used in the analytics and modeling of telecom data.In addition,CPS is used to provide valuable services in smart cities.In general,a typical telecom company hasmillions of subscribers and thus generatesmassive amounts of data.From this aspect,data storage,analysis,and processing are the key concerns.To solve these issues,herein we propose a multilevel cyber-physical social system(CPSS)for the analysis and modeling of large internet data.Our proposed multilevel system has three levels and each level has a specific functionality.Initially,raw Call Detail Data(CDR)was collected at the first level.Herein,the data preprocessing,cleaning,and error removal operations were performed.In the second level,data processing,cleaning,reduction,integration,processing,and storage were performed.Herein,suggested internet activity record measures were applied.Our proposed system initially constructs a graph and then performs network analysis.Thus proposed CPSS system accurately identifies different areas of internet peak usage in a city(Milan city).Our research is helpful for the network operators to plan effective network configuration,management,and optimization of resources.
文摘The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure of urban agglomeration in the Greater Bay Area through the use of social network analysis method.This is the inaugural application of big data based on location services in the study of urban agglomeration network structure,which represents a novel research perspective on this topic.The study reveals that the density of network linkages in the Greater Bay Area urban agglomeration has reached 100%,indicating a mature network-like spatial structure.This structure has given rise to three distinct communities:Shenzhen-Dongguan-Huizhou,Guangzhou-Foshan-Zhaoqing,and Zhuhai-Zhongshan-Jiangmen.Additionally,cities within the Greater Bay Area urban agglomeration play different roles,suggesting that varying development strategies may be necessary to achieve staggered development.The study demonstrates that large datasets represented by LBS can offer novel insights and methodologies for the examination of urban agglomeration network structures,contingent on the appropriate mining and processing of the data.
文摘Guest Editors Prof.Ling Tian Prof.Jian-Hua Tao University of Electronic Science and Technology of China Tsinghua University lingtian@uestc.edu.cn jhtao@tsinghua.edu.cn Dr.Bin Zhou National University of Defense Technology binzhou@nudt.edu.cn, Since the concept of “Big Data” was first introduced in Nature in 2008, it has been widely applied in fields, such as business, healthcare, national defense, education, transportation, and security. With the maturity of artificial intelligence technology, big data analysis techniques tailored to various fields have made significant progress, but still face many challenges in terms of data quality, algorithms, and computing power.
基金financial supports from National Natural Science Foundation of China(No.62205172)Huaneng Group Science and Technology Research Project(No.HNKJ22-H105)Tsinghua University Initiative Scientific Research Program and the International Joint Mission on Climate Change and Carbon Neutrality。
文摘Laser-induced breakdown spectroscopy(LIBS)has become a widely used atomic spectroscopic technique for rapid coal analysis.However,the vast amount of spectral information in LIBS contains signal uncertainty,which can affect its quantification performance.In this work,we propose a hybrid variable selection method to improve the performance of LIBS quantification.Important variables are first identified using Pearson's correlation coefficient,mutual information,least absolute shrinkage and selection operator(LASSO)and random forest,and then filtered and combined with empirical variables related to fingerprint elements of coal ash content.Subsequently,these variables are fed into a partial least squares regression(PLSR).Additionally,in some models,certain variables unrelated to ash content are removed manually to study the impact of variable deselection on model performance.The proposed hybrid strategy was tested on three LIBS datasets for quantitative analysis of coal ash content and compared with the corresponding data-driven baseline method.It is significantly better than the variable selection only method based on empirical knowledge and in most cases outperforms the baseline method.The results showed that on all three datasets the hybrid strategy for variable selection combining empirical knowledge and data-driven algorithms achieved the lowest root mean square error of prediction(RMSEP)values of 1.605,3.478 and 1.647,respectively,which were significantly lower than those obtained from multiple linear regression using only 12 empirical variables,which are 1.959,3.718 and 2.181,respectively.The LASSO-PLSR model with empirical support and 20 selected variables exhibited a significantly improved performance after variable deselection,with RMSEP values dropping from 1.635,3.962 and 1.647 to 1.483,3.086 and 1.567,respectively.Such results demonstrate that using empirical knowledge as a support for datadriven variable selection can be a viable approach to improve the accuracy and reliability of LIBS quantification.
基金supported by NOAA JTTI award via Grant #NA21OAR4590165, NOAA GOESR Program funding via Grant #NA16OAR4320115provided by NOAA/Office of Oceanic and Atmospheric Research under NOAA-University of Oklahoma Cooperative Agreement #NA11OAR4320072, U.S. Department of Commercesupported by the National Oceanic and Atmospheric Administration (NOAA) of the U.S. Department of Commerce via Grant #NA18NWS4680063。
文摘Capabilities to assimilate Geostationary Operational Environmental Satellite “R-series ”(GOES-R) Geostationary Lightning Mapper(GLM) flash extent density(FED) data within the operational Gridpoint Statistical Interpolation ensemble Kalman filter(GSI-EnKF) framework were previously developed and tested with a mesoscale convective system(MCS) case. In this study, such capabilities are further developed to assimilate GOES GLM FED data within the GSI ensemble-variational(EnVar) hybrid data assimilation(DA) framework. The results of assimilating the GLM FED data using 3DVar, and pure En3DVar(PEn3DVar, using 100% ensemble covariance and no static covariance) are compared with those of EnKF/DfEnKF for a supercell storm case. The focus of this study is to validate the correctness and evaluate the performance of the new implementation rather than comparing the performance of FED DA among different DA schemes. Only the results of 3DVar and pEn3DVar are examined and compared with EnKF/DfEnKF. Assimilation of a single FED observation shows that the magnitude and horizontal extent of the analysis increments from PEn3DVar are generally larger than from EnKF, which is mainly caused by using different localization strategies in EnFK/DfEnKF and PEn3DVar as well as the integration limits of the graupel mass in the observation operator. Overall, the forecast performance of PEn3DVar is comparable to EnKF/DfEnKF, suggesting correct implementation.
文摘The literary review presented in the following paper aims to analyze the tracking tools used in different countries during the period of the COVID-19 pandemic. Tracking apps that have been adopted in many countries to collect data in a homogeneous and immediate way have made up for the difficulty of collecting data and standardizing evaluation criteria. However, the regulation on the protection of personal data in the health sector and the adoption of the new General Data Protection Regulation in European countries has placed a strong limitation on their use. This has not been the case in non-European countries, where monitoring methodologies have become widespread. The textual analysis presented is based on co-occurrence and multiple correspondence analysis to show the contact tracing methods adopted in different countries in the pandemic period by relating them to the issue of privacy. It also analyzed the possibility of applying Blockchain technology in applications for tracking contagions from COVID-19 and managing health data to provide a high level of security and transparency, including through anonymization, thus increasing user trust in using the apps.