Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision...Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.展开更多
This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and i...This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.展开更多
The application of single-cell RNA sequencing(scRNA-seq)in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategie...The application of single-cell RNA sequencing(scRNA-seq)in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategies.With the expansion of capacity for high-throughput scRNA-seq,including clinical samples,the analysis of these huge volumes of data has become a daunting prospect for researchers entering this field.Here,we review the workflow for typical scRNA-seq data analysis,covering raw data processing and quality control,basic data analysis applicable for almost all scRNA-seq data sets,and advanced data analysis that should be tailored to specific scientific questions.While summarizing the current methods for each analysis step,we also provide an online repository of software and wrapped-up scripts to support the implementation.Recommendations and caveats are pointed out for some specific analysis tasks and approaches.We hope this resource will be helpful to researchers engaging with scRNA-seq,in particular for emerging clinical applications.展开更多
As COVID-19 poses a major threat to people’s health and economy,there is an urgent need for forecasting methodologies that can anticipate its trajectory efficiently.In non-stationary time series forecasting jobs,ther...As COVID-19 poses a major threat to people’s health and economy,there is an urgent need for forecasting methodologies that can anticipate its trajectory efficiently.In non-stationary time series forecasting jobs,there is frequently a hysteresis in the anticipated values relative to the real values.The multilayer deep-time convolutional network and a feature fusion network are combined in this paper’s proposal of an enhanced Multilayer Deep Time Convolutional Neural Network(MDTCNet)for COVID-19 prediction to address this problem.In particular,it is possible to record the deep features and temporal dependencies in uncertain time series,and the features may then be combined using a feature fusion network and a multilayer perceptron.Last but not least,the experimental verification is conducted on the prediction task of COVID-19 real daily confirmed cases in the world and the United States with uncertainty,realizing the short-term and long-term prediction of COVID-19 daily confirmed cases,and verifying the effectiveness and accuracy of the suggested prediction method,as well as reducing the hysteresis of the prediction results.展开更多
Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities tu...Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities turn up dangerous contaminants in our surroundings. This study investigated two years’ worth of air quality and outlier detection data from two Indian cities. Studies on air pollution have used numerous types of methodologies, with various gases being seen as a vector whose components include gas concentration values for each observation per-formed. We use curves to represent the monthly average of daily gas emissions in our technique. The approach, which is based on functional depth, was used to find outliers in the city of Delhi and Kolkata’s gas emissions, and the outcomes were compared to those from the traditional method. In the evaluation and comparison of these models’ performances, the functional approach model studied well.展开更多
Air quality is a critical concern for public health and environmental regulation. The Air Quality Index (AQI), a widely adopted index by the US Environmental Protection Agency (EPA), serves as a crucial metric for rep...Air quality is a critical concern for public health and environmental regulation. The Air Quality Index (AQI), a widely adopted index by the US Environmental Protection Agency (EPA), serves as a crucial metric for reporting site-specific air pollution levels. Accurately predicting air quality, as measured by the AQI, is essential for effective air pollution management. In this study, we aim to identify the most reliable regression model among linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), logistic regression, and K-nearest neighbors (KNN). We conducted four different regression analyses using a machine learning approach to determine the model with the best performance. By employing the confusion matrix and error percentages, we selected the best-performing model, which yielded prediction error rates of 22%, 23%, 20%, and 27%, respectively, for LDA, QDA, logistic regression, and KNN models. The logistic regression model outperformed the other three statistical models in predicting AQI. Understanding these models' performance can help address an existing gap in air quality research and contribute to the integration of regression techniques in AQI studies, ultimately benefiting stakeholders like environmental regulators, healthcare professionals, urban planners, and researchers.展开更多
This article presents a comprehensive analysis of the current state of research on the English translation of Lu You’s poetry, utilizing a data sample comprising research papers published in the CNKI Full-text Databa...This article presents a comprehensive analysis of the current state of research on the English translation of Lu You’s poetry, utilizing a data sample comprising research papers published in the CNKI Full-text Database from 2001 to 2022. Employing rigorous longitudinal statistical methods, the study examines the progress achieved over the past two decades. Notably, domestic researchers have displayed considerable interest in the study of Lu You’s English translation works since 2001. The research on the English translation of Lu You’s poetry reveals a diverse range of perspectives, indicating a rich body of scholarship. However, several challenges persist, including insufficient research, limited translation coverage, and a noticeable focus on specific poems such as “Phoenix Hairpin” in the realm of English translation research. Consequently, there is ample room for improvement in the quality of research output on the English translation of Lu You’s poems, as well as its recognition within the academic community. Building on these findings, it is argued that future investigations pertaining to the English translation of Lu You’s poetry should transcend the boundaries of textual analysis and encompass broader theoretical perspectives and research methodologies. By undertaking this shift, scholars will develop a more profound comprehension of Lu You’s poetic works and make substantive contributions to the field of translation studies. Thus, this article aims to bridge the gap between past research endeavors and future possibilities, serving as a guide and inspiration for scholars to embark on a more nuanced and enriching exploration of Lu You’s poetry as well as other Chinese literature classics.展开更多
Space debris poses a serious threat to human space activities and needs to be measured and cataloged. As a new technology for space target surveillance, the measurement accuracy of diffuse reflection laser ranging (D...Space debris poses a serious threat to human space activities and needs to be measured and cataloged. As a new technology for space target surveillance, the measurement accuracy of diffuse reflection laser ranging (DRLR) is much higher than that of microwave radar and optoelectronic measurement. Based on the laser ranging data of space debris from the DRLR system at Shanghai Astronomical Observatory acquired in March-April, 2013, the characteristics and precision of the laser ranging data are analyzed and their applications in orbit determination of space debris are discussed, which is implemented for the first time in China. The experiment indicates that the precision of laser ranging data can reach 39 cm-228 cm. When the data are sufficient enough (four arcs measured over three days), the orbital accuracy of space debris can be up to 50 m.展开更多
This paper presents the development and application of a production data analysis software that can analyze and forecast the production performance and reservoir properties of shale gas wells.The theories used in the ...This paper presents the development and application of a production data analysis software that can analyze and forecast the production performance and reservoir properties of shale gas wells.The theories used in the study were based on the analytical and empirical approaches.Its reliability has been confirmed through comparisons with a commercial software.Using transient data relating to multi-stage hydraulic fractured horizontal wells,it was confirmed that the accuracy of the modified hyperbolic method showed an error of approximately 4%compared to the actual estimated ultimate recovery(EUR).On the basis of the developed model,reliable productivity forecasts have been obtained by analyzing field production data relating to wells in Canada.The EUR was computed as 9.6 Bcf using the modified hyperbolic method.Employing the Pow Law Exponential method,the EUR would be 9.4 Bcf.The models developed in this study will allow in the future integration of new analytical and empirical theories in a relatively readily than commercial models.展开更多
RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpre...RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpreting the RNA-seq data.With the development of high-throughput sequencing technology,the sequencing depth of RNA-seq data increases explosively.The intricate biological process of transcriptome is more complicated and diversified beyond our imagination.Moreover,most of the remaining organisms still have no available reference genome or have only incomplete genome annotations.Therefore,a large number of bioinformatics methods for various transcriptomics studies are proposed to effectively settle these challenges.This review comprehensively summarizes the various studies in RNA-seq data analysis and their corresponding analysis methods,including genome annotation,quality control and pre-processing of reads,read alignment,transcriptome assembly,gene and isoform expression quantification,differential expression analysis,data visualization and other analyses.展开更多
A novel study using LCeMS(Liquid chromatography tandem mass spectrometry)coupled with multivariate data analysis and bioactivity evaluation was established for discrimination of aqueous extract and vinegar extract of...A novel study using LCeMS(Liquid chromatography tandem mass spectrometry)coupled with multivariate data analysis and bioactivity evaluation was established for discrimination of aqueous extract and vinegar extract of Shixiao San.Batches of these two kinds of samples were subjected to analysis,and the datasets of sample codes,tR-m/z pairs and ion intensities were processed with principal component analysis(PCA).The result of score plot showed a clear classification of the aqueous and vinegar groups.And the chemical markers having great contributions to the differentiation were screened out on the loading plot.The identities of the chemical markers were performed by comparing the mass fragments and retention times with those of reference compounds and/or the known compounds published in the literatures.Based on the proposed strategy,quercetin-3-Oneohesperidoside,isorhamnetin-3-O-neohespeeridoside,kaempferol-3-O-neohesperidoside,isorhamnetin-3-O-rutinoside and isorhamnetin-3-O-(2G-a-l-rhamnosyl)-rutinoside were explored as representative markers in distinguishing the vinegar extract from the aqueous extract.The anti-hyperlipidemic activities of two processed extracts of Shixiao San were examined on serum levels of lipids,lipoprotein and blood antioxidant enzymes in a rat hyperlipidemia model,and the vinegary extract,exerting strong lipid-lowering and antioxidative effects,was superior to the aqueous extract.Therefore,boiling with vinegary was predicted as the greatest processing procedure for anti-hyperlipidemic effect of Shixiao San.Furthermore,combining the changes in the metabolic profiling and bioactivity evaluation,the five representative markers may be related to the observed antihyperlipidemic effect.展开更多
Under industry 4.0, internet of things(IoT), especially radio frequency identification(RFID) technology, has been widely applied in manufacturing environment. This technology can bring convenience to production contro...Under industry 4.0, internet of things(IoT), especially radio frequency identification(RFID) technology, has been widely applied in manufacturing environment. This technology can bring convenience to production control and production transparency. Meanwhile, it generates increasing production data that are sometimes discrete, uncorrelated, and hard-to-use. Thus,an efficient analysis method is needed to utilize the invaluable data. This work provides an RFID-based production data analysis method for production control in Io T-enabled smart job-shops.The physical configuration and operation logic of Io T-enabled smart job-shop production are firstly described. Based on that,an RFID-based production data model is built to formalize and correlate the heterogeneous production data. Then, an eventdriven RFID-based production data analysis method is proposed to construct the RFID events and judge the process command execution. Furthermore, a near big data approach is used to excavate hidden information and knowledge from the historical production data. A demonstrative case is studied to verify the feasibility of the proposed model and methods. It is expected that our work will provide a different insight into the RFIDbased production data analysis.展开更多
With the rapid development of the Internet,many enterprises have launched their network platforms.When users browse,search,and click the products of these platforms,most platforms will keep records of these network be...With the rapid development of the Internet,many enterprises have launched their network platforms.When users browse,search,and click the products of these platforms,most platforms will keep records of these network behaviors,these records are often heterogeneous,and it is called log data.To effectively to analyze and manage these heterogeneous log data,so that enterprises can grasp the behavior characteristics of their platform users in time,to realize targeted recommendation of users,increase the sales volume of enterprises’products,and accelerate the development of enterprises.Firstly,we follow the process of big data collection,storage,analysis,and visualization to design the system,then,we adopt HDFS storage technology,Yarn resource management technology,and gink load balancing technology to build a Hadoop cluster to process the log data,and adopt MapReduce processing technology and data warehouse hive technology analyze the log data to obtain the results.Finally,the obtained results are displayed visually,and a log data analysis system is successfully constructed.It has been proved by practice that the system effectively realizes the collection,analysis and visualization of log data,and can accurately realize the recommendation of products by enterprises.The system is stable and effective.展开更多
A new dynamic model identification method is developed for continuous-time series analysis and forward prediction applications. The quantum of data is defined over moving time intervals in sliding window coordinates f...A new dynamic model identification method is developed for continuous-time series analysis and forward prediction applications. The quantum of data is defined over moving time intervals in sliding window coordinates for compressing the size of stored data while retaining the resolution of information. Quantum vectors are introduced as the basis of a linear space for defining a Dynamic Quantum Operator (DQO) model of the system defined by its data stream. The transport of the quantum of compressed data is modeled between the time interval bins during the movement of the sliding time window. The DQO model is identified from the samples of the real-time flow of data over the sliding time window. A least-square-fit identification method is used for evaluating the parameters of the quantum operator model, utilizing the repeated use of the sampled data through a number of time steps. The method is tested to analyze, and forward-predict air temperature variations accessed from weather data as well as methane concentration variations obtained from measurements of an operating mine. The results show efficient forward prediction capabilities, surpassing those using neural networks and other methods for the same task.展开更多
The issue of privacy protection for mobile social networks is a frontier topic in the field of social network applications.The existing researches on user privacy protection in mobile social network mainly focus on pr...The issue of privacy protection for mobile social networks is a frontier topic in the field of social network applications.The existing researches on user privacy protection in mobile social network mainly focus on privacy preserving data publishing and access control.There is little research on the association of user privacy information,so it is not easy to design personalized privacy protection strategy,but also increase the complexity of user privacy settings.Therefore,this paper concentrates on the association of user privacy information taking big data analysis tools,so as to provide data support for personalized privacy protection strategy design.展开更多
Big data analysis has penetrated into all fields of society and has brought about profound changes.However,there is relatively little research on big data supporting student management regarding college and university...Big data analysis has penetrated into all fields of society and has brought about profound changes.However,there is relatively little research on big data supporting student management regarding college and university’s big data.Taking the student card information as the research sample,using spark big data mining technology and K-Means clustering algorithm,taking scholarship evaluation as an example,the big data is analyzed.Data includes analysis of students’daily behavior from multiple dimensions,and it can prevent the unreasonable scholarship evaluation caused by unfair factors such as plagiarism,votes of teachers and students,etc.At the same time,students’absenteeism,physical health and psychological status in advance can be predicted,which makes student management work more active,accurate and effective.展开更多
With the arrival of the era of big data,the audit thinking mode has been promoted to change.Under the influence of big data,audit will become an activity of continuous behavio Through cloud data,the staff can control ...With the arrival of the era of big data,the audit thinking mode has been promoted to change.Under the influence of big data,audit will become an activity of continuous behavio Through cloud data,the staff can control the operation status and risk assessment of the whole enterprise,timely analyze,control and respond to risks,and protect the enterprise to reduce risks.With the advent of the era of big data,audit data analysis is becoming more and more important.At the same time,a large amount of data analysis also brings challenges to auditors.Methods to deal and solve the challenges has become an urgent problem to be solved at present.This paper mainly studies the challenges and countermeasures brought by the changes of audit approaches and methods to audit data analysis under the background of big data,so as to continuously innovate and practice the improvement of audit technology and promote the healthy and rapid development of social economy.展开更多
A factor analysis was applied to soil geochemical data to define anomalies related to buried Pb-Zn mineralization.A favorable main factor with a strong association of the elements Zn,Cu and Pb,related to mineralizatio...A factor analysis was applied to soil geochemical data to define anomalies related to buried Pb-Zn mineralization.A favorable main factor with a strong association of the elements Zn,Cu and Pb,related to mineralization,was selected for interpretation.The median+2 MAD(median absolute deviation)method of exploratory data analysis(EDA)and C-A(concentration-area)fractal modeling were then applied to the Mahalanobis distance,as defined by Zn,Cu and Pb from the factor analysis to set the thresholds for defining multi-element anomalies.As a result,the median+2 MAD method more successfully identified the Pb-Zn mineralization than the C-A fractal model.The soil anomaly identified by the median+2 MAD method on the Mahalanobis distances defined by three principal elements(Zn,Cu and Pb)rather than thirteen elements(Co,Zn,Cu,V,Mo,Ni,Cr,Mn,Pb,Ba,Sr,Zr and Ti)was the more favorable reflection of the ore body.The identified soil geochemical anomalies were compared with the in situ economic Pb-Zn ore bodies for validation.The results showed that the median+2 MAD approach is capable of mapping both strong and weak geochemical anomalies related to buried Pb-Zn mineralization,which is therefore useful at the reconnaissance drilling stage.展开更多
Purpose:The study aimed to describe youth time-use compositions,focusing on time spent in shorter and longer bouts of sedentary behavior and physical activity(PA),and to examine associations of these time-use composit...Purpose:The study aimed to describe youth time-use compositions,focusing on time spent in shorter and longer bouts of sedentary behavior and physical activity(PA),and to examine associations of these time-use compositions with cardiometabolic biomarkers.Methods:Accelerometer and cardiometabolic biomarker data from 2 Australian studies involving youths 7-13 years old were pooled(complete cases with accelerometry and adiposity marker data,n=782).A 9-component time-use composition was formed using compositional data analysis:time in shorter and longer bouts of sedentary behavior;time in shorter and longer bouts of light-,moderate-,or vigorous-intensity PA;and"other time"(i.e.,non-wear/sleep).Shorter and longer bouts of sedentary time were defined as<5 min and>5 min,respectively.Shorter bouts of light-,moderate-,and vigorous-intensity PA were defined as<1 min;longer bouts were defined as≥1 min.Regression models examined associations between overall time-use composition and cardiometabolic biomarkers.Then,associations were derived between ratios of longer activity patterns relative to shorter activity patterns,and of each intensity level relative to the other intensity levels and"other time",and cardiometabolic biomarkers.Results:Confounder-adjusted models showed that the overall time-use composition was associated with adiposity,blood pressure,lipids,and the summary score.Specifically,more time in longer bouts of light-intensity PA relative to shorter bouts of light-intensity PA was significantly associated with greater body mass index z-score(zBMI)(β=1.79;SE=0.68)and waist circumference(β=18.35,SE=4.78).When each activity intensity was considered relative to all higher intensities and"other time",more time in light-and vigorous-intensity PA,and less time in sedentary behavior and moderate-intensity PA,were associated with lower waist circumference.Conclusion:Accumulating PA,particularly light-intensity PA,in frequent short bursts may be more beneficial for limiting adiposity compared to accumulating the same amount of PA at these intensities in longer bouts.展开更多
In e-commerce the multidimensional data analysis based on the Web data needs integrating various data sources such as XML data and relational data on the conceptual level. A conceptual data description approach to mul...In e-commerce the multidimensional data analysis based on the Web data needs integrating various data sources such as XML data and relational data on the conceptual level. A conceptual data description approach to multidimensional data model the UML galaxy diagram is presented in order to conduct multidimensional data analysis for multiple subjects. The approach is illuminated using a case of 2_roots UML galaxy diagram that takes marketing analysis of TV products involved one retailer and several suppliers into consideration.展开更多
文摘Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.
文摘This research paper compares Excel and R language for data analysis and concludes that R language is more suitable for complex data analysis tasks.R language’s open-source nature makes it accessible to everyone,and its powerful data management and analysis tools make it suitable for handling complex data analysis tasks.It is also highly customizable,allowing users to create custom functions and packages to meet their specific needs.Additionally,R language provides high reproducibility,making it easy to replicate and verify research results,and it has excellent collaboration capabilities,enabling multiple users to work on the same project simultaneously.These advantages make R language a more suitable choice for complex data analysis tasks,particularly in scientific research and business applications.The findings of this study will help people understand that R is not just a language that can handle more data than Excel and demonstrate that r is essential to the field of data analysis.At the same time,it will also help users and organizations make informed decisions regarding their data analysis needs and software preferences.
基金suppor ted by the National Key Research and Development Program of China (2022YFC2702502)the National Natural Science Foundation of China (32170742, 31970646, and 32060152)+7 种基金the Start Fund for Specially Appointed Professor of Jiangsu ProvinceHainan Province Science and Technology Special Fund (ZDYF2021SHFZ051)the Natural Science Foundation of Hainan Province (820MS053)the Start Fund for High-level Talents of Nanjing Medical University (NMUR2020009)the Marshal Initiative Funding of Hainan Medical University (JBGS202103)the Hainan Province Clinical Medical Center (QWYH202175)the Bioinformatics for Major Diseases Science Innovation Group of Hainan Medical Universitythe Shenzhen Science and Technology Program (JCYJ20210324140407021)
文摘The application of single-cell RNA sequencing(scRNA-seq)in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategies.With the expansion of capacity for high-throughput scRNA-seq,including clinical samples,the analysis of these huge volumes of data has become a daunting prospect for researchers entering this field.Here,we review the workflow for typical scRNA-seq data analysis,covering raw data processing and quality control,basic data analysis applicable for almost all scRNA-seq data sets,and advanced data analysis that should be tailored to specific scientific questions.While summarizing the current methods for each analysis step,we also provide an online repository of software and wrapped-up scripts to support the implementation.Recommendations and caveats are pointed out for some specific analysis tasks and approaches.We hope this resource will be helpful to researchers engaging with scRNA-seq,in particular for emerging clinical applications.
基金supported by the major scientific and technological research project of Chongqing Education Commission(KJZD-M202000802)The first batch of Industrial and Informatization Key Special Fund Support Projects in Chongqing in 2022(2022000537).
文摘As COVID-19 poses a major threat to people’s health and economy,there is an urgent need for forecasting methodologies that can anticipate its trajectory efficiently.In non-stationary time series forecasting jobs,there is frequently a hysteresis in the anticipated values relative to the real values.The multilayer deep-time convolutional network and a feature fusion network are combined in this paper’s proposal of an enhanced Multilayer Deep Time Convolutional Neural Network(MDTCNet)for COVID-19 prediction to address this problem.In particular,it is possible to record the deep features and temporal dependencies in uncertain time series,and the features may then be combined using a feature fusion network and a multilayer perceptron.Last but not least,the experimental verification is conducted on the prediction task of COVID-19 real daily confirmed cases in the world and the United States with uncertainty,realizing the short-term and long-term prediction of COVID-19 daily confirmed cases,and verifying the effectiveness and accuracy of the suggested prediction method,as well as reducing the hysteresis of the prediction results.
文摘Human living would be impossible without air quality. Consistent advancements in practically every aspect of contemporary human life have harmed air quality. Everyday industrial, transportation, and home activities turn up dangerous contaminants in our surroundings. This study investigated two years’ worth of air quality and outlier detection data from two Indian cities. Studies on air pollution have used numerous types of methodologies, with various gases being seen as a vector whose components include gas concentration values for each observation per-formed. We use curves to represent the monthly average of daily gas emissions in our technique. The approach, which is based on functional depth, was used to find outliers in the city of Delhi and Kolkata’s gas emissions, and the outcomes were compared to those from the traditional method. In the evaluation and comparison of these models’ performances, the functional approach model studied well.
文摘Air quality is a critical concern for public health and environmental regulation. The Air Quality Index (AQI), a widely adopted index by the US Environmental Protection Agency (EPA), serves as a crucial metric for reporting site-specific air pollution levels. Accurately predicting air quality, as measured by the AQI, is essential for effective air pollution management. In this study, we aim to identify the most reliable regression model among linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), logistic regression, and K-nearest neighbors (KNN). We conducted four different regression analyses using a machine learning approach to determine the model with the best performance. By employing the confusion matrix and error percentages, we selected the best-performing model, which yielded prediction error rates of 22%, 23%, 20%, and 27%, respectively, for LDA, QDA, logistic regression, and KNN models. The logistic regression model outperformed the other three statistical models in predicting AQI. Understanding these models' performance can help address an existing gap in air quality research and contribute to the integration of regression techniques in AQI studies, ultimately benefiting stakeholders like environmental regulators, healthcare professionals, urban planners, and researchers.
文摘This article presents a comprehensive analysis of the current state of research on the English translation of Lu You’s poetry, utilizing a data sample comprising research papers published in the CNKI Full-text Database from 2001 to 2022. Employing rigorous longitudinal statistical methods, the study examines the progress achieved over the past two decades. Notably, domestic researchers have displayed considerable interest in the study of Lu You’s English translation works since 2001. The research on the English translation of Lu You’s poetry reveals a diverse range of perspectives, indicating a rich body of scholarship. However, several challenges persist, including insufficient research, limited translation coverage, and a noticeable focus on specific poems such as “Phoenix Hairpin” in the realm of English translation research. Consequently, there is ample room for improvement in the quality of research output on the English translation of Lu You’s poems, as well as its recognition within the academic community. Building on these findings, it is argued that future investigations pertaining to the English translation of Lu You’s poetry should transcend the boundaries of textual analysis and encompass broader theoretical perspectives and research methodologies. By undertaking this shift, scholars will develop a more profound comprehension of Lu You’s poetic works and make substantive contributions to the field of translation studies. Thus, this article aims to bridge the gap between past research endeavors and future possibilities, serving as a guide and inspiration for scholars to embark on a more nuanced and enriching exploration of Lu You’s poetry as well as other Chinese literature classics.
基金Supported by the National Natural Science Foundation of China
文摘Space debris poses a serious threat to human space activities and needs to be measured and cataloged. As a new technology for space target surveillance, the measurement accuracy of diffuse reflection laser ranging (DRLR) is much higher than that of microwave radar and optoelectronic measurement. Based on the laser ranging data of space debris from the DRLR system at Shanghai Astronomical Observatory acquired in March-April, 2013, the characteristics and precision of the laser ranging data are analyzed and their applications in orbit determination of space debris are discussed, which is implemented for the first time in China. The experiment indicates that the precision of laser ranging data can reach 39 cm-228 cm. When the data are sufficient enough (four arcs measured over three days), the orbital accuracy of space debris can be up to 50 m.
基金supported by the Energy Efficiency&Resources Core Technology Program of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)granted financial resource from the Ministry of Trade,Industry&Energy,Republic of Korea(No.20172510102090).
文摘This paper presents the development and application of a production data analysis software that can analyze and forecast the production performance and reservoir properties of shale gas wells.The theories used in the study were based on the analytical and empirical approaches.Its reliability has been confirmed through comparisons with a commercial software.Using transient data relating to multi-stage hydraulic fractured horizontal wells,it was confirmed that the accuracy of the modified hyperbolic method showed an error of approximately 4%compared to the actual estimated ultimate recovery(EUR).On the basis of the developed model,reliable productivity forecasts have been obtained by analyzing field production data relating to wells in Canada.The EUR was computed as 9.6 Bcf using the modified hyperbolic method.Employing the Pow Law Exponential method,the EUR would be 9.4 Bcf.The models developed in this study will allow in the future integration of new analytical and empirical theories in a relatively readily than commercial models.
文摘RNA-sequencing(RNA-seq),based on next-generation sequencing technologies,has rapidly become a standard and popular technology for transcriptome analysis.However,serious challenges still exist in analyzing and interpreting the RNA-seq data.With the development of high-throughput sequencing technology,the sequencing depth of RNA-seq data increases explosively.The intricate biological process of transcriptome is more complicated and diversified beyond our imagination.Moreover,most of the remaining organisms still have no available reference genome or have only incomplete genome annotations.Therefore,a large number of bioinformatics methods for various transcriptomics studies are proposed to effectively settle these challenges.This review comprehensively summarizes the various studies in RNA-seq data analysis and their corresponding analysis methods,including genome annotation,quality control and pre-processing of reads,read alignment,transcriptome assembly,gene and isoform expression quantification,differential expression analysis,data visualization and other analyses.
基金Natural Science Foundation of China(T11036061/T0108).
文摘A novel study using LCeMS(Liquid chromatography tandem mass spectrometry)coupled with multivariate data analysis and bioactivity evaluation was established for discrimination of aqueous extract and vinegar extract of Shixiao San.Batches of these two kinds of samples were subjected to analysis,and the datasets of sample codes,tR-m/z pairs and ion intensities were processed with principal component analysis(PCA).The result of score plot showed a clear classification of the aqueous and vinegar groups.And the chemical markers having great contributions to the differentiation were screened out on the loading plot.The identities of the chemical markers were performed by comparing the mass fragments and retention times with those of reference compounds and/or the known compounds published in the literatures.Based on the proposed strategy,quercetin-3-Oneohesperidoside,isorhamnetin-3-O-neohespeeridoside,kaempferol-3-O-neohesperidoside,isorhamnetin-3-O-rutinoside and isorhamnetin-3-O-(2G-a-l-rhamnosyl)-rutinoside were explored as representative markers in distinguishing the vinegar extract from the aqueous extract.The anti-hyperlipidemic activities of two processed extracts of Shixiao San were examined on serum levels of lipids,lipoprotein and blood antioxidant enzymes in a rat hyperlipidemia model,and the vinegary extract,exerting strong lipid-lowering and antioxidative effects,was superior to the aqueous extract.Therefore,boiling with vinegary was predicted as the greatest processing procedure for anti-hyperlipidemic effect of Shixiao San.Furthermore,combining the changes in the metabolic profiling and bioactivity evaluation,the five representative markers may be related to the observed antihyperlipidemic effect.
基金supported by the National Natural Science Foundation of China(71571142,51275396)
文摘Under industry 4.0, internet of things(IoT), especially radio frequency identification(RFID) technology, has been widely applied in manufacturing environment. This technology can bring convenience to production control and production transparency. Meanwhile, it generates increasing production data that are sometimes discrete, uncorrelated, and hard-to-use. Thus,an efficient analysis method is needed to utilize the invaluable data. This work provides an RFID-based production data analysis method for production control in Io T-enabled smart job-shops.The physical configuration and operation logic of Io T-enabled smart job-shop production are firstly described. Based on that,an RFID-based production data model is built to formalize and correlate the heterogeneous production data. Then, an eventdriven RFID-based production data analysis method is proposed to construct the RFID events and judge the process command execution. Furthermore, a near big data approach is used to excavate hidden information and knowledge from the historical production data. A demonstrative case is studied to verify the feasibility of the proposed model and methods. It is expected that our work will provide a different insight into the RFIDbased production data analysis.
基金supported by the Huaihua University Science Foundation under Grant HHUY2019-24.
文摘With the rapid development of the Internet,many enterprises have launched their network platforms.When users browse,search,and click the products of these platforms,most platforms will keep records of these network behaviors,these records are often heterogeneous,and it is called log data.To effectively to analyze and manage these heterogeneous log data,so that enterprises can grasp the behavior characteristics of their platform users in time,to realize targeted recommendation of users,increase the sales volume of enterprises’products,and accelerate the development of enterprises.Firstly,we follow the process of big data collection,storage,analysis,and visualization to design the system,then,we adopt HDFS storage technology,Yarn resource management technology,and gink load balancing technology to build a Hadoop cluster to process the log data,and adopt MapReduce processing technology and data warehouse hive technology analyze the log data to obtain the results.Finally,the obtained results are displayed visually,and a log data analysis system is successfully constructed.It has been proved by practice that the system effectively realizes the collection,analysis and visualization of log data,and can accurately realize the recommendation of products by enterprises.The system is stable and effective.
文摘A new dynamic model identification method is developed for continuous-time series analysis and forward prediction applications. The quantum of data is defined over moving time intervals in sliding window coordinates for compressing the size of stored data while retaining the resolution of information. Quantum vectors are introduced as the basis of a linear space for defining a Dynamic Quantum Operator (DQO) model of the system defined by its data stream. The transport of the quantum of compressed data is modeled between the time interval bins during the movement of the sliding time window. The DQO model is identified from the samples of the real-time flow of data over the sliding time window. A least-square-fit identification method is used for evaluating the parameters of the quantum operator model, utilizing the repeated use of the sampled data through a number of time steps. The method is tested to analyze, and forward-predict air temperature variations accessed from weather data as well as methane concentration variations obtained from measurements of an operating mine. The results show efficient forward prediction capabilities, surpassing those using neural networks and other methods for the same task.
基金We thank the anonymous reviewers and editors for their very constructive comments.the National Social Science Foundation Project of China under Grant 16BTQ085.
文摘The issue of privacy protection for mobile social networks is a frontier topic in the field of social network applications.The existing researches on user privacy protection in mobile social network mainly focus on privacy preserving data publishing and access control.There is little research on the association of user privacy information,so it is not easy to design personalized privacy protection strategy,but also increase the complexity of user privacy settings.Therefore,this paper concentrates on the association of user privacy information taking big data analysis tools,so as to provide data support for personalized privacy protection strategy design.
基金Nanjing Key Laboratory of Intelligent Information Processing Open Fund Project(No.19AIP05)。
文摘Big data analysis has penetrated into all fields of society and has brought about profound changes.However,there is relatively little research on big data supporting student management regarding college and university’s big data.Taking the student card information as the research sample,using spark big data mining technology and K-Means clustering algorithm,taking scholarship evaluation as an example,the big data is analyzed.Data includes analysis of students’daily behavior from multiple dimensions,and it can prevent the unreasonable scholarship evaluation caused by unfair factors such as plagiarism,votes of teachers and students,etc.At the same time,students’absenteeism,physical health and psychological status in advance can be predicted,which makes student management work more active,accurate and effective.
基金Key Major of Audit Science in quality Engineering Project of Private Universities in 2020(Grant No.:HS2020ZLGC06)Supervisor System Research Project of Huashang College of Guangdong University of Finance and Economics in 2018(Grant No.:2018HSDS03)University Quality Engineering of Huashang College in 2021(Grant No.:HS2021ZLGC19)。
文摘With the arrival of the era of big data,the audit thinking mode has been promoted to change.Under the influence of big data,audit will become an activity of continuous behavio Through cloud data,the staff can control the operation status and risk assessment of the whole enterprise,timely analyze,control and respond to risks,and protect the enterprise to reduce risks.With the advent of the era of big data,audit data analysis is becoming more and more important.At the same time,a large amount of data analysis also brings challenges to auditors.Methods to deal and solve the challenges has become an urgent problem to be solved at present.This paper mainly studies the challenges and countermeasures brought by the changes of audit approaches and methods to audit data analysis under the background of big data,so as to continuously innovate and practice the improvement of audit technology and promote the healthy and rapid development of social economy.
文摘A factor analysis was applied to soil geochemical data to define anomalies related to buried Pb-Zn mineralization.A favorable main factor with a strong association of the elements Zn,Cu and Pb,related to mineralization,was selected for interpretation.The median+2 MAD(median absolute deviation)method of exploratory data analysis(EDA)and C-A(concentration-area)fractal modeling were then applied to the Mahalanobis distance,as defined by Zn,Cu and Pb from the factor analysis to set the thresholds for defining multi-element anomalies.As a result,the median+2 MAD method more successfully identified the Pb-Zn mineralization than the C-A fractal model.The soil anomaly identified by the median+2 MAD method on the Mahalanobis distances defined by three principal elements(Zn,Cu and Pb)rather than thirteen elements(Co,Zn,Cu,V,Mo,Ni,Cr,Mn,Pb,Ba,Sr,Zr and Ti)was the more favorable reflection of the ore body.The identified soil geochemical anomalies were compared with the in situ economic Pb-Zn ore bodies for validation.The results showed that the median+2 MAD approach is capable of mapping both strong and weak geochemical anomalies related to buried Pb-Zn mineralization,which is therefore useful at the reconnaissance drilling stage.
文摘Purpose:The study aimed to describe youth time-use compositions,focusing on time spent in shorter and longer bouts of sedentary behavior and physical activity(PA),and to examine associations of these time-use compositions with cardiometabolic biomarkers.Methods:Accelerometer and cardiometabolic biomarker data from 2 Australian studies involving youths 7-13 years old were pooled(complete cases with accelerometry and adiposity marker data,n=782).A 9-component time-use composition was formed using compositional data analysis:time in shorter and longer bouts of sedentary behavior;time in shorter and longer bouts of light-,moderate-,or vigorous-intensity PA;and"other time"(i.e.,non-wear/sleep).Shorter and longer bouts of sedentary time were defined as<5 min and>5 min,respectively.Shorter bouts of light-,moderate-,and vigorous-intensity PA were defined as<1 min;longer bouts were defined as≥1 min.Regression models examined associations between overall time-use composition and cardiometabolic biomarkers.Then,associations were derived between ratios of longer activity patterns relative to shorter activity patterns,and of each intensity level relative to the other intensity levels and"other time",and cardiometabolic biomarkers.Results:Confounder-adjusted models showed that the overall time-use composition was associated with adiposity,blood pressure,lipids,and the summary score.Specifically,more time in longer bouts of light-intensity PA relative to shorter bouts of light-intensity PA was significantly associated with greater body mass index z-score(zBMI)(β=1.79;SE=0.68)and waist circumference(β=18.35,SE=4.78).When each activity intensity was considered relative to all higher intensities and"other time",more time in light-and vigorous-intensity PA,and less time in sedentary behavior and moderate-intensity PA,were associated with lower waist circumference.Conclusion:Accumulating PA,particularly light-intensity PA,in frequent short bursts may be more beneficial for limiting adiposity compared to accumulating the same amount of PA at these intensities in longer bouts.
基金This project was supported by China Postdoctoral Science Foundation (2005037506) and the National Natural ScienceFoundation of China (70472029)
文摘In e-commerce the multidimensional data analysis based on the Web data needs integrating various data sources such as XML data and relational data on the conceptual level. A conceptual data description approach to multidimensional data model the UML galaxy diagram is presented in order to conduct multidimensional data analysis for multiple subjects. The approach is illuminated using a case of 2_roots UML galaxy diagram that takes marketing analysis of TV products involved one retailer and several suppliers into consideration.