期刊文献+
共找到46,045篇文章
< 1 2 250 >
每页显示 20 50 100
Hadoop-based secure storage solution for big data in cloud computing environment 被引量:1
1
作者 Shaopeng Guan Conghui Zhang +1 位作者 Yilin Wang Wenqing Liu 《Digital Communications and Networks》 SCIE CSCD 2024年第1期227-236,共10页
In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose... In order to address the problems of the single encryption algorithm,such as low encryption efficiency and unreliable metadata for static data storage of big data platforms in the cloud computing environment,we propose a Hadoop based big data secure storage scheme.Firstly,in order to disperse the NameNode service from a single server to multiple servers,we combine HDFS federation and HDFS high-availability mechanisms,and use the Zookeeper distributed coordination mechanism to coordinate each node to achieve dual-channel storage.Then,we improve the ECC encryption algorithm for the encryption of ordinary data,and adopt a homomorphic encryption algorithm to encrypt data that needs to be calculated.To accelerate the encryption,we adopt the dualthread encryption mode.Finally,the HDFS control module is designed to combine the encryption algorithm with the storage model.Experimental results show that the proposed solution solves the problem of a single point of failure of metadata,performs well in terms of metadata reliability,and can realize the fault tolerance of the server.The improved encryption algorithm integrates the dual-channel storage mode,and the encryption storage efficiency improves by 27.6% on average. 展开更多
关键词 big data security data encryption HADOOP Parallel encrypted storage Zookeeper
下载PDF
Reliability evaluation of IGBT power module on electric vehicle using big data 被引量:1
2
作者 Li Liu Lei Tang +5 位作者 Huaping Jiang Fanyi Wei Zonghua Li Changhong Du Qianlei Peng Guocheng Lu 《Journal of Semiconductors》 EI CAS CSCD 2024年第5期50-60,共11页
There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction... There are challenges to the reliability evaluation for insulated gate bipolar transistors(IGBT)on electric vehicles,such as junction temperature measurement,computational and storage resources.In this paper,a junction temperature estimation approach based on neural network without additional cost is proposed and the lifetime calculation for IGBT using electric vehicle big data is performed.The direct current(DC)voltage,operation current,switching frequency,negative thermal coefficient thermistor(NTC)temperature and IGBT lifetime are inputs.And the junction temperature(T_(j))is output.With the rain flow counting method,the classified irregular temperatures are brought into the life model for the failure cycles.The fatigue accumulation method is then used to calculate the IGBT lifetime.To solve the limited computational and storage resources of electric vehicle controllers,the operation of IGBT lifetime calculation is running on a big data platform.The lifetime is then transmitted wirelessly to electric vehicles as input for neural network.Thus the junction temperature of IGBT under long-term operating conditions can be accurately estimated.A test platform of the motor controller combined with the vehicle big data server is built for the IGBT accelerated aging test.Subsequently,the IGBT lifetime predictions are derived from the junction temperature estimation by the neural network method and the thermal network method.The experiment shows that the lifetime prediction based on a neural network with big data demonstrates a higher accuracy than that of the thermal network,which improves the reliability evaluation of system. 展开更多
关键词 IGBT junction temperature neural network electric vehicles big data
下载PDF
Data-Driven Decision-Making for Bank Target Marketing Using Supervised Learning Classifiers on Imbalanced Big Data
3
作者 Fahim Nasir Abdulghani Ali Ahmed +2 位作者 Mehmet Sabir Kiraz Iryna Yevseyeva Mubarak Saif 《Computers, Materials & Continua》 SCIE EI 2024年第10期1703-1728,共26页
Integrating machine learning and data mining is crucial for processing big data and extracting valuable insights to enhance decision-making.However,imbalanced target variables within big data present technical challen... Integrating machine learning and data mining is crucial for processing big data and extracting valuable insights to enhance decision-making.However,imbalanced target variables within big data present technical challenges that hinder the performance of supervised learning classifiers on key evaluation metrics,limiting their overall effectiveness.This study presents a comprehensive review of both common and recently developed Supervised Learning Classifiers(SLCs)and evaluates their performance in data-driven decision-making.The evaluation uses various metrics,with a particular focus on the Harmonic Mean Score(F-1 score)on an imbalanced real-world bank target marketing dataset.The findings indicate that grid-search random forest and random-search random forest excel in Precision and area under the curve,while Extreme Gradient Boosting(XGBoost)outperforms other traditional classifiers in terms of F-1 score.Employing oversampling methods to address the imbalanced data shows significant performance improvement in XGBoost,delivering superior results across all metrics,particularly when using the SMOTE variant known as the BorderlineSMOTE2 technique.The study concludes several key factors for effectively addressing the challenges of supervised learning with imbalanced datasets.These factors include the importance of selecting appropriate datasets for training and testing,choosing the right classifiers,employing effective techniques for processing and handling imbalanced datasets,and identifying suitable metrics for performance evaluation.Additionally,factors also entail the utilisation of effective exploratory data analysis in conjunction with visualisation techniques to yield insights conducive to data-driven decision-making. 展开更多
关键词 big data machine learning data mining data visualization label encoding imbalanced dataset sampling techniques
下载PDF
Big Data Access Control Mechanism Based on Two-Layer Permission Decision Structure
4
作者 Aodi Liu Na Wang +3 位作者 Xuehui Du Dibin Shan Xiangyu Wu Wenjuan Wang 《Computers, Materials & Continua》 SCIE EI 2024年第4期1705-1726,共22页
Big data resources are characterized by large scale, wide sources, and strong dynamics. Existing access controlmechanisms based on manual policy formulation by security experts suffer from drawbacks such as low policy... Big data resources are characterized by large scale, wide sources, and strong dynamics. Existing access controlmechanisms based on manual policy formulation by security experts suffer from drawbacks such as low policymanagement efficiency and difficulty in accurately describing the access control policy. To overcome theseproblems, this paper proposes a big data access control mechanism based on a two-layer permission decisionstructure. This mechanism extends the attribute-based access control (ABAC) model. Business attributes areintroduced in the ABAC model as business constraints between entities. The proposed mechanism implementsa two-layer permission decision structure composed of the inherent attributes of access control entities and thebusiness attributes, which constitute the general permission decision algorithm based on logical calculation andthe business permission decision algorithm based on a bi-directional long short-term memory (BiLSTM) neuralnetwork, respectively. The general permission decision algorithm is used to implement accurate policy decisions,while the business permission decision algorithm implements fuzzy decisions based on the business constraints.The BiLSTM neural network is used to calculate the similarity of the business attributes to realize intelligent,adaptive, and efficient access control permission decisions. Through the two-layer permission decision structure,the complex and diverse big data access control management requirements can be satisfied by considering thesecurity and availability of resources. Experimental results show that the proposed mechanism is effective andreliable. In summary, it can efficiently support the secure sharing of big data resources. 展开更多
关键词 big data access control data security BiLSTM
下载PDF
Leveraging the potential of big genomic and phenotypic data for genome-wide association mapping in wheat
5
作者 Moritz Lell Yusheng Zhao Jochen C.Reif 《The Crop Journal》 SCIE CSCD 2024年第3期803-813,共11页
Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-s... Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-sized populations of several hundred individuals have been studied is rapidly increasing.Combining these data and using them in GWAS could increase both the power of QTL discovery and the accuracy of estimation of underlying genetic effects,but is hindered by data heterogeneity and lack of interoperability.In this study,we used genomic and phenotypic data sets,focusing on Central European winter wheat populations evaluated for heading date.We explored strategies for integrating these data and subsequently the resulting potential for GWAS.Establishing interoperability between data sets was greatly aided by some overlapping genotypes and a linear relationship between the different phenotyping protocols,resulting in high quality integrated phenotypic data.In this context,genomic prediction proved to be a suitable tool to study relevance of interactions between genotypes and experimental series,which was low in our case.Contrary to expectations,fewer associations between markers and traits were found in the larger combined data than in the individual experimental series.However,the predictive power based on the marker-trait associations of the integrated data set was higher across data sets.Therefore,the results show that the integration of medium-sized to Big Data is an approach to increase the power to detect QTL in GWAS.The results encourage further efforts to standardize and share data in the plant breeding community. 展开更多
关键词 big data Genome-wide association study data integration Genomic prediction WHEAT
下载PDF
Research on Tensor Multi-Clustering Distributed Incremental Updating Method for Big Data
6
作者 Hongjun Zhang Zeyu Zhang +3 位作者 Yilong Ruan Hao Ye Peng Li Desheng Shi 《Computers, Materials & Continua》 SCIE EI 2024年第10期1409-1432,共24页
The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces ... The scale and complexity of big data are growing continuously,posing severe challenges to traditional data processing methods,especially in the field of clustering analysis.To address this issue,this paper introduces a new method named Big Data Tensor Multi-Cluster Distributed Incremental Update(BDTMCDIncreUpdate),which combines distributed computing,storage technology,and incremental update techniques to provide an efficient and effective means for clustering analysis.Firstly,the original dataset is divided into multiple subblocks,and distributed computing resources are utilized to process the sub-blocks in parallel,enhancing efficiency.Then,initial clustering is performed on each sub-block using tensor-based multi-clustering techniques to obtain preliminary results.When new data arrives,incremental update technology is employed to update the core tensor and factor matrix,ensuring that the clustering model can adapt to changes in data.Finally,by combining the updated core tensor and factor matrix with historical computational results,refined clustering results are obtained,achieving real-time adaptation to dynamic data.Through experimental simulation on the Aminer dataset,the BDTMCDIncreUpdate method has demonstrated outstanding performance in terms of accuracy(ACC)and normalized mutual information(NMI)metrics,achieving an accuracy rate of 90%and an NMI score of 0.85,which outperforms existing methods such as TClusInitUpdate and TKLClusUpdate in most scenarios.Therefore,the BDTMCDIncreUpdate method offers an innovative solution to the field of big data analysis,integrating distributed computing,incremental updates,and tensor-based multi-clustering techniques.It not only improves the efficiency and scalability in processing large-scale high-dimensional datasets but also has been validated for its effectiveness and accuracy through experiments.This method shows great potential in real-world applications where dynamic data growth is common,and it is of significant importance for advancing the development of data analysis technology. 展开更多
关键词 TENSOR incremental update DISTRIBUTED clustering processing big data
下载PDF
An Innovative K-Anonymity Privacy-Preserving Algorithm to Improve Data Availability in the Context of Big Data
7
作者 Linlin Yuan Tiantian Zhang +2 位作者 Yuling Chen Yuxiang Yang Huang Li 《Computers, Materials & Continua》 SCIE EI 2024年第4期1561-1579,共19页
The development of technologies such as big data and blockchain has brought convenience to life,but at the same time,privacy and security issues are becoming more and more prominent.The K-anonymity algorithm is an eff... The development of technologies such as big data and blockchain has brought convenience to life,but at the same time,privacy and security issues are becoming more and more prominent.The K-anonymity algorithm is an effective and low computational complexity privacy-preserving algorithm that can safeguard users’privacy by anonymizing big data.However,the algorithm currently suffers from the problem of focusing only on improving user privacy while ignoring data availability.In addition,ignoring the impact of quasi-identified attributes on sensitive attributes causes the usability of the processed data on statistical analysis to be reduced.Based on this,we propose a new K-anonymity algorithm to solve the privacy security problem in the context of big data,while guaranteeing improved data usability.Specifically,we construct a new information loss function based on the information quantity theory.Considering that different quasi-identification attributes have different impacts on sensitive attributes,we set weights for each quasi-identification attribute when designing the information loss function.In addition,to reduce information loss,we improve K-anonymity in two ways.First,we make the loss of information smaller than in the original table while guaranteeing privacy based on common artificial intelligence algorithms,i.e.,greedy algorithm and 2-means clustering algorithm.In addition,we improve the 2-means clustering algorithm by designing a mean-center method to select the initial center of mass.Meanwhile,we design the K-anonymity algorithm of this scheme based on the constructed information loss function,the improved 2-means clustering algorithm,and the greedy algorithm,which reduces the information loss.Finally,we experimentally demonstrate the effectiveness of the algorithm in improving the effect of 2-means clustering and reducing information loss. 展开更多
关键词 Blockchain big data K-ANONYMITY 2-means clustering greedy algorithm mean-center method
下载PDF
Big Data Application Simulation Platform Design for Onboard Distributed Processing of LEO Mega-Constellation Networks
8
作者 Zhang Zhikai Gu Shushi +1 位作者 Zhang Qinyu Xue Jiayin 《China Communications》 SCIE CSCD 2024年第7期334-345,共12页
Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In exist... Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes. 展开更多
关键词 big data application Hadoop LEO mega-constellation multidimensional simulation onboard distributed processing
下载PDF
Exploring impacts of COVID-19 on spatial and temporal patterns of visitors to Canadian Rocky Mountain National Parks from social media big data
9
作者 Dehui Christina Geng Amy Li +4 位作者 Jieyu Zhang Howie W.Harshaw Christopher Gaston Wanli Wu Guangyu Wang 《Journal of Forestry Research》 SCIE EI CAS CSCD 2024年第4期13-33,共21页
COVID-19 posed challenges for global tourism management.Changes in visitor temporal and spatial patterns and their associated determinants pre-and peri-pandemic in Canadian Rocky Mountain National Parks are analyzed.D... COVID-19 posed challenges for global tourism management.Changes in visitor temporal and spatial patterns and their associated determinants pre-and peri-pandemic in Canadian Rocky Mountain National Parks are analyzed.Data was collected through social media programming and analyzed using spatiotemporal analysis and a geographically weighted regression(GWR)model.Results highlight that COVID-19 significantly changed park visitation patterns.Visitors tended to explore more remote areas peri-pandemic.The GWR model also indicated distance to nearby trails was a significant influence on visitor density.Our results indicate that the pandemic influenced tourism temporal and spatial imbalance.This research presents a novel approach using combined social media big data which can be extended to the field of tourism management,and has important implications to manage visitor patterns and to allocate resources efficiently to satisfy multiple objectives of park management. 展开更多
关键词 Tourism management Social media big data National parks COVID-19 Geographical weighted regression
下载PDF
Standard Framework Construction of Technology and Equipment for Big Data in Crop Phenomics
10
作者 Weiliang Wen Shenghao Gu +2 位作者 Ying Zhang Wanneng Yang Xinyu Guo 《Engineering》 SCIE EI CAS CSCD 2024年第11期175-184,共10页
Crop phenomics has rapidly progressed in recent years due to the growing need for crop functional geno-mics,digital breeding,and smart cultivation.Despite this advancement,the lack of standards for the cre-ation and u... Crop phenomics has rapidly progressed in recent years due to the growing need for crop functional geno-mics,digital breeding,and smart cultivation.Despite this advancement,the lack of standards for the cre-ation and usage of crop phenomics technology and equipment has become a bottleneck,limiting the industry’s high-quality development.This paper begins with an overview of the crop phenotyping indus-try and presents an industrial mapping of technology and equipment for big data in crop phenomics.It analyzes the necessity and current state of constructing a standard framework for crop phenotyping.Furthermore,this paper proposes the intended organizational structure and goals of the standard frame-work.It details the essentials of the standard framework in the research and development of hardware and equipment,data acquisition,and the storage and management of crop phenotyping data.Finally,it discusses promoting the construction and evaluation of the standard framework,aiming to provide ideas for developing a high-quality standard framework for crop phenotyping. 展开更多
关键词 Crop phenomics big data Phenotyping technology and equipment Standard framework Industrial mapping
下载PDF
Evaluation of a software positioning tool to support SMEs in adoption of big data analytics
11
作者 Matthew Willetts Anthony S.Atkins 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第1期13-24,共12页
Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Sma... Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Small and medium sized enterprises(SMEs)are the backbone of the global economy,comprising of 90%of businesses worldwide.However,only 10%SMEs have adopted big data analytics despite the competitive advantage they could achieve.Previous research has analysed the barriers to adoption and a strategic framework has been developed to help SMEs adopt big data analytics.The framework was converted into a scoring tool which has been applied to multiple case studies of SMEs in the UK.This paper documents the process of evaluating the framework based on the structured feedback from a focus group composed of experienced practitioners.The results of the evaluation are presented with a discussion on the results,and the paper concludes with recommendations to improve the scoring tool based on the proposed framework.The research demonstrates that this positioning tool is beneficial for SMEs to achieve competitive advantages by increasing the application of business intelligence and big data analytics. 展开更多
关键词 big data analytics EVALUATION Small and medium sized enterprises (SMEs) Strategic framework
下载PDF
Big data challenge for monitoring quality in higher education institutions using business intelligence dashboards
12
作者 Ali Sorour Anthony S.Atkins 《Journal of Electronic Science and Technology》 EI CAS CSCD 2024年第1期25-41,共17页
As big data becomes an apparent challenge to handle when building a business intelligence(BI)system,there is a motivation to handle this challenging issue in higher education institutions(HEIs).Monitoring quality in H... As big data becomes an apparent challenge to handle when building a business intelligence(BI)system,there is a motivation to handle this challenging issue in higher education institutions(HEIs).Monitoring quality in HEIs encompasses handling huge amounts of data coming from different sources.This paper reviews big data and analyses the cases from the literature regarding quality assurance(QA)in HEIs.It also outlines a framework that can address the big data challenge in HEIs to handle QA monitoring using BI dashboards and a prototype dashboard is presented in this paper.The dashboard was developed using a utilisation tool to monitor QA in HEIs to provide visual representations of big data.The prototype dashboard enables stakeholders to monitor compliance with QA standards while addressing the big data challenge associated with the substantial volume of data managed by HEIs’QA systems.This paper also outlines how the developed system integrates big data from social media into the monitoring dashboard. 展开更多
关键词 big data Business intelligence(BI) Dashboards Higher education(HE) Quality assurance(QA) Social media
下载PDF
The Review of Land Use/Land Cover Mapping AI Methodology and Application in the Era of Remote Sensing Big Data
13
作者 ZHANG Xinchang SHI Qian +2 位作者 SUN Ying HUANG Jianfeng HE Da 《Journal of Geodesy and Geoinformation Science》 CSCD 2024年第3期1-23,共23页
With the increasing number of remote sensing satellites,the diversification of observation modals,and the continuous advancement of artificial intelligence algorithms,historically opportunities have been brought to th... With the increasing number of remote sensing satellites,the diversification of observation modals,and the continuous advancement of artificial intelligence algorithms,historically opportunities have been brought to the applications of earth observation and information retrieval,including climate change monitoring,natural resource investigation,ecological environment protection,and territorial space planning.Over the past decade,artificial intelligence technology represented by deep learning has made significant contributions to the field of Earth observation.Therefore,this review will focus on the bottlenecks and development process of using deep learning methods for land use/land cover mapping of the Earth’s surface.Firstly,it introduces the basic framework of semantic segmentation network models for land use/land cover mapping.Then,we summarize the development of semantic segmentation models in geographical field,focusing on spatial and semantic feature extraction,context relationship perception,multi-scale effects modelling,and the transferability of models under geographical differences.Then,the application of semantic segmentation models in agricultural management,building boundary extraction,single tree segmentation and inter-species classification are reviewed.Finally,we discuss the future development prospects of deep learning technology in the context of remote sensing big data. 展开更多
关键词 remote sensing big data deep learning semantic segmentation land use/land cover mapping
下载PDF
Multi-dimensional database design and implementation of dam safety monitoring system 被引量:1
14
作者 Zhao Erfeng Wang Yachao +2 位作者 Jiang Yufeng Zhang Lei Yu Hong 《Water Science and Engineering》 EI CAS 2008年第3期112-120,共9页
To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mo... To improve the effectiveness of dam safety monitoring database systems, the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode. The optimal data model was confirmed by identifying data objects, defining relations and reviewing entities. The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely. On this basis, a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established, for which factual tables and dimensional tables have been designed. Finally, based on service design and user interface design, the dam safety monitoring system has been developed with Delphi as the development tool. This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design. It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers. 展开更多
关键词 dam safety multi-dimensional database conceptual data model database mode monitoring system
下载PDF
Goodness-of-fit tests for multi-dimensional copulas:Expanding application to historical drought data 被引量:2
15
作者 Ming-wei MA Li-liang REN +2 位作者 Song-bai SONG Jia-li SONG Shan-hu JIANG 《Water Science and Engineering》 EI CAS CSCD 2013年第1期18-30,共13页
The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for mul... The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required. 展开更多
关键词 goodness-of-fit test multi-dimensional copulas stochastic simulation Rosenblatt'stransformation bootstrap approach drought data
下载PDF
Data-Oriented Operating System for Big Data and Cloud
16
作者 Selwyn Darryl Kessler Kok-Why Ng Su-Cheng Haw 《Intelligent Automation & Soft Computing》 2024年第4期633-647,共15页
Operating System(OS)is a critical piece of software that manages a computer’s hardware and resources,acting as the intermediary between the computer and the user.The existing OS is not designed for Big Data and Cloud... Operating System(OS)is a critical piece of software that manages a computer’s hardware and resources,acting as the intermediary between the computer and the user.The existing OS is not designed for Big Data and Cloud Computing,resulting in data processing and management inefficiency.This paper proposes a simplified and improved kernel on an x86 system designed for Big Data and Cloud Computing purposes.The proposed algorithm utilizes the performance benefits from the improved Input/Output(I/O)performance.The performance engineering runs the data-oriented design on traditional data management to improve data processing speed by reducing memory access overheads in conventional data management.The OS incorporates a data-oriented design to“modernize”various Data Science and management aspects.The resulting OS contains a basic input/output system(BIOS)bootloader that boots into Intel 32-bit protected mode,a text display terminal,4 GB paging memory,4096 heap block size,a Hard Disk Drive(HDD)I/O Advanced Technology Attachment(ATA)driver and more.There are also I/O scheduling algorithm prototypes that demonstrate how a simple Sweeping algorithm is superior to more conventionally known I/O scheduling algorithms.A MapReduce prototype is implemented using Message Passing Interface(MPI)for big data purposes.An attempt was made to optimize binary search using modern performance engineering and data-oriented design. 展开更多
关键词 Operating system big data cloud computing MAPREDUCE data-ORIENTED
下载PDF
An Overview of the Application of Big Data in Supply Chain Management and Adaptation in Nigeria
17
作者 Jehoshaphat Jaiye Dukiya 《Journal of Computer and Communications》 2024年第8期37-51,共15页
That the world is a global village is no longer news through the tremendous advancement in the Information Communication Technology (ICT). The metamorphosis of the human data storage and analysis from analogue through... That the world is a global village is no longer news through the tremendous advancement in the Information Communication Technology (ICT). The metamorphosis of the human data storage and analysis from analogue through the jaguars-loom mainframe computer to the present modern high power processing computers with sextillion bytes storage capacity has prompted discussion of Big Data concept as a tool in managing hitherto all human challenges of complex human system multiplier effects. The supply chain management (SCM) that deals with spatial service delivery that must be safe, efficient, reliable, cheap, transparent, and foreseeable to meet customers’ needs cannot but employ bid data tools in its operation. This study employs secondary data online to review the importance of big data in supply chain management and the levels of adoption in Nigeria. The study revealed that the application of big data tools in SCM and other industrial sectors is synonymous to human and national development. It is therefore recommended that both private and governmental bodies should key into e-transactions for easy data assemblage and analysis for profitable forecasting and policy formation. 展开更多
关键词 big data IoT Optimization Right data Supply Chain Transport Management
下载PDF
Data Visualization in Big Data Analysis: Applications and Future Trends
18
作者 Wenyi Ouyang 《Journal of Computer and Communications》 2024年第11期76-85,共10页
The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future... The advent of the big data era has made data visualization a crucial tool for enhancing the efficiency and insights of data analysis. This theoretical research delves into the current applications and potential future trends of data visualization in big data analysis. The article first systematically reviews the theoretical foundations and technological evolution of data visualization, and thoroughly analyzes the challenges faced by visualization in the big data environment, such as massive data processing, real-time visualization requirements, and multi-dimensional data display. Through extensive literature research, it explores innovative application cases and theoretical models of data visualization in multiple fields including business intelligence, scientific research, and public decision-making. The study reveals that interactive visualization, real-time visualization, and immersive visualization technologies may become the main directions for future development and analyzes the potential of these technologies in enhancing user experience and data comprehension. The paper also delves into the theoretical potential of artificial intelligence technology in enhancing data visualization capabilities, such as automated chart generation, intelligent recommendation of visualization schemes, and adaptive visualization interfaces. The research also focuses on the role of data visualization in promoting interdisciplinary collaboration and data democratization. Finally, the paper proposes theoretical suggestions for promoting data visualization technology innovation and application popularization, including strengthening visualization literacy education, developing standardized visualization frameworks, and promoting open-source sharing of visualization tools. This study provides a comprehensive theoretical perspective for understanding the importance of data visualization in the big data era and its future development directions. 展开更多
关键词 data Visualization big data Analysis Artificial Intelligence Interactive Visualization data-Driven Decision Making
下载PDF
Optimizing Healthcare Big Data Processing with Containerized PySpark and Parallel Computing: A Study on ETL Pipeline Efficiency
19
作者 Ehsan Soltanmohammadi Neset Hikmet 《Journal of Data Analysis and Information Processing》 2024年第4期544-565,共22页
In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical D... In this study, we delve into the realm of efficient Big Data Engineering and Extract, Transform, Load (ETL) processes within the healthcare sector, leveraging the robust foundation provided by the MIMIC-III Clinical Database. Our investigation entails a comprehensive exploration of various methodologies aimed at enhancing the efficiency of ETL processes, with a primary emphasis on optimizing time and resource utilization. Through meticulous experimentation utilizing a representative dataset, we shed light on the advantages associated with the incorporation of PySpark and Docker containerized applications. Our research illuminates significant advancements in time efficiency, process streamlining, and resource optimization attained through the utilization of PySpark for distributed computing within Big Data Engineering workflows. Additionally, we underscore the strategic integration of Docker containers, delineating their pivotal role in augmenting scalability and reproducibility within the ETL pipeline. This paper encapsulates the pivotal insights gleaned from our experimental journey, accentuating the practical implications and benefits entailed in the adoption of PySpark and Docker. By streamlining Big Data Engineering and ETL processes in the context of clinical big data, our study contributes to the ongoing discourse on optimizing data processing efficiency in healthcare applications. The source code is available on request. 展开更多
关键词 big data Engineering ETL Healthcare Sector Containerized Applications Distributed Computing Resource Optimization data Processing Efficiency
下载PDF
Sports Prediction Model through Cloud Computing and Big Data Based on Artificial Intelligence Method
20
作者 Aws I. Abu Eid Achraf Ben Miled +9 位作者 Ahlem Fatnassi Majid A. Nawaz Ashraf F. A. Mahmoud Faroug A. Abdalla Chams Jabnoun Aida Dhibi Firas M. Allan Mohammed Ahmed Elhossiny Salem Belhaj Imen Ben Mohamed 《Journal of Intelligent Learning Systems and Applications》 2024年第2期53-79,共27页
This article delves into the intricate relationship between big data, cloud computing, and artificial intelligence, shedding light on their fundamental attributes and interdependence. It explores the seamless amalgama... This article delves into the intricate relationship between big data, cloud computing, and artificial intelligence, shedding light on their fundamental attributes and interdependence. It explores the seamless amalgamation of AI methodologies within cloud computing and big data analytics, encompassing the development of a cloud computing framework built on the robust foundation of the Hadoop platform, enriched by AI learning algorithms. Additionally, it examines the creation of a predictive model empowered by tailored artificial intelligence techniques. Rigorous simulations are conducted to extract valuable insights, facilitating method evaluation and performance assessment, all within the dynamic Hadoop environment, thereby reaffirming the precision of the proposed approach. The results and analysis section reveals compelling findings derived from comprehensive simulations within the Hadoop environment. These outcomes demonstrate the efficacy of the Sport AI Model (SAIM) framework in enhancing the accuracy of sports-related outcome predictions. Through meticulous mathematical analyses and performance assessments, integrating AI with big data emerges as a powerful tool for optimizing decision-making in sports. The discussion section extends the implications of these results, highlighting the potential for SAIM to revolutionize sports forecasting, strategic planning, and performance optimization for players and coaches. The combination of big data, cloud computing, and AI offers a promising avenue for future advancements in sports analytics. This research underscores the synergy between these technologies and paves the way for innovative approaches to sports-related decision-making and performance enhancement. 展开更多
关键词 Artificial Intelligence Machine Learning Spark Apache big data SAIM
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部