Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Sma...Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Small and medium sized enterprises(SMEs)are the backbone of the global economy,comprising of 90%of businesses worldwide.However,only 10%SMEs have adopted big data analytics despite the competitive advantage they could achieve.Previous research has analysed the barriers to adoption and a strategic framework has been developed to help SMEs adopt big data analytics.The framework was converted into a scoring tool which has been applied to multiple case studies of SMEs in the UK.This paper documents the process of evaluating the framework based on the structured feedback from a focus group composed of experienced practitioners.The results of the evaluation are presented with a discussion on the results,and the paper concludes with recommendations to improve the scoring tool based on the proposed framework.The research demonstrates that this positioning tool is beneficial for SMEs to achieve competitive advantages by increasing the application of business intelligence and big data analytics.展开更多
In recent years,huge volumes of healthcare data are getting generated in various forms.The advancements made in medical imaging are tremendous owing to which biomedical image acquisition has become easier and quicker....In recent years,huge volumes of healthcare data are getting generated in various forms.The advancements made in medical imaging are tremendous owing to which biomedical image acquisition has become easier and quicker.Due to such massive generation of big data,the utilization of new methods based on Big Data Analytics(BDA),Machine Learning(ML),and Artificial Intelligence(AI)have become essential.In this aspect,the current research work develops a new Big Data Analytics with Cat Swarm Optimization based deep Learning(BDA-CSODL)technique for medical image classification on Apache Spark environment.The aim of the proposed BDA-CSODL technique is to classify the medical images and diagnose the disease accurately.BDA-CSODL technique involves different stages of operations such as preprocessing,segmentation,fea-ture extraction,and classification.In addition,BDA-CSODL technique also fol-lows multi-level thresholding-based image segmentation approach for the detection of infected regions in medical image.Moreover,a deep convolutional neural network-based Inception v3 method is utilized in this study as feature extractor.Stochastic Gradient Descent(SGD)model is used for parameter tuning process.Furthermore,CSO with Long Short-Term Memory(CSO-LSTM)model is employed as a classification model to determine the appropriate class labels to it.Both SGD and CSO design approaches help in improving the overall image classification performance of the proposed BDA-CSODL technique.A wide range of simulations was conducted on benchmark medical image datasets and the com-prehensive comparative results demonstrate the supremacy of the proposed BDA-CSODL technique under different measures.展开更多
Lately,the Internet of Things(IoT)application requires millions of structured and unstructured data since it has numerous problems,such as data organization,production,and capturing.To address these shortcomings,big d...Lately,the Internet of Things(IoT)application requires millions of structured and unstructured data since it has numerous problems,such as data organization,production,and capturing.To address these shortcomings,big data analytics is the most superior technology that has to be adapted.Even though big data and IoT could make human life more convenient,those benefits come at the expense of security.To manage these kinds of threats,the intrusion detection system has been extensively applied to identify malicious network traffic,particularly once the preventive technique fails at the level of endpoint IoT devices.As cyberattacks targeting IoT have gradually become stealthy and more sophisticated,intrusion detection systems(IDS)must continually emerge to manage evolving security threats.This study devises Big Data Analytics with the Internet of Things Assisted Intrusion Detection using Modified Buffalo Optimization Algorithm with Deep Learning(IDMBOA-DL)algorithm.In the presented IDMBOA-DL model,the Hadoop MapReduce tool is exploited for managing big data.The MBOA algorithm is applied to derive an optimal subset of features from picking an optimum set of feature subsets.Finally,the sine cosine algorithm(SCA)with convolutional autoencoder(CAE)mechanism is utilized to recognize and classify the intrusions in the IoT network.A wide range of simulations was conducted to demonstrate the enhanced results of the IDMBOA-DL algorithm.The comparison outcomes emphasized the better performance of the IDMBOA-DL model over other approaches.展开更多
Big Data applications face different types of complexities in classifications.Cleaning and purifying data by eliminating irrelevant or redundant data for big data applications becomes a complex operation while attempt...Big Data applications face different types of complexities in classifications.Cleaning and purifying data by eliminating irrelevant or redundant data for big data applications becomes a complex operation while attempting to maintain discriminative features in processed data.The existing scheme has many disadvantages including continuity in training,more samples and training time in feature selections and increased classification execution times.Recently ensemble methods have made a mark in classification tasks as combine multiple results into a single representation.When comparing to a single model,this technique offers for improved prediction.Ensemble based feature selections parallel multiple expert’s judgments on a single topic.The major goal of this research is to suggest HEFSM(Heterogeneous Ensemble Feature Selection Model),a hybrid approach that combines multiple algorithms.The major goal of this research is to suggest HEFSM(Heterogeneous Ensemble Feature Selection Model),a hybrid approach that combines multiple algorithms.Further,individual outputs produced by methods producing subsets of features or rankings or voting are also combined in this work.KNN(K-Nearest Neighbor)classifier is used to classify the big dataset obtained from the ensemble learning approach.The results found of the study have been good,proving the proposed model’s efficiency in classifications in terms of the performance metrics like precision,recall,F-measure and accuracy used.展开更多
This paper focuses on facilitating state-of-the-art applications of big data analytics(BDA) architectures and infrastructures to telecommunications(telecom) industrial sector.Telecom companies are dealing with terabyt...This paper focuses on facilitating state-of-the-art applications of big data analytics(BDA) architectures and infrastructures to telecommunications(telecom) industrial sector.Telecom companies are dealing with terabytes to petabytes of data on a daily basis. Io T applications in telecom are further contributing to this data deluge. Recent advances in BDA have exposed new opportunities to get actionable insights from telecom big data. These benefits and the fast-changing BDA technology landscape make it important to investigate existing BDA applications to telecom sector. For this, we initially determine published research on BDA applications to telecom through a systematic literature review through which we filter 38 articles and categorize them in frameworks, use cases, literature reviews, white papers and experimental validations. We also discuss the benefits and challenges mentioned in these articles. We find that experiments are all proof of concepts(POC) on a severely limited BDA technology stack(as compared to the available technology stack), i.e.,we did not find any work focusing on full-fledged BDA implementation in an operational telecom environment. To facilitate these applications at research-level, we propose a state-of-the-art lambda architecture for BDA pipeline implementation(called Lambda Tel) based completely on open source BDA technologies and the standard Python language, along with relevant guidelines.We discovered only one research paper which presented a relatively-limited lambda architecture using the proprietary AWS cloud infrastructure. We believe Lambda Tel presents a clear roadmap for telecom industry practitioners to implement and enhance BDA applications in their enterprises.展开更多
The advent of healthcare information management systems(HIMSs)continues to produce large volumes of healthcare data for patient care and compliance and regulatory requirements at a global scale.Analysis of this big da...The advent of healthcare information management systems(HIMSs)continues to produce large volumes of healthcare data for patient care and compliance and regulatory requirements at a global scale.Analysis of this big data allows for boundless potential outcomes for discovering knowledge.Big data analytics(BDA)in healthcare can,for instance,help determine causes of diseases,generate effective diagnoses,enhance Qo S guarantees by increasing efficiency of the healthcare delivery and effectiveness and viability of treatments,generate accurate predictions of readmissions,enhance clinical care,and pinpoint opportunities for cost savings.However,BDA implementations in any domain are generally complicated and resource-intensive with a high failure rate and no roadmap or success strategies to guide the practitioners.In this paper,we present a comprehensive roadmap to derive insights from BDA in the healthcare(patient care)domain,based on the results of a systematic literature review.We initially determine big data characteristics for healthcare and then review BDA applications to healthcare in academic research focusing particularly on No SQL databases.We also identify the limitations and challenges of these applications and justify the potential of No SQL databases to address these challenges and further enhance BDA healthcare research.We then propose and describe a state-of-the-art BDA architecture called Med-BDA for healthcare domain which solves all current BDA challenges and is based on the latest zeta big data paradigm.We also present success strategies to ensure the working of Med-BDA along with outlining the major benefits of BDA applications to healthcare.Finally,we compare our work with other related literature reviews across twelve hallmark features to justify the novelty and importance of our work.The aforementioned contributions of our work are collectively unique and clearly present a roadmap for clinical administrators,practitioners and professionals to successfully implement BDA initiatives in their organizations.展开更多
To obtain the platform s big data analytics support,manufacturers in the traditional retail channel must decide whether to use the direct online channel.A retail supply chain model and a direct online supply chain mod...To obtain the platform s big data analytics support,manufacturers in the traditional retail channel must decide whether to use the direct online channel.A retail supply chain model and a direct online supply chain model are built,in which manufacturers design products alone in the retail channel,while the platform and manufacturer complete the product design in the direct online channel.These two models are analyzed using the game theoretical model and numerical simulation.The findings indicate that if the manufacturers design capabilities are not very high and the commission rate is not very low,the manufacturers will choose the direct online channel if the platform s technical efforts are within an interval.When the platform s technical efforts are exogenous,they positively influence the manufacturers decisions;however,in the endogenous case,the platform s effect on the manufacturers is reflected in the interaction of the commission rate and cost efficiency.The manufacturers and the platform should make synthetic effort decisions based on the manufacturer s development capabilities,the intensity of market competition,and the cost efficiency of the platform.展开更多
Big data analytics is emerging as one kind of the most important workloads in modern data centers. Hence,it is of great interest to identify the method of achieving the best performance for big data analytics workload...Big data analytics is emerging as one kind of the most important workloads in modern data centers. Hence,it is of great interest to identify the method of achieving the best performance for big data analytics workloads running on state-of-the-art SMT( simultaneous multithreading) processors,which needs comprehensive understanding to workload characteristics. This paper chooses the Spark workloads as the representative big data analytics workloads and performs comprehensive measurements on the POWER8 platform,which supports a wide range of multithreading. The research finds that the thread assignment policy and cache contention have significant impacts on application performance. In order to identify the potential optimization method from the experiment results,this study performs micro-architecture level characterizations by means of hardware performance counters and gives implications accordingly.展开更多
Big data analysis(BDA)can increase the capability of supply chain analysis of manufacturing companies.Therefore,many manufacturing companies want to use BDA,but it has been seen that BDA implementation is difficult,es...Big data analysis(BDA)can increase the capability of supply chain analysis of manufacturing companies.Therefore,many manufacturing companies want to use BDA,but it has been seen that BDA implementation is difficult,especially in developing countries due to the existence of various barriers related to finance,government regulations,etc.This paper aims to investigate the barriers to BDA implementation in Iranian companies.In literature,limited work has been done on identifying barriers to implementing BDA in developing countries.In this regard,34 barriers were identified to BDA adoption in Iran by employing a literature review and feedback received from experts.Then,the most important barriers(14)were analyzed using integrated Interpretive Structural Modeling and MICMAC approach.Results show that two barriers;namely,lack of sufficient knowledge of senior managers and weakness of governance policies,are the most significant.Finally,crucial policy measures and recommendations are proposed to assist managers and government bodies.展开更多
With an accelerating increase of business benefits produced from big data analytics (if used appropriately and intelligently by businesses in the private and public sectors), this study focused on empirically identify...With an accelerating increase of business benefits produced from big data analytics (if used appropriately and intelligently by businesses in the private and public sectors), this study focused on empirically identifying the big data analytics (BDA) attributes. These attributes were classified into four groups (i.e., value innovation, social impact, precision, and completeness of BDA quality) and were found to influence the decision-making performance and business performance outcomes. A structural equation modeling analysis using 382 responses from a BDA related to practitioners indicated that the attributes of representativeness, predictability, interpretability, and innovativeness as related to value innovation greatly enhanced the decision-making confidence and effectiveness of decision makers who make decisions using big data. In addition, individuality, collectivity, and willfulness, which are related to social impact, also greatly improved the decision-making confidence and effectiveness of the same decision makers. This shows that the value innovation and social impact, which have received relatively less attention in previous studies, are the crucial attributes for BDA quality as they influence the decision-making performance. Comprehensiveness, factuality, and realism, which are linked to completeness, also have similar results. Furthermore, the higher the decision-making confidence of the decision makers who used big data was, the higher the financial performance of their companies. In addition, high decision-making confidence using big data was found to improve the nonfinancial performance metrics such as customer satisfaction and quality levels as well as product development capabilities. High decision-making effectiveness with big data was also shown to improve the nonfinancial performance metrics.展开更多
Big data analysis is confronted with the obstacle of high dimensionality in data samples.To address this issue,researchers have devised a multitude of intel-ligent optimization algorithms aimed at enhancing big data a...Big data analysis is confronted with the obstacle of high dimensionality in data samples.To address this issue,researchers have devised a multitude of intel-ligent optimization algorithms aimed at enhancing big data analysis techniques.Among these algorithms is the War Strategy Optimization(WSO)proposed in 2022,which distinguishes itself from other intelligence algorithms through its potent optimization capabilities.Nevertheless,the WSO exhibits limitations in its global search capacity and is susceptible to becoming trapped in local optima when dealing with high-dimensional problems.To surmount these shortcomings and improve the performance of WSO in handling the challenges posed by high dimensionality in big data,this paper introduces an enhanced version of the WSO based on the carnivorous plant algorithm(CPA)and shared niche.The grouping concept and update strategy of CPA are incorporated into WSO,and its update strategy is modified through the introduction of a shared small habitat approach combined with an elite strategy to create a novel improved algorithm.Simula-tion experiments were conducted to compare this new War Strategy Optimization(CSWSO)with WSO,RKWSO,I-GWO,NCHHO and FDB-SDO using 16 test functions.Experimental results demonstrate that the proposed enhanced algorithm exhibits superior optimization accuracy and stability,providing a novel approach to addressing the challenges posed by high dimensionality in big data.展开更多
Big Data Analytics is an emerging field since massive storage and computing capabilities have been made available by advanced e-infrastructures.Earth and Environmental sciences are likely to benefit from Big Data Anal...Big Data Analytics is an emerging field since massive storage and computing capabilities have been made available by advanced e-infrastructures.Earth and Environmental sciences are likely to benefit from Big Data Analytics techniques supporting the processing of the large number of Earth Observation datasets currently acquired and generated through observations and simulations.However,Earth Science data and applications present specificities in terms of relevance of the geospatial information,wide heterogeneity of data models and formats,and complexity of processing.Therefore,Big Earth Data Analytics requires specifically tailored techniques and tools.The EarthServer Big Earth Data Analytics engine offers a solution for coverage-type datasets,built around a high performance array database technology,and the adoption and enhancement of standards for service interaction(OGC WCS and WCPS).The EarthServer solution,led by the collection of requirements from scientific communities and international initiatives,provides a holistic approach that ranges from query languages and scalability up to mobile access and visualization.The result is demonstrated and validated through the development of lighthouse applications in the Marine,Geology,Atmospheric,Planetary and Cryospheric science domains.展开更多
Over the past few decades,with the development of automatic identification,data capture and storage technologies,people generate data much faster and collect data much bigger than ever before in business,science,engin...Over the past few decades,with the development of automatic identification,data capture and storage technologies,people generate data much faster and collect data much bigger than ever before in business,science,engineering,education and other areas.Big data has emerged as an important area of study for both practitioners and researchers.It has huge impacts on data-related problems.In this paper,we identify the key issues related to big data analytics and then investigate its applications specifically related to business problems.展开更多
Big data has attracted much attention from academia and industry.But the discussion of big data is disparate,fragmented and distributed among different outlets.This paper conducts a systematic and extensive review on ...Big data has attracted much attention from academia and industry.But the discussion of big data is disparate,fragmented and distributed among different outlets.This paper conducts a systematic and extensive review on 186 journal publications about big data from 2011 to 2015 in the Science Citation Index(SCI)and the Social Science Citation Index(SSCI)database aiming to provide scholars and practitioners with a comprehensive overview and big picture about research on big data.The selected papers are grouped into 20 research categories.The contents of the paper(s)in each research category are summarized.Research directions for each category are outlined as well.The results in this study indicate that the selected papers were mainly published between 2013 and 2015 and focus on technological issues regarding big data.Diverse new approaches,methods,frameworks and systems are proposed for data collection,storage,transport,processing and analysis in the selected papers.Possible directions for future research on big data are discussed.展开更多
In this paper,recent developments on the Internet of Things(IoT)and its applications are surveyed,and the impact of newly developed Big Data(BD)on manufacturing information systems is especially discussed.Big Data ana...In this paper,recent developments on the Internet of Things(IoT)and its applications are surveyed,and the impact of newly developed Big Data(BD)on manufacturing information systems is especially discussed.Big Data analytics(BDA)has been identified as a critical technology to support data acquisition,storage,and analytics in data management systems in modern manufacturing.The purpose of the presented work is to clarify the requirements of predictive systems,and to identify research challenges and opportunities on BDA to support cloudbased information systems.展开更多
In this work,we design a multisensory IoT-based online vitals monitor(hereinafter referred to as the VITALS)to sense four bedside physiological parameters including pulse(heart)rate,body temperature,blood pressure,and...In this work,we design a multisensory IoT-based online vitals monitor(hereinafter referred to as the VITALS)to sense four bedside physiological parameters including pulse(heart)rate,body temperature,blood pressure,and periph-eral oxygen saturation.Then,the proposed system constantly transfers these signals to the analytics system which aids in enhancing diagnostics at an earlier stage as well as monitoring after recovery.The core hardware of the VITALS includes commercial off-the-shelf sensing devices/medical equipment,a powerful microcontroller,a reliable wireless communication module,and a big data analytics system.It extracts human vital signs in a pre-programmed interval of 30 min and sends them to big data analytics system through the WiFi module for further analysis.We use Apache Kafka(to gather live data streams from connected sen-sors),Apache Spark(to categorize the patient vitals and notify the medical pro-fessionals while identifying abnormalities in physiological parameters),Hadoop Distributed File System(HDFS)(to archive data streams for further analysis and long-term storage),Spark SQL,Hive and Matplotlib(to support caregivers to access/visualize appropriate information from collected data streams and to explore/understand the health status of the individuals).In addition,we develop a mobile application to send statistical graphs to doctors and patients to enable them to monitor health conditions remotely.Our proposed system is implemented on three patients for 7 days to check the effectiveness of sensing,data processing,and data transmission mechanisms.To validate the system accuracy,we compare the data values collected from established sensors with the measured readouts using a commercial healthcare monitor,the Welch Allyn®Spot Check.Our pro-posed system provides improved care solutions,especially for those whose access to care services is limited.展开更多
AbstractWith more and more data generated,it has become a big challenge for traditional architectures and infrastructures to process large amounts of data within an acceptable time and resources.In order to efficientl...AbstractWith more and more data generated,it has become a big challenge for traditional architectures and infrastructures to process large amounts of data within an acceptable time and resources.In order to efficiently extract value from these data,organizations need to find new tools and methods specialized for big data processing.For this reason,big data analytics has become a key factor for companies to reveal hidden information and achieve competitive advantages in the market.Currently,enormous publications of big data analytics make it difficult for practitioners and researchers to find topics they are interested in and track up to date.This paper aims to present an overview of big data analytics’content,scope and findings as well as opportunities provided by the application of big data analytics.展开更多
The availability of digital technology in the hands of every citizenry worldwide makes an available unprecedented massive amount of data.The capability to process these gigantic amounts of data in real-time with Big D...The availability of digital technology in the hands of every citizenry worldwide makes an available unprecedented massive amount of data.The capability to process these gigantic amounts of data in real-time with Big Data Analytics(BDA)tools and Machine Learning(ML)algorithms carries many paybacks.However,the high number of free BDA tools,platforms,and data mining tools makes it challenging to select the appropriate one for the right task.This paper presents a comprehensive mini-literature review of ML in BDA,using a keyword search;a total of 1512 published articles was identified.The articles were screened to 140 based on the study proposed novel taxonomy.The study outcome shows that deep neural networks(15%),support vector machines(15%),artificial neural networks(14%),decision trees(12%),and ensemble learning techniques(11%)are widely applied in BDA.The related applications fields,challenges,and most importantly the openings for future research,are detailed.展开更多
In this study,the key drivers of sustainability commitment,green supply chain management,big data integration and green human resource practice are explored,and the impact of these sustainable capabilities on the envi...In this study,the key drivers of sustainability commitment,green supply chain management,big data integration and green human resource practice are explored,and the impact of these sustainable capabilities on the environmental and financial performance of banks is also elaborated.In addition,the influence of green management practices on integrating big data technology into operations is presented.As for the concept of dynamic ability,it has been used to recommend and empirically test conceptual models.Data were collected through a self-administrated survey questionnaire on 317 people working in 37 banks in six Asian countries.Research suggests that big data analytics strategies have an impact on internal processes and on the stability and financial performance of banks.Besides,it is indicated that banks are committed to proper data monitoring of their customers to complete operational efficiency and sustainability goals.Furthermore,our result proved that banks practicing Green Innovation strategies experience better environmental and economic performance because their employees are already trained in Green HR.Finally,from our study,it was found that internal and external green supply chain management practices have a positive effect on the environmental and financial performance of banks,thus ensuring that the bank of Association of Southeast Asian Nations(ASEAN)mitigates the environmental impact through its operations and ultimately experiences an increase in financial performance.展开更多
Big data analytics applications are increasingly deployed on cloud computing infrastructures,and it is still a big challenge to pick the optimal cloud configurations in a cost-effective way.In this paper,we address th...Big data analytics applications are increasingly deployed on cloud computing infrastructures,and it is still a big challenge to pick the optimal cloud configurations in a cost-effective way.In this paper,we address this problem with a high accuracy and a low overhead.We propose Apollo,a data-driven approach that can rapidly pick the optimal cloud configurations by reusing data from similar workloads.We first classify 12 typical workloads in BigDataBench by characterizing pairwise correlations in our offline benchmarks.When a new workload comes,we run it with several small datasets to rank its key characteristics and get its similar workloads.Based on the rank,we then limit the search space of cloud configurations through a classification mechanism.At last,we leverage a hierarchical regression model to measure which cluster is more suitable and use a local search strategy to pick the optimal cloud configurations in a few extra tests.Our evaluation on 12 typical workloads in HiBench shows that compared with state-of-the-art approaches,Apollo can improve up to 30%search accuracy,while reducing as much as 50%overhead for picking the optimal cloud configurations.展开更多
文摘Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Small and medium sized enterprises(SMEs)are the backbone of the global economy,comprising of 90%of businesses worldwide.However,only 10%SMEs have adopted big data analytics despite the competitive advantage they could achieve.Previous research has analysed the barriers to adoption and a strategic framework has been developed to help SMEs adopt big data analytics.The framework was converted into a scoring tool which has been applied to multiple case studies of SMEs in the UK.This paper documents the process of evaluating the framework based on the structured feedback from a focus group composed of experienced practitioners.The results of the evaluation are presented with a discussion on the results,and the paper concludes with recommendations to improve the scoring tool based on the proposed framework.The research demonstrates that this positioning tool is beneficial for SMEs to achieve competitive advantages by increasing the application of business intelligence and big data analytics.
基金The author extends his appreciation to the Deanship of Scientific Research at Majmaah University for funding this study under Project Number(R-2022-61).
文摘In recent years,huge volumes of healthcare data are getting generated in various forms.The advancements made in medical imaging are tremendous owing to which biomedical image acquisition has become easier and quicker.Due to such massive generation of big data,the utilization of new methods based on Big Data Analytics(BDA),Machine Learning(ML),and Artificial Intelligence(AI)have become essential.In this aspect,the current research work develops a new Big Data Analytics with Cat Swarm Optimization based deep Learning(BDA-CSODL)technique for medical image classification on Apache Spark environment.The aim of the proposed BDA-CSODL technique is to classify the medical images and diagnose the disease accurately.BDA-CSODL technique involves different stages of operations such as preprocessing,segmentation,fea-ture extraction,and classification.In addition,BDA-CSODL technique also fol-lows multi-level thresholding-based image segmentation approach for the detection of infected regions in medical image.Moreover,a deep convolutional neural network-based Inception v3 method is utilized in this study as feature extractor.Stochastic Gradient Descent(SGD)model is used for parameter tuning process.Furthermore,CSO with Long Short-Term Memory(CSO-LSTM)model is employed as a classification model to determine the appropriate class labels to it.Both SGD and CSO design approaches help in improving the overall image classification performance of the proposed BDA-CSODL technique.A wide range of simulations was conducted on benchmark medical image datasets and the com-prehensive comparative results demonstrate the supremacy of the proposed BDA-CSODL technique under different measures.
文摘Lately,the Internet of Things(IoT)application requires millions of structured and unstructured data since it has numerous problems,such as data organization,production,and capturing.To address these shortcomings,big data analytics is the most superior technology that has to be adapted.Even though big data and IoT could make human life more convenient,those benefits come at the expense of security.To manage these kinds of threats,the intrusion detection system has been extensively applied to identify malicious network traffic,particularly once the preventive technique fails at the level of endpoint IoT devices.As cyberattacks targeting IoT have gradually become stealthy and more sophisticated,intrusion detection systems(IDS)must continually emerge to manage evolving security threats.This study devises Big Data Analytics with the Internet of Things Assisted Intrusion Detection using Modified Buffalo Optimization Algorithm with Deep Learning(IDMBOA-DL)algorithm.In the presented IDMBOA-DL model,the Hadoop MapReduce tool is exploited for managing big data.The MBOA algorithm is applied to derive an optimal subset of features from picking an optimum set of feature subsets.Finally,the sine cosine algorithm(SCA)with convolutional autoencoder(CAE)mechanism is utilized to recognize and classify the intrusions in the IoT network.A wide range of simulations was conducted to demonstrate the enhanced results of the IDMBOA-DL algorithm.The comparison outcomes emphasized the better performance of the IDMBOA-DL model over other approaches.
文摘Big Data applications face different types of complexities in classifications.Cleaning and purifying data by eliminating irrelevant or redundant data for big data applications becomes a complex operation while attempting to maintain discriminative features in processed data.The existing scheme has many disadvantages including continuity in training,more samples and training time in feature selections and increased classification execution times.Recently ensemble methods have made a mark in classification tasks as combine multiple results into a single representation.When comparing to a single model,this technique offers for improved prediction.Ensemble based feature selections parallel multiple expert’s judgments on a single topic.The major goal of this research is to suggest HEFSM(Heterogeneous Ensemble Feature Selection Model),a hybrid approach that combines multiple algorithms.The major goal of this research is to suggest HEFSM(Heterogeneous Ensemble Feature Selection Model),a hybrid approach that combines multiple algorithms.Further,individual outputs produced by methods producing subsets of features or rankings or voting are also combined in this work.KNN(K-Nearest Neighbor)classifier is used to classify the big dataset obtained from the ensemble learning approach.The results found of the study have been good,proving the proposed model’s efficiency in classifications in terms of the performance metrics like precision,recall,F-measure and accuracy used.
基金supported in part by the Big Data Analytics Laboratory(BDALAB)at the Institute of Business Administration under the research grant approved by the Higher Education Commission of Pakistan(www.hec.gov.pk)the Darbi company(www.darbi.io)
文摘This paper focuses on facilitating state-of-the-art applications of big data analytics(BDA) architectures and infrastructures to telecommunications(telecom) industrial sector.Telecom companies are dealing with terabytes to petabytes of data on a daily basis. Io T applications in telecom are further contributing to this data deluge. Recent advances in BDA have exposed new opportunities to get actionable insights from telecom big data. These benefits and the fast-changing BDA technology landscape make it important to investigate existing BDA applications to telecom sector. For this, we initially determine published research on BDA applications to telecom through a systematic literature review through which we filter 38 articles and categorize them in frameworks, use cases, literature reviews, white papers and experimental validations. We also discuss the benefits and challenges mentioned in these articles. We find that experiments are all proof of concepts(POC) on a severely limited BDA technology stack(as compared to the available technology stack), i.e.,we did not find any work focusing on full-fledged BDA implementation in an operational telecom environment. To facilitate these applications at research-level, we propose a state-of-the-art lambda architecture for BDA pipeline implementation(called Lambda Tel) based completely on open source BDA technologies and the standard Python language, along with relevant guidelines.We discovered only one research paper which presented a relatively-limited lambda architecture using the proprietary AWS cloud infrastructure. We believe Lambda Tel presents a clear roadmap for telecom industry practitioners to implement and enhance BDA applications in their enterprises.
基金supported by two research grants provided by the Karachi Institute of Economics and Technology(KIET)the Big Data Analytics Laboratory at the Insitute of Business Administration(IBAKarachi)。
文摘The advent of healthcare information management systems(HIMSs)continues to produce large volumes of healthcare data for patient care and compliance and regulatory requirements at a global scale.Analysis of this big data allows for boundless potential outcomes for discovering knowledge.Big data analytics(BDA)in healthcare can,for instance,help determine causes of diseases,generate effective diagnoses,enhance Qo S guarantees by increasing efficiency of the healthcare delivery and effectiveness and viability of treatments,generate accurate predictions of readmissions,enhance clinical care,and pinpoint opportunities for cost savings.However,BDA implementations in any domain are generally complicated and resource-intensive with a high failure rate and no roadmap or success strategies to guide the practitioners.In this paper,we present a comprehensive roadmap to derive insights from BDA in the healthcare(patient care)domain,based on the results of a systematic literature review.We initially determine big data characteristics for healthcare and then review BDA applications to healthcare in academic research focusing particularly on No SQL databases.We also identify the limitations and challenges of these applications and justify the potential of No SQL databases to address these challenges and further enhance BDA healthcare research.We then propose and describe a state-of-the-art BDA architecture called Med-BDA for healthcare domain which solves all current BDA challenges and is based on the latest zeta big data paradigm.We also present success strategies to ensure the working of Med-BDA along with outlining the major benefits of BDA applications to healthcare.Finally,we compare our work with other related literature reviews across twelve hallmark features to justify the novelty and importance of our work.The aforementioned contributions of our work are collectively unique and clearly present a roadmap for clinical administrators,practitioners and professionals to successfully implement BDA initiatives in their organizations.
基金The National Natural Science Foundation of China(No.72071039)the Foundation of China Scholarship Council(No.202106090197)。
文摘To obtain the platform s big data analytics support,manufacturers in the traditional retail channel must decide whether to use the direct online channel.A retail supply chain model and a direct online supply chain model are built,in which manufacturers design products alone in the retail channel,while the platform and manufacturer complete the product design in the direct online channel.These two models are analyzed using the game theoretical model and numerical simulation.The findings indicate that if the manufacturers design capabilities are not very high and the commission rate is not very low,the manufacturers will choose the direct online channel if the platform s technical efforts are within an interval.When the platform s technical efforts are exogenous,they positively influence the manufacturers decisions;however,in the endogenous case,the platform s effect on the manufacturers is reflected in the interaction of the commission rate and cost efficiency.The manufacturers and the platform should make synthetic effort decisions based on the manufacturer s development capabilities,the intensity of market competition,and the cost efficiency of the platform.
基金Supported by the National High Technology Research and Development Program of China(No.2015AA015308)the State Key Development Program for Basic Research of China(No.2014CB340402)
文摘Big data analytics is emerging as one kind of the most important workloads in modern data centers. Hence,it is of great interest to identify the method of achieving the best performance for big data analytics workloads running on state-of-the-art SMT( simultaneous multithreading) processors,which needs comprehensive understanding to workload characteristics. This paper chooses the Spark workloads as the representative big data analytics workloads and performs comprehensive measurements on the POWER8 platform,which supports a wide range of multithreading. The research finds that the thread assignment policy and cache contention have significant impacts on application performance. In order to identify the potential optimization method from the experiment results,this study performs micro-architecture level characterizations by means of hardware performance counters and gives implications accordingly.
文摘Big data analysis(BDA)can increase the capability of supply chain analysis of manufacturing companies.Therefore,many manufacturing companies want to use BDA,but it has been seen that BDA implementation is difficult,especially in developing countries due to the existence of various barriers related to finance,government regulations,etc.This paper aims to investigate the barriers to BDA implementation in Iranian companies.In literature,limited work has been done on identifying barriers to implementing BDA in developing countries.In this regard,34 barriers were identified to BDA adoption in Iran by employing a literature review and feedback received from experts.Then,the most important barriers(14)were analyzed using integrated Interpretive Structural Modeling and MICMAC approach.Results show that two barriers;namely,lack of sufficient knowledge of senior managers and weakness of governance policies,are the most significant.Finally,crucial policy measures and recommendations are proposed to assist managers and government bodies.
文摘With an accelerating increase of business benefits produced from big data analytics (if used appropriately and intelligently by businesses in the private and public sectors), this study focused on empirically identifying the big data analytics (BDA) attributes. These attributes were classified into four groups (i.e., value innovation, social impact, precision, and completeness of BDA quality) and were found to influence the decision-making performance and business performance outcomes. A structural equation modeling analysis using 382 responses from a BDA related to practitioners indicated that the attributes of representativeness, predictability, interpretability, and innovativeness as related to value innovation greatly enhanced the decision-making confidence and effectiveness of decision makers who make decisions using big data. In addition, individuality, collectivity, and willfulness, which are related to social impact, also greatly improved the decision-making confidence and effectiveness of the same decision makers. This shows that the value innovation and social impact, which have received relatively less attention in previous studies, are the crucial attributes for BDA quality as they influence the decision-making performance. Comprehensiveness, factuality, and realism, which are linked to completeness, also have similar results. Furthermore, the higher the decision-making confidence of the decision makers who used big data was, the higher the financial performance of their companies. In addition, high decision-making confidence using big data was found to improve the nonfinancial performance metrics such as customer satisfaction and quality levels as well as product development capabilities. High decision-making effectiveness with big data was also shown to improve the nonfinancial performance metrics.
文摘Big data analysis is confronted with the obstacle of high dimensionality in data samples.To address this issue,researchers have devised a multitude of intel-ligent optimization algorithms aimed at enhancing big data analysis techniques.Among these algorithms is the War Strategy Optimization(WSO)proposed in 2022,which distinguishes itself from other intelligence algorithms through its potent optimization capabilities.Nevertheless,the WSO exhibits limitations in its global search capacity and is susceptible to becoming trapped in local optima when dealing with high-dimensional problems.To surmount these shortcomings and improve the performance of WSO in handling the challenges posed by high dimensionality in big data,this paper introduces an enhanced version of the WSO based on the carnivorous plant algorithm(CPA)and shared niche.The grouping concept and update strategy of CPA are incorporated into WSO,and its update strategy is modified through the introduction of a shared small habitat approach combined with an elite strategy to create a novel improved algorithm.Simula-tion experiments were conducted to compare this new War Strategy Optimization(CSWSO)with WSO,RKWSO,I-GWO,NCHHO and FDB-SDO using 16 test functions.Experimental results demonstrate that the proposed enhanced algorithm exhibits superior optimization accuracy and stability,providing a novel approach to addressing the challenges posed by high dimensionality in big data.
基金the European Community under grant agreement 283610 EarthServer.
文摘Big Data Analytics is an emerging field since massive storage and computing capabilities have been made available by advanced e-infrastructures.Earth and Environmental sciences are likely to benefit from Big Data Analytics techniques supporting the processing of the large number of Earth Observation datasets currently acquired and generated through observations and simulations.However,Earth Science data and applications present specificities in terms of relevance of the geospatial information,wide heterogeneity of data models and formats,and complexity of processing.Therefore,Big Earth Data Analytics requires specifically tailored techniques and tools.The EarthServer Big Earth Data Analytics engine offers a solution for coverage-type datasets,built around a high performance array database technology,and the adoption and enhancement of standards for service interaction(OGC WCS and WCPS).The EarthServer solution,led by the collection of requirements from scientific communities and international initiatives,provides a holistic approach that ranges from query languages and scalability up to mobile access and visualization.The result is demonstrated and validated through the development of lighthouse applications in the Marine,Geology,Atmospheric,Planetary and Cryospheric science domains.
文摘Over the past few decades,with the development of automatic identification,data capture and storage technologies,people generate data much faster and collect data much bigger than ever before in business,science,engineering,education and other areas.Big data has emerged as an important area of study for both practitioners and researchers.It has huge impacts on data-related problems.In this paper,we identify the key issues related to big data analytics and then investigate its applications specifically related to business problems.
文摘Big data has attracted much attention from academia and industry.But the discussion of big data is disparate,fragmented and distributed among different outlets.This paper conducts a systematic and extensive review on 186 journal publications about big data from 2011 to 2015 in the Science Citation Index(SCI)and the Social Science Citation Index(SSCI)database aiming to provide scholars and practitioners with a comprehensive overview and big picture about research on big data.The selected papers are grouped into 20 research categories.The contents of the paper(s)in each research category are summarized.Research directions for each category are outlined as well.The results in this study indicate that the selected papers were mainly published between 2013 and 2015 and focus on technological issues regarding big data.Diverse new approaches,methods,frameworks and systems are proposed for data collection,storage,transport,processing and analysis in the selected papers.Possible directions for future research on big data are discussed.
文摘In this paper,recent developments on the Internet of Things(IoT)and its applications are surveyed,and the impact of newly developed Big Data(BD)on manufacturing information systems is especially discussed.Big Data analytics(BDA)has been identified as a critical technology to support data acquisition,storage,and analytics in data management systems in modern manufacturing.The purpose of the presented work is to clarify the requirements of predictive systems,and to identify research challenges and opportunities on BDA to support cloudbased information systems.
文摘In this work,we design a multisensory IoT-based online vitals monitor(hereinafter referred to as the VITALS)to sense four bedside physiological parameters including pulse(heart)rate,body temperature,blood pressure,and periph-eral oxygen saturation.Then,the proposed system constantly transfers these signals to the analytics system which aids in enhancing diagnostics at an earlier stage as well as monitoring after recovery.The core hardware of the VITALS includes commercial off-the-shelf sensing devices/medical equipment,a powerful microcontroller,a reliable wireless communication module,and a big data analytics system.It extracts human vital signs in a pre-programmed interval of 30 min and sends them to big data analytics system through the WiFi module for further analysis.We use Apache Kafka(to gather live data streams from connected sen-sors),Apache Spark(to categorize the patient vitals and notify the medical pro-fessionals while identifying abnormalities in physiological parameters),Hadoop Distributed File System(HDFS)(to archive data streams for further analysis and long-term storage),Spark SQL,Hive and Matplotlib(to support caregivers to access/visualize appropriate information from collected data streams and to explore/understand the health status of the individuals).In addition,we develop a mobile application to send statistical graphs to doctors and patients to enable them to monitor health conditions remotely.Our proposed system is implemented on three patients for 7 days to check the effectiveness of sensing,data processing,and data transmission mechanisms.To validate the system accuracy,we compare the data values collected from established sensors with the measured readouts using a commercial healthcare monitor,the Welch Allyn®Spot Check.Our pro-posed system provides improved care solutions,especially for those whose access to care services is limited.
文摘AbstractWith more and more data generated,it has become a big challenge for traditional architectures and infrastructures to process large amounts of data within an acceptable time and resources.In order to efficiently extract value from these data,organizations need to find new tools and methods specialized for big data processing.For this reason,big data analytics has become a key factor for companies to reveal hidden information and achieve competitive advantages in the market.Currently,enormous publications of big data analytics make it difficult for practitioners and researchers to find topics they are interested in and track up to date.This paper aims to present an overview of big data analytics’content,scope and findings as well as opportunities provided by the application of big data analytics.
文摘The availability of digital technology in the hands of every citizenry worldwide makes an available unprecedented massive amount of data.The capability to process these gigantic amounts of data in real-time with Big Data Analytics(BDA)tools and Machine Learning(ML)algorithms carries many paybacks.However,the high number of free BDA tools,platforms,and data mining tools makes it challenging to select the appropriate one for the right task.This paper presents a comprehensive mini-literature review of ML in BDA,using a keyword search;a total of 1512 published articles was identified.The articles were screened to 140 based on the study proposed novel taxonomy.The study outcome shows that deep neural networks(15%),support vector machines(15%),artificial neural networks(14%),decision trees(12%),and ensemble learning techniques(11%)are widely applied in BDA.The related applications fields,challenges,and most importantly the openings for future research,are detailed.
文摘In this study,the key drivers of sustainability commitment,green supply chain management,big data integration and green human resource practice are explored,and the impact of these sustainable capabilities on the environmental and financial performance of banks is also elaborated.In addition,the influence of green management practices on integrating big data technology into operations is presented.As for the concept of dynamic ability,it has been used to recommend and empirically test conceptual models.Data were collected through a self-administrated survey questionnaire on 317 people working in 37 banks in six Asian countries.Research suggests that big data analytics strategies have an impact on internal processes and on the stability and financial performance of banks.Besides,it is indicated that banks are committed to proper data monitoring of their customers to complete operational efficiency and sustainability goals.Furthermore,our result proved that banks practicing Green Innovation strategies experience better environmental and economic performance because their employees are already trained in Green HR.Finally,from our study,it was found that internal and external green supply chain management practices have a positive effect on the environmental and financial performance of banks,thus ensuring that the bank of Association of Southeast Asian Nations(ASEAN)mitigates the environmental impact through its operations and ultimately experiences an increase in financial performance.
基金supported by the National Key Research and Development Program of China under Grant No.2017YFB1001804。
文摘Big data analytics applications are increasingly deployed on cloud computing infrastructures,and it is still a big challenge to pick the optimal cloud configurations in a cost-effective way.In this paper,we address this problem with a high accuracy and a low overhead.We propose Apollo,a data-driven approach that can rapidly pick the optimal cloud configurations by reusing data from similar workloads.We first classify 12 typical workloads in BigDataBench by characterizing pairwise correlations in our offline benchmarks.When a new workload comes,we run it with several small datasets to rank its key characteristics and get its similar workloads.Based on the rank,we then limit the search space of cloud configurations through a classification mechanism.At last,we leverage a hierarchical regression model to measure which cluster is more suitable and use a local search strategy to pick the optimal cloud configurations in a few extra tests.Our evaluation on 12 typical workloads in HiBench shows that compared with state-of-the-art approaches,Apollo can improve up to 30%search accuracy,while reducing as much as 50%overhead for picking the optimal cloud configurations.