Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Sma...Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Small and medium sized enterprises(SMEs)are the backbone of the global economy,comprising of 90%of businesses worldwide.However,only 10%SMEs have adopted big data analytics despite the competitive advantage they could achieve.Previous research has analysed the barriers to adoption and a strategic framework has been developed to help SMEs adopt big data analytics.The framework was converted into a scoring tool which has been applied to multiple case studies of SMEs in the UK.This paper documents the process of evaluating the framework based on the structured feedback from a focus group composed of experienced practitioners.The results of the evaluation are presented with a discussion on the results,and the paper concludes with recommendations to improve the scoring tool based on the proposed framework.The research demonstrates that this positioning tool is beneficial for SMEs to achieve competitive advantages by increasing the application of business intelligence and big data analytics.展开更多
An anisotropic diffusion filter can be used to model a flow-dependent background error covariance matrix,which can be achieved by solving the advection-diffusion equation.Because of the directionality of the advection...An anisotropic diffusion filter can be used to model a flow-dependent background error covariance matrix,which can be achieved by solving the advection-diffusion equation.Because of the directionality of the advection term,the discrete method needs to be chosen very carefully.The finite analytic method is an alternative scheme to solve the advection-diffusion equation.As a combination of analytical and numerical methods,it not only has high calculation accuracy but also holds the characteristic of the auto upwind.To demonstrate its ability,the one-dimensional steady and unsteady advection-diffusion equation numerical examples are respectively solved by the finite analytic method.The more widely used upwind difference method is used as a control approach.The result indicates that the finite analytic method has higher accuracy than the upwind difference method.For the two-dimensional case,the finite analytic method still has a better performance.In the three-dimensional variational assimilation experiment,the finite analytic method can effectively improve analysis field accuracy,and its effect is significantly better than the upwind difference and the central difference method.Moreover,it is still a more effective solution method in the strong flow region where the advective-diffusion filter performs most prominently.展开更多
With the advent of digital therapeutics(DTx),the development of software as a medical device(SaMD)for mobile and wearable devices has gained significant attention in recent years.Existing DTx evaluations,such as rando...With the advent of digital therapeutics(DTx),the development of software as a medical device(SaMD)for mobile and wearable devices has gained significant attention in recent years.Existing DTx evaluations,such as randomized clinical trials,mostly focus on verifying the effectiveness of DTx products.To acquire a deeper understanding of DTx engagement and behavioral adherence,beyond efficacy,a large amount of contextual and interaction data from mobile and wearable devices during field deployment would be required for analysis.In this work,the overall flow of the data-driven DTx analytics is reviewed to help researchers and practitioners to explore DTx datasets,to investigate contextual patterns associated with DTx usage,and to establish the(causal)relationship between DTx engagement and behavioral adherence.This review of the key components of datadriven analytics provides novel research directions in the analysis of mobile sensor and interaction datasets,which helps to iteratively improve the receptivity of existing DTx.展开更多
Data breaches have massive consequences for companies, affecting them financially and undermining their reputation, which poses significant challenges to online security and the long-term viability of businesses. This...Data breaches have massive consequences for companies, affecting them financially and undermining their reputation, which poses significant challenges to online security and the long-term viability of businesses. This study analyzes trends in data breaches in the United States, examining the frequency, causes, and magnitude of breaches across various industries. We document that data breaches are increasing, with hacking emerging as the leading cause. Our descriptive analyses explore factors influencing breaches, including security vulnerabilities, human error, and malicious attacks. The findings provide policymakers and businesses with actionable insights to bolster data security through proactive audits, patching, encryption, and response planning. By better understanding breach patterns and risk factors, organizations can take targeted steps to enhance protections and mitigate the potential damage of future incidents.展开更多
Gestational Diabetes Mellitus (GDM) is a significant health concern affecting pregnant women worldwide. It is characterized by elevated blood sugar levels during pregnancy and poses risks to both maternal and fetal he...Gestational Diabetes Mellitus (GDM) is a significant health concern affecting pregnant women worldwide. It is characterized by elevated blood sugar levels during pregnancy and poses risks to both maternal and fetal health. Maternal complications of GDM include an increased risk of developing type 2 diabetes later in life, as well as hypertension and preeclampsia during pregnancy. Fetal complications may include macrosomia (large birth weight), birth injuries, and an increased risk of developing metabolic disorders later in life. Understanding the demographics, risk factors, and biomarkers associated with GDM is crucial for effective management and prevention strategies. This research aims to address these aspects comprehensively through the analysis of a dataset comprising 600 pregnant women. By exploring the demographics of the dataset and employing data modeling techniques, the study seeks to identify key risk factors associated with GDM. Moreover, by analyzing various biomarkers, the research aims to gain insights into the physiological mechanisms underlying GDM and its implications for maternal and fetal health. The significance of this research lies in its potential to inform clinical practice and public health policies related to GDM. By identifying demographic patterns and risk factors, healthcare providers can better tailor screening and intervention strategies for pregnant women at risk of GDM. Additionally, insights into biomarkers associated with GDM may contribute to the development of novel diagnostic tools and therapeutic approaches. Ultimately, by enhancing our understanding of GDM, this research aims to improve maternal and fetal outcomes and reduce the burden of this condition on healthcare systems and society. However, it’s important to acknowledge the limitations of the dataset used in this study. Further research utilizing larger and more diverse datasets, perhaps employing advanced data analysis techniques such as Power BI, is warranted to corroborate and expand upon the findings of this research. This underscores the ongoing need for continued investigation into GDM to refine our understanding and improve clinical management strategies.展开更多
Lately,the Internet of Things(IoT)application requires millions of structured and unstructured data since it has numerous problems,such as data organization,production,and capturing.To address these shortcomings,big d...Lately,the Internet of Things(IoT)application requires millions of structured and unstructured data since it has numerous problems,such as data organization,production,and capturing.To address these shortcomings,big data analytics is the most superior technology that has to be adapted.Even though big data and IoT could make human life more convenient,those benefits come at the expense of security.To manage these kinds of threats,the intrusion detection system has been extensively applied to identify malicious network traffic,particularly once the preventive technique fails at the level of endpoint IoT devices.As cyberattacks targeting IoT have gradually become stealthy and more sophisticated,intrusion detection systems(IDS)must continually emerge to manage evolving security threats.This study devises Big Data Analytics with the Internet of Things Assisted Intrusion Detection using Modified Buffalo Optimization Algorithm with Deep Learning(IDMBOA-DL)algorithm.In the presented IDMBOA-DL model,the Hadoop MapReduce tool is exploited for managing big data.The MBOA algorithm is applied to derive an optimal subset of features from picking an optimum set of feature subsets.Finally,the sine cosine algorithm(SCA)with convolutional autoencoder(CAE)mechanism is utilized to recognize and classify the intrusions in the IoT network.A wide range of simulations was conducted to demonstrate the enhanced results of the IDMBOA-DL algorithm.The comparison outcomes emphasized the better performance of the IDMBOA-DL model over other approaches.展开更多
In recent years,huge volumes of healthcare data are getting generated in various forms.The advancements made in medical imaging are tremendous owing to which biomedical image acquisition has become easier and quicker....In recent years,huge volumes of healthcare data are getting generated in various forms.The advancements made in medical imaging are tremendous owing to which biomedical image acquisition has become easier and quicker.Due to such massive generation of big data,the utilization of new methods based on Big Data Analytics(BDA),Machine Learning(ML),and Artificial Intelligence(AI)have become essential.In this aspect,the current research work develops a new Big Data Analytics with Cat Swarm Optimization based deep Learning(BDA-CSODL)technique for medical image classification on Apache Spark environment.The aim of the proposed BDA-CSODL technique is to classify the medical images and diagnose the disease accurately.BDA-CSODL technique involves different stages of operations such as preprocessing,segmentation,fea-ture extraction,and classification.In addition,BDA-CSODL technique also fol-lows multi-level thresholding-based image segmentation approach for the detection of infected regions in medical image.Moreover,a deep convolutional neural network-based Inception v3 method is utilized in this study as feature extractor.Stochastic Gradient Descent(SGD)model is used for parameter tuning process.Furthermore,CSO with Long Short-Term Memory(CSO-LSTM)model is employed as a classification model to determine the appropriate class labels to it.Both SGD and CSO design approaches help in improving the overall image classification performance of the proposed BDA-CSODL technique.A wide range of simulations was conducted on benchmark medical image datasets and the com-prehensive comparative results demonstrate the supremacy of the proposed BDA-CSODL technique under different measures.展开更多
Big Data applications face different types of complexities in classifications.Cleaning and purifying data by eliminating irrelevant or redundant data for big data applications becomes a complex operation while attempt...Big Data applications face different types of complexities in classifications.Cleaning and purifying data by eliminating irrelevant or redundant data for big data applications becomes a complex operation while attempting to maintain discriminative features in processed data.The existing scheme has many disadvantages including continuity in training,more samples and training time in feature selections and increased classification execution times.Recently ensemble methods have made a mark in classification tasks as combine multiple results into a single representation.When comparing to a single model,this technique offers for improved prediction.Ensemble based feature selections parallel multiple expert’s judgments on a single topic.The major goal of this research is to suggest HEFSM(Heterogeneous Ensemble Feature Selection Model),a hybrid approach that combines multiple algorithms.The major goal of this research is to suggest HEFSM(Heterogeneous Ensemble Feature Selection Model),a hybrid approach that combines multiple algorithms.Further,individual outputs produced by methods producing subsets of features or rankings or voting are also combined in this work.KNN(K-Nearest Neighbor)classifier is used to classify the big dataset obtained from the ensemble learning approach.The results found of the study have been good,proving the proposed model’s efficiency in classifications in terms of the performance metrics like precision,recall,F-measure and accuracy used.展开更多
Climate change and global warming results in natural hazards, including flash floods. Flash floods can create blue spots;areas where transport networks (roads, tunnels, bridges, passageways) and other engineering stru...Climate change and global warming results in natural hazards, including flash floods. Flash floods can create blue spots;areas where transport networks (roads, tunnels, bridges, passageways) and other engineering structures within them are at flood risk. The economic and social impact of flooding revealed that the damage caused by flash floods leading to blue spots is very high in terms of dollar amount and direct impacts on people’s lives. The impact of flooding within blue spots is either infrastructural or social, affecting lives and properties. Currently, more than 16.1 million properties in the U.S are vulnerable to flooding, and this is projected to increase by 3.2% within the next 30 years. Some models have been developed for flood risks analysis and management including some hydrological models, algorithms and machine learning and geospatial models. The models and methods reviewed are based on location data collection, statistical analysis and computation, and visualization (mapping). This research aims to create blue spots model for the State of Tennessee using ArcGIS visual programming language (model) and data analytics pipeline.展开更多
We are living in an age of big data,analytics,and artificial intelligence(AI).After reviewing a dozen different books on big data,data analytics,data science,AI,and business intelligence(BI),there are the current ques...We are living in an age of big data,analytics,and artificial intelligence(AI).After reviewing a dozen different books on big data,data analytics,data science,AI,and business intelligence(BI),there are the current questions:(1)What are the relationships between data,analytics,and intelligence?(2)What are the relationships between big data and big data analytics?(3)What is the relationship between BI and data analytics?This article first discusses the heuristics of the Greek philosopher Plato and French mathematician Descartes and how to reshape the world.Then it addresses the above questions based on a Boolean structure,which destructs big data,data analytics,data science,and AI into data,analytics,and intelligence as the Boolean atoms.Data,analytics,and intelligence are reorganized and reassembled,based on the Boolean structure,to data analytics,analytics intelligence,data intelligence,and data analytics intelligence.The research will analyse each of them after examining the system intelligence.The proposed approach in this research might facilitate the research and development of big data,data analytics,AI,and data science.展开更多
The intermediate link compression characteristics of e-commerce express logistics ne tworks influence the tradition al mode of circulation of goods and economic organization,and alter the city spatial pattern.Based on...The intermediate link compression characteristics of e-commerce express logistics ne tworks influence the tradition al mode of circulation of goods and economic organization,and alter the city spatial pattern.Based on the theory of space of flows,this study adopts China Smart Logistics Network relational data to build China's e-commerce express logistics network and explore its spatial structure characteristics through social network analysis(SNA),the PageRank technique,and geospatial methods.The results are as follows:the network density is 0.9270,which is close to 1;hence,indicating that e-commerce express logistics lines between Chinese cities are nearly complete and they form a typical network structure,thereby eliminating fragmented spaces.Moreover,the average minimum number of edges is 1.1375,which indicates that the network has a small world effect and thus has a high flow efficiency of logistics elements.A significant hierarchical diffusion effect was observed in dominant flows with the highest edge weights.A diamond-structured network was formed with Shanghai,Guangzhou,Chongqing,and Beijing as the four core nodes.Other node cities with a large logistics scale and importance in the network are mainly located in the 19 city agglomerations of China,revealing the fact that the development of city agglomerations is essential for promoting the separation of experience space and changing the urban spatial pattern.This study enriches the theory of urban networks,reveals the flow laws of modern logistics elements,and encourages coordinated development of urban logistics.展开更多
This paper focuses on facilitating state-of-the-art applications of big data analytics(BDA) architectures and infrastructures to telecommunications(telecom) industrial sector.Telecom companies are dealing with terabyt...This paper focuses on facilitating state-of-the-art applications of big data analytics(BDA) architectures and infrastructures to telecommunications(telecom) industrial sector.Telecom companies are dealing with terabytes to petabytes of data on a daily basis. Io T applications in telecom are further contributing to this data deluge. Recent advances in BDA have exposed new opportunities to get actionable insights from telecom big data. These benefits and the fast-changing BDA technology landscape make it important to investigate existing BDA applications to telecom sector. For this, we initially determine published research on BDA applications to telecom through a systematic literature review through which we filter 38 articles and categorize them in frameworks, use cases, literature reviews, white papers and experimental validations. We also discuss the benefits and challenges mentioned in these articles. We find that experiments are all proof of concepts(POC) on a severely limited BDA technology stack(as compared to the available technology stack), i.e.,we did not find any work focusing on full-fledged BDA implementation in an operational telecom environment. To facilitate these applications at research-level, we propose a state-of-the-art lambda architecture for BDA pipeline implementation(called Lambda Tel) based completely on open source BDA technologies and the standard Python language, along with relevant guidelines.We discovered only one research paper which presented a relatively-limited lambda architecture using the proprietary AWS cloud infrastructure. We believe Lambda Tel presents a clear roadmap for telecom industry practitioners to implement and enhance BDA applications in their enterprises.展开更多
The advent of healthcare information management systems(HIMSs)continues to produce large volumes of healthcare data for patient care and compliance and regulatory requirements at a global scale.Analysis of this big da...The advent of healthcare information management systems(HIMSs)continues to produce large volumes of healthcare data for patient care and compliance and regulatory requirements at a global scale.Analysis of this big data allows for boundless potential outcomes for discovering knowledge.Big data analytics(BDA)in healthcare can,for instance,help determine causes of diseases,generate effective diagnoses,enhance Qo S guarantees by increasing efficiency of the healthcare delivery and effectiveness and viability of treatments,generate accurate predictions of readmissions,enhance clinical care,and pinpoint opportunities for cost savings.However,BDA implementations in any domain are generally complicated and resource-intensive with a high failure rate and no roadmap or success strategies to guide the practitioners.In this paper,we present a comprehensive roadmap to derive insights from BDA in the healthcare(patient care)domain,based on the results of a systematic literature review.We initially determine big data characteristics for healthcare and then review BDA applications to healthcare in academic research focusing particularly on No SQL databases.We also identify the limitations and challenges of these applications and justify the potential of No SQL databases to address these challenges and further enhance BDA healthcare research.We then propose and describe a state-of-the-art BDA architecture called Med-BDA for healthcare domain which solves all current BDA challenges and is based on the latest zeta big data paradigm.We also present success strategies to ensure the working of Med-BDA along with outlining the major benefits of BDA applications to healthcare.Finally,we compare our work with other related literature reviews across twelve hallmark features to justify the novelty and importance of our work.The aforementioned contributions of our work are collectively unique and clearly present a roadmap for clinical administrators,practitioners and professionals to successfully implement BDA initiatives in their organizations.展开更多
This pioneering research represents a unique and singular study conducted within the United States, with a specific focus on non-technical graduate students pursuing degrees in business analytics. The primary impetus ...This pioneering research represents a unique and singular study conducted within the United States, with a specific focus on non-technical graduate students pursuing degrees in business analytics. The primary impetus behind this study stems from the escalating demand for data-driven professionals, the diverse academic backgrounds of students, the imperative for adaptable pedagogical methods, the ever-evolving landscape of curriculum designs, and the overarching commitment to fostering educational equity. To investigate these multifaceted dynamics, we employed a data collection method that included the distribution of an online survey on platforms such as LinkedIn. Our survey reached and engaged 74 graduate students actively pursuing degrees in Business Analytics within the United States. This comprehensive research is the first and only one of its kind conducted in this context, and it serves as a vanguard exploration into the challenges and influences that shape the learning journey of Python among non-technical graduate Business Analytics students. The analytical insights derived from this research underscore the pivotal role of hands-on learning strategies, exemplified by practice exercises and assignments. Moreover, the study highlights the positive and constructive influence of collaboration and peer support in the process of learning Python. These invaluable findings significantly augment the existing body of knowledge in the field of business analytics. Furthermore, they offer an essential resource for educators and institutions seeking to optimize the educational experiences of non-technical students as they acquire essential Python skills.展开更多
In this paper we aim to identify certain social factors that influence,and thus can be used to predict,the occurrence of crimes.The factors under consideration for this analytic are social demographics such as age,sex...In this paper we aim to identify certain social factors that influence,and thus can be used to predict,the occurrence of crimes.The factors under consideration for this analytic are social demographics such as age,sex,poverty,etc.,train ridership,traffic density and the number of business licenses per community area in Chicago,IL.A factor will be considered pertinent if there is high correlation between it and the number of crimes of a particular type in that community area.展开更多
In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedente...In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedented opportunities to tap into big data to mine valuable business intelligence. However, traditional business analytics methods may not be able to cope with the flood of big data. The main contribution of this paper is the illustration of the development of a novel big data stream analytics framework named BDSASA that leverages a probabilistic language model to analyze the consumer sentiments embedded in hundreds of millions of online consumer reviews. In particular, an inference model is embedded into the classical language modeling framework to enhance the prediction of consumer sentiments. The practical implication of our research work is that organizations can apply our big data stream analytics framework to analyze consumers’ product preferences, and hence develop more effective marketing and production strategies.展开更多
Monitoring,understanding and predicting Origin-destination(OD)flows in a city is an important problem for city planning and human activity.Taxi-GPS traces,acted as one kind of typical crowd sensed data,it can be used ...Monitoring,understanding and predicting Origin-destination(OD)flows in a city is an important problem for city planning and human activity.Taxi-GPS traces,acted as one kind of typical crowd sensed data,it can be used to mine the semantics of OD flows.In this paper,we firstly construct and analyze a complex network of OD flows based on large-scale GPS taxi traces of a city in China.The spatiotemporal analysis for the OD flows complex network showed that there were distinctive patterns in OD flows.Then based on a novel complex network model,a semantics mining method of OD flows is proposed through compounding Points of Interests(POI)network and public transport network to the OD flows network.The propose method would offer a novel way to predict the location characteristic and future traffic conditions accurately.展开更多
These last years we have been witnessing a tremendous growth in the volume and availability of data. This fact results primarily from the emergence of a multitude of sources (e.g. computers, mobile devices, sensors or...These last years we have been witnessing a tremendous growth in the volume and availability of data. This fact results primarily from the emergence of a multitude of sources (e.g. computers, mobile devices, sensors or social networks) that are continuously producing either structured, semi-structured or unstructured data. Database Management Systems and Data Warehouses are no longer the only technologies used to store and analyze datasets, namely due to the volume and complex structure of nowadays data that degrade their performance and scalability. Big Data is one of the recent challenges, since it implies new requirements in terms of data storage, processing and visualization. Despite that, analyzing properly Big Data can constitute great advantages because it allows discovering patterns and correlations in datasets. Users can use this processed information to gain deeper insights and to get business advantages. Thus, data modeling and data analytics are evolved in a way that we are able to process huge amounts of data without compromising performance and availability, but instead by “relaxing” the usual ACID properties. This paper provides a broad view and discussion of the current state of this subject with a particular focus on data modeling and data analytics, describing and clarifying the main differences between the three main approaches in what concerns these aspects, namely: operational databases, decision support databases and Big Data technologies.展开更多
Big Data and Data Analytics affect almost all aspects of modern organisations’decision-making and business strategies.Big Data and Data Analytics create opportunities,challenges,and implications for the external audi...Big Data and Data Analytics affect almost all aspects of modern organisations’decision-making and business strategies.Big Data and Data Analytics create opportunities,challenges,and implications for the external auditing procedure.The purpose of this article is to reveal essential aspects of the impact of Big Data and Data Analytics on external auditing.It seems that Big Data Analytics is a critical tool for organisations,as well as auditors,that contributes to the enhancement of the auditing process.Also,legislative implications must be taken under consideration,since existing standards may need to change.Last,auditors need to develop new skills and competence,and educational organisations need to change their educational programs in order to be able to correspond to new market needs.展开更多
文摘Big data analytics has been widely adopted by large companies to achieve measurable benefits including increased profitability,customer demand forecasting,cheaper development of products,and improved stock control.Small and medium sized enterprises(SMEs)are the backbone of the global economy,comprising of 90%of businesses worldwide.However,only 10%SMEs have adopted big data analytics despite the competitive advantage they could achieve.Previous research has analysed the barriers to adoption and a strategic framework has been developed to help SMEs adopt big data analytics.The framework was converted into a scoring tool which has been applied to multiple case studies of SMEs in the UK.This paper documents the process of evaluating the framework based on the structured feedback from a focus group composed of experienced practitioners.The results of the evaluation are presented with a discussion on the results,and the paper concludes with recommendations to improve the scoring tool based on the proposed framework.The research demonstrates that this positioning tool is beneficial for SMEs to achieve competitive advantages by increasing the application of business intelligence and big data analytics.
基金The National Key Research and Development Program of China under contract Nos 2022YFC3104804,2021YFC3101501,and 2017YFC1404103the National Programme on Global Change and Air-Sea Interaction of China under contract No.GASI-IPOVAI-04the National Natural Science Foundation of China under contract Nos 41876014,41606039,and 11801402.
文摘An anisotropic diffusion filter can be used to model a flow-dependent background error covariance matrix,which can be achieved by solving the advection-diffusion equation.Because of the directionality of the advection term,the discrete method needs to be chosen very carefully.The finite analytic method is an alternative scheme to solve the advection-diffusion equation.As a combination of analytical and numerical methods,it not only has high calculation accuracy but also holds the characteristic of the auto upwind.To demonstrate its ability,the one-dimensional steady and unsteady advection-diffusion equation numerical examples are respectively solved by the finite analytic method.The more widely used upwind difference method is used as a control approach.The result indicates that the finite analytic method has higher accuracy than the upwind difference method.For the two-dimensional case,the finite analytic method still has a better performance.In the three-dimensional variational assimilation experiment,the finite analytic method can effectively improve analysis field accuracy,and its effect is significantly better than the upwind difference and the central difference method.Moreover,it is still a more effective solution method in the strong flow region where the advective-diffusion filter performs most prominently.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Korea government(MSIT)(2020R1A4A1018774)。
文摘With the advent of digital therapeutics(DTx),the development of software as a medical device(SaMD)for mobile and wearable devices has gained significant attention in recent years.Existing DTx evaluations,such as randomized clinical trials,mostly focus on verifying the effectiveness of DTx products.To acquire a deeper understanding of DTx engagement and behavioral adherence,beyond efficacy,a large amount of contextual and interaction data from mobile and wearable devices during field deployment would be required for analysis.In this work,the overall flow of the data-driven DTx analytics is reviewed to help researchers and practitioners to explore DTx datasets,to investigate contextual patterns associated with DTx usage,and to establish the(causal)relationship between DTx engagement and behavioral adherence.This review of the key components of datadriven analytics provides novel research directions in the analysis of mobile sensor and interaction datasets,which helps to iteratively improve the receptivity of existing DTx.
文摘Data breaches have massive consequences for companies, affecting them financially and undermining their reputation, which poses significant challenges to online security and the long-term viability of businesses. This study analyzes trends in data breaches in the United States, examining the frequency, causes, and magnitude of breaches across various industries. We document that data breaches are increasing, with hacking emerging as the leading cause. Our descriptive analyses explore factors influencing breaches, including security vulnerabilities, human error, and malicious attacks. The findings provide policymakers and businesses with actionable insights to bolster data security through proactive audits, patching, encryption, and response planning. By better understanding breach patterns and risk factors, organizations can take targeted steps to enhance protections and mitigate the potential damage of future incidents.
文摘Gestational Diabetes Mellitus (GDM) is a significant health concern affecting pregnant women worldwide. It is characterized by elevated blood sugar levels during pregnancy and poses risks to both maternal and fetal health. Maternal complications of GDM include an increased risk of developing type 2 diabetes later in life, as well as hypertension and preeclampsia during pregnancy. Fetal complications may include macrosomia (large birth weight), birth injuries, and an increased risk of developing metabolic disorders later in life. Understanding the demographics, risk factors, and biomarkers associated with GDM is crucial for effective management and prevention strategies. This research aims to address these aspects comprehensively through the analysis of a dataset comprising 600 pregnant women. By exploring the demographics of the dataset and employing data modeling techniques, the study seeks to identify key risk factors associated with GDM. Moreover, by analyzing various biomarkers, the research aims to gain insights into the physiological mechanisms underlying GDM and its implications for maternal and fetal health. The significance of this research lies in its potential to inform clinical practice and public health policies related to GDM. By identifying demographic patterns and risk factors, healthcare providers can better tailor screening and intervention strategies for pregnant women at risk of GDM. Additionally, insights into biomarkers associated with GDM may contribute to the development of novel diagnostic tools and therapeutic approaches. Ultimately, by enhancing our understanding of GDM, this research aims to improve maternal and fetal outcomes and reduce the burden of this condition on healthcare systems and society. However, it’s important to acknowledge the limitations of the dataset used in this study. Further research utilizing larger and more diverse datasets, perhaps employing advanced data analysis techniques such as Power BI, is warranted to corroborate and expand upon the findings of this research. This underscores the ongoing need for continued investigation into GDM to refine our understanding and improve clinical management strategies.
文摘Lately,the Internet of Things(IoT)application requires millions of structured and unstructured data since it has numerous problems,such as data organization,production,and capturing.To address these shortcomings,big data analytics is the most superior technology that has to be adapted.Even though big data and IoT could make human life more convenient,those benefits come at the expense of security.To manage these kinds of threats,the intrusion detection system has been extensively applied to identify malicious network traffic,particularly once the preventive technique fails at the level of endpoint IoT devices.As cyberattacks targeting IoT have gradually become stealthy and more sophisticated,intrusion detection systems(IDS)must continually emerge to manage evolving security threats.This study devises Big Data Analytics with the Internet of Things Assisted Intrusion Detection using Modified Buffalo Optimization Algorithm with Deep Learning(IDMBOA-DL)algorithm.In the presented IDMBOA-DL model,the Hadoop MapReduce tool is exploited for managing big data.The MBOA algorithm is applied to derive an optimal subset of features from picking an optimum set of feature subsets.Finally,the sine cosine algorithm(SCA)with convolutional autoencoder(CAE)mechanism is utilized to recognize and classify the intrusions in the IoT network.A wide range of simulations was conducted to demonstrate the enhanced results of the IDMBOA-DL algorithm.The comparison outcomes emphasized the better performance of the IDMBOA-DL model over other approaches.
基金The author extends his appreciation to the Deanship of Scientific Research at Majmaah University for funding this study under Project Number(R-2022-61).
文摘In recent years,huge volumes of healthcare data are getting generated in various forms.The advancements made in medical imaging are tremendous owing to which biomedical image acquisition has become easier and quicker.Due to such massive generation of big data,the utilization of new methods based on Big Data Analytics(BDA),Machine Learning(ML),and Artificial Intelligence(AI)have become essential.In this aspect,the current research work develops a new Big Data Analytics with Cat Swarm Optimization based deep Learning(BDA-CSODL)technique for medical image classification on Apache Spark environment.The aim of the proposed BDA-CSODL technique is to classify the medical images and diagnose the disease accurately.BDA-CSODL technique involves different stages of operations such as preprocessing,segmentation,fea-ture extraction,and classification.In addition,BDA-CSODL technique also fol-lows multi-level thresholding-based image segmentation approach for the detection of infected regions in medical image.Moreover,a deep convolutional neural network-based Inception v3 method is utilized in this study as feature extractor.Stochastic Gradient Descent(SGD)model is used for parameter tuning process.Furthermore,CSO with Long Short-Term Memory(CSO-LSTM)model is employed as a classification model to determine the appropriate class labels to it.Both SGD and CSO design approaches help in improving the overall image classification performance of the proposed BDA-CSODL technique.A wide range of simulations was conducted on benchmark medical image datasets and the com-prehensive comparative results demonstrate the supremacy of the proposed BDA-CSODL technique under different measures.
文摘Big Data applications face different types of complexities in classifications.Cleaning and purifying data by eliminating irrelevant or redundant data for big data applications becomes a complex operation while attempting to maintain discriminative features in processed data.The existing scheme has many disadvantages including continuity in training,more samples and training time in feature selections and increased classification execution times.Recently ensemble methods have made a mark in classification tasks as combine multiple results into a single representation.When comparing to a single model,this technique offers for improved prediction.Ensemble based feature selections parallel multiple expert’s judgments on a single topic.The major goal of this research is to suggest HEFSM(Heterogeneous Ensemble Feature Selection Model),a hybrid approach that combines multiple algorithms.The major goal of this research is to suggest HEFSM(Heterogeneous Ensemble Feature Selection Model),a hybrid approach that combines multiple algorithms.Further,individual outputs produced by methods producing subsets of features or rankings or voting are also combined in this work.KNN(K-Nearest Neighbor)classifier is used to classify the big dataset obtained from the ensemble learning approach.The results found of the study have been good,proving the proposed model’s efficiency in classifications in terms of the performance metrics like precision,recall,F-measure and accuracy used.
文摘Climate change and global warming results in natural hazards, including flash floods. Flash floods can create blue spots;areas where transport networks (roads, tunnels, bridges, passageways) and other engineering structures within them are at flood risk. The economic and social impact of flooding revealed that the damage caused by flash floods leading to blue spots is very high in terms of dollar amount and direct impacts on people’s lives. The impact of flooding within blue spots is either infrastructural or social, affecting lives and properties. Currently, more than 16.1 million properties in the U.S are vulnerable to flooding, and this is projected to increase by 3.2% within the next 30 years. Some models have been developed for flood risks analysis and management including some hydrological models, algorithms and machine learning and geospatial models. The models and methods reviewed are based on location data collection, statistical analysis and computation, and visualization (mapping). This research aims to create blue spots model for the State of Tennessee using ArcGIS visual programming language (model) and data analytics pipeline.
基金supported partially by the Papua New Guinea Science and Technology Secretariat(PNGSTS)under the project grant No.1-3962 PNGSTS.
文摘We are living in an age of big data,analytics,and artificial intelligence(AI).After reviewing a dozen different books on big data,data analytics,data science,AI,and business intelligence(BI),there are the current questions:(1)What are the relationships between data,analytics,and intelligence?(2)What are the relationships between big data and big data analytics?(3)What is the relationship between BI and data analytics?This article first discusses the heuristics of the Greek philosopher Plato and French mathematician Descartes and how to reshape the world.Then it addresses the above questions based on a Boolean structure,which destructs big data,data analytics,data science,and AI into data,analytics,and intelligence as the Boolean atoms.Data,analytics,and intelligence are reorganized and reassembled,based on the Boolean structure,to data analytics,analytics intelligence,data intelligence,and data analytics intelligence.The research will analyse each of them after examining the system intelligence.The proposed approach in this research might facilitate the research and development of big data,data analytics,AI,and data science.
基金Under the auspices of National Natural Science Foundation of China(No.42071165,41801144)GDAS’Project of Science and Technology Development(No.2023GDASZH-2023010101,2021GDASYL-20210103004)。
文摘The intermediate link compression characteristics of e-commerce express logistics ne tworks influence the tradition al mode of circulation of goods and economic organization,and alter the city spatial pattern.Based on the theory of space of flows,this study adopts China Smart Logistics Network relational data to build China's e-commerce express logistics network and explore its spatial structure characteristics through social network analysis(SNA),the PageRank technique,and geospatial methods.The results are as follows:the network density is 0.9270,which is close to 1;hence,indicating that e-commerce express logistics lines between Chinese cities are nearly complete and they form a typical network structure,thereby eliminating fragmented spaces.Moreover,the average minimum number of edges is 1.1375,which indicates that the network has a small world effect and thus has a high flow efficiency of logistics elements.A significant hierarchical diffusion effect was observed in dominant flows with the highest edge weights.A diamond-structured network was formed with Shanghai,Guangzhou,Chongqing,and Beijing as the four core nodes.Other node cities with a large logistics scale and importance in the network are mainly located in the 19 city agglomerations of China,revealing the fact that the development of city agglomerations is essential for promoting the separation of experience space and changing the urban spatial pattern.This study enriches the theory of urban networks,reveals the flow laws of modern logistics elements,and encourages coordinated development of urban logistics.
基金supported in part by the Big Data Analytics Laboratory(BDALAB)at the Institute of Business Administration under the research grant approved by the Higher Education Commission of Pakistan(www.hec.gov.pk)the Darbi company(www.darbi.io)
文摘This paper focuses on facilitating state-of-the-art applications of big data analytics(BDA) architectures and infrastructures to telecommunications(telecom) industrial sector.Telecom companies are dealing with terabytes to petabytes of data on a daily basis. Io T applications in telecom are further contributing to this data deluge. Recent advances in BDA have exposed new opportunities to get actionable insights from telecom big data. These benefits and the fast-changing BDA technology landscape make it important to investigate existing BDA applications to telecom sector. For this, we initially determine published research on BDA applications to telecom through a systematic literature review through which we filter 38 articles and categorize them in frameworks, use cases, literature reviews, white papers and experimental validations. We also discuss the benefits and challenges mentioned in these articles. We find that experiments are all proof of concepts(POC) on a severely limited BDA technology stack(as compared to the available technology stack), i.e.,we did not find any work focusing on full-fledged BDA implementation in an operational telecom environment. To facilitate these applications at research-level, we propose a state-of-the-art lambda architecture for BDA pipeline implementation(called Lambda Tel) based completely on open source BDA technologies and the standard Python language, along with relevant guidelines.We discovered only one research paper which presented a relatively-limited lambda architecture using the proprietary AWS cloud infrastructure. We believe Lambda Tel presents a clear roadmap for telecom industry practitioners to implement and enhance BDA applications in their enterprises.
基金supported by two research grants provided by the Karachi Institute of Economics and Technology(KIET)the Big Data Analytics Laboratory at the Insitute of Business Administration(IBAKarachi)。
文摘The advent of healthcare information management systems(HIMSs)continues to produce large volumes of healthcare data for patient care and compliance and regulatory requirements at a global scale.Analysis of this big data allows for boundless potential outcomes for discovering knowledge.Big data analytics(BDA)in healthcare can,for instance,help determine causes of diseases,generate effective diagnoses,enhance Qo S guarantees by increasing efficiency of the healthcare delivery and effectiveness and viability of treatments,generate accurate predictions of readmissions,enhance clinical care,and pinpoint opportunities for cost savings.However,BDA implementations in any domain are generally complicated and resource-intensive with a high failure rate and no roadmap or success strategies to guide the practitioners.In this paper,we present a comprehensive roadmap to derive insights from BDA in the healthcare(patient care)domain,based on the results of a systematic literature review.We initially determine big data characteristics for healthcare and then review BDA applications to healthcare in academic research focusing particularly on No SQL databases.We also identify the limitations and challenges of these applications and justify the potential of No SQL databases to address these challenges and further enhance BDA healthcare research.We then propose and describe a state-of-the-art BDA architecture called Med-BDA for healthcare domain which solves all current BDA challenges and is based on the latest zeta big data paradigm.We also present success strategies to ensure the working of Med-BDA along with outlining the major benefits of BDA applications to healthcare.Finally,we compare our work with other related literature reviews across twelve hallmark features to justify the novelty and importance of our work.The aforementioned contributions of our work are collectively unique and clearly present a roadmap for clinical administrators,practitioners and professionals to successfully implement BDA initiatives in their organizations.
文摘This pioneering research represents a unique and singular study conducted within the United States, with a specific focus on non-technical graduate students pursuing degrees in business analytics. The primary impetus behind this study stems from the escalating demand for data-driven professionals, the diverse academic backgrounds of students, the imperative for adaptable pedagogical methods, the ever-evolving landscape of curriculum designs, and the overarching commitment to fostering educational equity. To investigate these multifaceted dynamics, we employed a data collection method that included the distribution of an online survey on platforms such as LinkedIn. Our survey reached and engaged 74 graduate students actively pursuing degrees in Business Analytics within the United States. This comprehensive research is the first and only one of its kind conducted in this context, and it serves as a vanguard exploration into the challenges and influences that shape the learning journey of Python among non-technical graduate Business Analytics students. The analytical insights derived from this research underscore the pivotal role of hands-on learning strategies, exemplified by practice exercises and assignments. Moreover, the study highlights the positive and constructive influence of collaboration and peer support in the process of learning Python. These invaluable findings significantly augment the existing body of knowledge in the field of business analytics. Furthermore, they offer an essential resource for educators and institutions seeking to optimize the educational experiences of non-technical students as they acquire essential Python skills.
文摘In this paper we aim to identify certain social factors that influence,and thus can be used to predict,the occurrence of crimes.The factors under consideration for this analytic are social demographics such as age,sex,poverty,etc.,train ridership,traffic density and the number of business licenses per community area in Chicago,IL.A factor will be considered pertinent if there is high correlation between it and the number of crimes of a particular type in that community area.
文摘In the era of big data, huge volumes of data are generated from online social networks, sensor networks, mobile devices, and organizations’ enterprise systems. This phenomenon provides organizations with unprecedented opportunities to tap into big data to mine valuable business intelligence. However, traditional business analytics methods may not be able to cope with the flood of big data. The main contribution of this paper is the illustration of the development of a novel big data stream analytics framework named BDSASA that leverages a probabilistic language model to analyze the consumer sentiments embedded in hundreds of millions of online consumer reviews. In particular, an inference model is embedded into the classical language modeling framework to enhance the prediction of consumer sentiments. The practical implication of our research work is that organizations can apply our big data stream analytics framework to analyze consumers’ product preferences, and hence develop more effective marketing and production strategies.
基金This work is supported by Shandong Provincial Natural Science Foundation,China under Grant No.ZR2017MG011This work is also supported by Key Research and Development Program in Shandong Provincial(2017GGX90103).
文摘Monitoring,understanding and predicting Origin-destination(OD)flows in a city is an important problem for city planning and human activity.Taxi-GPS traces,acted as one kind of typical crowd sensed data,it can be used to mine the semantics of OD flows.In this paper,we firstly construct and analyze a complex network of OD flows based on large-scale GPS taxi traces of a city in China.The spatiotemporal analysis for the OD flows complex network showed that there were distinctive patterns in OD flows.Then based on a novel complex network model,a semantics mining method of OD flows is proposed through compounding Points of Interests(POI)network and public transport network to the OD flows network.The propose method would offer a novel way to predict the location characteristic and future traffic conditions accurately.
文摘These last years we have been witnessing a tremendous growth in the volume and availability of data. This fact results primarily from the emergence of a multitude of sources (e.g. computers, mobile devices, sensors or social networks) that are continuously producing either structured, semi-structured or unstructured data. Database Management Systems and Data Warehouses are no longer the only technologies used to store and analyze datasets, namely due to the volume and complex structure of nowadays data that degrade their performance and scalability. Big Data is one of the recent challenges, since it implies new requirements in terms of data storage, processing and visualization. Despite that, analyzing properly Big Data can constitute great advantages because it allows discovering patterns and correlations in datasets. Users can use this processed information to gain deeper insights and to get business advantages. Thus, data modeling and data analytics are evolved in a way that we are able to process huge amounts of data without compromising performance and availability, but instead by “relaxing” the usual ACID properties. This paper provides a broad view and discussion of the current state of this subject with a particular focus on data modeling and data analytics, describing and clarifying the main differences between the three main approaches in what concerns these aspects, namely: operational databases, decision support databases and Big Data technologies.
文摘Big Data and Data Analytics affect almost all aspects of modern organisations’decision-making and business strategies.Big Data and Data Analytics create opportunities,challenges,and implications for the external auditing procedure.The purpose of this article is to reveal essential aspects of the impact of Big Data and Data Analytics on external auditing.It seems that Big Data Analytics is a critical tool for organisations,as well as auditors,that contributes to the enhancement of the auditing process.Also,legislative implications must be taken under consideration,since existing standards may need to change.Last,auditors need to develop new skills and competence,and educational organisations need to change their educational programs in order to be able to correspond to new market needs.