Under the background of the big data era,the education of big data majors is undergoing a profound teaching reform and innovation.With the increasing role of big data technology in analysis and decision-making,updatin...Under the background of the big data era,the education of big data majors is undergoing a profound teaching reform and innovation.With the increasing role of big data technology in analysis and decision-making,updating and expanding the teaching content of big data majors has become particularly important.In the era of big data,modern enterprises have put forward new and higher demands for big data talents,which not only include traditional data analysis skills but also knowledge of data visualization and information technology.To address these challenges,big data education needs to reform and innovate in the development and utilization of teaching content,methods,and resources.This paper proposes teaching models and reform methods for big data majors and analyzes corresponding teaching reforms and innovations to meet the requirements of the new development of big data majors.The traditional classroom teaching method is no longer sufficient to meet the learning needs of students,and more dynamic and interactive teaching methods,such as case studies,flipped classrooms,and project-based learning,are becoming increasingly essential.These innovative teaching methods can more effectively cultivate students’practical operation skills and independent thinking while allowing them to better learn advanced knowledge in a real big-data environment.In addition,the paper also discusses the construction of big data processing and analysis platforms,as well as innovative teaching management and evaluation systems to improve teaching quality.展开更多
In the era of the Internet,widely used web applications have become the target of hacker attacks because they contain a large amount of personal information.Among these vulnerabilities,stealing private data through cr...In the era of the Internet,widely used web applications have become the target of hacker attacks because they contain a large amount of personal information.Among these vulnerabilities,stealing private data through crosssite scripting(XSS)attacks is one of the most commonly used attacks by hackers.Currently,deep learning-based XSS attack detection methods have good application prospects;however,they suffer from problems such as being prone to overfitting,a high false alarm rate,and low accuracy.To address these issues,we propose a multi-stage feature extraction and fusion model for XSS detection based on Random Forest feature enhancement.The model utilizes RandomForests to capture the intrinsic structure and patterns of the data by extracting leaf node indices as features,which are subsequentlymergedwith the original data features to forma feature setwith richer information content.Further feature extraction is conducted through three parallel channels.Channel I utilizes parallel onedimensional convolutional layers(1Dconvolutional layers)with different convolutional kernel sizes to extract local features at different scales and performmulti-scale feature fusion;Channel II employsmaximum one-dimensional pooling layers(max 1D pooling layers)of various sizes to extract key features from the data;and Channel III extracts global information bi-directionally using a Bi-Directional Long-Short TermMemory Network(Bi-LSTM)and incorporates a multi-head attention mechanism to enhance global features.Finally,effective classification and prediction of XSS are performed by fusing the features of the three channels.To test the effectiveness of the model,we conduct experiments on six datasets.We achieve an accuracy of 100%on the UNSW-NB15 dataset and 99.99%on the CICIDS2017 dataset,which is higher than that of the existing models.展开更多
Nowadays,the amount of wed data is increasing at a rapid speed,which presents a serious challenge to the web monitoring.Text sentiment analysis,an important research topic in the area of natural language processing,is...Nowadays,the amount of wed data is increasing at a rapid speed,which presents a serious challenge to the web monitoring.Text sentiment analysis,an important research topic in the area of natural language processing,is a crucial task in the web monitoring area.The accuracy of traditional text sentiment analysis methods might be degraded in dealing with mass data.Deep learning is a hot research topic of the artificial intelligence in the recent years.By now,several research groups have studied the sentiment analysis of English texts using deep learning methods.In contrary,relatively few works have so far considered the Chinese text sentiment analysis toward this direction.In this paper,a method for analyzing the Chinese text sentiment is proposed based on the convolutional neural network(CNN)in deep learning in order to improve the analysis accuracy.The feature values of the CNN after the training process are nonuniformly distributed.In order to overcome this problem,a method for normalizing the feature values is proposed.Moreover,the dimensions of the text features are optimized through simulations.Finally,a method for updating the learning rate in the training process of the CNN is presented in order to achieve better performances.Experiment results on the typical datasets indicate that the accuracy of the proposed method can be improved compared with that of the traditional supervised machine learning methods,e.g.,the support vector machine method.展开更多
In the task of multi-target stance detection,there are problems the mutual influence of content describing different targets,resulting in reduction in accuracy.To solve this problem,a multi-target stance detection alg...In the task of multi-target stance detection,there are problems the mutual influence of content describing different targets,resulting in reduction in accuracy.To solve this problem,a multi-target stance detection algorithm based on a bidirectional long short-term memory(Bi-LSTM)network with position-weight is proposed.First,the corresponding position of the target in the input text is calculated with the ultimate position-weight vector.Next,the position information and output from the Bi-LSTM layer are fused by the position-weight fusion layer.Finally,the stances of different targets are predicted using the LSTM network and softmax classification.The multi-target stance detection corpus of the American election in 2016 is used to validate the proposed method.The results demonstrate that the Bi-LSTM network with position-weight achieves an advantage of 1.4%in macro average F1 value in the comparison of recent algorithms.展开更多
A graph theory model of the human nature structure( GMH) for machine vision and image/graphics processing is described in this paper. Independent from the motion and deformation of contours,the human nature structure(...A graph theory model of the human nature structure( GMH) for machine vision and image/graphics processing is described in this paper. Independent from the motion and deformation of contours,the human nature structure( HNS) embodies the most basic movement characteristics of the body. The human body can be divided into basic units like head,torso,and limbs. Using these basic units,a graph theory model for the HNS can be constructed. GMH provides a basic model for human posture processing,and the outline in the perspective projection plane is the body contour of an image. In addition,the GMH can be applied to articulated motion and deformable objects,e. g.,in the design and analysis of body posture,by modifying mapping parameters of the GMH.展开更多
Underwater sensor networks have important application value in the fields of water environment data collection,marine environment monitoring and so on.It has some characteristics such as low available bandwidth,large ...Underwater sensor networks have important application value in the fields of water environment data collection,marine environment monitoring and so on.It has some characteristics such as low available bandwidth,large propagation delays and limited energy,which bring new challenges to the current researches.The research on coverage control of underwater sensor networks is the basis of other related researches.A good sensor node coverage control method can effectively improve the quality of water environment monitoring.Aiming at the problem of high dynamics and uncertainty of monitoring targets,the random events level are divided into serious events and general events.The sensors are set to sense different levels of events and make different responses.Then,an event-driven optimization algorithm for determining sensor target location based on self-organization map is proposed.Aiming at the problem of limited energy of underwater sensor nodes,considering the moving distance,coverage redundancy and residual energy of sensor nodes,an underwater sensor movement control algorithm based on residual energy probability is proposed.The simulation results show that compared with the simple movement algorithm,the proposed algorithm can effectively improve the coverage and life cycle of the sensor networks,and realize real-time monitoring of the water environment.展开更多
Wireless Sensor Network(WSN)is an important part of the Internet of Things(IoT),which are used for information exchange and communication between smart objects.In practical applications,WSN lifecycle can be influenced...Wireless Sensor Network(WSN)is an important part of the Internet of Things(IoT),which are used for information exchange and communication between smart objects.In practical applications,WSN lifecycle can be influenced by the unbalanced distribution of node centrality and excessive energy consumption,etc.In order to overcome these problems,a heterogeneous wireless sensor network model with small world characteristics is constructed to balance the centrality and enhance the invulnerability of the network.Also,a new WSN centrality measurement method and a new invulnerability measurement model are proposed based on the WSN data transmission characteristics.Simulation results show that the life cycle and data transmission volume of the network can be improved with a lower network construction cost,and the invulnerability of the network is effectively enhanced.展开更多
Geographically replicating objects across multiple data centers improves the performance and reliability of cloud storage systems.Maintaining consistent replicas comes with high synchronization costs,as it faces more ...Geographically replicating objects across multiple data centers improves the performance and reliability of cloud storage systems.Maintaining consistent replicas comes with high synchronization costs,as it faces more expensive WAN transport prices and increased latency.Periodic replication is the widely used technique to reduce the synchronization costs.Periodic replication strategies in existing cloud storage systems are too static to handle traffic changes,which indicates that they are inflexible in the face of unforeseen loads,resulting in additional synchronization cost.We propose quantitative analysis models to quantify consistency and synchronization cost for periodically replicated systems,and derive the optimal synchronization period to achieve the best tradeoff between consistency and synchronization cost.Based on this,we propose a dynamic periodic synchronization method,Sync-Opt,which allows systems to set the optimal synchronization period according to the variable load in clouds to minimize the synchronization cost.Simulation results demonstrate the effectiveness of our models.Compared with the policies widely used in modern cloud storage systems,the Sync-Opt strategy significantly reduces the synchronization cost.展开更多
In industrial process control systems,there is overwhelming evidence corroborating the notion that economic or technical limitations result in some key variables that are very difficult to measure online.The data-driv...In industrial process control systems,there is overwhelming evidence corroborating the notion that economic or technical limitations result in some key variables that are very difficult to measure online.The data-driven soft sensor is an effective solution because it provides a reliable and stable online estimation of such variables.This paper employs a deep neural network with multiscale feature extraction layers to build soft sensors,which are applied to the benchmarked Tennessee-Eastman process(TEP)and a real wind farm case.The comparison of modelling results demonstrates that the multiscale feature extraction layers have the following advantages over other methods.First,the multiscale feature extraction layers significantly reduce the number of parameters compared to the other deep neural networks.Second,the multiscale feature extraction layers can powerfully extract dataset characteristics.Finally,the multiscale feature extraction layers with fully considered historical measurements can contain richer useful information and improved representation compared to traditional data-driven models.展开更多
The ever-increasing complexity of on-chip interconnection poses great challenges for the architecture of conventional system-on-chip(SoC) in semiconductor industry. The rapid development of process technology enables ...The ever-increasing complexity of on-chip interconnection poses great challenges for the architecture of conventional system-on-chip(SoC) in semiconductor industry. The rapid development of process technology enables the creation of stacked 3-dimensional(3 D) SoC by means of through-silicon-via(TSV). Stacked 3 D SoC testing consists of two major issues, test architecture optimization and test scheduling. This paper proposed game theory based optimization of test scheduling and test architecture to achieve win-win result as well as individual rationality for each player in a game. Game theory helps to achieve equilibrium between two correlated sides to find an optimal solution. Experimental results on handcrafted 3 D SoCs built from ITC'2 benchmarks demonstrate that the proposed approach achieves comparable or better test times at negligible computing time.展开更多
Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage reso...Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage resources makes efficient task distribution and data placement more challenging.To achieve a higher system performance,this study proposes a two-level global collaborative scheduling strategy for wide-area high-performance computing environments.The collaborative scheduling strategy integrates lightweight solution selection,redundant data placement and task stealing mechanisms,optimizing task distribution and data placement to achieve efficient computing in wide-area environments.The experimental results indicate that compared with the state-of-the-art collaborative scheduling algorithm HPS+,the proposed scheduling strategy reduces the makespan by 23.24%,improves computing and storage resource utilization by 8.28%and 21.73%respectively,and achieves similar global data migration costs.展开更多
基金Teaching Reform Project of Beijing Union University“Exploration of Teaching Reform of Big Data Analysis and Visualization Course under the Background of New Engineering”(JJ2024Y025)。
文摘Under the background of the big data era,the education of big data majors is undergoing a profound teaching reform and innovation.With the increasing role of big data technology in analysis and decision-making,updating and expanding the teaching content of big data majors has become particularly important.In the era of big data,modern enterprises have put forward new and higher demands for big data talents,which not only include traditional data analysis skills but also knowledge of data visualization and information technology.To address these challenges,big data education needs to reform and innovate in the development and utilization of teaching content,methods,and resources.This paper proposes teaching models and reform methods for big data majors and analyzes corresponding teaching reforms and innovations to meet the requirements of the new development of big data majors.The traditional classroom teaching method is no longer sufficient to meet the learning needs of students,and more dynamic and interactive teaching methods,such as case studies,flipped classrooms,and project-based learning,are becoming increasingly essential.These innovative teaching methods can more effectively cultivate students’practical operation skills and independent thinking while allowing them to better learn advanced knowledge in a real big-data environment.In addition,the paper also discusses the construction of big data processing and analysis platforms,as well as innovative teaching management and evaluation systems to improve teaching quality.
文摘In the era of the Internet,widely used web applications have become the target of hacker attacks because they contain a large amount of personal information.Among these vulnerabilities,stealing private data through crosssite scripting(XSS)attacks is one of the most commonly used attacks by hackers.Currently,deep learning-based XSS attack detection methods have good application prospects;however,they suffer from problems such as being prone to overfitting,a high false alarm rate,and low accuracy.To address these issues,we propose a multi-stage feature extraction and fusion model for XSS detection based on Random Forest feature enhancement.The model utilizes RandomForests to capture the intrinsic structure and patterns of the data by extracting leaf node indices as features,which are subsequentlymergedwith the original data features to forma feature setwith richer information content.Further feature extraction is conducted through three parallel channels.Channel I utilizes parallel onedimensional convolutional layers(1Dconvolutional layers)with different convolutional kernel sizes to extract local features at different scales and performmulti-scale feature fusion;Channel II employsmaximum one-dimensional pooling layers(max 1D pooling layers)of various sizes to extract key features from the data;and Channel III extracts global information bi-directionally using a Bi-Directional Long-Short TermMemory Network(Bi-LSTM)and incorporates a multi-head attention mechanism to enhance global features.Finally,effective classification and prediction of XSS are performed by fusing the features of the three channels.To test the effectiveness of the model,we conduct experiments on six datasets.We achieve an accuracy of 100%on the UNSW-NB15 dataset and 99.99%on the CICIDS2017 dataset,which is higher than that of the existing models.
文摘Nowadays,the amount of wed data is increasing at a rapid speed,which presents a serious challenge to the web monitoring.Text sentiment analysis,an important research topic in the area of natural language processing,is a crucial task in the web monitoring area.The accuracy of traditional text sentiment analysis methods might be degraded in dealing with mass data.Deep learning is a hot research topic of the artificial intelligence in the recent years.By now,several research groups have studied the sentiment analysis of English texts using deep learning methods.In contrary,relatively few works have so far considered the Chinese text sentiment analysis toward this direction.In this paper,a method for analyzing the Chinese text sentiment is proposed based on the convolutional neural network(CNN)in deep learning in order to improve the analysis accuracy.The feature values of the CNN after the training process are nonuniformly distributed.In order to overcome this problem,a method for normalizing the feature values is proposed.Moreover,the dimensions of the text features are optimized through simulations.Finally,a method for updating the learning rate in the training process of the CNN is presented in order to achieve better performances.Experiment results on the typical datasets indicate that the accuracy of the proposed method can be improved compared with that of the traditional supervised machine learning methods,e.g.,the support vector machine method.
基金Supported by the National Natural Science Foundation of China(No.61972040)the Science and Technology Projects of Beijing Municipal Education Commission(No.KM201711417011)the Premium Funding Project for Academic Human Resources Development in Beijing Union University(No.BPHR2020AZ03)。
文摘In the task of multi-target stance detection,there are problems the mutual influence of content describing different targets,resulting in reduction in accuracy.To solve this problem,a multi-target stance detection algorithm based on a bidirectional long short-term memory(Bi-LSTM)network with position-weight is proposed.First,the corresponding position of the target in the input text is calculated with the ultimate position-weight vector.Next,the position information and output from the Bi-LSTM layer are fused by the position-weight fusion layer.Finally,the stances of different targets are predicted using the LSTM network and softmax classification.The multi-target stance detection corpus of the American election in 2016 is used to validate the proposed method.The results demonstrate that the Bi-LSTM network with position-weight achieves an advantage of 1.4%in macro average F1 value in the comparison of recent algorithms.
基金Supported by the National Natural Science Foundation of China(No.71373023,61372148,61571045)Beijing Advanced Innovation Center for Imaging Technology(No.BAICIT-2016002)+1 种基金the National Key Technology R&D Program(No.2014BAK08B02,2015BAH55F03)the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions(No.CIT&TCD201504039)
文摘A graph theory model of the human nature structure( GMH) for machine vision and image/graphics processing is described in this paper. Independent from the motion and deformation of contours,the human nature structure( HNS) embodies the most basic movement characteristics of the body. The human body can be divided into basic units like head,torso,and limbs. Using these basic units,a graph theory model for the HNS can be constructed. GMH provides a basic model for human posture processing,and the outline in the perspective projection plane is the body contour of an image. In addition,the GMH can be applied to articulated motion and deformable objects,e. g.,in the design and analysis of body posture,by modifying mapping parameters of the GMH.
基金This research was funded by the National Natural Science Foundation of China,No.61802010Hundred-Thousand-Ten Thousand Talents Project of Beijing No.2020A28+1 种基金National Social Science Fund of China,No.19BGL184Beijing Excellent Talent Training Support Project for Young Top-Notch Team No.2018000026833TD01 and Academic Research Projects of Beijing Union University,No.ZK30202103。
文摘Underwater sensor networks have important application value in the fields of water environment data collection,marine environment monitoring and so on.It has some characteristics such as low available bandwidth,large propagation delays and limited energy,which bring new challenges to the current researches.The research on coverage control of underwater sensor networks is the basis of other related researches.A good sensor node coverage control method can effectively improve the quality of water environment monitoring.Aiming at the problem of high dynamics and uncertainty of monitoring targets,the random events level are divided into serious events and general events.The sensors are set to sense different levels of events and make different responses.Then,an event-driven optimization algorithm for determining sensor target location based on self-organization map is proposed.Aiming at the problem of limited energy of underwater sensor nodes,considering the moving distance,coverage redundancy and residual energy of sensor nodes,an underwater sensor movement control algorithm based on residual energy probability is proposed.The simulation results show that compared with the simple movement algorithm,the proposed algorithm can effectively improve the coverage and life cycle of the sensor networks,and realize real-time monitoring of the water environment.
基金This research was funded by the National Natural Science Foundation of China,No.61802010Hundred-Thousand-Ten Thousand Talents Project of Beijing No.2020A28+2 种基金National Social Science Fund of China,No.19BGL184Beijing Excellent Talent Training Support Project for Young Top-Notch Team No.2018000026833TD01Academic Research Projects of Beijing Union University,No.ZK30202103.
文摘Wireless Sensor Network(WSN)is an important part of the Internet of Things(IoT),which are used for information exchange and communication between smart objects.In practical applications,WSN lifecycle can be influenced by the unbalanced distribution of node centrality and excessive energy consumption,etc.In order to overcome these problems,a heterogeneous wireless sensor network model with small world characteristics is constructed to balance the centrality and enhance the invulnerability of the network.Also,a new WSN centrality measurement method and a new invulnerability measurement model are proposed based on the WSN data transmission characteristics.Simulation results show that the life cycle and data transmission volume of the network can be improved with a lower network construction cost,and the invulnerability of the network is effectively enhanced.
基金This work was supported by the National Natural Science Foundation of China(Grant Nos.62272026 and 62104014)the fund of the State Key Laboratory of Software Development Environment(No.SKLSDE-2022ZX-07)the Iluvatar CoreX semiconductor Co.,Ltd.
文摘Geographically replicating objects across multiple data centers improves the performance and reliability of cloud storage systems.Maintaining consistent replicas comes with high synchronization costs,as it faces more expensive WAN transport prices and increased latency.Periodic replication is the widely used technique to reduce the synchronization costs.Periodic replication strategies in existing cloud storage systems are too static to handle traffic changes,which indicates that they are inflexible in the face of unforeseen loads,resulting in additional synchronization cost.We propose quantitative analysis models to quantify consistency and synchronization cost for periodically replicated systems,and derive the optimal synchronization period to achieve the best tradeoff between consistency and synchronization cost.Based on this,we propose a dynamic periodic synchronization method,Sync-Opt,which allows systems to set the optimal synchronization period according to the variable load in clouds to minimize the synchronization cost.Simulation results demonstrate the effectiveness of our models.Compared with the policies widely used in modern cloud storage systems,the Sync-Opt strategy significantly reduces the synchronization cost.
基金supported by National Natural Science Foundation of China(No.61873142)the Science and Technology Research Program of the Chongqing Municipal Education Commission,China(Nos.KJZD-K202201901,KJQN202201109,KJQN202101904,KJQN202001903 and CXQT21035)+2 种基金the Scientific Research Foundation of Chongqing University of Technology,China(No.2019ZD76)the Scientific Research Foundation of Chongqing Institute of Engineering,China(No.2020xzky05)the Chongqing Municipal Natural Science Foundation,China(No.cstc2020jcyj-msxmX0666).
文摘In industrial process control systems,there is overwhelming evidence corroborating the notion that economic or technical limitations result in some key variables that are very difficult to measure online.The data-driven soft sensor is an effective solution because it provides a reliable and stable online estimation of such variables.This paper employs a deep neural network with multiscale feature extraction layers to build soft sensors,which are applied to the benchmarked Tennessee-Eastman process(TEP)and a real wind farm case.The comparison of modelling results demonstrates that the multiscale feature extraction layers have the following advantages over other methods.First,the multiscale feature extraction layers significantly reduce the number of parameters compared to the other deep neural networks.Second,the multiscale feature extraction layers can powerfully extract dataset characteristics.Finally,the multiscale feature extraction layers with fully considered historical measurements can contain richer useful information and improved representation compared to traditional data-driven models.
基金supported by the Support Project of High-Level Teachers in Beijing Municipal Universities in the Period of the13th Five-Year Plan(CIT&TCD 201704069)the Advanced Research Project for Science and Technology Development of Harbin Normal University(901-220601094)the Natural ScienceFoundationofHeilongjiangProvince(JJ2019LH0418)
文摘The ever-increasing complexity of on-chip interconnection poses great challenges for the architecture of conventional system-on-chip(SoC) in semiconductor industry. The rapid development of process technology enables the creation of stacked 3-dimensional(3 D) SoC by means of through-silicon-via(TSV). Stacked 3 D SoC testing consists of two major issues, test architecture optimization and test scheduling. This paper proposed game theory based optimization of test scheduling and test architecture to achieve win-win result as well as individual rationality for each player in a game. Game theory helps to achieve equilibrium between two correlated sides to find an optimal solution. Experimental results on handcrafted 3 D SoCs built from ITC'2 benchmarks demonstrate that the proposed approach achieves comparable or better test times at negligible computing time.
基金This work was supported by the National key R&D Program of China(2018YFB0203901)the National Natural Science Foundation of China under(Grant No.61772053)the fund of the State Key Laboratory of Software Development Environment(SKLSDE-2020ZX15).
文摘Wide-area high-performance computing is widely used for large-scale parallel computing applications owing to its high computing and storage resources.However,the geographical distribution of computing and storage resources makes efficient task distribution and data placement more challenging.To achieve a higher system performance,this study proposes a two-level global collaborative scheduling strategy for wide-area high-performance computing environments.The collaborative scheduling strategy integrates lightweight solution selection,redundant data placement and task stealing mechanisms,optimizing task distribution and data placement to achieve efficient computing in wide-area environments.The experimental results indicate that compared with the state-of-the-art collaborative scheduling algorithm HPS+,the proposed scheduling strategy reduces the makespan by 23.24%,improves computing and storage resource utilization by 8.28%and 21.73%respectively,and achieves similar global data migration costs.