Target maneuver recognition is a prerequisite for air combat situation awareness,trajectory prediction,threat assessment and maneuver decision.To get rid of the dependence of the current target maneuver recognition me...Target maneuver recognition is a prerequisite for air combat situation awareness,trajectory prediction,threat assessment and maneuver decision.To get rid of the dependence of the current target maneuver recognition method on empirical criteria and sample data,and automatically and adaptively complete the task of extracting the target maneuver pattern,in this paper,an air combat maneuver pattern extraction based on time series segmentation and clustering analysis is proposed by combining autoencoder,G-G clustering algorithm and the selective ensemble clustering analysis algorithm.Firstly,the autoencoder is used to extract key features of maneuvering trajectory to remove the impacts of redundant variables and reduce the data dimension;Then,taking the time information into account,the segmentation of Maneuver characteristic time series is realized with the improved FSTS-AEGG algorithm,and a large number of maneuver primitives are extracted;Finally,the maneuver primitives are grouped into some categories by using the selective ensemble multiple time series clustering algorithm,which can prove that each class represents a maneuver action.The maneuver pattern extraction method is applied to small scale air combat trajectory and can recognize and correctly partition at least 71.3%of maneuver actions,indicating that the method is effective and satisfies the requirements for engineering accuracy.In addition,this method can provide data support for various target maneuvering recognition methods proposed in the literature,greatly reduce the workload and improve the recognition accuracy.展开更多
In response to the lack of reliable physical parameters in the process simulation of the butadiene extraction,a large amount of phase equilibrium data were collected in the context of the actual process of butadiene p...In response to the lack of reliable physical parameters in the process simulation of the butadiene extraction,a large amount of phase equilibrium data were collected in the context of the actual process of butadiene production by acetonitrile.The accuracy of five prediction methods,UNIFAC(UNIQUAC Functional-group Activity Coefficients),UNIFAC-LL,UNIFAC-LBY,UNIFAC-DMD and COSMO-RS,applied to the butadiene extraction process was verified using partial phase equilibrium data.The results showed that the UNIFAC-DMD method had the highest accuracy in predicting phase equilibrium data for the missing system.COSMO-RS-predicted multiple systems showed good accuracy,and a large number of missing phase equilibrium data were estimated using the UNIFAC-DMD method and COSMO-RS method.The predicted phase equilibrium data were checked for consistency.The NRTL-RK(non-Random Two Liquid-Redlich-Kwong Equation of State)and UNIQUAC thermodynamic models were used to correlate the phase equilibrium data.Industrial device simulations were used to verify the accuracy of the thermodynamic model applied to the butadiene extraction process.The simulation results showed that the average deviations of the simulated results using the correlated thermodynamic model from the actual values were less than 2%compared to that using the commercial simulation software,Aspen Plus and its database.The average deviation was much smaller than that of the simulations using the Aspen Plus database(>10%),indicating that the obtained phase equilibrium data are highly accurate and reliable.The best phase equilibrium data and thermodynamic model parameters for butadiene extraction are provided.This improves the accuracy and reliability of the design,optimization and control of the process,and provides a basis and guarantee for developing a more environmentally friendly and economical butadiene extraction process.展开更多
An algorithm named DPP is addressed.In it,a new model based on the concept of irregularity degree is founded to evaluate the regularity of cells.It generates the structure regularity of cells by exploiting the signal ...An algorithm named DPP is addressed.In it,a new model based on the concept of irregularity degree is founded to evaluate the regularity of cells.It generates the structure regularity of cells by exploiting the signal flow of circuit.Then,it converts the bit slice structure to parallel constraints to enable Q place algorithm.The design flow and the main algorithms are introduced.Finally,the satisfied experimental result of the tool compared with the Cadence placement tool SE is discussed.展开更多
More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditi...More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.展开更多
In computational physics proton transfer phenomena could be viewed as pattern classification problems based on a set of input features allowing classification of the proton motion into two categories: transfer 'occu...In computational physics proton transfer phenomena could be viewed as pattern classification problems based on a set of input features allowing classification of the proton motion into two categories: transfer 'occurred' and transfer 'not occurred'. The goal of this paper is to evaluate the use of artificial neural networks in the classification of proton transfer events, based on the feed-forward back propagation neural network, used as a classifier to distinguish between the two transfer cases. In this paper, we use a new developed data mining and pattern recognition tool for automating, controlling, and drawing charts of the output data of an Empirical Valence Bond existing code. The study analyzes the need for pattern recognition in aqueous proton transfer processes and how the learning approach in error back propagation (multilayer perceptron algorithms) could be satisfactorily employed in the present case. We present a tool for pattern recognition and validate the code including a real physical case study. The results of applying the artificial neural networks methodology to crowd patterns based upon selected physical properties (e.g., temperature, density) show the abilities of the network to learn proton transfer patterns corresponding to properties of the aqueous environments, which is in turn proved to be fully compatible with previous proton transfer studies.展开更多
Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux...Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux systems. The method uses the data mining technique to model the normal behavior of a privileged program and uses a variable-length pattern matching algorithm to perform the comparison of the current behavior and historic normal behavior, which is more suitable for this problem than the fixed-length pattern matching algorithm proposed by Forrest et al. At the detection stage, the particularity of the audit data is taken into account, and two alternative schemes could be used to distinguish between normalities and intrusions. The method gives attention to both computational efficiency and detection accuracy and is especially applicable for on-line detection. The performance of the method is evaluated using the typical testing data set, and the results show that it is significantly better than the anomaly detection method based on hidden Markov models proposed by Yan et al. and the method based on fixed-length patterns proposed by Forrest and Hofmeyr. The novel method has been applied to practical hosted-based intrusion detection systems and achieved high detection performance.展开更多
The increasing availability of data in the urban context(e.g.,mobile phone,smart card and social media data)allows us to study urban dynamics at much finer temporal resolutions(e.g.,diurnal urban dynamics).Mobile phon...The increasing availability of data in the urban context(e.g.,mobile phone,smart card and social media data)allows us to study urban dynamics at much finer temporal resolutions(e.g.,diurnal urban dynamics).Mobile phone data,for instance,are found to be a useful data source for extracting diurnal human mobility patterns and for understanding urban dynamics.While previous studies often use call detail record(CDR)data,this study deploys aggregated network-driven mobile phone data that may reveal human mobility patterns more comprehensively and can mitigate some of the privacy concerns raised by mobile phone data usage.We first propose an analytical framework for characterizing and classifying urban areas based on their temporal activity patterns extracted from mobile phone data.Specifically,urban areas’diurnal spatiotemporal signatures of human mobility patterns are obtained through longitudinal mobile phone data.Urban areas are then classified based on the obtained signatures.The classification provides insights into city planning and development.Using the proposed framework,a case study was implemented in the city of Wuhu,China to understand its urban dynamics.The empirical study suggests that human activities in the city of Wuhu are highly concentrated at the Traffic Analysis Zone(TAZ)level.This large portion of local activities suggests that development and planning strategies that are different from those used by metropolitan Chinese cities should be applied in the city of Wuhu.This article concludes with discussions on several common challenges associated with using network-driven mobile phone data,which should be addressed in future studies.展开更多
The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional...The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional datasets. In addition, the traditional outlier detection method does not consider the frequency of subsets occurrence, thus, the detected outliers do not fit the definition of outliers (i.e., rarely appearing). The pattern mining-based outlier detection approaches have solved this problem, but the importance of each pattern is not taken into account in outlier detection process, so the detected outliers cannot truly reflect some actual situation. Aimed at these problems, a two-phase minimal weighted rare pattern mining-based outlier detection approach, called MWRPM-Outlier, is proposed to effectively detect outliers on the weight data stream. In particular, a method called MWRPM is proposed in the pattern mining phase to fast mine the minimal weighted rare patterns, and then two deviation factors are defined in outlier detection phase to measure the abnormal degree of each transaction on the weight data stream. Experimental results show that the proposed MWRPM-Outlier approach has excellent performance in outlier detection and MWRPM approach outperforms in weighted rare pattern mining.展开更多
The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which...The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.展开更多
A 16 kV/20 A power supply was developed for the extraction grid of prototype radio frequency(RF) ion source of neutral beam injector. To acquire the state signals of extraction grid power supply(EGPS) and control ...A 16 kV/20 A power supply was developed for the extraction grid of prototype radio frequency(RF) ion source of neutral beam injector. To acquire the state signals of extraction grid power supply(EGPS) and control the operation of the EGPS, a data acquisition and control system has been developed. This system mainly consists of interlock protection circuit board, photoelectric conversion circuit, optical fibers, industrial compact peripheral component interconnect(CPCI) computer and host computer. The human machine interface of host computer delivers commands and data to program of the CPCI computer, as well as offers a convenient client for setting parameters and displaying EGPS status. The CPCI computer acquires the status of the power supply. The system can turn-off the EGPS quickly when the faults of EGPS occur. The system has been applied to the EGPS of prototype RF ion source. Test results show that the data acquisition and control system for the EGPS can meet the requirements of the operation of prototype RF ion source.展开更多
A novel local binary pattern-based reversible data hiding(LBP-RDH)technique has been suggested to maintain a fair symmetry between the perceptual transparency and hiding capacity.During embedding,the image is divided ...A novel local binary pattern-based reversible data hiding(LBP-RDH)technique has been suggested to maintain a fair symmetry between the perceptual transparency and hiding capacity.During embedding,the image is divided into various 3×3 blocks.Then,using the LBP-based image descriptor,the LBP codes for each block are computed.Next,the obtained LBP codes are XORed with the embedding bits and are concealed in the respective blocks using the proposed pixel readjustment process.Further,each cover image(CI)pixel produces two different stego-image pixels.Likewise,during extraction,the CI pixels are restored without the loss of a single bit of information.The outcome of the proposed technique with respect to perceptual transparency measures,such as peak signal-to-noise ratio and structural similarity index,is found to be superior to that of some of the recent and state-of-the-art techniques.In addition,the proposed technique has shown excellent resilience to various stego-attacks,such as pixel difference histogram as well as regular and singular analysis.Besides,the out-off boundary pixel problem,which endures in most of the contemporary data hiding techniques,has been successfully addressed.展开更多
With the rapid development of the Internet globally since the 21st century,the amount of data information has increased exponentially.Data helps improve people’s livelihood and working conditions,as well as learning ...With the rapid development of the Internet globally since the 21st century,the amount of data information has increased exponentially.Data helps improve people’s livelihood and working conditions,as well as learning efficiency.Therefore,data extraction,analysis,and processing have become a hot issue for people from all walks of life.Traditional recommendation algorithm still has some problems,such as inaccuracy,less diversity,and low performance.To solve these problems and improve the accuracy and variety of the recommendation algorithms,the research combines the convolutional neural networks(CNN)and the attention model to design a recommendation algorithm based on the neural network framework.Through the text convolutional network,the input layer in CNN has transformed into two channels:static ones and non-static ones.Meanwhile,the self-attention system focuses on the system so that data can be better processed and the accuracy of feature extraction becomes higher.The recommendation algorithm combines CNN and attention system and divides the embedding layer into user information feature embedding and data name feature extraction embedding.It obtains data name features through a convolution kernel.Finally,the top pooling layer obtains the length vector.The attention system layer obtains the characteristics of the data type.Experimental results show that the proposed recommendation algorithm that combines CNN and the attention system can perform better in data extraction than the traditional CNN algorithm and other recommendation algorithms that are popular at the present stage.The proposed algorithm shows excellent accuracy and robustness.展开更多
Nowadays most of the cloud applications process large amount of data to provide the desired results. The Internet environment, the enterprise network advertising, network marketing plan, need partner sites selected as...Nowadays most of the cloud applications process large amount of data to provide the desired results. The Internet environment, the enterprise network advertising, network marketing plan, need partner sites selected as carrier and publishers. Website through static pages, dynamic pages, floating window, AD links, take the initiative to push a variety of ways to show the user enterprise marketing solutions, when the user access to web pages, use eye effect and concentration effect, attract users through reading web pages or click the page again, let the user detailed comprehensive understanding of the marketing plan, which affects the user' s real purchase decisions. Therefore, we combine the cloud environment with search engine optimization technique, the result shows that our method outperforms compared with other approaches.展开更多
In order to explore the travel characteristics and space-time distribution of different groups of bikeshare users,an online analytical processing(OLAP)tool called data cube was used for treating and displaying multi-d...In order to explore the travel characteristics and space-time distribution of different groups of bikeshare users,an online analytical processing(OLAP)tool called data cube was used for treating and displaying multi-dimensional data.We extended and modified the traditionally threedimensional data cube into four dimensions,which are space,date,time,and user,each with a user-specified hierarchy,and took transaction numbers and travel time as two quantitative measures.The results suggest that there are two obvious transaction peaks during the morning and afternoon rush hours on weekdays,while the volume at weekends has an approximate even distribution.Bad weather condition significantly restricts the bikeshare usage.Besides,seamless smartcard users generally take a longer trip than exclusive smartcard users;and non-native users ride faster than native users.These findings not only support the applicability and efficiency of data cube in the field of visualizing massive smartcard data,but also raise equity concerns among bikeshare users with different demographic backgrounds.展开更多
A variety of faulty radar echoes may cause serious problems with radar data applications,especially radar data assimilation and quantitative precipitation estimates.In this study,"test pattern" caused by test signal...A variety of faulty radar echoes may cause serious problems with radar data applications,especially radar data assimilation and quantitative precipitation estimates.In this study,"test pattern" caused by test signal or radar hardware failures in CINRAD (China New Generation Weather Radar) SA and SB radar operational observations are investigated.In order to distinguish the test pattern from other types of radar echoes,such as precipitation,clear air and other non-meteorological echoes,five feature parameters including the effective reflectivity data percentage (Rz),velocity RF (range folding) data percentage (RRF),missing velocity data percentage (RM),averaged along-azimuth reflectivity fluctuation (RNr,z) and averaged along-beam reflectivity fluctuation (RNa,z) are proposed.Based on the fuzzy logic method,a test pattern identification algorithm is developed,and the statistical results from all the different kinds of radar echoes indicate the performance of the algorithm.Analysis of two typical cases with heavy precipitation echoes located inside the test pattern are performed.The statistical results show that the test pattern identification algorithm performs well,since the test pattern is recognized in most cases.Besides,the algorithm can effectively remove the test pattern signal and retain strong precipitation echoes in heavy rainfall events.展开更多
Traditional pattern representation in information extraction lack in the ability of representing domain-specific concepts and are therefore devoid of flexibility. To overcome these restrictions, an enhanced pattern re...Traditional pattern representation in information extraction lack in the ability of representing domain-specific concepts and are therefore devoid of flexibility. To overcome these restrictions, an enhanced pattern representation is designed which includes ontological concepts, neighboring-tree structures and soft constraints. An information-(extraction) inference engine based on hypothesis-generation and conflict-resolution is implemented. The proposed technique is successfully applied to an information extraction system for Chinese-language query front-end of a job-recruitment search engine.展开更多
Web data extraction is to obtain valuable data from the tremendous information resource of the World Wide Web according to the pre - defined pattern. It processes and classifies the data on the Web. Formalization of t...Web data extraction is to obtain valuable data from the tremendous information resource of the World Wide Web according to the pre - defined pattern. It processes and classifies the data on the Web. Formalization of the procedure of Web data extraction is presented, as well as the description of crawling and extraction algorithm. Based on the formalization, an XML - based page structure description language, TIDL, is brought out, including the object model, the HTML object reference model and definition of tags. At the final part, a Web data gathering and querying application based on Internet agent technology, named Web Integration Services Kit (WISK) is mentioned.展开更多
Satellite remote sensing data are usually used to analyze the spatial distribution pattern of geological structures and generally serve as a significant means for the identification of alteration zones. Based on the L...Satellite remote sensing data are usually used to analyze the spatial distribution pattern of geological structures and generally serve as a significant means for the identification of alteration zones. Based on the Landsat Enhanced Thematic Mapper (ETM+) data, which have better spectral resolution (8 bands) and spatial resolution (15 m in PAN band), the synthesis processing techniques were presented to fulfill alteration information extraction: data preparation, vegetation indices and band ratios, and expert classifier-based classification. These techniques have been implemented in the MapGIS-RSP software (version 1.0), developed by the Wuhan Zondy Cyber Technology Co., Ltd, China. In the study area application of extracting alteration information in the Zhaoyuan (招远) gold mines, Shandong (山东) Province, China, several hydorthermally altered zones (included two new sites) were found after satellite imagery interpretation coupled with field surveys. It is concluded that these synthesis processing techniques are useful approaches and are applicable to a wide range of gold-mineralized alteration information extraction.展开更多
In this paper,we present an open python procedure with Jupyter notebook,for data extraction and vectorization of geophysical explo ration profile.Constrained by observation routes and traffic conditions,geophysical ex...In this paper,we present an open python procedure with Jupyter notebook,for data extraction and vectorization of geophysical explo ration profile.Constrained by observation routes and traffic conditions,geophysical exploration profiles tend to bend curved roads for easy observation,however,it must be projected onto a straight line when data processing and analyzing.After projection,we don’t know the true position of the obtained crustal structure.Nonetheless,when the results used as an initial constraint condition for other geophysical inversion,such as gravity inversion,we need to know the true position of the data rather than the distance to the starting point.We solved this problem by profile vectorization and reprojection.The method can be used for extraction data of various geophysical exploration profiles,such as seismic reflection profiles,gravity profiles.展开更多
Based on the night light data, urban area data, and economic data of Wuhan Urban Agglomeration from 2009 to 2015, we use spatial correlation dimension, spatial self-correlation analysis and weighted standard deviation...Based on the night light data, urban area data, and economic data of Wuhan Urban Agglomeration from 2009 to 2015, we use spatial correlation dimension, spatial self-correlation analysis and weighted standard deviation ellipse to identify the general characteristics and dynamic evolution characteristics of urban spatial pattern and economic disparity pattern. The research results prove that: between 2009 and 2013, Wuhan Urban Agglomeration expanded gradually from northwest to southeast and presented the dynamic evolution features of “along the river and the road”. The spatial structure is obvious, forming the pattern of “core-periphery”. The development of Wuhan Urban Agglomeration has obvious imbalance in economic geography space, presenting the development tendency of “One prominent, stronger in the west and weaker in the east”. The contract within Wuhan Urban Agglomeration is gradually decreased. Wuhan city and its surrounding areas have stronger economic growth strength as well as the cities along The Yangtze River. However, the relative development rate of Wuhan city area is still far higher than other cities and counties.展开更多
基金supported by the National Natural Science Foundation of China (Project No.72301293)。
文摘Target maneuver recognition is a prerequisite for air combat situation awareness,trajectory prediction,threat assessment and maneuver decision.To get rid of the dependence of the current target maneuver recognition method on empirical criteria and sample data,and automatically and adaptively complete the task of extracting the target maneuver pattern,in this paper,an air combat maneuver pattern extraction based on time series segmentation and clustering analysis is proposed by combining autoencoder,G-G clustering algorithm and the selective ensemble clustering analysis algorithm.Firstly,the autoencoder is used to extract key features of maneuvering trajectory to remove the impacts of redundant variables and reduce the data dimension;Then,taking the time information into account,the segmentation of Maneuver characteristic time series is realized with the improved FSTS-AEGG algorithm,and a large number of maneuver primitives are extracted;Finally,the maneuver primitives are grouped into some categories by using the selective ensemble multiple time series clustering algorithm,which can prove that each class represents a maneuver action.The maneuver pattern extraction method is applied to small scale air combat trajectory and can recognize and correctly partition at least 71.3%of maneuver actions,indicating that the method is effective and satisfies the requirements for engineering accuracy.In addition,this method can provide data support for various target maneuvering recognition methods proposed in the literature,greatly reduce the workload and improve the recognition accuracy.
基金supported by the National Natural Science Foundation of China(22178190)。
文摘In response to the lack of reliable physical parameters in the process simulation of the butadiene extraction,a large amount of phase equilibrium data were collected in the context of the actual process of butadiene production by acetonitrile.The accuracy of five prediction methods,UNIFAC(UNIQUAC Functional-group Activity Coefficients),UNIFAC-LL,UNIFAC-LBY,UNIFAC-DMD and COSMO-RS,applied to the butadiene extraction process was verified using partial phase equilibrium data.The results showed that the UNIFAC-DMD method had the highest accuracy in predicting phase equilibrium data for the missing system.COSMO-RS-predicted multiple systems showed good accuracy,and a large number of missing phase equilibrium data were estimated using the UNIFAC-DMD method and COSMO-RS method.The predicted phase equilibrium data were checked for consistency.The NRTL-RK(non-Random Two Liquid-Redlich-Kwong Equation of State)and UNIQUAC thermodynamic models were used to correlate the phase equilibrium data.Industrial device simulations were used to verify the accuracy of the thermodynamic model applied to the butadiene extraction process.The simulation results showed that the average deviations of the simulated results using the correlated thermodynamic model from the actual values were less than 2%compared to that using the commercial simulation software,Aspen Plus and its database.The average deviation was much smaller than that of the simulations using the Aspen Plus database(>10%),indicating that the obtained phase equilibrium data are highly accurate and reliable.The best phase equilibrium data and thermodynamic model parameters for butadiene extraction are provided.This improves the accuracy and reliability of the design,optimization and control of the process,and provides a basis and guarantee for developing a more environmentally friendly and economical butadiene extraction process.
文摘An algorithm named DPP is addressed.In it,a new model based on the concept of irregularity degree is founded to evaluate the regularity of cells.It generates the structure regularity of cells by exploiting the signal flow of circuit.Then,it converts the bit slice structure to parallel constraints to enable Q place algorithm.The design flow and the main algorithms are introduced.Finally,the satisfied experimental result of the tool compared with the Cadence placement tool SE is discussed.
基金supported by the Knowledge Innovation Program of the Chinese Academy of Sciencesthe National High-Tech R&D Program of China(2008BAK49B05)
文摘More web pages are widely applying AJAX (Asynchronous JavaScript XML) due to the rich interactivity and incremental communication. By observing, it is found that the AJAX contents, which could not be seen by traditional crawler, are well-structured and belong to one specific domain generally. Extracting the structured data from AJAX contents and annotating its semantic are very significant for further applications. In this paper, a structured AJAX data extraction method for agricultural domain based on agricultural ontology was proposed. Firstly, Crawljax, an open AJAX crawling tool, was overridden to explore and retrieve the AJAX contents; secondly, the retrieved contents were partitioned into items and then classified by combining with agricultural ontology. HTML tags and punctuations were used to segment the retrieved contents into entity items. Finally, the entity items were clustered and the semantic annotation was assigned to clustering results according to agricultural ontology. By experimental evaluation, the proposed approach was proved effectively in resource exploring, entity extraction, and semantic annotation.
基金Dr. Steve Jones, Scientific Advisor of the Canon Foundation for Scientific Research (7200 The Quorum, Oxford Business Park, Oxford OX4 2JZ, England). Canon Foundation for Scientific Research funded the UPC 2013 tuition fees of the corresponding author during her writing this article
文摘In computational physics proton transfer phenomena could be viewed as pattern classification problems based on a set of input features allowing classification of the proton motion into two categories: transfer 'occurred' and transfer 'not occurred'. The goal of this paper is to evaluate the use of artificial neural networks in the classification of proton transfer events, based on the feed-forward back propagation neural network, used as a classifier to distinguish between the two transfer cases. In this paper, we use a new developed data mining and pattern recognition tool for automating, controlling, and drawing charts of the output data of an Empirical Valence Bond existing code. The study analyzes the need for pattern recognition in aqueous proton transfer processes and how the learning approach in error back propagation (multilayer perceptron algorithms) could be satisfactorily employed in the present case. We present a tool for pattern recognition and validate the code including a real physical case study. The results of applying the artificial neural networks methodology to crowd patterns based upon selected physical properties (e.g., temperature, density) show the abilities of the network to learn proton transfer patterns corresponding to properties of the aqueous environments, which is in turn proved to be fully compatible with previous proton transfer studies.
基金supported by the National Grand Fundamental Research "973" Program of China (2004CB318109)the National High-Technology Research and Development Plan of China (2006AA01Z452)the National Information Security "242"Program of China (2005C39).
文摘Anomaly detection has been an active research topic in the field of network intrusion detection for many years. A novel method is presented for anomaly detection based on system calls into the kernels of Unix or Linux systems. The method uses the data mining technique to model the normal behavior of a privileged program and uses a variable-length pattern matching algorithm to perform the comparison of the current behavior and historic normal behavior, which is more suitable for this problem than the fixed-length pattern matching algorithm proposed by Forrest et al. At the detection stage, the particularity of the audit data is taken into account, and two alternative schemes could be used to distinguish between normalities and intrusions. The method gives attention to both computational efficiency and detection accuracy and is especially applicable for on-line detection. The performance of the method is evaluated using the typical testing data set, and the results show that it is significantly better than the anomaly detection method based on hidden Markov models proposed by Yan et al. and the method based on fixed-length patterns proposed by Forrest and Hofmeyr. The novel method has been applied to practical hosted-based intrusion detection systems and achieved high detection performance.
基金Under the auspices of the National Natural Science Foundation of China(No.41571146)China Postdoctoral Science Foundation(No.2019M651784)。
文摘The increasing availability of data in the urban context(e.g.,mobile phone,smart card and social media data)allows us to study urban dynamics at much finer temporal resolutions(e.g.,diurnal urban dynamics).Mobile phone data,for instance,are found to be a useful data source for extracting diurnal human mobility patterns and for understanding urban dynamics.While previous studies often use call detail record(CDR)data,this study deploys aggregated network-driven mobile phone data that may reveal human mobility patterns more comprehensively and can mitigate some of the privacy concerns raised by mobile phone data usage.We first propose an analytical framework for characterizing and classifying urban areas based on their temporal activity patterns extracted from mobile phone data.Specifically,urban areas’diurnal spatiotemporal signatures of human mobility patterns are obtained through longitudinal mobile phone data.Urban areas are then classified based on the obtained signatures.The classification provides insights into city planning and development.Using the proposed framework,a case study was implemented in the city of Wuhu,China to understand its urban dynamics.The empirical study suggests that human activities in the city of Wuhu are highly concentrated at the Traffic Analysis Zone(TAZ)level.This large portion of local activities suggests that development and planning strategies that are different from those used by metropolitan Chinese cities should be applied in the city of Wuhu.This article concludes with discussions on several common challenges associated with using network-driven mobile phone data,which should be addressed in future studies.
基金supported by Fundamental Research Funds for the Central Universities (No. 2018XD004)
文摘The distance-based outlier detection method detects the implied outliers by calculating the distance of the points in the dataset, but the computational complexity is particularly high when processing multidimensional datasets. In addition, the traditional outlier detection method does not consider the frequency of subsets occurrence, thus, the detected outliers do not fit the definition of outliers (i.e., rarely appearing). The pattern mining-based outlier detection approaches have solved this problem, but the importance of each pattern is not taken into account in outlier detection process, so the detected outliers cannot truly reflect some actual situation. Aimed at these problems, a two-phase minimal weighted rare pattern mining-based outlier detection approach, called MWRPM-Outlier, is proposed to effectively detect outliers on the weight data stream. In particular, a method called MWRPM is proposed in the pattern mining phase to fast mine the minimal weighted rare patterns, and then two deviation factors are defined in outlier detection phase to measure the abnormal degree of each transaction on the weight data stream. Experimental results show that the proposed MWRPM-Outlier approach has excellent performance in outlier detection and MWRPM approach outperforms in weighted rare pattern mining.
基金supported in part by National Key Research and Development Program of China(2019YFB2103200)NSFC(61672108),Open Subject Funds of Science and Technology on Information Transmission and Dissemination in Communication Networks Laboratory(SKX182010049)+1 种基金the Fundamental Research Funds for the Central Universities(5004193192019PTB-019)the Industrial Internet Innovation and Development Project 2018 of China.
文摘The development of cloud computing and virtualization technology has brought great challenges to the reliability of data center services.Data centers typically contain a large number of compute and storage nodes which may fail and affect the quality of service.Failure prediction is an important means of ensuring service availability.Predicting node failure in cloud-based data centers is challenging because the failure symptoms reflected have complex characteristics,and the distribution imbalance between the failure sample and the normal sample is widespread,resulting in inaccurate failure prediction.Targeting these challenges,this paper proposes a novel failure prediction method FP-STE(Failure Prediction based on Spatio-temporal Feature Extraction).Firstly,an improved recurrent neural network HW-GRU(Improved GRU based on HighWay network)and a convolutional neural network CNN are used to extract the temporal features and spatial features of multivariate data respectively to increase the discrimination of different types of failure symptoms which improves the accuracy of prediction.Then the intermediate results of the two models are added as features into SCSXGBoost to predict the possibility and the precise type of node failure in the future.SCS-XGBoost is an ensemble learning model that is improved by the integrated strategy of oversampling and cost-sensitive learning.Experimental results based on real data sets confirm the effectiveness and superiority of FP-STE.
基金supported by National Natural Science Foundation of China(Contract Nos.11505225&11675216)Foundation of ASIPP(Contract No.DSJJ-15-GC03)the Key Program of Research and Development of Hefei Science Center,CAS(2016HSC-KPRD002)
文摘A 16 kV/20 A power supply was developed for the extraction grid of prototype radio frequency(RF) ion source of neutral beam injector. To acquire the state signals of extraction grid power supply(EGPS) and control the operation of the EGPS, a data acquisition and control system has been developed. This system mainly consists of interlock protection circuit board, photoelectric conversion circuit, optical fibers, industrial compact peripheral component interconnect(CPCI) computer and host computer. The human machine interface of host computer delivers commands and data to program of the CPCI computer, as well as offers a convenient client for setting parameters and displaying EGPS status. The CPCI computer acquires the status of the power supply. The system can turn-off the EGPS quickly when the faults of EGPS occur. The system has been applied to the EGPS of prototype RF ion source. Test results show that the data acquisition and control system for the EGPS can meet the requirements of the operation of prototype RF ion source.
文摘A novel local binary pattern-based reversible data hiding(LBP-RDH)technique has been suggested to maintain a fair symmetry between the perceptual transparency and hiding capacity.During embedding,the image is divided into various 3×3 blocks.Then,using the LBP-based image descriptor,the LBP codes for each block are computed.Next,the obtained LBP codes are XORed with the embedding bits and are concealed in the respective blocks using the proposed pixel readjustment process.Further,each cover image(CI)pixel produces two different stego-image pixels.Likewise,during extraction,the CI pixels are restored without the loss of a single bit of information.The outcome of the proposed technique with respect to perceptual transparency measures,such as peak signal-to-noise ratio and structural similarity index,is found to be superior to that of some of the recent and state-of-the-art techniques.In addition,the proposed technique has shown excellent resilience to various stego-attacks,such as pixel difference histogram as well as regular and singular analysis.Besides,the out-off boundary pixel problem,which endures in most of the contemporary data hiding techniques,has been successfully addressed.
文摘With the rapid development of the Internet globally since the 21st century,the amount of data information has increased exponentially.Data helps improve people’s livelihood and working conditions,as well as learning efficiency.Therefore,data extraction,analysis,and processing have become a hot issue for people from all walks of life.Traditional recommendation algorithm still has some problems,such as inaccuracy,less diversity,and low performance.To solve these problems and improve the accuracy and variety of the recommendation algorithms,the research combines the convolutional neural networks(CNN)and the attention model to design a recommendation algorithm based on the neural network framework.Through the text convolutional network,the input layer in CNN has transformed into two channels:static ones and non-static ones.Meanwhile,the self-attention system focuses on the system so that data can be better processed and the accuracy of feature extraction becomes higher.The recommendation algorithm combines CNN and attention system and divides the embedding layer into user information feature embedding and data name feature extraction embedding.It obtains data name features through a convolution kernel.Finally,the top pooling layer obtains the length vector.The attention system layer obtains the characteristics of the data type.Experimental results show that the proposed recommendation algorithm that combines CNN and the attention system can perform better in data extraction than the traditional CNN algorithm and other recommendation algorithms that are popular at the present stage.The proposed algorithm shows excellent accuracy and robustness.
文摘Nowadays most of the cloud applications process large amount of data to provide the desired results. The Internet environment, the enterprise network advertising, network marketing plan, need partner sites selected as carrier and publishers. Website through static pages, dynamic pages, floating window, AD links, take the initiative to push a variety of ways to show the user enterprise marketing solutions, when the user access to web pages, use eye effect and concentration effect, attract users through reading web pages or click the page again, let the user detailed comprehensive understanding of the marketing plan, which affects the user' s real purchase decisions. Therefore, we combine the cloud environment with search engine optimization technique, the result shows that our method outperforms compared with other approaches.
基金Supported by Projects of International Cooperation and Exchange of the National Natural Science Foundation of China(51561135003)Key Project of National Natural Science Foundation of China(51338003)Scientific Research Foundation of Graduated School of Southeast University(YBJJ1842)
文摘In order to explore the travel characteristics and space-time distribution of different groups of bikeshare users,an online analytical processing(OLAP)tool called data cube was used for treating and displaying multi-dimensional data.We extended and modified the traditionally threedimensional data cube into four dimensions,which are space,date,time,and user,each with a user-specified hierarchy,and took transaction numbers and travel time as two quantitative measures.The results suggest that there are two obvious transaction peaks during the morning and afternoon rush hours on weekdays,while the volume at weekends has an approximate even distribution.Bad weather condition significantly restricts the bikeshare usage.Besides,seamless smartcard users generally take a longer trip than exclusive smartcard users;and non-native users ride faster than native users.These findings not only support the applicability and efficiency of data cube in the field of visualizing massive smartcard data,but also raise equity concerns among bikeshare users with different demographic backgrounds.
基金supported by the National Key Program for Developing Basic Sciences under Grant 2012CB417202the National Natural Science Foundation of China under Grant No. 41175038, No. 41305088 and No. 41075023+4 种基金the Meteorological Special Project "Radar network observation technology and QC"the CMA Key project "Radar Operational Software Engineering"the Chinese Academy of Meteorological Sciences Basic ScientificOperational Projects "Observation and retrieval methods of micro-physics and dynamic parameters of cloud and precipitation with multi-wavelength Remote Sensing"Project of the State Key Laboratory of Severe Weather grant 2012LASW-B04
文摘A variety of faulty radar echoes may cause serious problems with radar data applications,especially radar data assimilation and quantitative precipitation estimates.In this study,"test pattern" caused by test signal or radar hardware failures in CINRAD (China New Generation Weather Radar) SA and SB radar operational observations are investigated.In order to distinguish the test pattern from other types of radar echoes,such as precipitation,clear air and other non-meteorological echoes,five feature parameters including the effective reflectivity data percentage (Rz),velocity RF (range folding) data percentage (RRF),missing velocity data percentage (RM),averaged along-azimuth reflectivity fluctuation (RNr,z) and averaged along-beam reflectivity fluctuation (RNa,z) are proposed.Based on the fuzzy logic method,a test pattern identification algorithm is developed,and the statistical results from all the different kinds of radar echoes indicate the performance of the algorithm.Analysis of two typical cases with heavy precipitation echoes located inside the test pattern are performed.The statistical results show that the test pattern identification algorithm performs well,since the test pattern is recognized in most cases.Besides,the algorithm can effectively remove the test pattern signal and retain strong precipitation echoes in heavy rainfall events.
文摘Traditional pattern representation in information extraction lack in the ability of representing domain-specific concepts and are therefore devoid of flexibility. To overcome these restrictions, an enhanced pattern representation is designed which includes ontological concepts, neighboring-tree structures and soft constraints. An information-(extraction) inference engine based on hypothesis-generation and conflict-resolution is implemented. The proposed technique is successfully applied to an information extraction system for Chinese-language query front-end of a job-recruitment search engine.
基金Note:Contents discussed in this paper are part of a key project,No.2000-A31-01-04,sponsored by Ministry of Science and Technology of P.R.China
文摘Web data extraction is to obtain valuable data from the tremendous information resource of the World Wide Web according to the pre - defined pattern. It processes and classifies the data on the Web. Formalization of the procedure of Web data extraction is presented, as well as the description of crawling and extraction algorithm. Based on the formalization, an XML - based page structure description language, TIDL, is brought out, including the object model, the HTML object reference model and definition of tags. At the final part, a Web data gathering and querying application based on Internet agent technology, named Web Integration Services Kit (WISK) is mentioned.
基金The paper is supported by the Research Foundation for Out-standing Young Teachers, China University of Geosciences (Wuhan) (Nos. CUGQNL0628, CUGQNL0640)the National High-Tech Research and Development Program (863 Program) (No. 2001AA135170)the Postdoctoral Foundation of the Shandong Zhaojin Group Co. (No. 20050262120)
文摘Satellite remote sensing data are usually used to analyze the spatial distribution pattern of geological structures and generally serve as a significant means for the identification of alteration zones. Based on the Landsat Enhanced Thematic Mapper (ETM+) data, which have better spectral resolution (8 bands) and spatial resolution (15 m in PAN band), the synthesis processing techniques were presented to fulfill alteration information extraction: data preparation, vegetation indices and band ratios, and expert classifier-based classification. These techniques have been implemented in the MapGIS-RSP software (version 1.0), developed by the Wuhan Zondy Cyber Technology Co., Ltd, China. In the study area application of extracting alteration information in the Zhaoyuan (招远) gold mines, Shandong (山东) Province, China, several hydorthermally altered zones (included two new sites) were found after satellite imagery interpretation coupled with field surveys. It is concluded that these synthesis processing techniques are useful approaches and are applicable to a wide range of gold-mineralized alteration information extraction.
基金the support from Science for Earthquake Resilience of the China Earthquake Administration(XH17022)the National Natural Science Foundation of China(Grant No.U1939204,No.41204014)+1 种基金the Scientific Research Fund of Institute of Seismology and Institute of Crustal Dynamics,China Earthquake Administration(Grant No.IS20146141)National Key Research and Development Plan(Grant No.2017YFC1500204).
文摘In this paper,we present an open python procedure with Jupyter notebook,for data extraction and vectorization of geophysical explo ration profile.Constrained by observation routes and traffic conditions,geophysical exploration profiles tend to bend curved roads for easy observation,however,it must be projected onto a straight line when data processing and analyzing.After projection,we don’t know the true position of the obtained crustal structure.Nonetheless,when the results used as an initial constraint condition for other geophysical inversion,such as gravity inversion,we need to know the true position of the data rather than the distance to the starting point.We solved this problem by profile vectorization and reprojection.The method can be used for extraction data of various geophysical exploration profiles,such as seismic reflection profiles,gravity profiles.
文摘Based on the night light data, urban area data, and economic data of Wuhan Urban Agglomeration from 2009 to 2015, we use spatial correlation dimension, spatial self-correlation analysis and weighted standard deviation ellipse to identify the general characteristics and dynamic evolution characteristics of urban spatial pattern and economic disparity pattern. The research results prove that: between 2009 and 2013, Wuhan Urban Agglomeration expanded gradually from northwest to southeast and presented the dynamic evolution features of “along the river and the road”. The spatial structure is obvious, forming the pattern of “core-periphery”. The development of Wuhan Urban Agglomeration has obvious imbalance in economic geography space, presenting the development tendency of “One prominent, stronger in the west and weaker in the east”. The contract within Wuhan Urban Agglomeration is gradually decreased. Wuhan city and its surrounding areas have stronger economic growth strength as well as the cities along The Yangtze River. However, the relative development rate of Wuhan city area is still far higher than other cities and counties.