In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decisi...In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4. 5 and OCI.展开更多
Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in prac...Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in practical application. In this paper, we perform WSD study based on large scale real-world corpus using two unsupervised learning algorithms based on ±n-improved Bayesian model and Dependency Grammar (DG)-improved Bayesian model. ±n-improved classifiers reduce the window size of context of ambiguous words with close-distance feature extraction method, and decrease the jamming of useless features, thus obviously improve the accuracy, reaching 83.18% (in open test). DG-improved classifier can more effectively conquer the noise effect existing in Naive-Bayesian classifier. Experimental results show that this approach does better on Chinese WSD, and the open test achieved an accuracy of 86.27%.展开更多
The peripheral nervous system plays a major role in the maintenance of our physiology. Several peripheral nerves intimately regulate the state of the brain, spinal cord, and visceral systems. A new class of therapeuti...The peripheral nervous system plays a major role in the maintenance of our physiology. Several peripheral nerves intimately regulate the state of the brain, spinal cord, and visceral systems. A new class of therapeutics, called bioelectronic medicines, are being developed to precisely regulate physiology and treat dysfunction using peripheral nerve stimulation. In this review, we first discuss new work using closed-loop bioelectronic medicine to treat upper limb paralysis. In contrast to open-loop bioelectronic medicines, closed-loop approaches trigger ‘on demand' peripheral nerve stimulation due to a change in function(e.g., during an upper limb movement or a change in cardiopulmonary state). We also outline our perspective on timing rules for closedloop bioelectronic stimulation, interface features for non-invasively stimulating peripheral nerves, and machine learning algorithms to recognize disease events for closed-loop stimulation control. Although there will be several challenges for this emerging field, we look forward to future bioelectronic medicines that can autonomously sense changes in the body, to provide closed-loop peripheral nerve stimulation and treat disease.展开更多
Spectrum sensing is one of the key issues in cognitive radio networks. Most of previous work concenates on sensing the spectrum in a single spectrum band. In this paper, we propose a spectrum sensing sequence predicti...Spectrum sensing is one of the key issues in cognitive radio networks. Most of previous work concenates on sensing the spectrum in a single spectrum band. In this paper, we propose a spectrum sensing sequence prediction scheme for cognitive radio networks with multiple spectrum bands to decrease the spectrum sensing time and increase the throughput of secondary users. The scheme is based on recent advances in computational learning theory, which has shown that prediction is synonymous with data compression. A Ziv-Lempel data compression algorithm is used to design our spectrum sensing sequence prediction scheme. The spectrum band usage history is used for the prediction in our proposed scheme. Simulation results show that the proposed scheme can reduce the average sensing time and improve the system throughput significantly.展开更多
Bayesian statistics assigns basic probabilities to singletons (single element sets). The Dempster-Shafer evidence theory generalizes Bayesian statistics by assigning basic probabilities to subsets to represent evide...Bayesian statistics assigns basic probabilities to singletons (single element sets). The Dempster-Shafer evidence theory generalizes Bayesian statistics by assigning basic probabilities to subsets to represent evidence and to develop evidential reasoning. This paper discusses what is the strength of evidence theory. As an application of evidence theory, evidential reasoning in air battle systems is discussed. In the air battle system, evidential reasoning is applied to fuse the muitisensor iaformation and identify the type of aircraft. The effectiveness of this fusion approach is evaluated by simulated data.展开更多
In this paper, we present Real-Time Flow Filter (RTFF) -a system that adopts a middle ground between coarse-grained volume anomaly detection and deep packet inspection. RTFF was designed with the goal of scaling to hi...In this paper, we present Real-Time Flow Filter (RTFF) -a system that adopts a middle ground between coarse-grained volume anomaly detection and deep packet inspection. RTFF was designed with the goal of scaling to high volume data feeds that are common in large Tier-1 ISP networks and providing rich, timely information on observed attacks. It is a software solution that is designed to run on off-the-shelf hardware platforms and incorporates a scalable data processing architecture along with lightweight analysis algorithms that make it suitable for deployment in large networks. RTFF also makes use of state of the art machine learning algorithms to construct attack models that can be used to detect as well as predict attacks.展开更多
This paper presents a new inductive learning algorithm, HGR (Version 2.0), based on the newly-developed extension matrix theory. The basic idea is to partition the positive examples of a specific class in a given exam...This paper presents a new inductive learning algorithm, HGR (Version 2.0), based on the newly-developed extension matrix theory. The basic idea is to partition the positive examples of a specific class in a given example set into consistent groups, and each group corresponds to a consistent rule which covers all the examples in this group and none of the negative examples. Then a performance comparison of the HGR algorithm with other inductive algorithms, such as C4.5, OC1, HCV and SVM, is given in the paper. The authors not only selected 15 databases from the famous UCI machine learning repository, but also considered a real world problem. Experimental results show that their method achieves higher accuracy and fewer rules as compared with other algorithms.展开更多
This paper describes the implementation of an Information Systems (IS) capstone project management course that is a requirement for graduating seniors in an undergraduate Computer Information Systems (CIS) program...This paper describes the implementation of an Information Systems (IS) capstone project management course that is a requirement for graduating seniors in an undergraduate Computer Information Systems (CIS) program at a regional university. The description provides a model which includes the culmination of students' academic training in an IS curriculum which is part of a Bachelor of Business Administration (BBA) program in an accredited college of business. The course requires an application of technical and business skills, as well as systems development and project management skills--while students are working on an actual IS project for an external sponsoring organization. Rationale for implementing this type of course includes the benefits it provides to the students, the project sponsors, and the IS department providing the course. Feedback from the course is used as integral part of the C1S curriculum assessment process used for accreditation purposes.展开更多
In this paper, we report in-depth analysis and research on the optimizing computer network structure based on genetic algorithm and modified convex optimization theory. Machine learning method has been widely used in ...In this paper, we report in-depth analysis and research on the optimizing computer network structure based on genetic algorithm and modified convex optimization theory. Machine learning method has been widely used in the background and one of its core problems is to solve the optimization problem. Unlike traditional batch algorithm, stochastic gradient descent algorithm in each iteration calculation, the optimization of a single sample point only losses could greatly reduce the memory overhead. The experiment illustrates the feasibility of our proposed approach.展开更多
Thanks to the fast improvement of the computing power and the rapid development of the computational chemistry and biology,the computer-aided drug design techniques have been successfully applied in almost every stage...Thanks to the fast improvement of the computing power and the rapid development of the computational chemistry and biology,the computer-aided drug design techniques have been successfully applied in almost every stage of the drug discovery and development pipeline to speed up the process of research and reduce the cost and risk related to preclinical and clinical trials.Owing to the development of machine learning theory and the accumulation of pharmacological data, the artificial intelligence(AI) technology, as a powerful data mining tool, has cut a figure in various fields of the drug design, such as virtual screening,activity scoring, quantitative structure-activity relationship(QSAR) analysis, de novo drug design, and in silico evaluation of absorption, distribution, metabolism, excretion and toxicity(ADME/T) properties. Although it is still challenging to provide a physical explanation of the AI-based models, it indeed has been acting as a great power to help manipulating the drug discovery through the versatile frameworks. Recently, due to the strong generalization ability and powerful feature extraction capability,deep learning methods have been employed in predicting the molecular properties as well as generating the desired molecules,which will further promote the application of AI technologies in the field of drug design.展开更多
Fast prediction of permeability directly from images enabled by image recognition neural networks is a novel pore-scale modeling method that has a great potential. This article presents a framework that includes (1) g...Fast prediction of permeability directly from images enabled by image recognition neural networks is a novel pore-scale modeling method that has a great potential. This article presents a framework that includes (1) generation of porous media samples,(2) computation of permeability via fluid dynamics simulations,(3) training of convolutional neural networks (CNN) with simulated data, and (4) validations against simulations. Comparison of machine learning results and the ground truths suggests excellent predictive performance across a wide range of porosities and pore geometries, especially for those with dilated pores. Owning to such heterogeneity, the permeability cannot be estimated using the conventional Kozeny–Carman approach. Computational time was reduced by several orders of magnitude compared to fluid dynamic simulations. We found that, by including physical parameters that are known to affect permeability into the neural network, the physics-informed CNN generated better results than regular CNN. However, improvements vary with implemented heterogeneity.展开更多
Abstract Accurate forecast of future container throughput of a port is very important for its con struction, upgrading, and operation management. This study proposes a transfer forecasting model guided by discrete par...Abstract Accurate forecast of future container throughput of a port is very important for its con struction, upgrading, and operation management. This study proposes a transfer forecasting model guided by discrete particle swarm optimization algorithm (TF-DPSO). It firstly transfers some related time series in source domain to assist in modeling the target time series by transfer learning technique, and then constructs the forecasting model by a pattern matching method called analog complexing. Finally, the discrete particle swarm optimization algorithm is introduced to find the optimal match between the two important parameters in TF-DPSO. The container throughput time series of two im portant ports in China, Shanghai Port and Ningbo Port are used for empirical analysis, and the results show the effectiveness of the proposed model.展开更多
文摘In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4. 5 and OCI.
基金Supported by the National Natural Science Foundation of China (No.60435020).
文摘Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in practical application. In this paper, we perform WSD study based on large scale real-world corpus using two unsupervised learning algorithms based on ±n-improved Bayesian model and Dependency Grammar (DG)-improved Bayesian model. ±n-improved classifiers reduce the window size of context of ambiguous words with close-distance feature extraction method, and decrease the jamming of useless features, thus obviously improve the accuracy, reaching 83.18% (in open test). DG-improved classifier can more effectively conquer the noise effect existing in Naive-Bayesian classifier. Experimental results show that this approach does better on Chinese WSD, and the open test achieved an accuracy of 86.27%.
文摘The peripheral nervous system plays a major role in the maintenance of our physiology. Several peripheral nerves intimately regulate the state of the brain, spinal cord, and visceral systems. A new class of therapeutics, called bioelectronic medicines, are being developed to precisely regulate physiology and treat dysfunction using peripheral nerve stimulation. In this review, we first discuss new work using closed-loop bioelectronic medicine to treat upper limb paralysis. In contrast to open-loop bioelectronic medicines, closed-loop approaches trigger ‘on demand' peripheral nerve stimulation due to a change in function(e.g., during an upper limb movement or a change in cardiopulmonary state). We also outline our perspective on timing rules for closedloop bioelectronic stimulation, interface features for non-invasively stimulating peripheral nerves, and machine learning algorithms to recognize disease events for closed-loop stimulation control. Although there will be several challenges for this emerging field, we look forward to future bioelectronic medicines that can autonomously sense changes in the body, to provide closed-loop peripheral nerve stimulation and treat disease.
基金Supported by the National Natural Science Foundation of China(No.60832009), the Natural Science Foundation of Beijing (No.4102044) and the National Nature Science Foundation for Young Scholars of China (No.61001115)
文摘Spectrum sensing is one of the key issues in cognitive radio networks. Most of previous work concenates on sensing the spectrum in a single spectrum band. In this paper, we propose a spectrum sensing sequence prediction scheme for cognitive radio networks with multiple spectrum bands to decrease the spectrum sensing time and increase the throughput of secondary users. The scheme is based on recent advances in computational learning theory, which has shown that prediction is synonymous with data compression. A Ziv-Lempel data compression algorithm is used to design our spectrum sensing sequence prediction scheme. The spectrum band usage history is used for the prediction in our proposed scheme. Simulation results show that the proposed scheme can reduce the average sensing time and improve the system throughput significantly.
基金Specialized Research Fund for the Doctoral Program of Higher Education,China(No.20060183041)
文摘Bayesian statistics assigns basic probabilities to singletons (single element sets). The Dempster-Shafer evidence theory generalizes Bayesian statistics by assigning basic probabilities to subsets to represent evidence and to develop evidential reasoning. This paper discusses what is the strength of evidence theory. As an application of evidence theory, evidential reasoning in air battle systems is discussed. In the air battle system, evidential reasoning is applied to fuse the muitisensor iaformation and identify the type of aircraft. The effectiveness of this fusion approach is evaluated by simulated data.
文摘In this paper, we present Real-Time Flow Filter (RTFF) -a system that adopts a middle ground between coarse-grained volume anomaly detection and deep packet inspection. RTFF was designed with the goal of scaling to high volume data feeds that are common in large Tier-1 ISP networks and providing rich, timely information on observed attacks. It is a software solution that is designed to run on off-the-shelf hardware platforms and incorporates a scalable data processing architecture along with lightweight analysis algorithms that make it suitable for deployment in large networks. RTFF also makes use of state of the art machine learning algorithms to construct attack models that can be used to detect as well as predict attacks.
文摘This paper presents a new inductive learning algorithm, HGR (Version 2.0), based on the newly-developed extension matrix theory. The basic idea is to partition the positive examples of a specific class in a given example set into consistent groups, and each group corresponds to a consistent rule which covers all the examples in this group and none of the negative examples. Then a performance comparison of the HGR algorithm with other inductive algorithms, such as C4.5, OC1, HCV and SVM, is given in the paper. The authors not only selected 15 databases from the famous UCI machine learning repository, but also considered a real world problem. Experimental results show that their method achieves higher accuracy and fewer rules as compared with other algorithms.
文摘This paper describes the implementation of an Information Systems (IS) capstone project management course that is a requirement for graduating seniors in an undergraduate Computer Information Systems (CIS) program at a regional university. The description provides a model which includes the culmination of students' academic training in an IS curriculum which is part of a Bachelor of Business Administration (BBA) program in an accredited college of business. The course requires an application of technical and business skills, as well as systems development and project management skills--while students are working on an actual IS project for an external sponsoring organization. Rationale for implementing this type of course includes the benefits it provides to the students, the project sponsors, and the IS department providing the course. Feedback from the course is used as integral part of the C1S curriculum assessment process used for accreditation purposes.
文摘In this paper, we report in-depth analysis and research on the optimizing computer network structure based on genetic algorithm and modified convex optimization theory. Machine learning method has been widely used in the background and one of its core problems is to solve the optimization problem. Unlike traditional batch algorithm, stochastic gradient descent algorithm in each iteration calculation, the optimization of a single sample point only losses could greatly reduce the memory overhead. The experiment illustrates the feasibility of our proposed approach.
基金supported by the National Natural Science Foundation of China (21210003 and 81230076 to H.J., 81773634 to M.Z. and 81430084 to K.C.)the “Personalized Medicines-Molecular Signature-based Drug Discovery and Development”, Strategic Priority Research Program of the Chinese Academy of Sciences (XDA12050201 to M.Z.)+1 种基金National Key Research & Development Plan (2016YFC1201003 to M.Z.)the National Basic Research Program (2015CB910304 to X.L.)
文摘Thanks to the fast improvement of the computing power and the rapid development of the computational chemistry and biology,the computer-aided drug design techniques have been successfully applied in almost every stage of the drug discovery and development pipeline to speed up the process of research and reduce the cost and risk related to preclinical and clinical trials.Owing to the development of machine learning theory and the accumulation of pharmacological data, the artificial intelligence(AI) technology, as a powerful data mining tool, has cut a figure in various fields of the drug design, such as virtual screening,activity scoring, quantitative structure-activity relationship(QSAR) analysis, de novo drug design, and in silico evaluation of absorption, distribution, metabolism, excretion and toxicity(ADME/T) properties. Although it is still challenging to provide a physical explanation of the AI-based models, it indeed has been acting as a great power to help manipulating the drug discovery through the versatile frameworks. Recently, due to the strong generalization ability and powerful feature extraction capability,deep learning methods have been employed in predicting the molecular properties as well as generating the desired molecules,which will further promote the application of AI technologies in the field of drug design.
文摘Fast prediction of permeability directly from images enabled by image recognition neural networks is a novel pore-scale modeling method that has a great potential. This article presents a framework that includes (1) generation of porous media samples,(2) computation of permeability via fluid dynamics simulations,(3) training of convolutional neural networks (CNN) with simulated data, and (4) validations against simulations. Comparison of machine learning results and the ground truths suggests excellent predictive performance across a wide range of porosities and pore geometries, especially for those with dilated pores. Owning to such heterogeneity, the permeability cannot be estimated using the conventional Kozeny–Carman approach. Computational time was reduced by several orders of magnitude compared to fluid dynamic simulations. We found that, by including physical parameters that are known to affect permeability into the neural network, the physics-informed CNN generated better results than regular CNN. However, improvements vary with implemented heterogeneity.
基金partly supported by the Natural Science Foundation of China under Grant Nos.71101100 and 70731160635New Teachers’Fund for Doctor Stations,Ministry of Education under Grant No.20110181120047+5 种基金Excellent Youth Fund of Sichuan University under Grant No.2013SCU04A08China Postdoctoral Science Foundation under Grant Nos.2011M500418,2012T50148 and 2013M530753Frontier and Cross-innovation Foundation of Sichuan University under Grant No.skqy201352Soft Science Foundation of Sichuan Province under Grant No.2013ZR0016Humanities and Social Sciences Youth Foundation of the Ministry of Education of China under Grant No.11YJC870028Selfdetermined Research Funds of CCNU from the Colleges’ Basic Research and Operation of MOE under Grant No.CCNU13F030
文摘Abstract Accurate forecast of future container throughput of a port is very important for its con struction, upgrading, and operation management. This study proposes a transfer forecasting model guided by discrete particle swarm optimization algorithm (TF-DPSO). It firstly transfers some related time series in source domain to assist in modeling the target time series by transfer learning technique, and then constructs the forecasting model by a pattern matching method called analog complexing. Finally, the discrete particle swarm optimization algorithm is introduced to find the optimal match between the two important parameters in TF-DPSO. The container throughput time series of two im portant ports in China, Shanghai Port and Ningbo Port are used for empirical analysis, and the results show the effectiveness of the proposed model.