This paper gives the representation of rules, the strategy of rule controlling and the existing problems in English Chinese Machine Translation(MT) named BT863 I. Then it puts forward a method for processing these rul...This paper gives the representation of rules, the strategy of rule controlling and the existing problems in English Chinese Machine Translation(MT) named BT863 I. Then it puts forward a method for processing these rules based on the decision tree. With this method, some problems such as rule conflic and rule redundancy occurring in BT863 I have been solved and the efficiency of MT system has been improved greatly. This method also has general meaning in the Rule based expert system.展开更多
In machine translation(MT) practice,there is an urgent need for constructing a set of Chinese-to-English aspect transferring rules to define the transferring conditions.The integrated feature set was used to generaliz...In machine translation(MT) practice,there is an urgent need for constructing a set of Chinese-to-English aspect transferring rules to define the transferring conditions.The integrated feature set was used to generalize and justify the Chinese-to-English transferring rule of the 'ZHE' aspect(ZHE Rule).A ZHE classification model was built in this study.The impacts of each set of temporal,lexical aspectual,and syntactic features,and their integrated impacts,on the accuracy of the ZHE Rule were tested.Over 600 misclassified corpus sentences were manually examined.A 10-fold cross-validation was used with a decision tree algorithm.The main results are:(1) The ZHE Rule was generalized and justified to have a higher accuracy under the two metrics:the precision rate and the areas under the receiver operating characteristic curve(AUC).(2) The temporal,lexical aspectual,and syntactic feature sets have an integrated contribution to the accuracy of the ZHE Rule.The syntactic and temporal features have an impact on ZHE aspect derivations,while the lexical aspectual features are not predictive of ZHE aspect derivation.(3) While associated with active verbs,the ZHE aspect can denote a perfective situation.This study suggests that the temporal and syntactic features are the predictive ZHE aspect classification features and that the ZHE Rule with an overall precision rate of 80.1% is accurate enough to be further explored in MT practice.The machine learning method,decision tree,can be applied to the automatic aspect transferring in MT research and aspectual interpretations in linguistic research.展开更多
As the use of blockchain for digital payments continues to rise,it becomes susceptible to various malicious attacks.Successfully detecting anomalies within blockchain transactions is essential for bolstering trust in ...As the use of blockchain for digital payments continues to rise,it becomes susceptible to various malicious attacks.Successfully detecting anomalies within blockchain transactions is essential for bolstering trust in digital payments.However,the task of anomaly detection in blockchain transaction data is challenging due to the infrequent occurrence of illicit transactions.Although several studies have been conducted in the field,a limitation persists:the lack of explanations for the model’s predictions.This study seeks to overcome this limitation by integrating explainable artificial intelligence(XAI)techniques and anomaly rules into tree-based ensemble classifiers for detecting anomalous Bitcoin transactions.The shapley additive explanation(SHAP)method is employed to measure the contribution of each feature,and it is compatible with ensemble models.Moreover,we present rules for interpreting whether a Bitcoin transaction is anomalous or not.Additionally,we introduce an under-sampling algorithm named XGBCLUS,designed to balance anomalous and non-anomalous transaction data.This algorithm is compared against other commonly used under-sampling and over-sampling techniques.Finally,the outcomes of various tree-based single classifiers are compared with those of stacking and voting ensemble classifiers.Our experimental results demonstrate that:(i)XGBCLUS enhances true positive rate(TPR)and receiver operating characteristic-area under curve(ROC-AUC)scores compared to state-of-the-art under-sampling and over-sampling techniques,and(ii)our proposed ensemble classifiers outperform traditional single tree-based machine learning classifiers in terms of accuracy,TPR,and false positive rate(FPR)scores.展开更多
To overcome the limitation that complex data types with noun attributes cannot be processed by rank learning algorithms, a new rank learning algorithm is designed. In the learning algorithm based on the decision tree,...To overcome the limitation that complex data types with noun attributes cannot be processed by rank learning algorithms, a new rank learning algorithm is designed. In the learning algorithm based on the decision tree, the splitting rule of the decision tree is revised with a new definition of rank impurity. A new rank learning algorithm, which can be intuitively explained, is obtained and its theoretical basis is provided. The experimental results show that in the aspect of average rank loss, the ranking tree algorithm outperforms perception ranking and ordinal regression algorithms and it also has a faster convergence speed. The rank learning algorithm based on the decision tree is able to process categorical data and select relative features.展开更多
文摘This paper gives the representation of rules, the strategy of rule controlling and the existing problems in English Chinese Machine Translation(MT) named BT863 I. Then it puts forward a method for processing these rules based on the decision tree. With this method, some problems such as rule conflic and rule redundancy occurring in BT863 I have been solved and the efficiency of MT system has been improved greatly. This method also has general meaning in the Rule based expert system.
基金supported by the National Social Science Foundation of China(No.08BYY001)the Worldwide Universities Network 2009 Research Mobility Programme
文摘In machine translation(MT) practice,there is an urgent need for constructing a set of Chinese-to-English aspect transferring rules to define the transferring conditions.The integrated feature set was used to generalize and justify the Chinese-to-English transferring rule of the 'ZHE' aspect(ZHE Rule).A ZHE classification model was built in this study.The impacts of each set of temporal,lexical aspectual,and syntactic features,and their integrated impacts,on the accuracy of the ZHE Rule were tested.Over 600 misclassified corpus sentences were manually examined.A 10-fold cross-validation was used with a decision tree algorithm.The main results are:(1) The ZHE Rule was generalized and justified to have a higher accuracy under the two metrics:the precision rate and the areas under the receiver operating characteristic curve(AUC).(2) The temporal,lexical aspectual,and syntactic feature sets have an integrated contribution to the accuracy of the ZHE Rule.The syntactic and temporal features have an impact on ZHE aspect derivations,while the lexical aspectual features are not predictive of ZHE aspect derivation.(3) While associated with active verbs,the ZHE aspect can denote a perfective situation.This study suggests that the temporal and syntactic features are the predictive ZHE aspect classification features and that the ZHE Rule with an overall precision rate of 80.1% is accurate enough to be further explored in MT practice.The machine learning method,decision tree,can be applied to the automatic aspect transferring in MT research and aspectual interpretations in linguistic research.
文摘As the use of blockchain for digital payments continues to rise,it becomes susceptible to various malicious attacks.Successfully detecting anomalies within blockchain transactions is essential for bolstering trust in digital payments.However,the task of anomaly detection in blockchain transaction data is challenging due to the infrequent occurrence of illicit transactions.Although several studies have been conducted in the field,a limitation persists:the lack of explanations for the model’s predictions.This study seeks to overcome this limitation by integrating explainable artificial intelligence(XAI)techniques and anomaly rules into tree-based ensemble classifiers for detecting anomalous Bitcoin transactions.The shapley additive explanation(SHAP)method is employed to measure the contribution of each feature,and it is compatible with ensemble models.Moreover,we present rules for interpreting whether a Bitcoin transaction is anomalous or not.Additionally,we introduce an under-sampling algorithm named XGBCLUS,designed to balance anomalous and non-anomalous transaction data.This algorithm is compared against other commonly used under-sampling and over-sampling techniques.Finally,the outcomes of various tree-based single classifiers are compared with those of stacking and voting ensemble classifiers.Our experimental results demonstrate that:(i)XGBCLUS enhances true positive rate(TPR)and receiver operating characteristic-area under curve(ROC-AUC)scores compared to state-of-the-art under-sampling and over-sampling techniques,and(ii)our proposed ensemble classifiers outperform traditional single tree-based machine learning classifiers in terms of accuracy,TPR,and false positive rate(FPR)scores.
基金The Planning Program of Science and Technology of Hunan Province (No05JT1039)
文摘To overcome the limitation that complex data types with noun attributes cannot be processed by rank learning algorithms, a new rank learning algorithm is designed. In the learning algorithm based on the decision tree, the splitting rule of the decision tree is revised with a new definition of rank impurity. A new rank learning algorithm, which can be intuitively explained, is obtained and its theoretical basis is provided. The experimental results show that in the aspect of average rank loss, the ranking tree algorithm outperforms perception ranking and ordinal regression algorithms and it also has a faster convergence speed. The rank learning algorithm based on the decision tree is able to process categorical data and select relative features.