This paper describes the experiments with Korean-to-Vietnamese statistical machine translation(SMT). The fact that Korean is a morphologically complex language that does not have clear optimal word boundaries causes a...This paper describes the experiments with Korean-to-Vietnamese statistical machine translation(SMT). The fact that Korean is a morphologically complex language that does not have clear optimal word boundaries causes a major problem of translating into or from Korean. To solve this problem, we present a method to conduct a Korean morphological analysis by using a pre-analyzed partial word-phrase dictionary(PWD).Besides, we build a Korean-Vietnamese parallel corpus for training SMT models by collecting text from multilingual magazines. Then, we apply such a morphology analysis to Korean sentences that are included in the collected parallel corpus as a preprocessing step. The experiment results demonstrate a remarkable improvement of Korean-to-Vietnamese translation quality in term of bi-lingual evaluation understudy(BLEU).展开更多
This paper proposed a method to incorporate syntax-based language models in phrase-based statistical machine translation (SMT) systems. The syntax-based language model used in this paper is based on link grammar,which...This paper proposed a method to incorporate syntax-based language models in phrase-based statistical machine translation (SMT) systems. The syntax-based language model used in this paper is based on link grammar,which is a high lexical formalism. In order to apply language models based on link grammar in phrase-based models,the concept of linked phrases,an extension of the concept of traditional phrases in phrase-based models was brought out. Experiments were conducted and the results showed that the use of syntax-based language models could improve the performance of the phrase-based models greatly.展开更多
A novel model based on structure alignments is proposed for statistical machine translation in this paper. Meta-structure and sequence of meta-structure for a parse tree are defined. During the translation process, a ...A novel model based on structure alignments is proposed for statistical machine translation in this paper. Meta-structure and sequence of meta-structure for a parse tree are defined. During the translation process, a parse tree is decomposed to deal with the structure divergence and the alignments can be constructed at different levels of recombination of meta-structure (RM). This method can perform the structure mapping across the sub-tree structure between languages. As a result, we get not only the translation for the target language, but sequence of meta-stmctu .re of its parse tree at the same time. Experiments show that the model in the framework of log-linear model has better generative ability and significantly outperforms Pharaoh, a phrase-based system.展开更多
Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word a...Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.展开更多
As the new generation of artificial intelligence(AI)continues to evolve,weather big data and statistical machine learning(SML)technologies complement each other and are deeply integrated to significantly improve the p...As the new generation of artificial intelligence(AI)continues to evolve,weather big data and statistical machine learning(SML)technologies complement each other and are deeply integrated to significantly improve the processing and forecasting accuracy of fishery weather.Accurate fishery weather services play a crucial role in fishery production,serving as a great safeguard for economic benefits and personal safety,enabling fishermen to carry out fishery production better,and contributing to the sustainable development of the fishery industry.The objective of this paper is to offer an understanding of the present state of research and development in SML technology for simulating and forecasting fishery weather.Specifically,we analyze the current state of research and technical features of SML in weather and summarize the applications of SML in simulation and forecasting of fishery weather,which mainly include three aspects:fishery weather scenario generation,fishery weather forecasting,and fishery extreme weather warning.We also illustrate the main technical means and principles of SML technology.Finally,we summarize the most advanced SML fields and provide an outlook on their application value in the field of fishery weather.展开更多
New energy integration and flexible demand response make smart grid operation scenarios complex and change-able,which bring challenges to network planning.If every possible scenario is considered,the solution to the p...New energy integration and flexible demand response make smart grid operation scenarios complex and change-able,which bring challenges to network planning.If every possible scenario is considered,the solution to the plan-ning can become extremely time-consuming and difficult.This paper introduces statistical machine learning(SML)techniques to carry out multi-scenario based probabilistic power flow calculations and describes their application to the stochastic planning of distribution networks.The proposed SML includes linear regression,probability distribu-tion,Markov chain,isoprobabilistic transformation,maximum likelihood estimator,stochastic response surface and center point method.Based on the above SML model,capricious weather,photovoltaic power generation,thermal load,power flow and uncertainty programming are simulated.Taking a 33-bus distribution system as an example,this paper compares the stochastic planning model based on SML with the traditional models published in the literature.The results verify that the proposed model greatly improves planning performance while meeting accuracy require-ments.The case study also considers a realistic power distribution system operating under stressed conditions.展开更多
The development of distributed renewable energy,such as photovoltaic power and wind power generation,makes the energy system cleaner,and is of great significance in reducing carbon emissions.However,weather can affect...The development of distributed renewable energy,such as photovoltaic power and wind power generation,makes the energy system cleaner,and is of great significance in reducing carbon emissions.However,weather can affect distributed renewable energy power generation,and the uncertainty of output brings challenges to uncertainty planning for distributed renewable energy.Energy systems with high penetration of distributed renewable energy involve the high-dimensional,nonlinear dynamics of large-scale complex systems,and the optimal solution of the uncertainty model is a difficult problem.From the perspective of statistical machine learning,the theory of planning of distributed renewable energy systems under uncertainty is reviewed and some key technologies are put forward for applying advanced artificial intelligence to distributed renewable power uncertainty planning.展开更多
Unknown words are one of the key factors that greatly affect the translation quality. Traditionally, nearly all the related researches focus on obtaining the translation of the unknown words. However, these approaches...Unknown words are one of the key factors that greatly affect the translation quality. Traditionally, nearly all the related researches focus on obtaining the translation of the unknown words. However, these approaches have two disadvantages. On the one hand, they usually rely on many additional resources such as bilingual web data; on the other hand, they cannot guarantee good reordering and lexical selection of surrounding words. This paper gives a new perspective on handling unknown words in statistical machine translation (SMT). Instead of making great efforts to find the translation of unknown words, we focus on determining the semantic function of the unknown word in the test sentence and keeping the semantic function unchanged in the translation process. In this way, unknown words can help the phrase reordering and lexical selection of their surrounding words even though they still remain untranslated. In order to determine the semantic function of an unknown word, we employ the distributional semantic model and the bidirectional language model. Extensive experiments on both phrase-based and linguistically syntax-based SMT models in Chinese-to-English translation show that our method can substantially improve the translation quality.展开更多
The pivot language approach for statistical machine translation(SMT) is a good method to break the resource bottleneck for certain language pairs. However, in the implementation of conventional approaches, pivotside c...The pivot language approach for statistical machine translation(SMT) is a good method to break the resource bottleneck for certain language pairs. However, in the implementation of conventional approaches, pivotside context information is far from fully utilized, resulting in erroneous estimations of translation probabilities. In this study, we propose two topic-aware pivot language approaches to use different levels of pivot-side context. The first method takes advantage of document-level context by assuming that the bridged phrase pairs should be similar in the document-level topic distributions. The second method focuses on the effect of local context. Central to this approach are that the phrase sense can be reflected by local context in the form of probabilistic topics, and that bridged phrase pairs should be compatible in the latent sense distributions. Then, we build an interpolated model bringing the above methods together to further enhance the system performance. Experimental results on French-Spanish and French-German translations using English as the pivot language demonstrate the effectiveness of topic-based context in pivot-based SMT.展开更多
Companies like Google, MSN and Yahoo provide translation services on their websites, generating translations based on statistical bilingual text corpora. Human translation seems to be inferior in face of huge amount o...Companies like Google, MSN and Yahoo provide translation services on their websites, generating translations based on statistical bilingual text corpora. Human translation seems to be inferior in face of huge amount of information and fast development of computer science. Despite the functions and versatility of statistical machine translation, it may never take the place of human effort. Teachers are supposed to guide the students in using online translation system.展开更多
Retelling extraction is an important branch of Natural Language Processing(NLP),and high-quality retelling resources are very helpful to improve the performance of machine translation.However,traditional methods based...Retelling extraction is an important branch of Natural Language Processing(NLP),and high-quality retelling resources are very helpful to improve the performance of machine translation.However,traditional methods based on the bilingual parallel corpus often ignore the document background in the process of retelling acquisition and application.In order to solve this problem,we introduce topic model information into the translation mode and propose a topic-based statistical machine translation method to improve the translation performance.In this method,Probabilistic Latent Semantic Analysis(PLSA)is used to obtains the co-occurrence relationship between words and documents by the hybrid matrix decomposition.Then we design a decoder to simplify the decoding process.Experiments show that the proposed method can effectively improve the accuracy of translation.展开更多
Loose phrase extraction method is proposed and applied for phrase-based statistical ma- chine translation. The method extracts phrase pairs that are not strictly consistent with word align- ments. Two types of constra...Loose phrase extraction method is proposed and applied for phrase-based statistical ma- chine translation. The method extracts phrase pairs that are not strictly consistent with word align- ments. Two types of constraints on word positions are investigated for this method. Furthermore, n-best alignments are introduced for phrase extraction instead of the one-best. Experimental results show that the proposed approach outperforms the baseline system, Pharaoh system, for both one-best and n-best alignments.展开更多
Building energy consumption accounts for nearly 40% of global energy consumption, HVAC (Heating, Ventilating, and Air Conditioning) systems are the major building energy consumers, and as one type of HVAC systems, t...Building energy consumption accounts for nearly 40% of global energy consumption, HVAC (Heating, Ventilating, and Air Conditioning) systems are the major building energy consumers, and as one type of HVAC systems, the heat pump air conditioning system, which is more energy-efficient compared to the traditional air conditioning system, is being more widely used to save energy. However, in northern China, extreme climatic conditions increase the cooling and heating load of the heat pump air conditioning system and accelerate the aging of the equipment, and the sensor may detect drifted parameters owing to climate change. This non-linear drifted parameter increases the false alarm rate of the fault detection and the need for unnecessary troubleshooting. In order to overcome the impact of the device aging and the drifted parameter, a Kalman filter and SPC (statistical process control) fault detection method are introduced in this paper. In this method, the model parameter and its standard variance can he estimated by Kalman filter based on the gray model and the real-time data of the air conditioning system. Further, by using SPC to construct the dynamic control limits, false alarm rate is reduced. And this paper mainly focuses on the cold machine failure in the component failure and its soft fault detection. This approach has been tested on a simulation model of the "Sino-German Energy Conservation Demonstration Center" building heat pump air-conditioning system in Shenyang, China, and the results show that the Kalman filter and SPC fault detection method is simple and highly efficient with a low false alarm rate, and it can deal with the difficulties caused by the extreme environment and the non-linear influence of the parameters, and what's more, it provides a good foundation for dynamic fault diagnosis and fault prediction analysis.展开更多
Online banking fraud occurs whenever a criminal can seize accounts and transfer funds from an individual’s online bank account.Successfully preventing this requires the detection of as many fraudsters as possible,wit...Online banking fraud occurs whenever a criminal can seize accounts and transfer funds from an individual’s online bank account.Successfully preventing this requires the detection of as many fraudsters as possible,without producing too many false alarms.This is a challenge for machine learning owing to the extremely imbalanced data and complexity of fraud.In addition,classical machine learning methods must be extended,minimizing expected financial losses.Finally,fraud can only be combated systematically and economically if the risks and costs in payment channels are known.We define three models that overcome these challenges:machine learning-based fraud detection,economic optimization of machine learning results,and a risk model to predict the risk of fraud while considering countermeasures.The models were tested utilizing real data.Our machine learning model alone reduces the expected and unexpected losses in the three aggregated payment channels by 15%compared to a benchmark consisting of static if-then rules.Optimizing the machine-learning model further reduces the expected losses by 52%.These results hold with a low false positive rate of 0.4%.Thus,the risk framework of the three models is viable from a business and risk perspective.展开更多
For aircraft manufacturing industries, the analyses and prediction of part machining error during machining process are very important to control and improve part machining quality. In order to effectively control mac...For aircraft manufacturing industries, the analyses and prediction of part machining error during machining process are very important to control and improve part machining quality. In order to effectively control machining error, the method of integrating multivariate statistical process control (MSPC) and stream of variations (SoV) is proposed. Firstly, machining error is modeled by multi-operation approaches for part machining process. SoV is adopted to establish the mathematic model of the relationship between the error of upstream operations and the error of downstream operations. Here error sources not only include the influence of upstream operations but also include many of other error sources. The standard model and the predicted model about SoV are built respectively by whether the operation is done or not to satisfy different requests during part machining process. Secondly, the method of one-step ahead forecast error (OSFE) is used to eliminate autocorrelativity of the sample data from the SoV model, and the T2 control chart in MSPC is built to realize machining error detection according to the data characteristics of the above error model, which can judge whether the operation is out of control or not. If it is, then feedback is sent to the operations. The error model is modified by adjusting the operation out of control, and continually it is used to monitor operations. Finally, a machining instance containing two operations demonstrates the effectiveness of the machining error control method presented in this paper.展开更多
Abstract The goals of this paper are twofold: we describe common features in data sets from motor vehicle insurance companies and we investigate a general strategy which exploits the knowledge of such features. The re...Abstract The goals of this paper are twofold: we describe common features in data sets from motor vehicle insurance companies and we investigate a general strategy which exploits the knowledge of such features. The results of the strategy are a basis to develop insurance tariffs. We use a nonparametric approach based on a combination of kernel logistic regression and e-support vector regression which both have good robustness properties. The strategy is applied to a data set from motor vehicle insurance companies.展开更多
基金supported by the Institute for Information&communications Technology Promotion under Grant No.R0101-16-0176the Project of Core Technology Development for Human-Like Self-Taught Learning Based on Symbolic Approach
文摘This paper describes the experiments with Korean-to-Vietnamese statistical machine translation(SMT). The fact that Korean is a morphologically complex language that does not have clear optimal word boundaries causes a major problem of translating into or from Korean. To solve this problem, we present a method to conduct a Korean morphological analysis by using a pre-analyzed partial word-phrase dictionary(PWD).Besides, we build a Korean-Vietnamese parallel corpus for training SMT models by collecting text from multilingual magazines. Then, we apply such a morphology analysis to Korean sentences that are included in the collected parallel corpus as a preprocessing step. The experiment results demonstrate a remarkable improvement of Korean-to-Vietnamese translation quality in term of bi-lingual evaluation understudy(BLEU).
基金National Natural Science Foundation of China ( No.60803078)National High Technology Research and Development Programs of China (No.2006AA010107, No.2006AA010108)
文摘This paper proposed a method to incorporate syntax-based language models in phrase-based statistical machine translation (SMT) systems. The syntax-based language model used in this paper is based on link grammar,which is a high lexical formalism. In order to apply language models based on link grammar in phrase-based models,the concept of linked phrases,an extension of the concept of traditional phrases in phrase-based models was brought out. Experiments were conducted and the results showed that the use of syntax-based language models could improve the performance of the phrase-based models greatly.
基金the National High Technology Research and Development Progran of China(No.200606010108.2006AA01Z150)
文摘A novel model based on structure alignments is proposed for statistical machine translation in this paper. Meta-structure and sequence of meta-structure for a parse tree are defined. During the translation process, a parse tree is decomposed to deal with the structure divergence and the alignments can be constructed at different levels of recombination of meta-structure (RM). This method can perform the structure mapping across the sub-tree structure between languages. As a result, we get not only the translation for the target language, but sequence of meta-stmctu .re of its parse tree at the same time. Experiments show that the model in the framework of log-linear model has better generative ability and significantly outperforms Pharaoh, a phrase-based system.
基金supported by the National Natural Science Foundation of China(No.61303082) the Research Fund for the Doctoral Program of Higher Education of China(No.20120121120046)
文摘Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method.
基金the National Natural Science Foundation of China under Grant 52007193 and The 2115 Talent Development Program of China Agricultural University.
文摘As the new generation of artificial intelligence(AI)continues to evolve,weather big data and statistical machine learning(SML)technologies complement each other and are deeply integrated to significantly improve the processing and forecasting accuracy of fishery weather.Accurate fishery weather services play a crucial role in fishery production,serving as a great safeguard for economic benefits and personal safety,enabling fishermen to carry out fishery production better,and contributing to the sustainable development of the fishery industry.The objective of this paper is to offer an understanding of the present state of research and development in SML technology for simulating and forecasting fishery weather.Specifically,we analyze the current state of research and technical features of SML in weather and summarize the applications of SML in simulation and forecasting of fishery weather,which mainly include three aspects:fishery weather scenario generation,fishery weather forecasting,and fishery extreme weather warning.We also illustrate the main technical means and principles of SML technology.Finally,we summarize the most advanced SML fields and provide an outlook on their application value in the field of fishery weather.
基金supported by the National Natural Science Foundation of China under Grant 52007193 and The 2115 Talent Development Program of China Agricultural University.
文摘New energy integration and flexible demand response make smart grid operation scenarios complex and change-able,which bring challenges to network planning.If every possible scenario is considered,the solution to the plan-ning can become extremely time-consuming and difficult.This paper introduces statistical machine learning(SML)techniques to carry out multi-scenario based probabilistic power flow calculations and describes their application to the stochastic planning of distribution networks.The proposed SML includes linear regression,probability distribu-tion,Markov chain,isoprobabilistic transformation,maximum likelihood estimator,stochastic response surface and center point method.Based on the above SML model,capricious weather,photovoltaic power generation,thermal load,power flow and uncertainty programming are simulated.Taking a 33-bus distribution system as an example,this paper compares the stochastic planning model based on SML with the traditional models published in the literature.The results verify that the proposed model greatly improves planning performance while meeting accuracy require-ments.The case study also considers a realistic power distribution system operating under stressed conditions.
基金supported by the State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources under Grant No.LAPS21016the National Natural Science Foundation of China under Grant 52007193the 2115 Talent Development Program of China Agricultural University.
文摘The development of distributed renewable energy,such as photovoltaic power and wind power generation,makes the energy system cleaner,and is of great significance in reducing carbon emissions.However,weather can affect distributed renewable energy power generation,and the uncertainty of output brings challenges to uncertainty planning for distributed renewable energy.Energy systems with high penetration of distributed renewable energy involve the high-dimensional,nonlinear dynamics of large-scale complex systems,and the optimal solution of the uncertainty model is a difficult problem.From the perspective of statistical machine learning,the theory of planning of distributed renewable energy systems under uncertainty is reviewed and some key technologies are put forward for applying advanced artificial intelligence to distributed renewable power uncertainty planning.
基金Supported by the National High Technology Research and Development 863 Program of China under Grant Nos. 2011AA01A207,2012AA011101, and 2012AA011102
文摘Unknown words are one of the key factors that greatly affect the translation quality. Traditionally, nearly all the related researches focus on obtaining the translation of the unknown words. However, these approaches have two disadvantages. On the one hand, they usually rely on many additional resources such as bilingual web data; on the other hand, they cannot guarantee good reordering and lexical selection of surrounding words. This paper gives a new perspective on handling unknown words in statistical machine translation (SMT). Instead of making great efforts to find the translation of unknown words, we focus on determining the semantic function of the unknown word in the test sentence and keeping the semantic function unchanged in the translation process. In this way, unknown words can help the phrase reordering and lexical selection of their surrounding words even though they still remain untranslated. In order to determine the semantic function of an unknown word, we employ the distributional semantic model and the bidirectional language model. Extensive experiments on both phrase-based and linguistically syntax-based SMT models in Chinese-to-English translation show that our method can substantially improve the translation quality.
基金Project supported by the National High-Tech R&D Program of China(No.2012BAH14F03)the National Natural Science Foundation of China(Nos.61005052 and 61303082)+2 种基金the Re-search Fund for the Doctoral Program of Higher Education of China(No.20120121120046)the Natural Science Foundation of Fujian Province of China(No.2011J01360)the Funda-mental Research Funds for the Central Universities,China(No.2010121068)
文摘The pivot language approach for statistical machine translation(SMT) is a good method to break the resource bottleneck for certain language pairs. However, in the implementation of conventional approaches, pivotside context information is far from fully utilized, resulting in erroneous estimations of translation probabilities. In this study, we propose two topic-aware pivot language approaches to use different levels of pivot-side context. The first method takes advantage of document-level context by assuming that the bridged phrase pairs should be similar in the document-level topic distributions. The second method focuses on the effect of local context. Central to this approach are that the phrase sense can be reflected by local context in the form of probabilistic topics, and that bridged phrase pairs should be compatible in the latent sense distributions. Then, we build an interpolated model bringing the above methods together to further enhance the system performance. Experimental results on French-Spanish and French-German translations using English as the pivot language demonstrate the effectiveness of topic-based context in pivot-based SMT.
文摘Companies like Google, MSN and Yahoo provide translation services on their websites, generating translations based on statistical bilingual text corpora. Human translation seems to be inferior in face of huge amount of information and fast development of computer science. Despite the functions and versatility of statistical machine translation, it may never take the place of human effort. Teachers are supposed to guide the students in using online translation system.
基金supported by National Social Science Fund of China(Youth Program):“A Study of Acceptability of Chinese Government Public Signs in the New Era and the Countermeasures of the English Translation”(No.:13CYY010)the Subject Construction and Management Project of Zhejiang Gongshang University:“Research on the Organic Integration Path of Constructing Ideological and Political Training and Design of Mixed Teaching Platform during Epidemic Period”(No.:XKJS2020007)Ministry of Education IndustryUniversity Cooperative Education Program:“Research on the Construction of Cross-border Logistics Marketing Bilingual Course Integration”(NO.:202102494002).
文摘Retelling extraction is an important branch of Natural Language Processing(NLP),and high-quality retelling resources are very helpful to improve the performance of machine translation.However,traditional methods based on the bilingual parallel corpus often ignore the document background in the process of retelling acquisition and application.In order to solve this problem,we introduce topic model information into the translation mode and propose a topic-based statistical machine translation method to improve the translation performance.In this method,Probabilistic Latent Semantic Analysis(PLSA)is used to obtains the co-occurrence relationship between words and documents by the hybrid matrix decomposition.Then we design a decoder to simplify the decoding process.Experiments show that the proposed method can effectively improve the accuracy of translation.
基金the High Technology Research and Develop-ment Program of China (No.2004AA117010-08).
文摘Loose phrase extraction method is proposed and applied for phrase-based statistical ma- chine translation. The method extracts phrase pairs that are not strictly consistent with word align- ments. Two types of constraints on word positions are investigated for this method. Furthermore, n-best alignments are introduced for phrase extraction instead of the one-best. Experimental results show that the proposed approach outperforms the baseline system, Pharaoh system, for both one-best and n-best alignments.
基金Supported by the National Natural Science Foundation Committee of China(61503259)China Postdoctoral Science Foundation Funded Project(2017M611261)+1 种基金Chinese Scholarship Council(201608210107)Hanyu Plan of Shenyang Jianzhu University(XKHY2-64)
文摘Building energy consumption accounts for nearly 40% of global energy consumption, HVAC (Heating, Ventilating, and Air Conditioning) systems are the major building energy consumers, and as one type of HVAC systems, the heat pump air conditioning system, which is more energy-efficient compared to the traditional air conditioning system, is being more widely used to save energy. However, in northern China, extreme climatic conditions increase the cooling and heating load of the heat pump air conditioning system and accelerate the aging of the equipment, and the sensor may detect drifted parameters owing to climate change. This non-linear drifted parameter increases the false alarm rate of the fault detection and the need for unnecessary troubleshooting. In order to overcome the impact of the device aging and the drifted parameter, a Kalman filter and SPC (statistical process control) fault detection method are introduced in this paper. In this method, the model parameter and its standard variance can he estimated by Kalman filter based on the gray model and the real-time data of the air conditioning system. Further, by using SPC to construct the dynamic control limits, false alarm rate is reduced. And this paper mainly focuses on the cold machine failure in the component failure and its soft fault detection. This approach has been tested on a simulation model of the "Sino-German Energy Conservation Demonstration Center" building heat pump air-conditioning system in Shenyang, China, and the results show that the Kalman filter and SPC fault detection method is simple and highly efficient with a low false alarm rate, and it can deal with the difficulties caused by the extreme environment and the non-linear influence of the parameters, and what's more, it provides a good foundation for dynamic fault diagnosis and fault prediction analysis.
基金from any funding agency in the public,commercial,or not-for-profit sectors.
文摘Online banking fraud occurs whenever a criminal can seize accounts and transfer funds from an individual’s online bank account.Successfully preventing this requires the detection of as many fraudsters as possible,without producing too many false alarms.This is a challenge for machine learning owing to the extremely imbalanced data and complexity of fraud.In addition,classical machine learning methods must be extended,minimizing expected financial losses.Finally,fraud can only be combated systematically and economically if the risks and costs in payment channels are known.We define three models that overcome these challenges:machine learning-based fraud detection,economic optimization of machine learning results,and a risk model to predict the risk of fraud while considering countermeasures.The models were tested utilizing real data.Our machine learning model alone reduces the expected and unexpected losses in the three aggregated payment channels by 15%compared to a benchmark consisting of static if-then rules.Optimizing the machine-learning model further reduces the expected losses by 52%.These results hold with a low false positive rate of 0.4%.Thus,the risk framework of the three models is viable from a business and risk perspective.
基金National Natural Science Foundation of China (70931004)
文摘For aircraft manufacturing industries, the analyses and prediction of part machining error during machining process are very important to control and improve part machining quality. In order to effectively control machining error, the method of integrating multivariate statistical process control (MSPC) and stream of variations (SoV) is proposed. Firstly, machining error is modeled by multi-operation approaches for part machining process. SoV is adopted to establish the mathematic model of the relationship between the error of upstream operations and the error of downstream operations. Here error sources not only include the influence of upstream operations but also include many of other error sources. The standard model and the predicted model about SoV are built respectively by whether the operation is done or not to satisfy different requests during part machining process. Secondly, the method of one-step ahead forecast error (OSFE) is used to eliminate autocorrelativity of the sample data from the SoV model, and the T2 control chart in MSPC is built to realize machining error detection according to the data characteristics of the above error model, which can judge whether the operation is out of control or not. If it is, then feedback is sent to the operations. The error model is modified by adjusting the operation out of control, and continually it is used to monitor operations. Finally, a machining instance containing two operations demonstrates the effectiveness of the machining error control method presented in this paper.
文摘Abstract The goals of this paper are twofold: we describe common features in data sets from motor vehicle insurance companies and we investigate a general strategy which exploits the knowledge of such features. The results of the strategy are a basis to develop insurance tariffs. We use a nonparametric approach based on a combination of kernel logistic regression and e-support vector regression which both have good robustness properties. The strategy is applied to a data set from motor vehicle insurance companies.