Social networking platforms provide a vital source for disseminating information across the globe,particularly in case of disaster.These platforms are great mean to find out the real account of the disaster.Twitter is...Social networking platforms provide a vital source for disseminating information across the globe,particularly in case of disaster.These platforms are great mean to find out the real account of the disaster.Twitter is an example of such platform,which has been extensively utilized by scientific community due to its unidirectional model.It is considered a challenging task to identify eyewitness tweets about the incident from the millions of tweets shared by twitter users.Research community has proposed diverse sets of techniques to identify eyewitness account.A recent state-of-the-art approach has proposed a comprehensive set of features to identify eyewitness account.However,this approach suffers some limitation.Firstly,automatically extracting the feature-words remains a perplexing task against each feature identified by the approach.Secondly,all identified features were not incorporated in the implementation.This paper has utilized the language structure,linguistics,and word relation to achieve automatic extraction of feature-words by creating grammar rules.Additionally,all identified features were implemented which were left out by the state-of-the-art model.A generic approach is taken to cover different types of disaster such as earthquakes,floods,hurricanes,and wildfires.The proposed approach was then evaluated for all disaster-types,including earthquakes,floods,hurricanes,and fire.Based on the static dictionary,the Zahra et al.approach was able to produce an F-Score value of 0.92 for Eyewitness identification in the earthquake category.The proposed approach secured F-Score values of 0.81 in the same category.This score can be considered as a significant score without using a static dictionary.展开更多
This article introduces some of the main themes in CL, with special focus on the role of constructions and idioms, and the way in which constructions and idioms are motivated. It is argued that a language can, in fact...This article introduces some of the main themes in CL, with special focus on the role of constructions and idioms, and the way in which constructions and idioms are motivated. It is argued that a language can, in fact, be regarded as a very large set of idioms and constructions. This view contrasts with the more traditional view, that a language can be analyzed as a dictionary and a grammar . Some conclusions are drawn for foreign language pedagogy, and for the design of learning aids for language students.展开更多
Securities fraud is a common worldwide problem, resulting in serious negative consequences to securities market each year. Securities Regulatory Commission from various countries has also attached great importance to ...Securities fraud is a common worldwide problem, resulting in serious negative consequences to securities market each year. Securities Regulatory Commission from various countries has also attached great importance to the detection and prevention of securities fraud activities. Securities fraud is also increasing due to the rapid expansion of securities market in China. In accomplishing the task of securities fraud detection, China Securities Regulatory Commission (CSRC) could be facilitated in their work by using a number of data mining techniques. In this paper, we investigate the usefulness of Logistic regression model, Neural Networks (NNs), Sequential minimal optimization (SMO), Radial Basis Function (RBF) networks, Bayesian networks and Grammar Based Genet- ic Programming (GBGP) in the classification of the real, large and latest China Corporate Securities Fraud (CCSF) database. The six data mining techniques are compared in terms of their performances. As a result, we found GBGP outperforms others. This paper describes the GBGP in detail in solving the CCSF problem. In addition, the Synthetic Minority Over-sampling Technique (SMOTE) is applied to generate synthetic minority class examples for the imbalanced CCSF dataset.展开更多
文摘Social networking platforms provide a vital source for disseminating information across the globe,particularly in case of disaster.These platforms are great mean to find out the real account of the disaster.Twitter is an example of such platform,which has been extensively utilized by scientific community due to its unidirectional model.It is considered a challenging task to identify eyewitness tweets about the incident from the millions of tweets shared by twitter users.Research community has proposed diverse sets of techniques to identify eyewitness account.A recent state-of-the-art approach has proposed a comprehensive set of features to identify eyewitness account.However,this approach suffers some limitation.Firstly,automatically extracting the feature-words remains a perplexing task against each feature identified by the approach.Secondly,all identified features were not incorporated in the implementation.This paper has utilized the language structure,linguistics,and word relation to achieve automatic extraction of feature-words by creating grammar rules.Additionally,all identified features were implemented which were left out by the state-of-the-art model.A generic approach is taken to cover different types of disaster such as earthquakes,floods,hurricanes,and wildfires.The proposed approach was then evaluated for all disaster-types,including earthquakes,floods,hurricanes,and fire.Based on the static dictionary,the Zahra et al.approach was able to produce an F-Score value of 0.92 for Eyewitness identification in the earthquake category.The proposed approach secured F-Score values of 0.81 in the same category.This score can be considered as a significant score without using a static dictionary.
文摘This article introduces some of the main themes in CL, with special focus on the role of constructions and idioms, and the way in which constructions and idioms are motivated. It is argued that a language can, in fact, be regarded as a very large set of idioms and constructions. This view contrasts with the more traditional view, that a language can be analyzed as a dictionary and a grammar . Some conclusions are drawn for foreign language pedagogy, and for the design of learning aids for language students.
文摘Securities fraud is a common worldwide problem, resulting in serious negative consequences to securities market each year. Securities Regulatory Commission from various countries has also attached great importance to the detection and prevention of securities fraud activities. Securities fraud is also increasing due to the rapid expansion of securities market in China. In accomplishing the task of securities fraud detection, China Securities Regulatory Commission (CSRC) could be facilitated in their work by using a number of data mining techniques. In this paper, we investigate the usefulness of Logistic regression model, Neural Networks (NNs), Sequential minimal optimization (SMO), Radial Basis Function (RBF) networks, Bayesian networks and Grammar Based Genet- ic Programming (GBGP) in the classification of the real, large and latest China Corporate Securities Fraud (CCSF) database. The six data mining techniques are compared in terms of their performances. As a result, we found GBGP outperforms others. This paper describes the GBGP in detail in solving the CCSF problem. In addition, the Synthetic Minority Over-sampling Technique (SMOTE) is applied to generate synthetic minority class examples for the imbalanced CCSF dataset.