This study examined Japanese patents in terms of the quantitative characteristics of application documents that resulted in the acquisition of rights in order to clarify the relationship between the features and paten...This study examined Japanese patents in terms of the quantitative characteristics of application documents that resulted in the acquisition of rights in order to clarify the relationship between the features and patentability of applications. The groups of approved applications and those that had not been approved were compared for 12 variables: publication time lag; numbers of inventors, classifications, pages, figures, tables, claims, priority claims, countries for priority claims, cited patents, and cited non-patent documents; and median of citation age. Furthermore, the authors carried out the experiments in which patent applications were automatically classified into two groups by the machine learning method, random forests. As a result, statistically significant differences between the two groups were observed for the following variables (p 〈 .001): the numbers of inventors, pages, figures, claims, priority claims, and countries for priority claims were significantly larger in the group of approved applications, while the time lag until publication was smaller. In particular, the publication time lag and the numbers of inventors, pages, and figures were variables representing the features that largely contribute to discriminating approved applications in the classification using random forests, which implies that these have relatively strong relationships with patentability.展开更多
Forecasting economic indices on the basis of information extracted from text documents, like newspaper articles is an attractive idea. With the help of text mining techniques, in particular sentiment analysis, we eval...Forecasting economic indices on the basis of information extracted from text documents, like newspaper articles is an attractive idea. With the help of text mining techniques, in particular sentiment analysis, we evaluate the tone of individual New York Times (NYT) articles and compare our results to the Chicago Fed National Activity Index (CFNAI). In this paper, we present a simple, intuitive framework to derive sentiment scores from text documents In particular articles are tagged based on terms and their connotated sentiment. Subsequently, we forecast the CFNAI movements via support vector machines (SVM) trained on a subset of the observed sentiment scores. We apply our model into two different data sets, the whole NYT articles and the articles categorized as NYT business news. On both data sets, we applied a simple performance measure to evaluate forecasting accuracy of the CFNAI展开更多
文摘This study examined Japanese patents in terms of the quantitative characteristics of application documents that resulted in the acquisition of rights in order to clarify the relationship between the features and patentability of applications. The groups of approved applications and those that had not been approved were compared for 12 variables: publication time lag; numbers of inventors, classifications, pages, figures, tables, claims, priority claims, countries for priority claims, cited patents, and cited non-patent documents; and median of citation age. Furthermore, the authors carried out the experiments in which patent applications were automatically classified into two groups by the machine learning method, random forests. As a result, statistically significant differences between the two groups were observed for the following variables (p 〈 .001): the numbers of inventors, pages, figures, claims, priority claims, and countries for priority claims were significantly larger in the group of approved applications, while the time lag until publication was smaller. In particular, the publication time lag and the numbers of inventors, pages, and figures were variables representing the features that largely contribute to discriminating approved applications in the classification using random forests, which implies that these have relatively strong relationships with patentability.
文摘Forecasting economic indices on the basis of information extracted from text documents, like newspaper articles is an attractive idea. With the help of text mining techniques, in particular sentiment analysis, we evaluate the tone of individual New York Times (NYT) articles and compare our results to the Chicago Fed National Activity Index (CFNAI). In this paper, we present a simple, intuitive framework to derive sentiment scores from text documents In particular articles are tagged based on terms and their connotated sentiment. Subsequently, we forecast the CFNAI movements via support vector machines (SVM) trained on a subset of the observed sentiment scores. We apply our model into two different data sets, the whole NYT articles and the articles categorized as NYT business news. On both data sets, we applied a simple performance measure to evaluate forecasting accuracy of the CFNAI