Purpose:To address the“anomalies”that occur when scientific breakthroughs emerge,this study focuses on identifying early signs and nascent stages of breakthrough innovations from the perspective of outliers,aiming t...Purpose:To address the“anomalies”that occur when scientific breakthroughs emerge,this study focuses on identifying early signs and nascent stages of breakthrough innovations from the perspective of outliers,aiming to achieve early identification of scientific breakthroughs in papers.Design/methodology/approach:This study utilizes semantic technology to extract research entities from the titles and abstracts of papers to represent each paper’s research content.Outlier detection methods are then employed to measure and analyze the anomalies in breakthrough papers during their early stages.The development and evolution process are traced using literature time tags.Finally,a case study is conducted using the key publications of the 2021 Nobel Prize laureates in Physiology or Medicine.Findings:Through manual analysis of all identified outlier papers,the effectiveness of the proposed method for early identifying potential scientific breakthroughs is verified.Research limitations:The study’s applicability has only been empirically tested in the biomedical field.More data from various fields are needed to validate the robustness and generalizability of the method.Practical implications:This study provides a valuable supplement to current methods for early identification of scientific breakthroughs,effectively supporting technological intelligence decision-making and services.Originality/value:The study introduces a novel approach to early identification of scientific breakthroughs by leveraging outlier analysis of research entities,offering a more sensitive,precise,and fine-grained alternative method compared to traditional citation-based evaluations,which enhances the ability to identify nascent breakthrough innovations.展开更多
The rapid developments of technologies that generate arrays of gene dataenable a global view of the transcription levels of hundreds of thousands of genes simultaneously.The outlier detection problem for gene data has...The rapid developments of technologies that generate arrays of gene dataenable a global view of the transcription levels of hundreds of thousands of genes simultaneously.The outlier detection problem for gene data has its importance but together with the difficulty ofhigh dimensionality. The sparsity of data in high-dimensional space makes each point a relativelygood outlier in the view of traditional distance-based definitions. Thus, finding outliers in highdimensional data is more complex. In this paper, some basic outlier analysis algorithms arediscussed and a new genetic algorithm is presented. This algorithm is to find best dimensionprojections based on a revised cell-based algorithm and to give explanations to solutions. It cansolve the outlier detection problem for gene expression data and for other high dimensional data aswell.展开更多
A two-step method is proposed for detection and identification of invisible impact damage in composite structure under temperature changes using Lamb waves.First,a statistical outlier analysis is employed to distingui...A two-step method is proposed for detection and identification of invisible impact damage in composite structure under temperature changes using Lamb waves.First,a statistical outlier analysis is employed to distinguish whether the changes of Lamb wave signals are induced by damage within a monitoring area or are only affected by temperature changes.Damage indices are defined after the Lamb wave signals are processed by Fourier transform,and a Monte Carlo procedure is used to obtain the damage threshold value for the damage indices at the undamaged state.If the damage indices in the operation state exceed the threshold value,the presence of damage is determined.Then,a probabilistic damage imaging algorithm displaying probabilities of the presence of damage within the monitoring area is adopted to fuse information collected from multiple actuator-sensor paths to identify the location of damage.Damage indices under damaged state are used to generate the diagnostic image.Experimental study on a stiffened composite panel with random temperature changes is performed to demonstrate the effectiveness of the proposed method.展开更多
基金supported by the major project of the National Social Science Foundation of China“Big Data-driven Semantic Evaluation System of Science and Technology Literature”(Grant No.21&ZD329)。
文摘Purpose:To address the“anomalies”that occur when scientific breakthroughs emerge,this study focuses on identifying early signs and nascent stages of breakthrough innovations from the perspective of outliers,aiming to achieve early identification of scientific breakthroughs in papers.Design/methodology/approach:This study utilizes semantic technology to extract research entities from the titles and abstracts of papers to represent each paper’s research content.Outlier detection methods are then employed to measure and analyze the anomalies in breakthrough papers during their early stages.The development and evolution process are traced using literature time tags.Finally,a case study is conducted using the key publications of the 2021 Nobel Prize laureates in Physiology or Medicine.Findings:Through manual analysis of all identified outlier papers,the effectiveness of the proposed method for early identifying potential scientific breakthroughs is verified.Research limitations:The study’s applicability has only been empirically tested in the biomedical field.More data from various fields are needed to validate the robustness and generalizability of the method.Practical implications:This study provides a valuable supplement to current methods for early identification of scientific breakthroughs,effectively supporting technological intelligence decision-making and services.Originality/value:The study introduces a novel approach to early identification of scientific breakthroughs by leveraging outlier analysis of research entities,offering a more sensitive,precise,and fine-grained alternative method compared to traditional citation-based evaluations,which enhances the ability to identify nascent breakthrough innovations.
文摘The rapid developments of technologies that generate arrays of gene dataenable a global view of the transcription levels of hundreds of thousands of genes simultaneously.The outlier detection problem for gene data has its importance but together with the difficulty ofhigh dimensionality. The sparsity of data in high-dimensional space makes each point a relativelygood outlier in the view of traditional distance-based definitions. Thus, finding outliers in highdimensional data is more complex. In this paper, some basic outlier analysis algorithms arediscussed and a new genetic algorithm is presented. This algorithm is to find best dimensionprojections based on a revised cell-based algorithm and to give explanations to solutions. It cansolve the outlier detection problem for gene expression data and for other high dimensional data aswell.
基金Supported by the Aeronautical Science Foundation of China(2008ZA52012)the Six Kinds of Excellent Talent Project in Jiangsu Province of China(2010JZ004)the Research Foundation of Nanjing University of Aeronautics and Astronautics(NS2010027)~~
文摘A two-step method is proposed for detection and identification of invisible impact damage in composite structure under temperature changes using Lamb waves.First,a statistical outlier analysis is employed to distinguish whether the changes of Lamb wave signals are induced by damage within a monitoring area or are only affected by temperature changes.Damage indices are defined after the Lamb wave signals are processed by Fourier transform,and a Monte Carlo procedure is used to obtain the damage threshold value for the damage indices at the undamaged state.If the damage indices in the operation state exceed the threshold value,the presence of damage is determined.Then,a probabilistic damage imaging algorithm displaying probabilities of the presence of damage within the monitoring area is adopted to fuse information collected from multiple actuator-sensor paths to identify the location of damage.Damage indices under damaged state are used to generate the diagnostic image.Experimental study on a stiffened composite panel with random temperature changes is performed to demonstrate the effectiveness of the proposed method.