There is a large amount of heterogeneous data distributed in various sources in the upstream of PetroChina. These data can be valuable assets if we can fully use them. Meanwhile, the knowledge graph, as a new emerging...There is a large amount of heterogeneous data distributed in various sources in the upstream of PetroChina. These data can be valuable assets if we can fully use them. Meanwhile, the knowledge graph, as a new emerging technique, provides a way to integrate multi-source heterogeneous data. In this paper, we present one application of the knowledge graph in the upstream of PetroChina. Specifically, we first construct a knowledge graph from both structured and unstructured data with multiple NLP (natural language progressing) methods. Then, we introduce two typical knowledge graph powered applications and show the benefit that the knowledge graph brings to these applications:compared with the traditional machine learning approach, the well log interpretation method powered by knowledge graph shows more than 7.69% improvement of accuracy.展开更多
Line transect sampling is a very useful method in survey of wildlife population. Confident interval estimation for density D of a biological population is proposed based on a sequential design. The survey area is occu...Line transect sampling is a very useful method in survey of wildlife population. Confident interval estimation for density D of a biological population is proposed based on a sequential design. The survey area is occupied by the population whose size is unknown. A stopping rule is proposed by a kernel-based estimator of density function of the perpendicular data at a distance. With this stopping rule, we construct several confidence intervals for D by difference procedures. Some bias reduction techniques are used to modify the confidence intervals. These intervals provide the desired coverage probability as the bandwidth in the stopping rule approaches zero. A simulation study is also given to illustrate the performance of this proposed sequential kernel procedure.展开更多
文摘There is a large amount of heterogeneous data distributed in various sources in the upstream of PetroChina. These data can be valuable assets if we can fully use them. Meanwhile, the knowledge graph, as a new emerging technique, provides a way to integrate multi-source heterogeneous data. In this paper, we present one application of the knowledge graph in the upstream of PetroChina. Specifically, we first construct a knowledge graph from both structured and unstructured data with multiple NLP (natural language progressing) methods. Then, we introduce two typical knowledge graph powered applications and show the benefit that the knowledge graph brings to these applications:compared with the traditional machine learning approach, the well log interpretation method powered by knowledge graph shows more than 7.69% improvement of accuracy.
基金Supported by the National Natural Science Funds for Distinguished Young Scholar(No.70825004)the National Natural Science Foundation of China(No.10731010,10628104 and 10721101)Leading Academic Discipline Program,the 10th five year plan of 211 Project for Shanghai University of Finance and Economics and 211 Project for Shanghai University of Finance and Economics(the 3rd phase)
文摘Line transect sampling is a very useful method in survey of wildlife population. Confident interval estimation for density D of a biological population is proposed based on a sequential design. The survey area is occupied by the population whose size is unknown. A stopping rule is proposed by a kernel-based estimator of density function of the perpendicular data at a distance. With this stopping rule, we construct several confidence intervals for D by difference procedures. Some bias reduction techniques are used to modify the confidence intervals. These intervals provide the desired coverage probability as the bandwidth in the stopping rule approaches zero. A simulation study is also given to illustrate the performance of this proposed sequential kernel procedure.