In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple e...In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach.展开更多
Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data...Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data models are studied,and the characteristics of building information modeling standards(IFC),city geographic modeling language(CityGML),indoor modeling language(IndoorGML),and other models are compared and analyzed.CityGML and IndoorGML models face challenges in satisfying diverse application scenarios and requirements due to limitations in their expression capabilities.It is proposed to combine the semantic information of the model objects to effectively partition and organize the indoor and outdoor spatial 3D model data and to construct the indoor and outdoor data organization mechanism of“chunk-layer-subobject-entrances-area-detail object.”This method is verified by proposing a 3D data organization method for indoor and outdoor space and constructing a 3D visualization system based on it.展开更多
Semantic Web(SW)provides new opportunities for the study and application of big data,massive ranges of data sets in varied formats from multiple sources.Related studies focus on potential SW technologies for resolving...Semantic Web(SW)provides new opportunities for the study and application of big data,massive ranges of data sets in varied formats from multiple sources.Related studies focus on potential SW technologies for resolving big data problems,such as structurally and semantically heterogeneous data that result from the variety of data formats(structured,semi-structured,numeric,unstructured text data,email,video,audio,stock ticker).SW offers information semantically both for people and machines to retain the vast volume of data and provide a meaningful output of unstructured data.In the current research,we implement a new semantic Extract Transform Load(ETL)model that uses SW technologies for aggregating,integrating,and representing data as linked data.First,geospatial data resources are aggregated from the internet,and then a semantic ETL model is used to store the aggregated data in a semantic model after converting it to Resource Description Framework(RDF)format for successful integration and representation.The principal contribution of this research is the synthesis,aggregation,and semantic representation of geospatial data to solve problems.A case study of city data is used to illustrate the semantic ETL model’s functionalities.The results show that the proposed model solves the structural and semantic heterogeneity problems in diverse data sources for successful data aggregation,integration,and representation.展开更多
This paper focuses on the issues of categorical database gen-eralization and emphasizes the roles ofsupporting data model, integrated datamodel, spatial analysis and semanticanalysis in database generalization.The fra...This paper focuses on the issues of categorical database gen-eralization and emphasizes the roles ofsupporting data model, integrated datamodel, spatial analysis and semanticanalysis in database generalization.The framework contents of categoricaldatabase generalization transformationare defined. This paper presents an in-tegrated spatial supporting data struc-ture, a semantic supporting model andsimilarity model for the categorical da-tabase generalization. The concept oftransformation unit is proposed in generalization.展开更多
In accordance with the requirements of expanding Machine-To-Machine communication (M2M), the network overlay is in progress in several domains such as Smart Grid. Consequently, it is predictable that opportunities and...In accordance with the requirements of expanding Machine-To-Machine communication (M2M), the network overlay is in progress in several domains such as Smart Grid. Consequently, it is predictable that opportunities and cases of integrating yielded data from devices such as sensors will increase more. Accordingly, the importance of Ontology and Information Models (IM) which normalize the semantics including sensor expressions, have increased, and the standards of these definitions have been more important as well. So far, there have been multiple initiatives for standardizing the Ontology and IM in regards to the sensors expression such as Sensor Standards Harmonization by the National Institute of Standards and Technology (NIST), W3C Semantic Sensor Network (SSN) and the recent W3C IoT-Lite Ontology. However, there is still room to improve the current level of the Ontology and IM on the viewpoint of the implementing structure. This paper presents a set of IMs on abstract sensors and contexts in regards to the phenomenon around these sensors from the point of view of a structure implementing these specified sensors. As several previous studies have pointed out, multiple aspects on the sensors should be modeled. Accordingly, multiple sets of Ontology and IM on these sensors should be defined. Our study has intended to clarify the relationship between configurations and physical measured quantities of the structures implementing a set of sensors. Up to present, they have not been generalized and have remained unformulated. Consequently, due to the result of this analysis, it is expected to implement a more generalized translator module easily, which aggregates the measured data from the sensors on the middleware level managing these Ontology and IM, instead of the layer of user application programs.展开更多
With the rapid development of technology,geological big data is increasing explosively,and plays an increasingly important position in the national economy(Zhang and Zhou,2017;Zhou et al.,2018).Governments and agencie...With the rapid development of technology,geological big data is increasing explosively,and plays an increasingly important position in the national economy(Zhang and Zhou,2017;Zhou et al.,2018).Governments and agencies attach great importance to the open internet service of geological big data and information at home,and abroad(Yan et al.,2013;Guo et al.,2014).The basic norms of western countries’geological data information services are rich and varied products.展开更多
Geology is the base for highways and tunnels construction. With the fast development of national highway construction, highway tunnel construction project are more and more complex. The completeness and accuracy are e...Geology is the base for highways and tunnels construction. With the fast development of national highway construction, highway tunnel construction project are more and more complex. The completeness and accuracy are essential for the planning, design and construction of projects, while the ground information is quite poor in systematic, reliable and timely aspects. Therefore, the development of underground road tunnels, and the implementation of informationized spatial information management is urgent for highway construction. 3D geological tunnel model is intuitive, high efficient and convenience which greatly facilitates the maintenance and security of highway tunnels construction and it will be the trend for the future highway tunnel development.展开更多
The IEC 61850 standard stipulates the Substation Configuration Description Language (SCL) file as a means to define the substation equipment, IED function and also the communication mechanism for the substation area n...The IEC 61850 standard stipulates the Substation Configuration Description Language (SCL) file as a means to define the substation equipment, IED function and also the communication mechanism for the substation area network. The SCL is an eXtensible Markup Language (XML) based file which helps to describe the configuration of the substation Intelligent Electronic Devices (IED) including their associated functions. The SCL file is also configured to contain all IED capabilities including data model which is structured into objects for easy descriptive modeling. The effective functioning of this SCL file relies on appropriate validation techniques which check the data model for errors due to non-conformity to the IEC 61850 standard. In this research, we extend the conventional SCL validation algorithm to develop a more advanced validator which can validate the standard data model using the Unified Modeling Language (UML). By using the Rule-based SCL validation tool, we implement validation test cases for a more comprehensive understanding of the various validation functionalities. It can be observed from the algorithm and the various implemented test cases that the proposed validation tool can improve SCL information validation and also help automation engineers to comprehend the IEC 61850 substation system architecture.展开更多
In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can su...In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can support this ongoing process with subsequent analysis.In this study,a solution to attaining this goal is proposed,based on the design and implementation of a data mart as part of a dimensional trajectory data warehouse(TDW)that acts as a repository for the management of movement data.A novel methodological approach is proposed for modeling multiple spatial and temporal dimensions in a logical model.The case study presented in this paper for modeling and analyzing workforce movement data is to support human resource management decision-making and the following discussion provides a representative example of the contribution of a TDW in the process of information management and decision support systems.The entire process of exporting,cleaning,consolidating,and transforming data is implemented to achieve an appropriate format for final import.Structured query language(SQL)queries demonstrate the convenience of dimensional design for data analysis,and valuable information can be extracted from the movements of employees on company premises to manage the workforce efficiently and effectively.Visual analytics through data visualization support the analysis and facilitate decisionmaking and business intelligence.展开更多
It is very important for the development of electric power big data technology to use the electric power knowledge.A new electric power knowledge theory model is proposed here to solve the problem of normalized modele...It is very important for the development of electric power big data technology to use the electric power knowledge.A new electric power knowledge theory model is proposed here to solve the problem of normalized modeled electric power knowledge for the management and analysis of electric power big data.Current modeling techniques of electric power knowledge are viewed as inadequate because of the complexity and variety of the relationships among electric power system data.Ontology theory and semantic web technologies used in electric power systems and in many other industry domains provide a new kind of knowledge modeling method.Based on this,this paper proposes the structure,elements,basic calculations and multidimensional reasoning method of the new knowledge model.A modeling example of the regulations defined in electric power system operation standard is demonstrated.Different forms of the model and related technologies are also introduced,including electric power system standard modeling,multi-type data management,unstructured data searching,knowledge display and data analysis based on semantic expansion and reduction.Research shows that the new model developed here is powerful and can adapt to various knowledge expression requirements of electric power big data.With the development of electric power big data technology,it is expected that the knowledge model will be improved and will be used in more applications.展开更多
基金Science and Technology Innovation 2030-Major Project of“New Generation Artificial Intelligence”granted by Ministry of Science and Technology,Grant Number 2020AAA0109300.
文摘In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach.
文摘Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data models are studied,and the characteristics of building information modeling standards(IFC),city geographic modeling language(CityGML),indoor modeling language(IndoorGML),and other models are compared and analyzed.CityGML and IndoorGML models face challenges in satisfying diverse application scenarios and requirements due to limitations in their expression capabilities.It is proposed to combine the semantic information of the model objects to effectively partition and organize the indoor and outdoor spatial 3D model data and to construct the indoor and outdoor data organization mechanism of“chunk-layer-subobject-entrances-area-detail object.”This method is verified by proposing a 3D data organization method for indoor and outdoor space and constructing a 3D visualization system based on it.
文摘Semantic Web(SW)provides new opportunities for the study and application of big data,massive ranges of data sets in varied formats from multiple sources.Related studies focus on potential SW technologies for resolving big data problems,such as structurally and semantically heterogeneous data that result from the variety of data formats(structured,semi-structured,numeric,unstructured text data,email,video,audio,stock ticker).SW offers information semantically both for people and machines to retain the vast volume of data and provide a meaningful output of unstructured data.In the current research,we implement a new semantic Extract Transform Load(ETL)model that uses SW technologies for aggregating,integrating,and representing data as linked data.First,geospatial data resources are aggregated from the internet,and then a semantic ETL model is used to store the aggregated data in a semantic model after converting it to Resource Description Framework(RDF)format for successful integration and representation.The principal contribution of this research is the synthesis,aggregation,and semantic representation of geospatial data to solve problems.A case study of city data is used to illustrate the semantic ETL model’s functionalities.The results show that the proposed model solves the structural and semantic heterogeneity problems in diverse data sources for successful data aggregation,integration,and representation.
基金the National Natural Science Foundation (No. 40271088) the Research Fund of International Institute of Geo-information Science and Earth Observation.
文摘This paper focuses on the issues of categorical database gen-eralization and emphasizes the roles ofsupporting data model, integrated datamodel, spatial analysis and semanticanalysis in database generalization.The framework contents of categoricaldatabase generalization transformationare defined. This paper presents an in-tegrated spatial supporting data struc-ture, a semantic supporting model andsimilarity model for the categorical da-tabase generalization. The concept oftransformation unit is proposed in generalization.
文摘In accordance with the requirements of expanding Machine-To-Machine communication (M2M), the network overlay is in progress in several domains such as Smart Grid. Consequently, it is predictable that opportunities and cases of integrating yielded data from devices such as sensors will increase more. Accordingly, the importance of Ontology and Information Models (IM) which normalize the semantics including sensor expressions, have increased, and the standards of these definitions have been more important as well. So far, there have been multiple initiatives for standardizing the Ontology and IM in regards to the sensors expression such as Sensor Standards Harmonization by the National Institute of Standards and Technology (NIST), W3C Semantic Sensor Network (SSN) and the recent W3C IoT-Lite Ontology. However, there is still room to improve the current level of the Ontology and IM on the viewpoint of the implementing structure. This paper presents a set of IMs on abstract sensors and contexts in regards to the phenomenon around these sensors from the point of view of a structure implementing these specified sensors. As several previous studies have pointed out, multiple aspects on the sensors should be modeled. Accordingly, multiple sets of Ontology and IM on these sensors should be defined. Our study has intended to clarify the relationship between configurations and physical measured quantities of the structures implementing a set of sensors. Up to present, they have not been generalized and have remained unformulated. Consequently, due to the result of this analysis, it is expected to implement a more generalized translator module easily, which aggregates the measured data from the sensors on the middleware level managing these Ontology and IM, instead of the layer of user application programs.
基金granted by the National Key R&D Program of China(Grant No.2016YFC0600510)the Ministry of Land and Resources"Twelfth Five-Year Plan"Key Projects(Grant No.1212011220352).
文摘With the rapid development of technology,geological big data is increasing explosively,and plays an increasingly important position in the national economy(Zhang and Zhou,2017;Zhou et al.,2018).Governments and agencies attach great importance to the open internet service of geological big data and information at home,and abroad(Yan et al.,2013;Guo et al.,2014).The basic norms of western countries’geological data information services are rich and varied products.
文摘Geology is the base for highways and tunnels construction. With the fast development of national highway construction, highway tunnel construction project are more and more complex. The completeness and accuracy are essential for the planning, design and construction of projects, while the ground information is quite poor in systematic, reliable and timely aspects. Therefore, the development of underground road tunnels, and the implementation of informationized spatial information management is urgent for highway construction. 3D geological tunnel model is intuitive, high efficient and convenience which greatly facilitates the maintenance and security of highway tunnels construction and it will be the trend for the future highway tunnel development.
文摘The IEC 61850 standard stipulates the Substation Configuration Description Language (SCL) file as a means to define the substation equipment, IED function and also the communication mechanism for the substation area network. The SCL is an eXtensible Markup Language (XML) based file which helps to describe the configuration of the substation Intelligent Electronic Devices (IED) including their associated functions. The SCL file is also configured to contain all IED capabilities including data model which is structured into objects for easy descriptive modeling. The effective functioning of this SCL file relies on appropriate validation techniques which check the data model for errors due to non-conformity to the IEC 61850 standard. In this research, we extend the conventional SCL validation algorithm to develop a more advanced validator which can validate the standard data model using the Unified Modeling Language (UML). By using the Rule-based SCL validation tool, we implement validation test cases for a more comprehensive understanding of the various validation functionalities. It can be observed from the algorithm and the various implemented test cases that the proposed validation tool can improve SCL information validation and also help automation engineers to comprehend the IEC 61850 substation system architecture.
文摘In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can support this ongoing process with subsequent analysis.In this study,a solution to attaining this goal is proposed,based on the design and implementation of a data mart as part of a dimensional trajectory data warehouse(TDW)that acts as a repository for the management of movement data.A novel methodological approach is proposed for modeling multiple spatial and temporal dimensions in a logical model.The case study presented in this paper for modeling and analyzing workforce movement data is to support human resource management decision-making and the following discussion provides a representative example of the contribution of a TDW in the process of information management and decision support systems.The entire process of exporting,cleaning,consolidating,and transforming data is implemented to achieve an appropriate format for final import.Structured query language(SQL)queries demonstrate the convenience of dimensional design for data analysis,and valuable information can be extracted from the movements of employees on company premises to manage the workforce efficiently and effectively.Visual analytics through data visualization support the analysis and facilitate decisionmaking and business intelligence.
基金supported by Science and Technology Foundation of the State Grid Corporation of China(XT71-14-043).
文摘It is very important for the development of electric power big data technology to use the electric power knowledge.A new electric power knowledge theory model is proposed here to solve the problem of normalized modeled electric power knowledge for the management and analysis of electric power big data.Current modeling techniques of electric power knowledge are viewed as inadequate because of the complexity and variety of the relationships among electric power system data.Ontology theory and semantic web technologies used in electric power systems and in many other industry domains provide a new kind of knowledge modeling method.Based on this,this paper proposes the structure,elements,basic calculations and multidimensional reasoning method of the new knowledge model.A modeling example of the regulations defined in electric power system operation standard is demonstrated.Different forms of the model and related technologies are also introduced,including electric power system standard modeling,multi-type data management,unstructured data searching,knowledge display and data analysis based on semantic expansion and reduction.Research shows that the new model developed here is powerful and can adapt to various knowledge expression requirements of electric power big data.With the development of electric power big data technology,it is expected that the knowledge model will be improved and will be used in more applications.