Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant resear...Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities.Under complex scenes,multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions.However,achieving outstanding performance is challenging because of equipment performance limitations,missing information,and data noise.This paper comprehensively reviews existing methods based onmulti-modal fusion techniques and completes a detailed and in-depth analysis.According to the data fusion stage,multi-modal fusion has four primary methods:early fusion,deep fusion,late fusion,and hybrid fusion.The paper surveys the three majormulti-modal fusion technologies that can significantly enhance the effect of data fusion and further explore the applications of multi-modal fusion technology in various fields.Finally,it discusses the challenges and explores potential research opportunities.Multi-modal tasks still need intensive study because of data heterogeneity and quality.Preserving complementary information and eliminating redundant information between modalities is critical in multi-modal technology.Invalid data fusion methods may introduce extra noise and lead to worse results.This paper provides a comprehensive and detailed summary in response to these challenges.展开更多
Knowledge graphs(KGs)have been widely accepted as powerful tools for modeling the complex relationships between concepts and developing knowledge-based services.In recent years,researchers in the field of power system...Knowledge graphs(KGs)have been widely accepted as powerful tools for modeling the complex relationships between concepts and developing knowledge-based services.In recent years,researchers in the field of power systems have explored KGs to develop intelligent dispatching systems for increasingly large power grids.With multiple power grid dispatching knowledge graphs(PDKGs)constructed by different agencies,the knowledge fusion of different PDKGs is useful for providing more accurate decision supports.To achieve this,entity alignment that aims at connecting different KGs by identifying equivalent entities is a critical step.Existing entity alignment methods cannot integrate useful structural,attribute,and relational information while calculating entities’similarities and are prone to making many-to-one alignments,thus can hardly achieve the best performance.To address these issues,this paper proposes a collective entity alignment model that integrates three kinds of available information and makes collective counterpart assignments.This model proposes a novel knowledge graph attention network(KGAT)to learn the embeddings of entities and relations explicitly and calculates entities’similarities by adaptively incorporating the structural,attribute,and relational similarities.Then,we formulate the counterpart assignment task as an integer programming(IP)problem to obtain one-to-one alignments.We not only conduct experiments on a pair of PDKGs but also evaluate o ur model on three commonly used cross-lingual KGs.Experimental comparisons indicate that our model outperforms other methods and provides an effective tool for the knowledge fusion of PDKGs.展开更多
基金supported by the Natural Science Foundation of Liaoning Province(Grant No.2023-MSBA-070)the National Natural Science Foundation of China(Grant No.62302086).
文摘Multi-modal fusion technology gradually become a fundamental task in many fields,such as autonomous driving,smart healthcare,sentiment analysis,and human-computer interaction.It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities.Under complex scenes,multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions.However,achieving outstanding performance is challenging because of equipment performance limitations,missing information,and data noise.This paper comprehensively reviews existing methods based onmulti-modal fusion techniques and completes a detailed and in-depth analysis.According to the data fusion stage,multi-modal fusion has four primary methods:early fusion,deep fusion,late fusion,and hybrid fusion.The paper surveys the three majormulti-modal fusion technologies that can significantly enhance the effect of data fusion and further explore the applications of multi-modal fusion technology in various fields.Finally,it discusses the challenges and explores potential research opportunities.Multi-modal tasks still need intensive study because of data heterogeneity and quality.Preserving complementary information and eliminating redundant information between modalities is critical in multi-modal technology.Invalid data fusion methods may introduce extra noise and lead to worse results.This paper provides a comprehensive and detailed summary in response to these challenges.
基金supported by the National Key R&D Program of China(2018AAA0101502)the Science and Technology Project of SGCC(State Grid Corporation of China):Fundamental Theory of Human-in-the-Loop Hybrid-Augmented Intelligence for Power Grid Dispatch and Control。
文摘Knowledge graphs(KGs)have been widely accepted as powerful tools for modeling the complex relationships between concepts and developing knowledge-based services.In recent years,researchers in the field of power systems have explored KGs to develop intelligent dispatching systems for increasingly large power grids.With multiple power grid dispatching knowledge graphs(PDKGs)constructed by different agencies,the knowledge fusion of different PDKGs is useful for providing more accurate decision supports.To achieve this,entity alignment that aims at connecting different KGs by identifying equivalent entities is a critical step.Existing entity alignment methods cannot integrate useful structural,attribute,and relational information while calculating entities’similarities and are prone to making many-to-one alignments,thus can hardly achieve the best performance.To address these issues,this paper proposes a collective entity alignment model that integrates three kinds of available information and makes collective counterpart assignments.This model proposes a novel knowledge graph attention network(KGAT)to learn the embeddings of entities and relations explicitly and calculates entities’similarities by adaptively incorporating the structural,attribute,and relational similarities.Then,we formulate the counterpart assignment task as an integer programming(IP)problem to obtain one-to-one alignments.We not only conduct experiments on a pair of PDKGs but also evaluate o ur model on three commonly used cross-lingual KGs.Experimental comparisons indicate that our model outperforms other methods and provides an effective tool for the knowledge fusion of PDKGs.