摘要
The digital twin is the concept of transcending reality,which is the reverse feedback from the real physical space to the virtual digital space.People hold great prospects for this emerging technology.In order to realize the upgrading of the digital twin industrial chain,it is urgent to introduce more modalities,such as vision,haptics,hearing and smell,into the virtual digital space,which assists physical entities and virtual objects in creating a closer connection.Therefore,perceptual understanding and object recognition have become an urgent hot topic in the digital twin.Existing surface material classification schemes often achieve recognition through machine learning or deep learning in a single modality,ignoring the complementarity between multiple modalities.In order to overcome this dilemma,we propose a multimodal fusion network in our article that combines two modalities,visual and haptic,for surface material recognition.On the one hand,the network makes full use of the potential correlations between multiple modalities to deeply mine the modal semantics and complete the data mapping.On the other hand,the network is extensible and can be used as a universal architecture to include more modalities.Experiments show that the constructed multimodal fusion network can achieve 99.42%classification accuracy while reducing complexity.
基金
the National Natural Science Foundation of China(62001246,62001248,62171232)
Key R&D Program of Jiangsu Province Key project and topics under Grant BE2021095
the Natural Science Foundation of Jiangsu Province Higher Education Institutions(20KJB510020)
the Future Network Scientific Research Fund Project(FNSRFP-2021-YB-16)
the open research fund of Key Lab of Broadband Wireless Communication and Sensor Network Technology(JZNY202110)
the NUPTSF under Grant(NY220070).