This paper presents a two-phase genetic algorithm (TPGA) based on the multi- parent genetic algorithm (MPGA). Through analysis we find MPGA will lead the population' s evol vement to diversity or convergence accor...This paper presents a two-phase genetic algorithm (TPGA) based on the multi- parent genetic algorithm (MPGA). Through analysis we find MPGA will lead the population' s evol vement to diversity or convergence according to the population size and the crossover size, so we make it run in different forms during the global and local optimization phases and then forms TPGA. The experiment results show that TPGA is very efficient for the optimization of low-dimension multi-modal functions, usually we can obtain all the global optimal solutions.展开更多
Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become availa...Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.展开更多
In this paper, a new algorithm for solving multi-modal function optimization problems-two-level subspace evolutionary algorithm is proposed. In the first level, the improved GT algorithm is used to do global recombina...In this paper, a new algorithm for solving multi-modal function optimization problems-two-level subspace evolutionary algorithm is proposed. In the first level, the improved GT algorithm is used to do global recombination search so that the whole population can be separated into several niches according to the position of solutions; then, in the second level, the niche evolutionary strategy is used for local search in the subspaces gotten in the first level till solutions of the problem are found. The new algorithm has been tested on some hard problems and some good results are obtained.展开更多
Traditional electroencephalograph(EEG)-based emotion recognition requires a large number of calibration samples to build a model for a specific subject,which restricts the application of the affective brain computer i...Traditional electroencephalograph(EEG)-based emotion recognition requires a large number of calibration samples to build a model for a specific subject,which restricts the application of the affective brain computer interface(BCI)in practice.We attempt to use the multi-modal data from the past session to realize emotion recognition in the case of a small amount of calibration samples.To solve this problem,we propose a multimodal domain adaptive variational autoencoder(MMDA-VAE)method,which learns shared cross-domain latent representations of the multi-modal data.Our method builds a multi-modal variational autoencoder(MVAE)to project the data of multiple modalities into a common space.Through adversarial learning and cycle-consistency regularization,our method can reduce the distribution difference of each domain on the shared latent representation layer and realize the transfer of knowledge.Extensive experiments are conducted on two public datasets,SEED and SEED-IV,and the results show the superiority of our proposed method.Our work can effectively improve the performance of emotion recognition with a small amount of labelled multi-modal data.展开更多
The improved version of Los Alamos model with the multi-modal fission approach is used to analyse the prompt fission neutron spectrum and multiplicity for the neutron-induced fission of 237Np. The spectra of neutrons ...The improved version of Los Alamos model with the multi-modal fission approach is used to analyse the prompt fission neutron spectrum and multiplicity for the neutron-induced fission of 237Np. The spectra of neutrons emitted from fragments for the three most dominant fission modes (standard Ⅰ, standard Ⅱ and superlong) are calculated separately and the total spectrum is synthesized. The multi-modal parameters contained in the spectrum model are determined on the basis of experimental data of fission fragment mass distributions. The calculated total prompt fission neutron spectrum and multiplicity are better agreement with the experimental data than those obtained from the conventional treatment of the Los Alamos model.展开更多
An attempt is made to improve the evaluation of the prompt fission neutron emis- sion from 233U(n, f) reaction for incident neutron energies below 6 MeV. The multi-modal fission approach is applied to the improved v...An attempt is made to improve the evaluation of the prompt fission neutron emis- sion from 233U(n, f) reaction for incident neutron energies below 6 MeV. The multi-modal fission approach is applied to the improved version of Los Alamos model and the point by point model. The prompt fission neutron spectra and the prompt fission neutron as a function of fragment mass (usually named "sawtooth" data) v(A) are calculated independently for the three most dominant fission modes (standard I, standard II and superlong), and the total spectra and v(A) are syn- thesized. The multi-modal parameters are determined on the basis of experimental data of fission fragment mass distributions. The present calculation results can describe the experimental data very well, and the proposed treatment is thus a useful tool for prompt fission neutron emission prediction.展开更多
With the increasing of the elderly population and the growing hearth care cost, the role of service robots in aiding the disabled and the elderly is becoming important. Many researchers in the world have paid much att...With the increasing of the elderly population and the growing hearth care cost, the role of service robots in aiding the disabled and the elderly is becoming important. Many researchers in the world have paid much attention to heaRthcare robots and rehabilitation robots. To get natural and harmonious communication between the user and a service robot, the information perception/feedback ability, and interaction ability for service robots become more important in many key issues.展开更多
In telerobotic system for remote welding, human-machine interface is one of the most important factor for enhancing capability and efficiency. This paper presents an architecture design of human-machine interface for ...In telerobotic system for remote welding, human-machine interface is one of the most important factor for enhancing capability and efficiency. This paper presents an architecture design of human-machine interface for welding telerobotic system: welding multi-modal human-machine interface. The human-machine interface integrated several control modes, which are namely shared control, teleteaching, supervisory control and local autonomous control. Space mouse, panoramic vision camera and graphics simulation system are also integrated into the human-machine interface for welding teleoperation. Finally, weld seam tracing and welding experiments of U-shape seam are performed by these control modes respectively. The results show that the system has better performance of human-machine interaction and complexity environment welding.展开更多
The acoustic vibration characteristics of landmines are investigated by means of modal analysis. According to the mechanical structure of landmines, a certain number of points are marked on the landmine shell to analy...The acoustic vibration characteristics of landmines are investigated by means of modal analysis. According to the mechanical structure of landmines, a certain number of points are marked on the landmine shell to analyze its multi-modal vibration characteristics, based on laser self-mixing interferometer and taking 69 plastic landmine as an example, the vibration detection experiment system is built to show the results of analytical method of multi-modal testing. The first and second order natural frequencies of the bricks are 38 HZ and 106 HZ, 112 HZ and 232 HZ for plastic landmines, and 74 HZ and 290 HZ for metal landmines. The first and second order natural frequencies of the bricks are far smaller than those of plastic landmines and metal landmines. This indicates that landmines show multi-modal vibration characteristics under external excitation, which are significantly different from those of bricks. The findings can be used for further research on acoustic landmines detection technology.展开更多
Based on the theory of Forceville’s multi-modal metaphor,this paper adopts qualitative and quantitative research methods to analyze 60 social safety ads both in China and America,trying to demonstrate the similaritie...Based on the theory of Forceville’s multi-modal metaphor,this paper adopts qualitative and quantitative research methods to analyze 60 social safety ads both in China and America,trying to demonstrate the similarities and differences between the chosen social safety ads in using multi-modal metaphor and discussing the factors that caused these differences.展开更多
Leveraging deep learning-based techniques to classify diseases has attracted extensive research interest in recent years.Nevertheless,most of the current studies only consider single-modal medical images,and the numbe...Leveraging deep learning-based techniques to classify diseases has attracted extensive research interest in recent years.Nevertheless,most of the current studies only consider single-modal medical images,and the number of ophthalmic diseases that can be classified is relatively small.Moreover,imbalanced data distribution of different ophthalmic diseases is not taken into consideration,which limits the application of deep learning techniques in realistic clinical scenes.In this paper,we propose a Multimodal Multi-disease Long-tailed Classification Network(M^(2)LC-Net)in response to the challenges mentioned above.M^(2)LC-Net leverages ResNet18-CBAM to extract features from fundus images and Optical Coherence Tomography(OCT)images,respectively,and conduct feature fusion to classify 11 common ophthalmic diseases.Moreover,Class Activation Mapping(CAM)is employed to visualize each mode to improve interpretability of M^(2)LC-Net.We conduct comprehensive experiments on realistic dataset collected from a Grade III Level A ophthalmology hospital in China,including 34,396 images of 11 disease labels.Experimental results demonstrate effectiveness of our proposed model M^(2)LC-Net.Compared with the stateof-the-art,various performance metrics have been improved significantly.Specifically,Cohen’s kappa coefficient κ has been improved by 3.21%,which is a remarkable improvement.展开更多
Taking the teaching practice of agricultural landscape planning for example,this paper uses the multi-modal teaching idea for teaching design based on traditional lecture-style teaching,including multi-modal teaching ...Taking the teaching practice of agricultural landscape planning for example,this paper uses the multi-modal teaching idea for teaching design based on traditional lecture-style teaching,including multi-modal teaching materials,multi-modal teaching methods and multi-modal teaching evaluation. The results show that this method can effectively improve students' interest in learning,reinforce the theoretical basis of agricultural landscape planning theory,and improve agricultural landscape planning practical skills. It is the active exploration of multi-modal teaching model and useful complement to traditional classroom teaching.展开更多
Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplic...Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplicate detection methods. We have designed a coarse-to-fine near duplicate detection framework to speed-up the process and a multi-modal integra-tion scheme for accurate detection. The duplicate pairs are detected with both global feature (partition based color his-togram) and local feature (CPAM and SIFT Bag-of-Word model). The experiment results on large scale data set proved the effectiveness of the proposed design.展开更多
Modal parameter identification is a mature technology.However,there are some challenges in its practical applications such as the identification of vibration systems involving closely spaced modes and intensive noise ...Modal parameter identification is a mature technology.However,there are some challenges in its practical applications such as the identification of vibration systems involving closely spaced modes and intensive noise contamination.This paper proposes a new time-frequency method based on intrinsic chirp component decomposition(ICCD)to address these issues.In this method,a redundant Fourier model is used to ameliorate border distortions and improve the accuracy of signal reconstruction.The effectiveness and accuracy of the proposed method are illustrated using three examples:a cantilever beam structure with intensive noise contamination or environmental interference,a four-degree-of-freedom structure with two closely spaced modes,and an impact test on a cantilever rectangular plate.By comparison with the identification method based on the empirical wavelet transform(EWT),it is shown that the presented method is effective,even in a high-noise environment,and the dynamic characteristics of closely spaced modes are accurately determined.展开更多
Diagnosis and prediction of satellite fault are more difficult than that of other equipment due to the complex structure of satellites and the presence of multi excite sources of satellite faults. Generally, one kind ...Diagnosis and prediction of satellite fault are more difficult than that of other equipment due to the complex structure of satellites and the presence of multi excite sources of satellite faults. Generally, one kind of reasoning model can only diagnose and predict one kind of satellite faults. In this paper the author introduces an application of a new method using multi modal reasoning to diagnose and predict satellite faults. The method has been used in the development of knowledge based satellite fault diagnosis and recovery system (KSFDRS) successfully. It is shown that the method is effective.展开更多
Based on the teaching video of middle school English teachers, through observation and analysis, it puts forward the problem of less use, wrong use and abuse in the use of teachers' teaching gestures in middle sch...Based on the teaching video of middle school English teachers, through observation and analysis, it puts forward the problem of less use, wrong use and abuse in the use of teachers' teaching gestures in middle school English teaching. And then it puts forward corresponding solutions from three aspects: concept, theory and practice. Hoping to provide further reference to the complementary role of teaching gesture and teaching discourse.展开更多
基金Supported by the National Natural Science Foundation of China (70071042,60073043,60133010)
文摘This paper presents a two-phase genetic algorithm (TPGA) based on the multi- parent genetic algorithm (MPGA). Through analysis we find MPGA will lead the population' s evol vement to diversity or convergence according to the population size and the crossover size, so we make it run in different forms during the global and local optimization phases and then forms TPGA. The experiment results show that TPGA is very efficient for the optimization of low-dimension multi-modal functions, usually we can obtain all the global optimal solutions.
基金Supported by Grant-in-Aid for Young Scientists(A)(Grant No.26700021)Japan Society for the Promotion of Science and Strategic Information and Communications R&D Promotion Programme(Grant No.142103011)Ministry of Internal Affairs and Communications
文摘Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.
基金Supported by the National Natural Science Foundation of China (70071042,60073043,60133010)
文摘In this paper, a new algorithm for solving multi-modal function optimization problems-two-level subspace evolutionary algorithm is proposed. In the first level, the improved GT algorithm is used to do global recombination search so that the whole population can be separated into several niches according to the position of solutions; then, in the second level, the niche evolutionary strategy is used for local search in the subspaces gotten in the first level till solutions of the problem are found. The new algorithm has been tested on some hard problems and some good results are obtained.
基金National Natural Science Foundation of China(61976209,62020106015,U21A20388)in part by the CAS International Collaboration Key Project(173211KYSB20190024)in part by the Strategic Priority Research Program of CAS(XDB32040000)。
文摘Traditional electroencephalograph(EEG)-based emotion recognition requires a large number of calibration samples to build a model for a specific subject,which restricts the application of the affective brain computer interface(BCI)in practice.We attempt to use the multi-modal data from the past session to realize emotion recognition in the case of a small amount of calibration samples.To solve this problem,we propose a multimodal domain adaptive variational autoencoder(MMDA-VAE)method,which learns shared cross-domain latent representations of the multi-modal data.Our method builds a multi-modal variational autoencoder(MVAE)to project the data of multiple modalities into a common space.Through adversarial learning and cycle-consistency regularization,our method can reduce the distribution difference of each domain on the shared latent representation layer and realize the transfer of knowledge.Extensive experiments are conducted on two public datasets,SEED and SEED-IV,and the results show the superiority of our proposed method.Our work can effectively improve the performance of emotion recognition with a small amount of labelled multi-modal data.
基金Project supported by the State Key Development Program for Basic Research of China (Grant Nos 2008CB717803 and 2007ID103)the Research Fund for the Doctoral Program of Higher Education of China (Gant No 200610001023)
文摘The improved version of Los Alamos model with the multi-modal fission approach is used to analyse the prompt fission neutron spectrum and multiplicity for the neutron-induced fission of 237Np. The spectra of neutrons emitted from fragments for the three most dominant fission modes (standard Ⅰ, standard Ⅱ and superlong) are calculated separately and the total spectrum is synthesized. The multi-modal parameters contained in the spectrum model are determined on the basis of experimental data of fission fragment mass distributions. The calculated total prompt fission neutron spectrum and multiplicity are better agreement with the experimental data than those obtained from the conventional treatment of the Los Alamos model.
基金supported by the State Key Development Program for Basic Research of China (Nos. 2008CB717803, 2009GB107001, and2007CB209903)the Research Fund for the Doctoral Program of Higher Education of China (No. 200610011023)
文摘An attempt is made to improve the evaluation of the prompt fission neutron emis- sion from 233U(n, f) reaction for incident neutron energies below 6 MeV. The multi-modal fission approach is applied to the improved version of Los Alamos model and the point by point model. The prompt fission neutron spectra and the prompt fission neutron as a function of fragment mass (usually named "sawtooth" data) v(A) are calculated independently for the three most dominant fission modes (standard I, standard II and superlong), and the total spectra and v(A) are syn- thesized. The multi-modal parameters are determined on the basis of experimental data of fission fragment mass distributions. The present calculation results can describe the experimental data very well, and the proposed treatment is thus a useful tool for prompt fission neutron emission prediction.
文摘With the increasing of the elderly population and the growing hearth care cost, the role of service robots in aiding the disabled and the elderly is becoming important. Many researchers in the world have paid much attention to heaRthcare robots and rehabilitation robots. To get natural and harmonious communication between the user and a service robot, the information perception/feedback ability, and interaction ability for service robots become more important in many key issues.
文摘In telerobotic system for remote welding, human-machine interface is one of the most important factor for enhancing capability and efficiency. This paper presents an architecture design of human-machine interface for welding telerobotic system: welding multi-modal human-machine interface. The human-machine interface integrated several control modes, which are namely shared control, teleteaching, supervisory control and local autonomous control. Space mouse, panoramic vision camera and graphics simulation system are also integrated into the human-machine interface for welding teleoperation. Finally, weld seam tracing and welding experiments of U-shape seam are performed by these control modes respectively. The results show that the system has better performance of human-machine interaction and complexity environment welding.
基金supported,in part,by the National Natural Science Foundation of China(Grant No.61773249)the Natural Science Foundation of Shanghai(Grant No.16ZRl411700)the Science and Technology on NearSurface Detection Laboratory(Grant No.6142414090117,TCGZ2017A006)
文摘The acoustic vibration characteristics of landmines are investigated by means of modal analysis. According to the mechanical structure of landmines, a certain number of points are marked on the landmine shell to analyze its multi-modal vibration characteristics, based on laser self-mixing interferometer and taking 69 plastic landmine as an example, the vibration detection experiment system is built to show the results of analytical method of multi-modal testing. The first and second order natural frequencies of the bricks are 38 HZ and 106 HZ, 112 HZ and 232 HZ for plastic landmines, and 74 HZ and 290 HZ for metal landmines. The first and second order natural frequencies of the bricks are far smaller than those of plastic landmines and metal landmines. This indicates that landmines show multi-modal vibration characteristics under external excitation, which are significantly different from those of bricks. The findings can be used for further research on acoustic landmines detection technology.
文摘Based on the theory of Forceville’s multi-modal metaphor,this paper adopts qualitative and quantitative research methods to analyze 60 social safety ads both in China and America,trying to demonstrate the similarities and differences between the chosen social safety ads in using multi-modal metaphor and discussing the factors that caused these differences.
基金the National Natural Science Foundation of China(No.62076035)。
文摘Leveraging deep learning-based techniques to classify diseases has attracted extensive research interest in recent years.Nevertheless,most of the current studies only consider single-modal medical images,and the number of ophthalmic diseases that can be classified is relatively small.Moreover,imbalanced data distribution of different ophthalmic diseases is not taken into consideration,which limits the application of deep learning techniques in realistic clinical scenes.In this paper,we propose a Multimodal Multi-disease Long-tailed Classification Network(M^(2)LC-Net)in response to the challenges mentioned above.M^(2)LC-Net leverages ResNet18-CBAM to extract features from fundus images and Optical Coherence Tomography(OCT)images,respectively,and conduct feature fusion to classify 11 common ophthalmic diseases.Moreover,Class Activation Mapping(CAM)is employed to visualize each mode to improve interpretability of M^(2)LC-Net.We conduct comprehensive experiments on realistic dataset collected from a Grade III Level A ophthalmology hospital in China,including 34,396 images of 11 disease labels.Experimental results demonstrate effectiveness of our proposed model M^(2)LC-Net.Compared with the stateof-the-art,various performance metrics have been improved significantly.Specifically,Cohen’s kappa coefficient κ has been improved by 3.21%,which is a remarkable improvement.
基金Supported by Education and Teaching Reform and Research Project of Xi'an University of Science and Technology(JG14110)Cultivation Fund of Xi'an University of Science and Technology(201640)Science and Technology Innovation Team Fund of College of Architecture and Civil Engineering(17JGCXTD004)
文摘Taking the teaching practice of agricultural landscape planning for example,this paper uses the multi-modal teaching idea for teaching design based on traditional lecture-style teaching,including multi-modal teaching materials,multi-modal teaching methods and multi-modal teaching evaluation. The results show that this method can effectively improve students' interest in learning,reinforce the theoretical basis of agricultural landscape planning theory,and improve agricultural landscape planning practical skills. It is the active exploration of multi-modal teaching model and useful complement to traditional classroom teaching.
文摘Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplicate detection methods. We have designed a coarse-to-fine near duplicate detection framework to speed-up the process and a multi-modal integra-tion scheme for accurate detection. The duplicate pairs are detected with both global feature (partition based color his-togram) and local feature (CPAM and SIFT Bag-of-Word model). The experiment results on large scale data set proved the effectiveness of the proposed design.
基金Project supported by the National Natural Science Foundation of China(Nos.11702170,11320011,and 11802279)the China Postdoctoral Science Foundation(No.2016M601585)
文摘Modal parameter identification is a mature technology.However,there are some challenges in its practical applications such as the identification of vibration systems involving closely spaced modes and intensive noise contamination.This paper proposes a new time-frequency method based on intrinsic chirp component decomposition(ICCD)to address these issues.In this method,a redundant Fourier model is used to ameliorate border distortions and improve the accuracy of signal reconstruction.The effectiveness and accuracy of the proposed method are illustrated using three examples:a cantilever beam structure with intensive noise contamination or environmental interference,a four-degree-of-freedom structure with two closely spaced modes,and an impact test on a cantilever rectangular plate.By comparison with the identification method based on the empirical wavelet transform(EWT),it is shown that the presented method is effective,even in a high-noise environment,and the dynamic characteristics of closely spaced modes are accurately determined.
文摘Diagnosis and prediction of satellite fault are more difficult than that of other equipment due to the complex structure of satellites and the presence of multi excite sources of satellite faults. Generally, one kind of reasoning model can only diagnose and predict one kind of satellite faults. In this paper the author introduces an application of a new method using multi modal reasoning to diagnose and predict satellite faults. The method has been used in the development of knowledge based satellite fault diagnosis and recovery system (KSFDRS) successfully. It is shown that the method is effective.
文摘Based on the teaching video of middle school English teachers, through observation and analysis, it puts forward the problem of less use, wrong use and abuse in the use of teachers' teaching gestures in middle school English teaching. And then it puts forward corresponding solutions from three aspects: concept, theory and practice. Hoping to provide further reference to the complementary role of teaching gesture and teaching discourse.