Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the...Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the research and applications of natural language processing across different modalities,our goal is to accurately extract frame-level semantic information from videos and ultimately transmit high-quality videos.Specifically,we propose a deep learning-basedMulti-ModalMutual Enhancement Video Semantic Communication system,called M3E-VSC.Built upon a VectorQuantized Generative AdversarialNetwork(VQGAN),our systemaims to leverage mutual enhancement among different modalities by using text as the main carrier of transmission.With it,the semantic information can be extracted fromkey-frame images and audio of the video and performdifferential value to ensure that the extracted text conveys accurate semantic information with fewer bits,thus improving the capacity of the system.Furthermore,a multi-frame semantic detection module is designed to facilitate semantic transitions during video generation.Simulation results demonstrate that our proposed model maintains high robustness in complex noise environments,particularly in low signal-to-noise ratio conditions,significantly improving the accuracy and speed of semantic transmission in video communication by approximately 50 percent.展开更多
The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-genera...The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable.展开更多
Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively u...Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively utilize multi-modal remote sensing data to break through the performance bottleneck of single-modal interpretation.In addition,semantic segmentation and height estimation in remote sensing data are two tasks with strong correlation,but existing methods usually study individual tasks separately,which leads to high computational resource overhead.To this end,we propose a Multi-Task learning framework for Multi-Modal remote sensing images(MM_MT).Specifically,we design a Cross-Modal Feature Fusion(CMFF)method,which aggregates complementary information of different modalities to improve the accuracy of semantic segmentation and height estimation.Besides,a dual-stream multi-task learning method is introduced for Joint Semantic Segmentation and Height Estimation(JSSHE),extracting common features in a shared network to save time and resources,and then learning task-specific features in two task branches.Experimental results on the public multi-modal remote sensing image dataset Potsdam show that compared to training two tasks independently,multi-task learning saves 20%of training time and achieves competitive performance with mIoU of 83.02%for semantic segmentation and accuracy of 95.26%for height estimation.展开更多
In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure in...In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical.Unfortunately,existing approaches fail to handle these problems.This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues(TD-MMC),which utilizes three valuable multi-model clues:text-model importance,text-image complementary,and text-image inconsistency.TD-MMC is dominated by textural content and assisted by image information while using social network information to enhance text representation.To reduce the irrelevant social structure’s information interference,we use a unidirectional cross-modal attention mechanism to selectively learn the social structure’s features.A cross-modal attention mechanism is adopted to obtain text-image cross-modal features while retaining textual features to reduce the loss of important information.In addition,TD-MMC employs a new multi-model loss to improve the model’s generalization ability.Extensive experiments have been conducted on two public real-world English and Chinese datasets,and the results show that our proposed model outperforms the state-of-the-art methods on classification evaluation metrics.展开更多
In this paper, a study on four African ports was taken out that all have the ca-pability to become a hub port that can serve the central African region. The paper sort to determine which port was most suitable and por...In this paper, a study on four African ports was taken out that all have the ca-pability to become a hub port that can serve the central African region. The paper sort to determine which port was most suitable and port indexing was the method that was used to evaluate these ports. The ports evaluated were the port of Kribi, the port of Bata, the port of Libreville and the port of Pointe-Noire. There were other models that were also used which included linear regression and linear programming which all contributed to providing the final results of the port with the most suitable potential to serve as a hub port and meaningful results were obtained. The final results showed that the port of Pointe-Noire was the most suitable port to serve the central African region as a hub port.展开更多
The rapid development of electric buses has brought a surge in the number of bus hubs and their charging and discharging capacities.Therefore,the location and construction scale of bus hubs will greatly affect the ope...The rapid development of electric buses has brought a surge in the number of bus hubs and their charging and discharging capacities.Therefore,the location and construction scale of bus hubs will greatly affect the operation costs and benefits of an urban distribution network in the future.Through the scientific and reasonable planning of public transport hubs on the premise of meeting the needs of basic public transport services,it can reduce the negative impact of electric bus charging loads upon the power grids.Furthermore,it can use its flexible operation characteristics to provide flexible support for the distribution network.In this paper,taking the impact of public transport hub on the reliability of distribution network as the starting point,a three-level programming optimization model based on the value and economy of distribution network load loss is proposed.Through the upper model,several planning schemes can be generated,which provides boundary conditions for the expansion of middle-level optimization.The normal operation dispatching scheme of public transport hub obtained from the middle-level optimization results provides boundary conditions for the development of lower level optimization.Through the lower level optimization,the expected load loss of the whole distribution system including bus hub under the planning scheme given by the upper level can be obtained.The effectiveness of the model is verified by an IEEE-33 bus example.展开更多
Cardiomyopathies represent the most common clinical and genetic heterogeneous group of diseases that affect the heart function.Though progress has been made to elucidate the process,molecular mechanisms of different c...Cardiomyopathies represent the most common clinical and genetic heterogeneous group of diseases that affect the heart function.Though progress has been made to elucidate the process,molecular mechanisms of different classes of cardiomyopathies remain elusive.This paper aims to describe the similarities and differences in molecular features of dilated cardiomyopathy(DCM)and ischemic cardiomyopathy(ICM).We firstly detected the co-expressed modules using the weighted gene co-expression network analysis(WGCNA).Significant modules associated with DCM/ICM were identified by the Pearson correlation coefficient(PCC)between the modules and the phenotype of DCM/ICM.The differentially expressed genes in the modules were selected to perform functional enrichment.The potential transcription factors(TFs)prediction was conducted for transcription regulation of hub genes.Apoptosis and cardiac conduction were perturbed in DCM and ICM,respectively.TFs demonstrated that the biomarkers and the transcription regulations in DCM and ICM were different,which helps make more accurate discrimination between them at molecular levels.In conclusion,comprehensive analyses of the molecular features may advance our understanding of DCM and ICM causes and progression.Thus,this understanding may promote the development of innovative diagnoses and treatments.展开更多
基金supported by the National Key Research and Development Project under Grant 2020YFB1807602Key Program of Marine Economy Development Special Foundation of Department of Natural Resources of Guangdong Province(GDNRC[2023]24)the National Natural Science Foundation of China under Grant 62271267.
文摘Recently,there have been significant advancements in the study of semantic communication in single-modal scenarios.However,the ability to process information in multi-modal environments remains limited.Inspired by the research and applications of natural language processing across different modalities,our goal is to accurately extract frame-level semantic information from videos and ultimately transmit high-quality videos.Specifically,we propose a deep learning-basedMulti-ModalMutual Enhancement Video Semantic Communication system,called M3E-VSC.Built upon a VectorQuantized Generative AdversarialNetwork(VQGAN),our systemaims to leverage mutual enhancement among different modalities by using text as the main carrier of transmission.With it,the semantic information can be extracted fromkey-frame images and audio of the video and performdifferential value to ensure that the extracted text conveys accurate semantic information with fewer bits,thus improving the capacity of the system.Furthermore,a multi-frame semantic detection module is designed to facilitate semantic transitions during video generation.Simulation results demonstrate that our proposed model maintains high robustness in complex noise environments,particularly in low signal-to-noise ratio conditions,significantly improving the accuracy and speed of semantic transmission in video communication by approximately 50 percent.
基金the National Natural Science Foundation of China(No.61976080)the Academic Degrees&Graduate Education Reform Project of Henan Province(No.2021SJGLX195Y)+1 种基金the Teaching Reform Research and Practice Project of Henan Undergraduate Universities(No.2022SYJXLX008)the Key Project on Research and Practice of Henan University Graduate Education and Teaching Reform(No.YJSJG2023XJ006)。
文摘The unsupervised multi-modal image translation is an emerging domain of computer vision whose goal is to transform an image from the source domain into many diverse styles in the target domain.However,the multi-generator mechanism is employed among the advanced approaches available to model different domain mappings,which results in inefficient training of neural networks and pattern collapse,leading to inefficient generation of image diversity.To address this issue,this paper introduces a multi-modal unsupervised image translation framework that uses a generator to perform multi-modal image translation.Specifically,firstly,the domain code is introduced in this paper to explicitly control the different generation tasks.Secondly,this paper brings in the squeeze-and-excitation(SE)mechanism and feature attention(FA)module.Finally,the model integrates multiple optimization objectives to ensure efficient multi-modal translation.This paper performs qualitative and quantitative experiments on multiple non-paired benchmark image translation datasets while demonstrating the benefits of the proposed method over existing technologies.Overall,experimental results have shown that the proposed method is versatile and scalable.
基金National Key R&D Program of China(No.2022ZD0118401).
文摘Deep learning based methods have been successfully applied to semantic segmentation of optical remote sensing images.However,as more and more remote sensing data is available,it is a new challenge to comprehensively utilize multi-modal remote sensing data to break through the performance bottleneck of single-modal interpretation.In addition,semantic segmentation and height estimation in remote sensing data are two tasks with strong correlation,but existing methods usually study individual tasks separately,which leads to high computational resource overhead.To this end,we propose a Multi-Task learning framework for Multi-Modal remote sensing images(MM_MT).Specifically,we design a Cross-Modal Feature Fusion(CMFF)method,which aggregates complementary information of different modalities to improve the accuracy of semantic segmentation and height estimation.Besides,a dual-stream multi-task learning method is introduced for Joint Semantic Segmentation and Height Estimation(JSSHE),extracting common features in a shared network to save time and resources,and then learning task-specific features in two task branches.Experimental results on the public multi-modal remote sensing image dataset Potsdam show that compared to training two tasks independently,multi-task learning saves 20%of training time and achieves competitive performance with mIoU of 83.02%for semantic segmentation and accuracy of 95.26%for height estimation.
基金This research was funded by the General Project of Philosophy and Social Science of Heilongjiang Province,Grant Number:20SHB080.
文摘In recent years,how to efficiently and accurately identify multi-model fake news has become more challenging.First,multi-model data provides more evidence but not all are equally important.Secondly,social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical.Unfortunately,existing approaches fail to handle these problems.This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues(TD-MMC),which utilizes three valuable multi-model clues:text-model importance,text-image complementary,and text-image inconsistency.TD-MMC is dominated by textural content and assisted by image information while using social network information to enhance text representation.To reduce the irrelevant social structure’s information interference,we use a unidirectional cross-modal attention mechanism to selectively learn the social structure’s features.A cross-modal attention mechanism is adopted to obtain text-image cross-modal features while retaining textual features to reduce the loss of important information.In addition,TD-MMC employs a new multi-model loss to improve the model’s generalization ability.Extensive experiments have been conducted on two public real-world English and Chinese datasets,and the results show that our proposed model outperforms the state-of-the-art methods on classification evaluation metrics.
文摘In this paper, a study on four African ports was taken out that all have the ca-pability to become a hub port that can serve the central African region. The paper sort to determine which port was most suitable and port indexing was the method that was used to evaluate these ports. The ports evaluated were the port of Kribi, the port of Bata, the port of Libreville and the port of Pointe-Noire. There were other models that were also used which included linear regression and linear programming which all contributed to providing the final results of the port with the most suitable potential to serve as a hub port and meaningful results were obtained. The final results showed that the port of Pointe-Noire was the most suitable port to serve the central African region as a hub port.
文摘The rapid development of electric buses has brought a surge in the number of bus hubs and their charging and discharging capacities.Therefore,the location and construction scale of bus hubs will greatly affect the operation costs and benefits of an urban distribution network in the future.Through the scientific and reasonable planning of public transport hubs on the premise of meeting the needs of basic public transport services,it can reduce the negative impact of electric bus charging loads upon the power grids.Furthermore,it can use its flexible operation characteristics to provide flexible support for the distribution network.In this paper,taking the impact of public transport hub on the reliability of distribution network as the starting point,a three-level programming optimization model based on the value and economy of distribution network load loss is proposed.Through the upper model,several planning schemes can be generated,which provides boundary conditions for the expansion of middle-level optimization.The normal operation dispatching scheme of public transport hub obtained from the middle-level optimization results provides boundary conditions for the development of lower level optimization.Through the lower level optimization,the expected load loss of the whole distribution system including bus hub under the planning scheme given by the upper level can be obtained.The effectiveness of the model is verified by an IEEE-33 bus example.
基金supported by the National Natural Science Foundation of China under Grants No.61720106004 and No.61872405the Key R&D Project of Sichuan Province,China under Grants No.20ZDYF2772 and No.2020YFS0243.
文摘Cardiomyopathies represent the most common clinical and genetic heterogeneous group of diseases that affect the heart function.Though progress has been made to elucidate the process,molecular mechanisms of different classes of cardiomyopathies remain elusive.This paper aims to describe the similarities and differences in molecular features of dilated cardiomyopathy(DCM)and ischemic cardiomyopathy(ICM).We firstly detected the co-expressed modules using the weighted gene co-expression network analysis(WGCNA).Significant modules associated with DCM/ICM were identified by the Pearson correlation coefficient(PCC)between the modules and the phenotype of DCM/ICM.The differentially expressed genes in the modules were selected to perform functional enrichment.The potential transcription factors(TFs)prediction was conducted for transcription regulation of hub genes.Apoptosis and cardiac conduction were perturbed in DCM and ICM,respectively.TFs demonstrated that the biomarkers and the transcription regulations in DCM and ICM were different,which helps make more accurate discrimination between them at molecular levels.In conclusion,comprehensive analyses of the molecular features may advance our understanding of DCM and ICM causes and progression.Thus,this understanding may promote the development of innovative diagnoses and treatments.