Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion...Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models.展开更多
With the emergence and development of social networks,people can stay in touch with friends,family,and colleagues more quickly and conveniently,regardless of their location.This ubiquitous digital internet environment...With the emergence and development of social networks,people can stay in touch with friends,family,and colleagues more quickly and conveniently,regardless of their location.This ubiquitous digital internet environment has also led to large-scale disclosure of personal privacy.Due to the complexity and subtlety of sensitive information,traditional sensitive information identification technologies cannot thoroughly address the characteristics of each piece of data,thus weakening the deep connections between text and images.In this context,this paper adopts the CLIP model as a modality discriminator.By using comparative learning between sensitive image descriptions and images,the similarity between the images and the sensitive descriptions is obtained to determine whether the images contain sensitive information.This provides the basis for identifying sensitive information using different modalities.Specifically,if the original data does not contain sensitive information,only single-modality text-sensitive information identification is performed;if the original data contains sensitive information,multimodality sensitive information identification is conducted.This approach allows for differentiated processing of each piece of data,thereby achieving more accurate sensitive information identification.The aforementioned modality discriminator can address the limitations of existing sensitive information identification technologies,making the identification of sensitive information from the original data more appropriate and precise.展开更多
Phenotypic diversity,especially that of facial morphology,has not been fully investigated in the Han Chinese,which is the largest ethnic group in the world.In this study,we systematically analyzed a total of 14,838 fa...Phenotypic diversity,especially that of facial morphology,has not been fully investigated in the Han Chinese,which is the largest ethnic group in the world.In this study,we systematically analyzed a total of 14,838 facial traits representing 15 categories with both a large-scale three-dimensional(3D)manual landmarking database and computer-aided facial segmented phenotyping in 2379 Han Chinese individuals.Our results illustrate that homogeneous and heterogeneous facial morphological traits exist among Han Chinese populations across the three geographical regions:Zhengzhou,Taizhou,and Nanning.We identifed 1560 shared features from extracted phenotypes,which characterized well the basic facial morphology of the Han Chinese.In particular,heterogeneous phenotypes showing population structures corresponded to geographical subpopulations.The greatest facial variation among these geographical populations was the angle of glabella,left subalare,and right cheilion(p=3.4×10^(−161)).Interestingly,we found that Han Chinese populations could be classifed into northern Han,central Han,and southern Han at the phenotypic level,and the facial morphological variation pattern of central Han Chinese was between the typical diferentiation of northern and southern Han Chinese.This result was highly consistent with the results revealed by the genetic data.These fndings provide new insights into the analysis of multidimensional phenotypes as well as a valuable resource for further facial phenotype-genotype association studies in Han Chinese and East Asian populations.展开更多
In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This me...In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This method is aimed at organizations such as companies and schools that are transitioning from traditional access control models to the ABAC model.The manual retrieval and analysis involved in this transition are inefficient,prone to errors,and costly.Most organizations have high-level specifications defined for security policies that include a set of access control policies,which often exist in the form of natural language documents.Utilizing this rich source of information,our method effectively identifies and extracts the necessary attributes and rules for access control from natural language documents,thereby constructing and optimizing access control policies.This work transforms the problem of policy automation generation into two tasks:extraction of access control statements andmining of access control attributes.First,the Chat General Language Model(ChatGLM)isemployed to extract access control-related statements from a wide range of natural language documents by constructing unique prompts and leveraging the model’s In-Context Learning to contextualize the statements.Then,the Iterated Dilated-Convolutions-Conditional Random Field(ID-CNN-CRF)model is used to annotate access control attributes within these extracted statements,including subject attributes,object attributes,and action attributes,thus reassembling new access control policies.Experimental results show that our method,compared to baseline methods,achieved the highest F1 score of 0.961,confirming the model’s effectiveness and accuracy.展开更多
Next-generation sequencing technologies have significantly accelerated the identification of disease-causing mutations and facilitated the emergence of personalized medicine(Genomes Project Consortium et al.,2015;Good...Next-generation sequencing technologies have significantly accelerated the identification of disease-causing mutations and facilitated the emergence of personalized medicine(Genomes Project Consortium et al.,2015;Goodwin et al.,2016;Sirugo et al.,2019).In comparison with whole-genome sequencing,whole-exome sequencing(WES),which covers the coding regions of the genome,offers a cost-efficacy balance.WES provides deeper sequencing depth(>100)and allows the more accurate detection of rare variants that are tailored for clinical applications(Lek et al.,2016).展开更多
Altitude acclimatization is a human physiological process of adjusting to the decreased oxygen availability.Since several physiological processes are involved and their correlations are complicated,the analyses of sin...Altitude acclimatization is a human physiological process of adjusting to the decreased oxygen availability.Since several physiological processes are involved and their correlations are complicated,the analyses of single traits are insufficient in revealing the complex mechanism of high-altitude acclimatization.In this study,we examined these physiological responses as the composite phenotypes that are represented by a linear combination of physiological traits.We developed a strategy that combines both spectral clustering and partial least squares path modeling(PLSPM)to define composite phenotypes based on a cohort study of 883 Chinese Han males.In addition,we captured 14 composite phenotypes from 28 physiological traits of high-altitude acclimatization.Using these composite phenotypes,we applied k-means clustering to reveal hidden population physiological heterogeneity in high-altitude acclimatization.Furthermore,we employed multivariate linear regression to systematically model(Models 1 and 2)oxygen saturation(SpO_(2))changes in high-altitude acclimatization and evaluated model fitness performance.Composite phenotypes based on Model 2 fit better than single trait-based Model 1 in all measurement indices.This new strategy of using composite phenotypes may be potentially employed as a general strategy for complex traits research such as genetic loci discovery and analyses of phenomics.展开更多
Nowadays,autonomous driving has been attracted widespread attention from academia and industry.As we all know,deep learning is effective and essential for the development of AI components of Autonomous Vehicles(AVs).H...Nowadays,autonomous driving has been attracted widespread attention from academia and industry.As we all know,deep learning is effective and essential for the development of AI components of Autonomous Vehicles(AVs).However,it is challenging to adopt multi-source heterogenous data in deep learning.Therefore,we propose a novel data-driven approach for the delivery of high-quality Spatio-Temporal Trajectory Data(STTD)to AVs,which can be deployed to assist the development of AI components with deep learning.The novelty of our work is that the meta-model of STTD is constructed based on the domain knowledge of autonomous driving.Our approach,including collection,preprocessing,storage and modeling of STTD as well as the training of AI components,helps to process and utilize huge amount of STTD efficiently.To further demonstrate the usability of our approach,a case study of vehicle behavior prediction using Long Short-Term Memory(LSTM)networks is discussed.Experimental results show that our approach facilitates the training process of AI components with the STTD.展开更多
基金supported by the National Natural Science Foundation of China(No.62302540)with author Fangfang Shan.For more information,please visit their website at https://www.nsfc.gov.cn/(accessed on 31/05/2024)+3 种基金Additionally,it is also funded by the Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020)where Fangfang Shan is an author.Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/(accessed on 31/05/2024)supported by the Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422)for more information,you can visit https://kjt.henan.gov.cn/2022/09-02/2599082.html(accessed on 31/05/2024).
文摘Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models.
基金supported by the National Natural Science Foundation of China(No.62302540),with author Fangfang Shan for more information,please visit their website at https://www.nsfc.gov.cn/(accessed on 05 June 2024)Additionally,it is also funded by the Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020),where Fangfang Shan is an author.Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/(accessed on 05 June 2024)the Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422),and for more information,you can visit https://kjt.henan.gov.cn(accessed on 05 June 2024).
文摘With the emergence and development of social networks,people can stay in touch with friends,family,and colleagues more quickly and conveniently,regardless of their location.This ubiquitous digital internet environment has also led to large-scale disclosure of personal privacy.Due to the complexity and subtlety of sensitive information,traditional sensitive information identification technologies cannot thoroughly address the characteristics of each piece of data,thus weakening the deep connections between text and images.In this context,this paper adopts the CLIP model as a modality discriminator.By using comparative learning between sensitive image descriptions and images,the similarity between the images and the sensitive descriptions is obtained to determine whether the images contain sensitive information.This provides the basis for identifying sensitive information using different modalities.Specifically,if the original data does not contain sensitive information,only single-modality text-sensitive information identification is performed;if the original data contains sensitive information,multimodality sensitive information identification is conducted.This approach allows for differentiated processing of each piece of data,thereby achieving more accurate sensitive information identification.The aforementioned modality discriminator can address the limitations of existing sensitive information identification technologies,making the identification of sensitive information from the original data more appropriate and precise.
基金the Basic Science Center Program(32288101)the National Natural Science Foundation of China(NSFC)grants(32271186,31771325,32030020,31961130380,T2122007,and 32070577)the National Science and Technology Basic Research Project(2015FY111700 to LJ).
文摘Phenotypic diversity,especially that of facial morphology,has not been fully investigated in the Han Chinese,which is the largest ethnic group in the world.In this study,we systematically analyzed a total of 14,838 facial traits representing 15 categories with both a large-scale three-dimensional(3D)manual landmarking database and computer-aided facial segmented phenotyping in 2379 Han Chinese individuals.Our results illustrate that homogeneous and heterogeneous facial morphological traits exist among Han Chinese populations across the three geographical regions:Zhengzhou,Taizhou,and Nanning.We identifed 1560 shared features from extracted phenotypes,which characterized well the basic facial morphology of the Han Chinese.In particular,heterogeneous phenotypes showing population structures corresponded to geographical subpopulations.The greatest facial variation among these geographical populations was the angle of glabella,left subalare,and right cheilion(p=3.4×10^(−161)).Interestingly,we found that Han Chinese populations could be classifed into northern Han,central Han,and southern Han at the phenotypic level,and the facial morphological variation pattern of central Han Chinese was between the typical diferentiation of northern and southern Han Chinese.This result was highly consistent with the results revealed by the genetic data.These fndings provide new insights into the analysis of multidimensional phenotypes as well as a valuable resource for further facial phenotype-genotype association studies in Han Chinese and East Asian populations.
基金supported by the National Natural Science Foundation of China Project(No.62302540),please visit their website at https://www.nsfc.gov.cn/(accessed on 18 June 2024)The Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020),Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/(accessed on 18 June 2024)Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422),you can visit https://kjt.henan.gov.cn/2022/09-02/2599082.html(accessed on 18 June 2024).
文摘In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This method is aimed at organizations such as companies and schools that are transitioning from traditional access control models to the ABAC model.The manual retrieval and analysis involved in this transition are inefficient,prone to errors,and costly.Most organizations have high-level specifications defined for security policies that include a set of access control policies,which often exist in the form of natural language documents.Utilizing this rich source of information,our method effectively identifies and extracts the necessary attributes and rules for access control from natural language documents,thereby constructing and optimizing access control policies.This work transforms the problem of policy automation generation into two tasks:extraction of access control statements andmining of access control attributes.First,the Chat General Language Model(ChatGLM)isemployed to extract access control-related statements from a wide range of natural language documents by constructing unique prompts and leveraging the model’s In-Context Learning to contextualize the statements.Then,the Iterated Dilated-Convolutions-Conditional Random Field(ID-CNN-CRF)model is used to annotate access control attributes within these extracted statements,including subject attributes,object attributes,and action attributes,thus reassembling new access control policies.Experimental results show that our method,compared to baseline methods,achieved the highest F1 score of 0.961,confirming the model’s effectiveness and accuracy.
基金Shanghai Municipal Science and Technology Major Project(2017SHZDZX01)CAMS Innovation Fund for Medical Sciences(2019-I2M-5-066)+1 种基金the National Basic Research Program of China(2015FY111700)This work was also supported by the Postdoctoral Science Foundation of China(2018M640333 and 2019M651354).
文摘Next-generation sequencing technologies have significantly accelerated the identification of disease-causing mutations and facilitated the emergence of personalized medicine(Genomes Project Consortium et al.,2015;Goodwin et al.,2016;Sirugo et al.,2019).In comparison with whole-genome sequencing,whole-exome sequencing(WES),which covers the coding regions of the genome,offers a cost-efficacy balance.WES provides deeper sequencing depth(>100)and allows the more accurate detection of rare variants that are tailored for clinical applications(Lek et al.,2016).
基金supported by Shanghai Municipal Science and Technology Major Project(2017SHZDZX01)National Science Foundation of China(31330038)+5 种基金CAMS Innovation Fund for Medical Sciences(2019-I2M-5-066)Science and Technology Committee of Shanghai Municipality(16JC1400500)Ministry of Science and Technology(2015FY1117000)the 111 Project(B13016)Major Project of Special Development Funds of Zhangjiang National Independent Innovation Demonstration Zone(ZJ2019-ZD-004)supported by the Postdoctoral Science Foundation of China(2018M640333).
文摘Altitude acclimatization is a human physiological process of adjusting to the decreased oxygen availability.Since several physiological processes are involved and their correlations are complicated,the analyses of single traits are insufficient in revealing the complex mechanism of high-altitude acclimatization.In this study,we examined these physiological responses as the composite phenotypes that are represented by a linear combination of physiological traits.We developed a strategy that combines both spectral clustering and partial least squares path modeling(PLSPM)to define composite phenotypes based on a cohort study of 883 Chinese Han males.In addition,we captured 14 composite phenotypes from 28 physiological traits of high-altitude acclimatization.Using these composite phenotypes,we applied k-means clustering to reveal hidden population physiological heterogeneity in high-altitude acclimatization.Furthermore,we employed multivariate linear regression to systematically model(Models 1 and 2)oxygen saturation(SpO_(2))changes in high-altitude acclimatization and evaluated model fitness performance.Composite phenotypes based on Model 2 fit better than single trait-based Model 1 in all measurement indices.This new strategy of using composite phenotypes may be potentially employed as a general strategy for complex traits research such as genetic loci discovery and analyses of phenomics.
基金supports for this work,provided by the National Natural Science Foundation of China(Grant No.61972153)the National Key Research and Development Program(No.2018YFE0101000)+1 种基金the Key projects of the Ministry of Science and Technology(No.2020AAA0107800)are gratefully acknowledged.
文摘Nowadays,autonomous driving has been attracted widespread attention from academia and industry.As we all know,deep learning is effective and essential for the development of AI components of Autonomous Vehicles(AVs).However,it is challenging to adopt multi-source heterogenous data in deep learning.Therefore,we propose a novel data-driven approach for the delivery of high-quality Spatio-Temporal Trajectory Data(STTD)to AVs,which can be deployed to assist the development of AI components with deep learning.The novelty of our work is that the meta-model of STTD is constructed based on the domain knowledge of autonomous driving.Our approach,including collection,preprocessing,storage and modeling of STTD as well as the training of AI components,helps to process and utilize huge amount of STTD efficiently.To further demonstrate the usability of our approach,a case study of vehicle behavior prediction using Long Short-Term Memory(LSTM)networks is discussed.Experimental results show that our approach facilitates the training process of AI components with the STTD.