Background External knowledge representations play an essential role in knowledge-based visual question and answering to better understand complex scenarios in the open world.Recent entity-relationship embedding appro...Background External knowledge representations play an essential role in knowledge-based visual question and answering to better understand complex scenarios in the open world.Recent entity-relationship embedding approaches are deficient in representing some complex relations,resulting in a lack of topic-related knowledge and redundancy in topic-irrelevant information.Methods To this end,we propose MKEAH:Multimodal Knowledge Extraction and Accumulation on Hyperplanes.To ensure that the lengths of the feature vectors projected onto the hyperplane compare equally and to filter out sufficient topic-irrelevant information,two losses are proposed to learn the triplet representations from the complementary views:range loss and orthogonal loss.To interpret the capability of extracting topic-related knowledge,we present the Topic Similarity(TS)between topic and entity-relations.Results Experimental results demonstrate the effectiveness of hyperplane embedding for knowledge representation in knowledge-based visual question answering.Our model outperformed state-of-the-art methods by 2.12%and 3.24%on two challenging knowledge-request datasets:OK-VQA and KRVQA,respectively.Conclusions The obvious advantages of our model in TS show that using hyperplane embedding to represent multimodal knowledge can improve its ability to extract topic-related knowledge.展开更多
In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and comput...In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.展开更多
The weapon and equipment operational requirement analysis(WEORA) is a necessary condition to win a future war,among which the acquisition of knowledge about weapons and equipment is a great challenge. The main challen...The weapon and equipment operational requirement analysis(WEORA) is a necessary condition to win a future war,among which the acquisition of knowledge about weapons and equipment is a great challenge. The main challenge is that the existing weapons and equipment data fails to carry out structured knowledge representation, and knowledge navigation based on natural language cannot efficiently support the WEORA. To solve above problem, this research proposes a method based on question answering(QA) of weapons and equipment knowledge graph(WEKG) to construct and navigate the knowledge related to weapons and equipment in the WEORA. This method firstly constructs the WEKG, and builds a neutral network-based QA system over the WEKG by means of semantic parsing for knowledge navigation. Finally, the method is evaluated and a chatbot on the QA system is developed for the WEORA. Our proposed method has good performance in the accuracy and efficiency of searching target knowledge, and can well assist the WEORA.展开更多
Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the ...Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.展开更多
利用咸水或微咸水进行农田灌溉是缓解中国新疆地区农业水资源供需矛盾从而保障当地棉花产业可持续发展的主要途径之一。为了明确不同咸水灌溉措施对棉花产量及经济效益的影响,该研究通过2 a的棉花膜下滴灌大田试验和文献检索获取了新疆...利用咸水或微咸水进行农田灌溉是缓解中国新疆地区农业水资源供需矛盾从而保障当地棉花产业可持续发展的主要途径之一。为了明确不同咸水灌溉措施对棉花产量及经济效益的影响,该研究通过2 a的棉花膜下滴灌大田试验和文献检索获取了新疆9个不同试验地点的土壤、作物及灌溉等数据资料,评估作物产量-水盐胁迫响应分析模型(ANalytical Salt WatER,ANSWER)在新疆棉花产量评估中的适用性和可靠性,并结合经济收支平衡方法,模拟分析不同咸水灌溉措施(包括不同灌溉定额和灌溉水电导率的组合)对棉花产量与经济效益的影响。采用决定系数(R2)、均方根误差(root mean squared error,RMSE)、相对均方根误差(relative root mean squared error,RRMSE)评价模型精度。结果表明,在9个不同试验地点,ANSWER模型均可较准确地估算棉花的相对产量,其估算值与实测值之间的R^(2)≥0.54,RMSE≤0.14,RRMSE≤0.16;不同试验地点,优化获得的各个模型生物参数(与棉花根系吸水的水盐胁迫响应相关的参数)差异较小,变异系数的绝对值处于0.08~0.37之间;基于不同试验地点优化的各生物参数均值估算各地的棉花相对产量,其与实测值仍然吻合良好(R^(2)为0.59,RMSE为0.06,RRMSE为0.07);此外,当灌溉水电导率一定时,棉花净收益随灌溉定额增加呈先增后降的趋势,净收益达到峰值所需的灌溉定额随灌溉水电导率升高而迅速增加;当灌溉水电导率不大于10 dS/m时,通过加大供水量均可获得与淡水灌溉相当的净收益。研究可为新疆地区棉花产量与效益评估以及咸水资源合理开发利用提供理论依据。展开更多
Obesity is recognized as the second highest risk factor for cancer. The pathogenic mechanisms underlying tobaccorelated cancers are well characterized and efective programs have led to a decline in smoking and related...Obesity is recognized as the second highest risk factor for cancer. The pathogenic mechanisms underlying tobaccorelated cancers are well characterized and efective programs have led to a decline in smoking and related cancers, but there is a global epidemic of obesity without a clear understanding of how obesity causes cancer. Obesity is heterogeneous, and approximately 25% of obese individuals remain healthy(metabolically healthy obese, MHO), so which fat deposition(subcutaneous versus visceral, adipose versus ectopic) is "malignant"? What is the mechanism of carcinogenesis? Is it by metabolic dysregulation or chronic inflammation? Through which chemokines/genes/signaling pathways does adipose tissue influence carcinogenesis? Can selective inhibition of these pathways uncouple obesity from cancers? Do all obesity related cancers(ORCs) share a molecular signature? Are there common(overlapping) genetic loci that make individuals susceptible to obesity, metabolic syndrome, and cancers? Can we identify precursor lesions of ORCs and will early intervention of high risk individuals alter the natural history? It appears unlikely that the obesity epidemic will be controlled anytime soon; answers to these questions will help to reduce the adverse efect of obesity on human condition.展开更多
Over the last couple of decades,community question-answering sites(CQAs)have been a topic of much academic interest.Scholars have often leveraged traditional machine learning(ML)and deep learning(DL)to explore the eve...Over the last couple of decades,community question-answering sites(CQAs)have been a topic of much academic interest.Scholars have often leveraged traditional machine learning(ML)and deep learning(DL)to explore the ever-growing volume of content that CQAs engender.To clarify the current state of the CQA literature that has used ML and DL,this paper reports a systematic literature review.The goal is to summarise and synthesise the major themes of CQA research related to(i)questions,(ii)answers and(iii)users.The final review included 133 articles.Dominant research themes include question quality,answer quality,and expert identification.In terms of dataset,some of the most widely studied platforms include Yahoo!Answers,Stack Exchange and Stack Overflow.The scope of most articles was confined to just one platform with few cross-platform investigations.Articles with ML outnumber those with DL.Nonetheless,the use of DL in CQA research is on an upward trajectory.A number of research directions are proposed.展开更多
How to Differentiate and Treat Bi-syndrome by Acupuncture and Moxibustion?Bi-syndrome is the syndrome due to invasion of the exogenous pathogenic factors of wind, cold and dampness, which obstruct the channels and col...How to Differentiate and Treat Bi-syndrome by Acupuncture and Moxibustion?Bi-syndrome is the syndrome due to invasion of the exogenous pathogenic factors of wind, cold and dampness, which obstruct the channels and collaterals, leading to stagnated flow of qi and blood, characterized by such clinical manifestations as aching pain, numbness, heaviness, limited flexion and extension of the muscles, tendons and joints, or swelling and burning heat of the joints. This syndrome includes rheumatic arthritis, rheumatoid arthritis, osseous arthritis, and various neuralgia. The endogenous causative factors for the occurrence of Bi-syndrome are insufficiency of yang-qi, essence and blood, while the exogenous causative factors are the pathogenic wind, cold, and dampness. At the initial stage of the disease, the excessiveness of pathogen usually prevails, and the disease tends to be located in the limbs, skin and muscles, and channels and collaterals; while at the chronic stage, there often exists deficiency of the vital-qi or deficiency and excess intermixed, and the disease tends to be located deeper in the tendons and bones or in the zang-fu organs.展开更多
ExpertRecommendation(ER)aims to identify domain experts with high expertise and willingness to provide answers to questions in Community Question Answering(CQA)web services.How to model questions and users in the hete...ExpertRecommendation(ER)aims to identify domain experts with high expertise and willingness to provide answers to questions in Community Question Answering(CQA)web services.How to model questions and users in the heterogeneous content network is critical to this task.Most traditional methods focus on modeling questions and users based on the textual content left in the community while ignoring the structural properties of heterogeneous CQA networks and always suffering from textual data sparsity issues.Recent approaches take advantage of structural proximities between nodes and attempt to fuse the textual content of nodes for modeling.However,they often fail to distinguish the nodes’personalized preferences and only consider the textual content of a part of the nodes in network embedding learning,while ignoring the semantic relevance of nodes.In this paper,we propose a novel framework that jointly considers the structural proximity relations and textual semantic relevance to model users and questions more comprehensively.Specifically,we learn topology-based embeddings through a hierarchical attentive network learning strategy,in which the proximity information and the personalized preference of nodes are encoded and preserved.Meanwhile,we utilize the node’s textual content and the text correlation between adjacent nodes to build the content-based embedding through a meta-context-aware skip-gram model.In addition,the user’s relative answer quality is incorporated to promote the ranking performance.Experimental results show that our proposed framework consistently and significantly outperforms the state-of-the-art baselines on three real-world datasets by taking the deep semantic understanding and structural feature learning together.The performance of the proposed work is analyzed in terms of MRR,P@K,and MAP and is proven to be more advanced than the existing methodologies.展开更多
Recently,pre-trained language representation models such as bidirec-tional encoder representations from transformers(BERT)have been performing well in commonsense question answering(CSQA).However,there is a problem th...Recently,pre-trained language representation models such as bidirec-tional encoder representations from transformers(BERT)have been performing well in commonsense question answering(CSQA).However,there is a problem that the models do not directly use explicit information of knowledge sources existing outside.To augment this,additional methods such as knowledge-aware graph network(KagNet)and multi-hop graph relation network(MHGRN)have been proposed.In this study,we propose to use the latest pre-trained language model a lite bidirectional encoder representations from transformers(ALBERT)with knowledge graph information extraction technique.We also propose to applying the novel method,schema graph expansion to recent language models.Then,we analyze the effect of applying knowledge graph-based knowledge extraction techniques to recent pre-trained language models and confirm that schema graph expansion is effective in some extent.Furthermore,we show that our proposed model can achieve better performance than existing KagNet and MHGRN models in CommonsenseQA dataset.展开更多
Teachers’ questions have been regarded as an important component in foreign language teaching context. The present paper aims to present a brief investigation into teachers’ question types and students’ answers in ...Teachers’ questions have been regarded as an important component in foreign language teaching context. The present paper aims to present a brief investigation into teachers’ question types and students’ answers in primary school English teaching, and tries to draw some implications for primary school English teachers. The video was transcribed and analyzed by the researcher. According to what is surveyed in the study, some questioning strategies were put forward for primary English teaching in the future.展开更多
基金Supported by National Nature Science Foudation of China(61976160,61906137,61976158,62076184,62076182)Shanghai Science and Technology Plan Project(21DZ1204800)。
文摘Background External knowledge representations play an essential role in knowledge-based visual question and answering to better understand complex scenarios in the open world.Recent entity-relationship embedding approaches are deficient in representing some complex relations,resulting in a lack of topic-related knowledge and redundancy in topic-irrelevant information.Methods To this end,we propose MKEAH:Multimodal Knowledge Extraction and Accumulation on Hyperplanes.To ensure that the lengths of the feature vectors projected onto the hyperplane compare equally and to filter out sufficient topic-irrelevant information,two losses are proposed to learn the triplet representations from the complementary views:range loss and orthogonal loss.To interpret the capability of extracting topic-related knowledge,we present the Topic Similarity(TS)between topic and entity-relations.Results Experimental results demonstrate the effectiveness of hyperplane embedding for knowledge representation in knowledge-based visual question answering.Our model outperformed state-of-the-art methods by 2.12%and 3.24%on two challenging knowledge-request datasets:OK-VQA and KRVQA,respectively.Conclusions The obvious advantages of our model in TS show that using hyperplane embedding to represent multimodal knowledge can improve its ability to extract topic-related knowledge.
基金Supported by Sichuan Science and Technology Program(2021YFQ0003,2023YFSY0026,2023YFH0004).
文摘In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.
文摘The weapon and equipment operational requirement analysis(WEORA) is a necessary condition to win a future war,among which the acquisition of knowledge about weapons and equipment is a great challenge. The main challenge is that the existing weapons and equipment data fails to carry out structured knowledge representation, and knowledge navigation based on natural language cannot efficiently support the WEORA. To solve above problem, this research proposes a method based on question answering(QA) of weapons and equipment knowledge graph(WEKG) to construct and navigate the knowledge related to weapons and equipment in the WEORA. This method firstly constructs the WEKG, and builds a neutral network-based QA system over the WEKG by means of semantic parsing for knowledge navigation. Finally, the method is evaluated and a chatbot on the QA system is developed for the WEORA. Our proposed method has good performance in the accuracy and efficiency of searching target knowledge, and can well assist the WEORA.
基金supported by Sichuan Science and Technology Program(2023YFSY0026,2023YFH0004).
文摘Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.
文摘利用咸水或微咸水进行农田灌溉是缓解中国新疆地区农业水资源供需矛盾从而保障当地棉花产业可持续发展的主要途径之一。为了明确不同咸水灌溉措施对棉花产量及经济效益的影响,该研究通过2 a的棉花膜下滴灌大田试验和文献检索获取了新疆9个不同试验地点的土壤、作物及灌溉等数据资料,评估作物产量-水盐胁迫响应分析模型(ANalytical Salt WatER,ANSWER)在新疆棉花产量评估中的适用性和可靠性,并结合经济收支平衡方法,模拟分析不同咸水灌溉措施(包括不同灌溉定额和灌溉水电导率的组合)对棉花产量与经济效益的影响。采用决定系数(R2)、均方根误差(root mean squared error,RMSE)、相对均方根误差(relative root mean squared error,RRMSE)评价模型精度。结果表明,在9个不同试验地点,ANSWER模型均可较准确地估算棉花的相对产量,其估算值与实测值之间的R^(2)≥0.54,RMSE≤0.14,RRMSE≤0.16;不同试验地点,优化获得的各个模型生物参数(与棉花根系吸水的水盐胁迫响应相关的参数)差异较小,变异系数的绝对值处于0.08~0.37之间;基于不同试验地点优化的各生物参数均值估算各地的棉花相对产量,其与实测值仍然吻合良好(R^(2)为0.59,RMSE为0.06,RRMSE为0.07);此外,当灌溉水电导率一定时,棉花净收益随灌溉定额增加呈先增后降的趋势,净收益达到峰值所需的灌溉定额随灌溉水电导率升高而迅速增加;当灌溉水电导率不大于10 dS/m时,通过加大供水量均可获得与淡水灌溉相当的净收益。研究可为新疆地区棉花产量与效益评估以及咸水资源合理开发利用提供理论依据。
文摘Obesity is recognized as the second highest risk factor for cancer. The pathogenic mechanisms underlying tobaccorelated cancers are well characterized and efective programs have led to a decline in smoking and related cancers, but there is a global epidemic of obesity without a clear understanding of how obesity causes cancer. Obesity is heterogeneous, and approximately 25% of obese individuals remain healthy(metabolically healthy obese, MHO), so which fat deposition(subcutaneous versus visceral, adipose versus ectopic) is "malignant"? What is the mechanism of carcinogenesis? Is it by metabolic dysregulation or chronic inflammation? Through which chemokines/genes/signaling pathways does adipose tissue influence carcinogenesis? Can selective inhibition of these pathways uncouple obesity from cancers? Do all obesity related cancers(ORCs) share a molecular signature? Are there common(overlapping) genetic loci that make individuals susceptible to obesity, metabolic syndrome, and cancers? Can we identify precursor lesions of ORCs and will early intervention of high risk individuals alter the natural history? It appears unlikely that the obesity epidemic will be controlled anytime soon; answers to these questions will help to reduce the adverse efect of obesity on human condition.
文摘Over the last couple of decades,community question-answering sites(CQAs)have been a topic of much academic interest.Scholars have often leveraged traditional machine learning(ML)and deep learning(DL)to explore the ever-growing volume of content that CQAs engender.To clarify the current state of the CQA literature that has used ML and DL,this paper reports a systematic literature review.The goal is to summarise and synthesise the major themes of CQA research related to(i)questions,(ii)answers and(iii)users.The final review included 133 articles.Dominant research themes include question quality,answer quality,and expert identification.In terms of dataset,some of the most widely studied platforms include Yahoo!Answers,Stack Exchange and Stack Overflow.The scope of most articles was confined to just one platform with few cross-platform investigations.Articles with ML outnumber those with DL.Nonetheless,the use of DL in CQA research is on an upward trajectory.A number of research directions are proposed.
文摘How to Differentiate and Treat Bi-syndrome by Acupuncture and Moxibustion?Bi-syndrome is the syndrome due to invasion of the exogenous pathogenic factors of wind, cold and dampness, which obstruct the channels and collaterals, leading to stagnated flow of qi and blood, characterized by such clinical manifestations as aching pain, numbness, heaviness, limited flexion and extension of the muscles, tendons and joints, or swelling and burning heat of the joints. This syndrome includes rheumatic arthritis, rheumatoid arthritis, osseous arthritis, and various neuralgia. The endogenous causative factors for the occurrence of Bi-syndrome are insufficiency of yang-qi, essence and blood, while the exogenous causative factors are the pathogenic wind, cold, and dampness. At the initial stage of the disease, the excessiveness of pathogen usually prevails, and the disease tends to be located in the limbs, skin and muscles, and channels and collaterals; while at the chronic stage, there often exists deficiency of the vital-qi or deficiency and excess intermixed, and the disease tends to be located deeper in the tendons and bones or in the zang-fu organs.
文摘ExpertRecommendation(ER)aims to identify domain experts with high expertise and willingness to provide answers to questions in Community Question Answering(CQA)web services.How to model questions and users in the heterogeneous content network is critical to this task.Most traditional methods focus on modeling questions and users based on the textual content left in the community while ignoring the structural properties of heterogeneous CQA networks and always suffering from textual data sparsity issues.Recent approaches take advantage of structural proximities between nodes and attempt to fuse the textual content of nodes for modeling.However,they often fail to distinguish the nodes’personalized preferences and only consider the textual content of a part of the nodes in network embedding learning,while ignoring the semantic relevance of nodes.In this paper,we propose a novel framework that jointly considers the structural proximity relations and textual semantic relevance to model users and questions more comprehensively.Specifically,we learn topology-based embeddings through a hierarchical attentive network learning strategy,in which the proximity information and the personalized preference of nodes are encoded and preserved.Meanwhile,we utilize the node’s textual content and the text correlation between adjacent nodes to build the content-based embedding through a meta-context-aware skip-gram model.In addition,the user’s relative answer quality is incorporated to promote the ranking performance.Experimental results show that our proposed framework consistently and significantly outperforms the state-of-the-art baselines on three real-world datasets by taking the deep semantic understanding and structural feature learning together.The performance of the proposed work is analyzed in terms of MRR,P@K,and MAP and is proven to be more advanced than the existing methodologies.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea Government(MSIT)(No.2020R1G1A1100493).
文摘Recently,pre-trained language representation models such as bidirec-tional encoder representations from transformers(BERT)have been performing well in commonsense question answering(CSQA).However,there is a problem that the models do not directly use explicit information of knowledge sources existing outside.To augment this,additional methods such as knowledge-aware graph network(KagNet)and multi-hop graph relation network(MHGRN)have been proposed.In this study,we propose to use the latest pre-trained language model a lite bidirectional encoder representations from transformers(ALBERT)with knowledge graph information extraction technique.We also propose to applying the novel method,schema graph expansion to recent language models.Then,we analyze the effect of applying knowledge graph-based knowledge extraction techniques to recent pre-trained language models and confirm that schema graph expansion is effective in some extent.Furthermore,we show that our proposed model can achieve better performance than existing KagNet and MHGRN models in CommonsenseQA dataset.
文摘Teachers’ questions have been regarded as an important component in foreign language teaching context. The present paper aims to present a brief investigation into teachers’ question types and students’ answers in primary school English teaching, and tries to draw some implications for primary school English teachers. The video was transcribed and analyzed by the researcher. According to what is surveyed in the study, some questioning strategies were put forward for primary English teaching in the future.