Creating practice questions for programming learning is not easy.It requires the instructor to diligently organize heterogeneous learning resources,that is,conceptual programming concepts and procedural programming ru...Creating practice questions for programming learning is not easy.It requires the instructor to diligently organize heterogeneous learning resources,that is,conceptual programming concepts and procedural programming rules.Today’s programming question generation(PQG)is still largely relying on the demanding creation task performed by the instructors without advanced technological support.In this work,we propose a semantic PQG model that aims to help the instructor generate new programming questions and expand the assessment items.The PQG model is designed to transform conceptual and procedural programming knowledge from textbooks into a semantic network by the Local Knowledge Graph(LKG)and Abstract Syntax Tree(AST).For any given question,the model queries the established network to find related code examples and generates a set of questions by the associated LKG/AST semantic structures.We conduct analysis to compare instructor-made questions from 9 undergraduate introductory programming courses and textbook questions.The results show that the instructormade questions had much simpler complexity than the textbook ones.The disparity of topic distribution intrigued us to further research the breadth and depth of question quality and also to investigate the complexity of the questions in relation to the student performances.Finally,we report a user study results on the proposed Artificial Intelligent-infused semantic PQG model in examining the machine-generated questions’quality.展开更多
As the dual task of question answering,question generation(QG)is a significant and challenging task that aims to generate valid and fluent questions from a given paragraph.The QG task is of great significance to quest...As the dual task of question answering,question generation(QG)is a significant and challenging task that aims to generate valid and fluent questions from a given paragraph.The QG task is of great significance to question answering systems,conversational systems,and machine reading comprehension systems.Recent sequence to sequence neural models have achieved outstanding performance in English and Chinese QG tasks.However,the task of Tibetan QG is rarely mentioned.The key factor impeding its development is the lack of a public Tibetan QG dataset.Faced with this challenge,the present paper first collects 425 articles from the Tibetan Wikipedia website and constructs 7,234 question–answer pairs through crowdsourcing.Next,we propose a Tibetan QG model based on the sequence to sequence framework to generate Tibetan questions from given paragraphs.Secondly,in order to generate answer-aware questions,we introduce an attention mechanism that can capture the key semantic information related to the answer.Meanwhile,we adopt a copy mechanism to copy some words in the paragraph to avoid generating unknown or rare words in the question.Finally,experiments show that our model achieves higher performance than baseline models.We also further explore the attention and copy mechanisms,and prove their effectiveness through experiments.展开更多
Question Generation(QG)is the task of generating questions according to the given contexts.Most of the existing methods are based on Recurrent Neural Networks(RNNs)for generating questions with passage-level input for...Question Generation(QG)is the task of generating questions according to the given contexts.Most of the existing methods are based on Recurrent Neural Networks(RNNs)for generating questions with passage-level input for providing more details,which seriously suffer from such problems as gradient vanishing and ineffective information utilization.In fact,reasonably extracting useful information from a given context is more in line with our actual needs during questioning especially in the education scenario.To that end,in this paper,we propose a novel Hierarchical Answer-Aware and Context-Aware Network(HACAN)to construct a high-quality passage representation and judge the balance between the sentences and the whole passage.Specifically,a Hierarchical Passage Encoder(HPE)is proposed to construct an answer-aware and context-aware passage representation,with a strategy of utilizing multi-hop reasoning.Then,we draw inspiration from the actual human questioning process and design a Hierarchical Passage-aware Decoder(HPD)which determines when to utilize the passage information.We conduct extensive experiments on the SQuAD dataset,where the results verify the effectivenesss of our model in comparison with several baselines.展开更多
Math word problem uses a real word story to present basic arithmetic operations using textual narration. It is used to develop student’s comprehension skill in conjunction with the ability to generate a solution that...Math word problem uses a real word story to present basic arithmetic operations using textual narration. It is used to develop student’s comprehension skill in conjunction with the ability to generate a solution that agrees with the story given in the problem. To master math word problem solving, students need to be given fresh and enormous amount of problems, which normal textbooks as well as teachers fail to provide most of the time. To fill the gap, a few research works have been proposed on techniques to automatically generate math word problems and equations mainly for English speaking community. Amharic is a Semitic language spoken by more than hundred million Ethiopians and is a language of instruction in elementary schools in Ethiopia. And yet it belongs to one of a less resourced language in the field of linguistics and natural language processing (NLP). Hence, in this paper, a strategy for automatic generation of Amharic Math Word (AMW) problem and equation is proposed, which is a first attempt to introduce the use template based shallow NLP approach to generate math word problem for Amharic language as a step towards enabling comprehension and learning problem solving in mathematics for primary school students. The proposed novel technique accepts a sample AMW problem as user input to form a template. A template provides AMW problem with placeholders, type of problem and equation template. It is used as a pattern to generate semantically equivalent AMW problems with their equations. To validate the reality of the proposed approach, a prototype was developed and used as a testing platform. Experimental results have shown 93.84% overall efficiency on the core task of forming templates from a given corpus containing AMW problems collected from elementary school mathematics textbooks and other school worksheets. Human judges have also found generated AMW problem and equation as solvable as the textbook problems.展开更多
Question Generation(QG)is the task of utilizing Artificial Intelligence(AI)technology to generate questions that can be answered by a span of text within a given passage.Existing research on QG in the educational fiel...Question Generation(QG)is the task of utilizing Artificial Intelligence(AI)technology to generate questions that can be answered by a span of text within a given passage.Existing research on QG in the educational field struggles with two challenges:the mainstream QG models based on seq-to-seq fail to utilize the structured information from the passage;the other is the lack of specialized educational QG datasets.To address the challenges,a specialized QG dataset,reading comprehension dataset from examinations for QG(named RACE4QG),is reconstructed by applying a new answer tagging approach and a data-filtering strategy to the RACE dataset.Further,an end-to-end QG model,which can exploit the intra-and inter-sentence information to generate better questions,is proposed.In our model,the encoder utilizes a Gated Recurrent Units(GRU)network,which takes the concatenation of word embedding,answer tagging,and Graph Attention neTworks(GAT)embedding as input.The hidden states of the GRU are operated with a gated self-attention to obtain the final passage-answer representation,which will be fed to the decoder.Results show that our model outperforms baselines on automatic metrics and human evaluation.Consequently,the model improves the baseline by 0.44,1.32,and 1.34 on BLEU-4,ROUGE-L,and METEOR metrics,respectively,indicating the effectivity and reliability of our model.Its gap with human expectations also reflects the research potential.展开更多
文摘Creating practice questions for programming learning is not easy.It requires the instructor to diligently organize heterogeneous learning resources,that is,conceptual programming concepts and procedural programming rules.Today’s programming question generation(PQG)is still largely relying on the demanding creation task performed by the instructors without advanced technological support.In this work,we propose a semantic PQG model that aims to help the instructor generate new programming questions and expand the assessment items.The PQG model is designed to transform conceptual and procedural programming knowledge from textbooks into a semantic network by the Local Knowledge Graph(LKG)and Abstract Syntax Tree(AST).For any given question,the model queries the established network to find related code examples and generates a set of questions by the associated LKG/AST semantic structures.We conduct analysis to compare instructor-made questions from 9 undergraduate introductory programming courses and textbook questions.The results show that the instructormade questions had much simpler complexity than the textbook ones.The disparity of topic distribution intrigued us to further research the breadth and depth of question quality and also to investigate the complexity of the questions in relation to the student performances.Finally,we report a user study results on the proposed Artificial Intelligent-infused semantic PQG model in examining the machine-generated questions’quality.
基金This work is supported by the National Nature Science Foundation(No.61972436).
文摘As the dual task of question answering,question generation(QG)is a significant and challenging task that aims to generate valid and fluent questions from a given paragraph.The QG task is of great significance to question answering systems,conversational systems,and machine reading comprehension systems.Recent sequence to sequence neural models have achieved outstanding performance in English and Chinese QG tasks.However,the task of Tibetan QG is rarely mentioned.The key factor impeding its development is the lack of a public Tibetan QG dataset.Faced with this challenge,the present paper first collects 425 articles from the Tibetan Wikipedia website and constructs 7,234 question–answer pairs through crowdsourcing.Next,we propose a Tibetan QG model based on the sequence to sequence framework to generate Tibetan questions from given paragraphs.Secondly,in order to generate answer-aware questions,we introduce an attention mechanism that can capture the key semantic information related to the answer.Meanwhile,we adopt a copy mechanism to copy some words in the paragraph to avoid generating unknown or rare words in the question.Finally,experiments show that our model achieves higher performance than baseline models.We also further explore the attention and copy mechanisms,and prove their effectiveness through experiments.
基金This research was partially supported by the National Key R&D Program of China(No.2021YFF0901003).
文摘Question Generation(QG)is the task of generating questions according to the given contexts.Most of the existing methods are based on Recurrent Neural Networks(RNNs)for generating questions with passage-level input for providing more details,which seriously suffer from such problems as gradient vanishing and ineffective information utilization.In fact,reasonably extracting useful information from a given context is more in line with our actual needs during questioning especially in the education scenario.To that end,in this paper,we propose a novel Hierarchical Answer-Aware and Context-Aware Network(HACAN)to construct a high-quality passage representation and judge the balance between the sentences and the whole passage.Specifically,a Hierarchical Passage Encoder(HPE)is proposed to construct an answer-aware and context-aware passage representation,with a strategy of utilizing multi-hop reasoning.Then,we draw inspiration from the actual human questioning process and design a Hierarchical Passage-aware Decoder(HPD)which determines when to utilize the passage information.We conduct extensive experiments on the SQuAD dataset,where the results verify the effectivenesss of our model in comparison with several baselines.
文摘Math word problem uses a real word story to present basic arithmetic operations using textual narration. It is used to develop student’s comprehension skill in conjunction with the ability to generate a solution that agrees with the story given in the problem. To master math word problem solving, students need to be given fresh and enormous amount of problems, which normal textbooks as well as teachers fail to provide most of the time. To fill the gap, a few research works have been proposed on techniques to automatically generate math word problems and equations mainly for English speaking community. Amharic is a Semitic language spoken by more than hundred million Ethiopians and is a language of instruction in elementary schools in Ethiopia. And yet it belongs to one of a less resourced language in the field of linguistics and natural language processing (NLP). Hence, in this paper, a strategy for automatic generation of Amharic Math Word (AMW) problem and equation is proposed, which is a first attempt to introduce the use template based shallow NLP approach to generate math word problem for Amharic language as a step towards enabling comprehension and learning problem solving in mathematics for primary school students. The proposed novel technique accepts a sample AMW problem as user input to form a template. A template provides AMW problem with placeholders, type of problem and equation template. It is used as a pattern to generate semantically equivalent AMW problems with their equations. To validate the reality of the proposed approach, a prototype was developed and used as a testing platform. Experimental results have shown 93.84% overall efficiency on the core task of forming templates from a given corpus containing AMW problems collected from elementary school mathematics textbooks and other school worksheets. Human judges have also found generated AMW problem and equation as solvable as the textbook problems.
基金This work was supported by the National Natural Science Foundation of China(No.62166050)Yunnan Fundamental Research Projects(No.202201AS070021)Yunnan Innovation Team of Education Informatization for Nationalities,Scientific Technology Innovation Team of Educational Big Data Application Technology in University of Yunnan Province,and Yunnan Normal University Graduate Research and innovation fund in 2020(No.ysdyjs2020006).
文摘Question Generation(QG)is the task of utilizing Artificial Intelligence(AI)technology to generate questions that can be answered by a span of text within a given passage.Existing research on QG in the educational field struggles with two challenges:the mainstream QG models based on seq-to-seq fail to utilize the structured information from the passage;the other is the lack of specialized educational QG datasets.To address the challenges,a specialized QG dataset,reading comprehension dataset from examinations for QG(named RACE4QG),is reconstructed by applying a new answer tagging approach and a data-filtering strategy to the RACE dataset.Further,an end-to-end QG model,which can exploit the intra-and inter-sentence information to generate better questions,is proposed.In our model,the encoder utilizes a Gated Recurrent Units(GRU)network,which takes the concatenation of word embedding,answer tagging,and Graph Attention neTworks(GAT)embedding as input.The hidden states of the GRU are operated with a gated self-attention to obtain the final passage-answer representation,which will be fed to the decoder.Results show that our model outperforms baselines on automatic metrics and human evaluation.Consequently,the model improves the baseline by 0.44,1.32,and 1.34 on BLEU-4,ROUGE-L,and METEOR metrics,respectively,indicating the effectivity and reliability of our model.Its gap with human expectations also reflects the research potential.