Identifying rare patterns for medical diagnosis is a challenging task due to heterogeneity and the volume of data.Data summarization can create a concise version of the original data that can be used for effective dia...Identifying rare patterns for medical diagnosis is a challenging task due to heterogeneity and the volume of data.Data summarization can create a concise version of the original data that can be used for effective diagnosis.In this paper,we propose an ensemble summarization method that combines clustering and sampling to create a summary of the original data to ensure the inclusion of rare patterns.To the best of our knowledge,there has been no such technique available to augment the performance of anomaly detection techniques and simultaneously increase the efficiency of medical diagnosis.The performance of popular anomaly detection algorithms increases significantly in terms of accuracy and computational complexity when the summaries are used.Therefore,the medical diagnosis becomes more effective,and our experimental results reflect that the combination of the proposed summarization scheme and all underlying algorithms used in this paper outperforms the most popular anomaly detection techniques.展开更多
Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract ...Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization.This method resulted in rapid exploration,indexing,and retrieval of massive video libraries.We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint(BRISK)and bisecting K-means clustering algorithm.The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the descriptors from video sequences.The video frames’BRISK features are clustered using a bisecting K-means,and the keyframe is determined by selecting the frame that is most near the cluster center.Without applying any clustering parameters,the appropriate clusters number is determined using the silhouette coefficient.Experiments were carried out on a publicly available open video project(OVP)dataset that contained videos of different genres.The proposed method’s effectiveness is compared to existing methods using a variety of evaluation metrics,and the proposed method achieves a trade-off between computational cost and quality.展开更多
Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes an...Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes and their neighbors,but ignore the dynamic dependencies between nodes.To address this challenge,we propose an innovative Adaptive Graph Convolutional Adjacency Matrix Network(TAMGCN),leveraging the attention mechanism to dynamically adjust dependencies between graph nodes.Specifically,we first segment shots and extract features of each frame,then compute the representative features of each shot.Subsequently,we utilize the attention mechanism to dynamically adjust the adjacency matrix of the graph convolutional network to better capture the dynamic dependencies between graph nodes.Finally,we fuse temporal features extracted by Bi-directional Long Short-Term Memory network with structural features extracted by the graph convolutional network to generate high-quality summaries.Extensive experiments are conducted on two benchmark datasets,TVSum and SumMe,yielding F1-scores of 60.8%and 53.2%,respectively.Experimental results demonstrate that our method outperforms most state-of-the-art video summarization techniques.展开更多
The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Curr...The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Current approaches in Extractive Text Summarization(ETS)leverage the modeling of inter-sentence relationships,a task of paramount importance in producing coherent summaries.This study introduces an innovative model that integrates Graph Attention Networks(GATs)with Transformer-based Bidirectional Encoder Representa-tions from Transformers(BERT)and Latent Dirichlet Allocation(LDA),further enhanced by Term Frequency-Inverse Document Frequency(TF-IDF)values,to improve sentence selection by capturing comprehensive topical information.Our approach constructs a graph with nodes representing sentences,words,and topics,thereby elevating the interconnectivity and enabling a more refined understanding of text structures.This model is stretched to Multi-Document Summarization(MDS)from Single-Document Summarization,offering significant improvements over existing models such as THGS-GMM and Topic-GraphSum,as demonstrated by empirical evaluations on benchmark news datasets like Cable News Network(CNN)/Daily Mail(DM)and Multi-News.The results consistently demonstrate superior performance,showcasing the model’s robustness in handling complex summarization tasks across single and multi-document contexts.This research not only advances the integration of BERT and LDA within a GATs but also emphasizes our model’s capacity to effectively manage global information and adapt to diverse summarization challenges.展开更多
Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is ...Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is proposed,which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality.In general,a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie.In this paper,JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the characters' whole life.In terms of fusing keywords and key-frames,affinity propagation is adopted to calculate the similarity between each key-frame cluster and keywords.Through the method mentioned above,a video summarization is presented based on multimodal fusion which describes video content more completely.In order to reduce the time spent on searching the interest video content and get the relationship between main characters,a kind of map is adopted to visualize video content and interact with video summarization.An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently.展开更多
This paper presents two different algorithms that derive the cohesion structure in the form of lexical chains from two kinds of language resources HowNet and TongYiCiCiLin. The re-search that connects the cohesion str...This paper presents two different algorithms that derive the cohesion structure in the form of lexical chains from two kinds of language resources HowNet and TongYiCiCiLin. The re-search that connects the cohesion structure of a text to the derivation of its summary is displayed. A novel model of automatic text summarization is devised,based on the data provided by lexical chains from original texts. Moreover,the construction rules of lexical chains are modified accord-ing to characteristics of the knowledge database in order to be more suitable for Chinese summa-rization. Evaluation results show that high quality indicative summaries are produced from Chi-nese texts.展开更多
In free viewpoint video(FVV)and 3DTV,the depth image-based rendering method has been put forward for rendering virtual view video based on multi-view video plus depth(MVD) format.However,the projection with slightly d...In free viewpoint video(FVV)and 3DTV,the depth image-based rendering method has been put forward for rendering virtual view video based on multi-view video plus depth(MVD) format.However,the projection with slightly different perspective turns the covered background regions into hole regions in the rendered video.This paper presents a depth enhanced image summarization generation model for the hole-filling via exploiting the texture fidelity and the geometry consistency between the hole and the remaining nearby regions.The texture fidelity and the geometry consistency are enhanced by drawing texture details and pixel-wise depth information into the energy cost of similarity measure correspondingly.The proposed approach offers significant improvement in terms of 0.2dB PSNR gain,0.06 SSIM gain and subjective quality enhancement for the hole-filling images in virtual viewpoint video.展开更多
Automatic text summarization involves reducing a text document or a larger corpus of multiple documents to a short set of sentences or paragraphs that convey the main meaning of the text. In this paper, we discuss abo...Automatic text summarization involves reducing a text document or a larger corpus of multiple documents to a short set of sentences or paragraphs that convey the main meaning of the text. In this paper, we discuss about multi-document summarization that differs from the single one in which the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Since the number and variety of online medical news make them difficult for experts in the medical field to read all of the medical news, an automatic multi-document summarization can be useful for easy study of information on the web. Hence we propose a new approach based on machine learning meta-learner algorithm called AdaBoost that is used for summarization. We treat a document as a set of sentences, and the learning algorithm must learn to classify as positive or negative examples of sentences based on the score of the sentences. For this learning task, we apply AdaBoost meta-learning algorithm where a C4.5 decision tree has been chosen as the base learner. In our experiment, we use 450 pieces of news that are downloaded from different medical websites. Then we compare our results with some existing approaches.展开更多
Nowadays,people use online resources such as educational videos and courses.However,such videos and courses are mostly long and thus,summarizing them will be valuable.The video contents(visual,audio,and subtitles)coul...Nowadays,people use online resources such as educational videos and courses.However,such videos and courses are mostly long and thus,summarizing them will be valuable.The video contents(visual,audio,and subtitles)could be analyzed to generate textual summaries,i.e.,notes.Videos’subtitles contain significant information.Therefore,summarizing subtitles is effective to concentrate on the necessary details.Most of the existing studies used Term Frequency-Inverse Document Frequency(TF-IDF)and Latent Semantic Analysis(LSA)models to create lectures’summaries.This study takes another approach and applies LatentDirichlet Allocation(LDA),which proved its effectiveness in document summarization.Specifically,the proposed LDA summarization model follows three phases.The first phase aims to prepare the subtitle file for modelling by performing some preprocessing steps,such as removing stop words.In the second phase,the LDA model is trained on subtitles to generate the keywords list used to extract important sentences.Whereas in the third phase,a summary is generated based on the keywords list.The generated summaries by LDA were lengthy;thus,a length enhancement method has been proposed.For the evaluation,the authors developed manual summaries of the existing“EDUVSUM”educational videos dataset.The authors compared the generated summaries with the manual-generated outlines using two methods,(i)Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and(ii)human evaluation.The performance of LDA-based generated summaries outperforms the summaries generated by TF-IDF and LSA.Besides reducing the summaries’length,the proposed length enhancement method did improve the summaries’precision rates.Other domains,such as news videos,can apply the proposed method for video summarization.展开更多
Text summarization is the process of automatically creating a compressed version of a given document preserving its information content. There are two types of summarization: extractive and abstractive. Extractive sum...Text summarization is the process of automatically creating a compressed version of a given document preserving its information content. There are two types of summarization: extractive and abstractive. Extractive summarization methods simplify the problem of summarization into the problem of selecting a representative subset of the sentences in the original documents. Abstractive summarization may compose novel sentences, unseen in the original sources. In our study we focus on sentence based extractive document summarization. The extractive summarization systems are typically based on techniques for sentence extraction and aim to cover the set of sentences that are most important for the overall understanding of a given document. In this paper, we propose unsupervised document summarization method that creates the summary by clustering and extracting sentences from the original document. For this purpose new criterion functions for sentence clustering have been proposed. Similarity measures play an increasingly important role in document clustering. Here we’ve also developed a discrete differential evolution algorithm to optimize the criterion functions. The experimental results show that our suggested approach can improve the performance compared to sate-of-the-art summarization approaches.展开更多
Due to the advanced developments of the Internet and information technologies,a massive quantity of electronic data in the biomedical sector has been exponentially increased.To handle the huge amount of biomedical dat...Due to the advanced developments of the Internet and information technologies,a massive quantity of electronic data in the biomedical sector has been exponentially increased.To handle the huge amount of biomedical data,automated multi-document biomedical text summarization becomes an effective and robust approach of accessing the increased amount of technical and medical literature in the biomedical sector through the summarization of multiple source documents by retaining the significantly informative data.So,multi-document biomedical text summarization acts as a vital role to alleviate the issue of accessing precise and updated information.This paper presents a Deep Learning based Attention Long Short Term Memory(DLALSTM)Model for Multi-document Biomedical Text Summarization.The proposed DL-ALSTM model initially performs data preprocessing to convert the available medical data into a compatible format for further processing.Then,the DL-ALSTM model gets executed to summarize the contents from the multiple biomedical documents.In order to tune the summarization performance of the DL-ALSTM model,chaotic glowworm swarm optimization(CGSO)algorithm is employed.Extensive experimentation analysis is performed to ensure the betterment of the DL-ALSTM model and the results are investigated using the PubMed dataset.Comprehensive comparative result analysis is carried out to showcase the efficiency of the proposed DL-ALSTM model with the recently presented models.展开更多
Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis (LSA) is first used to extract semantic knowledge from a given document, all questio...Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis (LSA) is first used to extract semantic knowledge from a given document, all question paragraphs are identified, an automatic text segmentation approach analogous to Text'filing is exploited to improve the precision of correlating question paragraphs and answer paragraphs, and finally some "important" sentences are extracted from the generic content and the question-answer pairs to generate a complete summary. Experimental results showed that our approach is highly efficient and improves significantly the coherence of the summary while not compromising informativeness.展开更多
News feed is one of the potential information providing sources which give updates on various topics of different domains.These updates on various topics need to be collected since the domain specific interested users...News feed is one of the potential information providing sources which give updates on various topics of different domains.These updates on various topics need to be collected since the domain specific interested users are in need of important updates in their domains with organized data from various sources.In this paper,the news summarization system is proposed for the news data streams from RSS feeds and Google news.Since news stream analysis requires live content,the news data are continuously collected for our experimentation.Themajor contributions of thiswork involve domain corpus based news collection,news content extraction,hierarchical clustering of the news and summarization of news.Many of the existing news summarization systems lack in providing dynamic content with domain wise representation.This is alleviated in our proposed systemby tagging the news feed with domain corpuses and organizing the news streams with the hierarchical structure with topic wise representation.Further,the news streams are summarized for the users with a novel summarization algorithm.The proposed summarization system generates topic wise summaries effectively for the user and no system in the literature has handled the news summarization by collecting the data dynamically and organizing the content hierarchically.The proposed system is compared with existing systems and achieves better results in generating news summaries.The Online news content editors are highly benefitted by this system for instantly getting the news summaries of their domain interest.展开更多
Taking into account the increasing volume of text documents,automatic summarization is one of the important tools for quick and optimal utilization of such sources.Automatic summarization is a text compression process...Taking into account the increasing volume of text documents,automatic summarization is one of the important tools for quick and optimal utilization of such sources.Automatic summarization is a text compression process for producing a shorter document in order to quickly access the important goals and main features of the input document.In this study,a novel method is introduced for selective text summarization using the genetic algorithm and generation of repetitive patterns.One of the important features of the proposed summarization is to identify and extract the relationship between the main features of the input text and the creation of repetitive patterns in order to produce and optimize the vector of the main document features in the production of the summary document compared to other previous methods.In this study,attempts were made to encompass all the main parameters of the summary text including unambiguous summary with the highest precision,continuity and consistency.To investigate the efficiency of the proposed algorithm,the results of the study were evaluated with respect to the precision and recall criteria.The results of the study evaluation showed the optimization the dimensions of the features and generation of a sequence of summary document sentences having the most consistency with the main goals and features of the input document.展开更多
In recent years,many text summarization models based on pretraining methods have achieved very good results.However,in these text summarization models,semantic deviations are easy to occur between the original input r...In recent years,many text summarization models based on pretraining methods have achieved very good results.However,in these text summarization models,semantic deviations are easy to occur between the original input representation and the representation that passed multi-layer encoder,which may result in inconsistencies between the generated summary and the source text content.The Bidirectional Encoder Representations from Transformers(BERT)improves the performance of many tasks in Natural Language Processing(NLP).Although BERT has a strong capability to encode context,it lacks the fine-grained semantic representation.To solve these two problems,we proposed a semantic supervision method based on Capsule Network.Firstly,we extracted the fine-grained semantic representation of the input and encoded result in BERT by Capsule Network.Secondly,we used the fine-grained semantic representation of the input to supervise the fine-grained semantic representation of the encoded result.Then we evaluated our model on a popular Chinese social media dataset(LCSTS),and the result showed that our model achieved higher ROUGE scores(including R-1,R-2),and our model outperformed baseline systems.Finally,we conducted a comparative study on the stability of the model,and the experimental results showed that our model was more stable.展开更多
In the era of Big Data,we are faced with an inevitable and challenging problem of“overload information”.To alleviate this problem,it is important to use effective automatic text summarization techniques to obtain th...In the era of Big Data,we are faced with an inevitable and challenging problem of“overload information”.To alleviate this problem,it is important to use effective automatic text summarization techniques to obtain the key information quickly and efficiently from the huge amount of text.In this paper,we propose a hybrid method of extractive text summarization based on deep learning and graph ranking algorithms(ETSDG).In this method,a pre-trained deep learning model is designed to yield useful sentence embeddings.Given the association between sentences in raw documents,a traditional LexRank algorithm with fine-tuning is adopted fin ETSDG.In order to improve the performance of the extractive text summarization method,we further integrate the traditional LexRank algorithm with deep learning.Testing results on the data set DUC2004 show that ETSDG has better performance in ROUGE metrics compared with certain benchmark methods.展开更多
With the remarkable growth of textual data sources in recent years,easy,fast,and accurate text processing has become a challenge with significant payoffs.Automatic text summarization is the process of compressing text...With the remarkable growth of textual data sources in recent years,easy,fast,and accurate text processing has become a challenge with significant payoffs.Automatic text summarization is the process of compressing text documents into shorter summaries for easier review of its core contents,which must be done without losing important features and information.This paper introduces a new hybrid method for extractive text summarization with feature selection based on text structure.The major advantage of the proposed summarization method over previous systems is the modeling of text structure and relationship between entities in the input text,which improves the sentence feature selection process and leads to the generation of unambiguous,concise,consistent,and coherent summaries.The paper also presents the results of the evaluation of the proposed method based on precision and recall criteria.It is shown that the method produces summaries consisting of chains of sentences with the aforementioned characteristics from the original text.展开更多
Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in r...Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in real systems are based on graph models,which are characterized by their simplicity and stability.Thus,this paper proposes an improved extractive text summarization algorithm based on both topic and graph models.The methodology of this work consists of two stages.First,the well-known TextRank algorithm is analyzed and its shortcomings are investigated.Then,an improved method is proposed with a new computational model of sentence weights.The experimental results were carried out on standard DUC2004 and DUC2006 datasets and compared to four text summarization methods.Finally,through experiments on the DUC2004 and DUC2006 datasets,our proposed improved graph model algorithm TG-SMR(Topic Graph-Summarizer)is compared to other text summarization systems.The experimental results prove that the proposed TG-SMR algorithm achieves higher ROUGE scores.It is foreseen that the TG-SMR algorithm will open a new horizon that concerns the performance of ROUGE evaluation indicators.展开更多
This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level an...This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level and micro-level discourse structure to identify important information that can be extracted from dissertation abstracts, and then uses a variable-based framework to integrate and organize extracted information across dissertation abstracts. This framework focuses more on research concepts and their research relationships found in sociology dissertation abstracts and has a hierarchical structure. A taxonomy is constructed to support the summarization process in two ways: (1) helping to identify important concepts and relations expressed in the text, and (2) providing a structure for linking similar concepts in different abstracts. This paper describes the variable-based framework and the summarization process, and then reports the construction of the taxonomy for supporting the summarization process. An example is provided to show how to use the constructed taxonomy to identify important concepts and integrate the concepts extracted from different abstracts.展开更多
We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based met...We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based methods proposed by other researchers tend to ignore informativeness of words when they generate summaries, our proposed framework takes relevance, diversity, informativeness and length constraint of sentences into consideration comprehensively. We apply Density Peaks Clustering (DPC) to get relevance scores and diversity scores of sentences simultaneously. Our framework produces the best performance on DUC2004, 0.396 of ROUGE-1 score, 0.094 of ROUGE-2 score and 0.143 of ROUGE-SU4 which outperforms a series of popular baselines, such as DUC Best, FGB [7], and BSTM [10].展开更多
文摘Identifying rare patterns for medical diagnosis is a challenging task due to heterogeneity and the volume of data.Data summarization can create a concise version of the original data that can be used for effective diagnosis.In this paper,we propose an ensemble summarization method that combines clustering and sampling to create a summary of the original data to ensure the inclusion of rare patterns.To the best of our knowledge,there has been no such technique available to augment the performance of anomaly detection techniques and simultaneously increase the efficiency of medical diagnosis.The performance of popular anomaly detection algorithms increases significantly in terms of accuracy and computational complexity when the summaries are used.Therefore,the medical diagnosis becomes more effective,and our experimental results reflect that the combination of the proposed summarization scheme and all underlying algorithms used in this paper outperforms the most popular anomaly detection techniques.
基金The authors would like to thank Research Supporting Project Number(RSP2024R444)King Saud University,Riyadh,Saudi Arabia.
文摘Due to the exponential growth of video data,aided by rapid advancements in multimedia technologies.It became difficult for the user to obtain information from a large video series.The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization.This method resulted in rapid exploration,indexing,and retrieval of massive video libraries.We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint(BRISK)and bisecting K-means clustering algorithm.The current method effectively recognizes relevant frames using BRISK by extracting keypoints and the descriptors from video sequences.The video frames’BRISK features are clustered using a bisecting K-means,and the keyframe is determined by selecting the frame that is most near the cluster center.Without applying any clustering parameters,the appropriate clusters number is determined using the silhouette coefficient.Experiments were carried out on a publicly available open video project(OVP)dataset that contained videos of different genres.The proposed method’s effectiveness is compared to existing methods using a variety of evaluation metrics,and the proposed method achieves a trade-off between computational cost and quality.
基金This work was supported by Natural Science Foundation of Gansu Province under Grant Nos.21JR7RA570,20JR10RA334Basic Research Program of Gansu Province No.22JR11RA106,Gansu University of Political Science and Law Major Scientific Research and Innovation Projects under Grant No.GZF2020XZDA03+1 种基金the Young Doctoral Fund Project of Higher Education Institutions in Gansu Province in 2022 under Grant No.2022QB-123,Gansu Province Higher Education Innovation Fund Project under Grant No.2022A-097the University-Level Research Funding Project under Grant No.GZFXQNLW022 and University-Level Innovative Research Team of Gansu University of Political Science and Law.
文摘Video summarization aims to select key frames or key shots to create summaries for fast retrieval,compression,and efficient browsing of videos.Graph neural networks efficiently capture information about graph nodes and their neighbors,but ignore the dynamic dependencies between nodes.To address this challenge,we propose an innovative Adaptive Graph Convolutional Adjacency Matrix Network(TAMGCN),leveraging the attention mechanism to dynamically adjust dependencies between graph nodes.Specifically,we first segment shots and extract features of each frame,then compute the representative features of each shot.Subsequently,we utilize the attention mechanism to dynamically adjust the adjacency matrix of the graph convolutional network to better capture the dynamic dependencies between graph nodes.Finally,we fuse temporal features extracted by Bi-directional Long Short-Term Memory network with structural features extracted by the graph convolutional network to generate high-quality summaries.Extensive experiments are conducted on two benchmark datasets,TVSum and SumMe,yielding F1-scores of 60.8%and 53.2%,respectively.Experimental results demonstrate that our method outperforms most state-of-the-art video summarization techniques.
文摘The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Current approaches in Extractive Text Summarization(ETS)leverage the modeling of inter-sentence relationships,a task of paramount importance in producing coherent summaries.This study introduces an innovative model that integrates Graph Attention Networks(GATs)with Transformer-based Bidirectional Encoder Representa-tions from Transformers(BERT)and Latent Dirichlet Allocation(LDA),further enhanced by Term Frequency-Inverse Document Frequency(TF-IDF)values,to improve sentence selection by capturing comprehensive topical information.Our approach constructs a graph with nodes representing sentences,words,and topics,thereby elevating the interconnectivity and enabling a more refined understanding of text structures.This model is stretched to Multi-Document Summarization(MDS)from Single-Document Summarization,offering significant improvements over existing models such as THGS-GMM and Topic-GraphSum,as demonstrated by empirical evaluations on benchmark news datasets like Cable News Network(CNN)/Daily Mail(DM)and Multi-News.The results consistently demonstrate superior performance,showcasing the model’s robustness in handling complex summarization tasks across single and multi-document contexts.This research not only advances the integration of BERT and LDA within a GATs but also emphasizes our model’s capacity to effectively manage global information and adapt to diverse summarization challenges.
基金Supported by the National Key Research and Development Plan(2016YFB1001200)the Natural Science Foundation of China(U1435220,61232013)Natural Science Research Projects of Universities in Jiangsu Province(16KJA520003)
文摘Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is proposed,which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality.In general,a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie.In this paper,JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the characters' whole life.In terms of fusing keywords and key-frames,affinity propagation is adopted to calculate the similarity between each key-frame cluster and keywords.Through the method mentioned above,a video summarization is presented based on multimodal fusion which describes video content more completely.In order to reduce the time spent on searching the interest video content and get the relationship between main characters,a kind of map is adopted to visualize video content and interact with video summarization.An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently.
基金the Key Project of National Natural Sci-ence Foundation of China (No.60435020)the High Technology Research and Development Programme of China (No.2002AA117010-09).
文摘This paper presents two different algorithms that derive the cohesion structure in the form of lexical chains from two kinds of language resources HowNet and TongYiCiCiLin. The re-search that connects the cohesion structure of a text to the derivation of its summary is displayed. A novel model of automatic text summarization is devised,based on the data provided by lexical chains from original texts. Moreover,the construction rules of lexical chains are modified accord-ing to characteristics of the knowledge database in order to be more suitable for Chinese summa-rization. Evaluation results show that high quality indicative summaries are produced from Chi-nese texts.
文摘In free viewpoint video(FVV)and 3DTV,the depth image-based rendering method has been put forward for rendering virtual view video based on multi-view video plus depth(MVD) format.However,the projection with slightly different perspective turns the covered background regions into hole regions in the rendered video.This paper presents a depth enhanced image summarization generation model for the hole-filling via exploiting the texture fidelity and the geometry consistency between the hole and the remaining nearby regions.The texture fidelity and the geometry consistency are enhanced by drawing texture details and pixel-wise depth information into the energy cost of similarity measure correspondingly.The proposed approach offers significant improvement in terms of 0.2dB PSNR gain,0.06 SSIM gain and subjective quality enhancement for the hole-filling images in virtual viewpoint video.
文摘Automatic text summarization involves reducing a text document or a larger corpus of multiple documents to a short set of sentences or paragraphs that convey the main meaning of the text. In this paper, we discuss about multi-document summarization that differs from the single one in which the issues of compression, speed, redundancy and passage selection are critical in the formation of useful summaries. Since the number and variety of online medical news make them difficult for experts in the medical field to read all of the medical news, an automatic multi-document summarization can be useful for easy study of information on the web. Hence we propose a new approach based on machine learning meta-learner algorithm called AdaBoost that is used for summarization. We treat a document as a set of sentences, and the learning algorithm must learn to classify as positive or negative examples of sentences based on the score of the sentences. For this learning task, we apply AdaBoost meta-learning algorithm where a C4.5 decision tree has been chosen as the base learner. In our experiment, we use 450 pieces of news that are downloaded from different medical websites. Then we compare our results with some existing approaches.
文摘Nowadays,people use online resources such as educational videos and courses.However,such videos and courses are mostly long and thus,summarizing them will be valuable.The video contents(visual,audio,and subtitles)could be analyzed to generate textual summaries,i.e.,notes.Videos’subtitles contain significant information.Therefore,summarizing subtitles is effective to concentrate on the necessary details.Most of the existing studies used Term Frequency-Inverse Document Frequency(TF-IDF)and Latent Semantic Analysis(LSA)models to create lectures’summaries.This study takes another approach and applies LatentDirichlet Allocation(LDA),which proved its effectiveness in document summarization.Specifically,the proposed LDA summarization model follows three phases.The first phase aims to prepare the subtitle file for modelling by performing some preprocessing steps,such as removing stop words.In the second phase,the LDA model is trained on subtitles to generate the keywords list used to extract important sentences.Whereas in the third phase,a summary is generated based on the keywords list.The generated summaries by LDA were lengthy;thus,a length enhancement method has been proposed.For the evaluation,the authors developed manual summaries of the existing“EDUVSUM”educational videos dataset.The authors compared the generated summaries with the manual-generated outlines using two methods,(i)Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and(ii)human evaluation.The performance of LDA-based generated summaries outperforms the summaries generated by TF-IDF and LSA.Besides reducing the summaries’length,the proposed length enhancement method did improve the summaries’precision rates.Other domains,such as news videos,can apply the proposed method for video summarization.
文摘Text summarization is the process of automatically creating a compressed version of a given document preserving its information content. There are two types of summarization: extractive and abstractive. Extractive summarization methods simplify the problem of summarization into the problem of selecting a representative subset of the sentences in the original documents. Abstractive summarization may compose novel sentences, unseen in the original sources. In our study we focus on sentence based extractive document summarization. The extractive summarization systems are typically based on techniques for sentence extraction and aim to cover the set of sentences that are most important for the overall understanding of a given document. In this paper, we propose unsupervised document summarization method that creates the summary by clustering and extracting sentences from the original document. For this purpose new criterion functions for sentence clustering have been proposed. Similarity measures play an increasingly important role in document clustering. Here we’ve also developed a discrete differential evolution algorithm to optimize the criterion functions. The experimental results show that our suggested approach can improve the performance compared to sate-of-the-art summarization approaches.
基金This work is funded byDeanship of Scientific Research atKingKhalid University under Grant Number(RGP 1/279/42).www.kku.edu.sa.
文摘Due to the advanced developments of the Internet and information technologies,a massive quantity of electronic data in the biomedical sector has been exponentially increased.To handle the huge amount of biomedical data,automated multi-document biomedical text summarization becomes an effective and robust approach of accessing the increased amount of technical and medical literature in the biomedical sector through the summarization of multiple source documents by retaining the significantly informative data.So,multi-document biomedical text summarization acts as a vital role to alleviate the issue of accessing precise and updated information.This paper presents a Deep Learning based Attention Long Short Term Memory(DLALSTM)Model for Multi-document Biomedical Text Summarization.The proposed DL-ALSTM model initially performs data preprocessing to convert the available medical data into a compatible format for further processing.Then,the DL-ALSTM model gets executed to summarize the contents from the multiple biomedical documents.In order to tune the summarization performance of the DL-ALSTM model,chaotic glowworm swarm optimization(CGSO)algorithm is employed.Extensive experimentation analysis is performed to ensure the betterment of the DL-ALSTM model and the results are investigated using the PubMed dataset.Comprehensive comparative result analysis is carried out to showcase the efficiency of the proposed DL-ALSTM model with the recently presented models.
基金Project (No. 2002AA119050) supported by the National Hi-TechResearch and Development Program (863) of China
文摘Automatic Chinese text summarization for dialogue style is a relatively new research area. In this paper, Latent Semantic Analysis (LSA) is first used to extract semantic knowledge from a given document, all question paragraphs are identified, an automatic text segmentation approach analogous to Text'filing is exploited to improve the precision of correlating question paragraphs and answer paragraphs, and finally some "important" sentences are extracted from the generic content and the question-answer pairs to generate a complete summary. Experimental results showed that our approach is highly efficient and improves significantly the coherence of the summary while not compromising informativeness.
文摘News feed is one of the potential information providing sources which give updates on various topics of different domains.These updates on various topics need to be collected since the domain specific interested users are in need of important updates in their domains with organized data from various sources.In this paper,the news summarization system is proposed for the news data streams from RSS feeds and Google news.Since news stream analysis requires live content,the news data are continuously collected for our experimentation.Themajor contributions of thiswork involve domain corpus based news collection,news content extraction,hierarchical clustering of the news and summarization of news.Many of the existing news summarization systems lack in providing dynamic content with domain wise representation.This is alleviated in our proposed systemby tagging the news feed with domain corpuses and organizing the news streams with the hierarchical structure with topic wise representation.Further,the news streams are summarized for the users with a novel summarization algorithm.The proposed summarization system generates topic wise summaries effectively for the user and no system in the literature has handled the news summarization by collecting the data dynamically and organizing the content hierarchically.The proposed system is compared with existing systems and achieves better results in generating news summaries.The Online news content editors are highly benefitted by this system for instantly getting the news summaries of their domain interest.
文摘Taking into account the increasing volume of text documents,automatic summarization is one of the important tools for quick and optimal utilization of such sources.Automatic summarization is a text compression process for producing a shorter document in order to quickly access the important goals and main features of the input document.In this study,a novel method is introduced for selective text summarization using the genetic algorithm and generation of repetitive patterns.One of the important features of the proposed summarization is to identify and extract the relationship between the main features of the input text and the creation of repetitive patterns in order to produce and optimize the vector of the main document features in the production of the summary document compared to other previous methods.In this study,attempts were made to encompass all the main parameters of the summary text including unambiguous summary with the highest precision,continuity and consistency.To investigate the efficiency of the proposed algorithm,the results of the study were evaluated with respect to the precision and recall criteria.The results of the study evaluation showed the optimization the dimensions of the features and generation of a sequence of summary document sentences having the most consistency with the main goals and features of the input document.
基金This work was partially supported by the National Natural Science Foundation of China(Grant No.61502082)the National Key R&D Program of China(Grant No.2018YFA0306703).
文摘In recent years,many text summarization models based on pretraining methods have achieved very good results.However,in these text summarization models,semantic deviations are easy to occur between the original input representation and the representation that passed multi-layer encoder,which may result in inconsistencies between the generated summary and the source text content.The Bidirectional Encoder Representations from Transformers(BERT)improves the performance of many tasks in Natural Language Processing(NLP).Although BERT has a strong capability to encode context,it lacks the fine-grained semantic representation.To solve these two problems,we proposed a semantic supervision method based on Capsule Network.Firstly,we extracted the fine-grained semantic representation of the input and encoded result in BERT by Capsule Network.Secondly,we used the fine-grained semantic representation of the input to supervise the fine-grained semantic representation of the encoded result.Then we evaluated our model on a popular Chinese social media dataset(LCSTS),and the result showed that our model achieved higher ROUGE scores(including R-1,R-2),and our model outperformed baseline systems.Finally,we conducted a comparative study on the stability of the model,and the experimental results showed that our model was more stable.
文摘In the era of Big Data,we are faced with an inevitable and challenging problem of“overload information”.To alleviate this problem,it is important to use effective automatic text summarization techniques to obtain the key information quickly and efficiently from the huge amount of text.In this paper,we propose a hybrid method of extractive text summarization based on deep learning and graph ranking algorithms(ETSDG).In this method,a pre-trained deep learning model is designed to yield useful sentence embeddings.Given the association between sentences in raw documents,a traditional LexRank algorithm with fine-tuning is adopted fin ETSDG.In order to improve the performance of the extractive text summarization method,we further integrate the traditional LexRank algorithm with deep learning.Testing results on the data set DUC2004 show that ETSDG has better performance in ROUGE metrics compared with certain benchmark methods.
文摘With the remarkable growth of textual data sources in recent years,easy,fast,and accurate text processing has become a challenge with significant payoffs.Automatic text summarization is the process of compressing text documents into shorter summaries for easier review of its core contents,which must be done without losing important features and information.This paper introduces a new hybrid method for extractive text summarization with feature selection based on text structure.The major advantage of the proposed summarization method over previous systems is the modeling of text structure and relationship between entities in the input text,which improves the sentence feature selection process and leads to the generation of unambiguous,concise,consistent,and coherent summaries.The paper also presents the results of the evaluation of the proposed method based on precision and recall criteria.It is shown that the method produces summaries consisting of chains of sentences with the aforementioned characteristics from the original text.
文摘Recently,automation is considered vital in most fields since computing methods have a significant role in facilitating work such as automatic text summarization.However,most of the computing methods that are used in real systems are based on graph models,which are characterized by their simplicity and stability.Thus,this paper proposes an improved extractive text summarization algorithm based on both topic and graph models.The methodology of this work consists of two stages.First,the well-known TextRank algorithm is analyzed and its shortcomings are investigated.Then,an improved method is proposed with a new computational model of sentence weights.The experimental results were carried out on standard DUC2004 and DUC2006 datasets and compared to four text summarization methods.Finally,through experiments on the DUC2004 and DUC2006 datasets,our proposed improved graph model algorithm TG-SMR(Topic Graph-Summarizer)is compared to other text summarization systems.The experimental results prove that the proposed TG-SMR algorithm achieves higher ROUGE scores.It is foreseen that the TG-SMR algorithm will open a new horizon that concerns the performance of ROUGE evaluation indicators.
文摘This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level and micro-level discourse structure to identify important information that can be extracted from dissertation abstracts, and then uses a variable-based framework to integrate and organize extracted information across dissertation abstracts. This framework focuses more on research concepts and their research relationships found in sociology dissertation abstracts and has a hierarchical structure. A taxonomy is constructed to support the summarization process in two ways: (1) helping to identify important concepts and relations expressed in the text, and (2) providing a structure for linking similar concepts in different abstracts. This paper describes the variable-based framework and the summarization process, and then reports the construction of the taxonomy for supporting the summarization process. An example is provided to show how to use the constructed taxonomy to identify important concepts and integrate the concepts extracted from different abstracts.
文摘We present a novel unsupervised integrated score framework to generate generic extractive multi- document summaries by ranking sentences based on dynamic programming (DP) strategy. Considering that cluster-based methods proposed by other researchers tend to ignore informativeness of words when they generate summaries, our proposed framework takes relevance, diversity, informativeness and length constraint of sentences into consideration comprehensively. We apply Density Peaks Clustering (DPC) to get relevance scores and diversity scores of sentences simultaneously. Our framework produces the best performance on DUC2004, 0.396 of ROUGE-1 score, 0.094 of ROUGE-2 score and 0.143 of ROUGE-SU4 which outperforms a series of popular baselines, such as DUC Best, FGB [7], and BSTM [10].