Text Rank is a popular tool for obtaining words or phrases that are important for many Natural Language Processing (NLP) tasks. This paper presents a practical approach for Text Rank domain specific using Field Associ...Text Rank is a popular tool for obtaining words or phrases that are important for many Natural Language Processing (NLP) tasks. This paper presents a practical approach for Text Rank domain specific using Field Association (FA) words. We present the keyphrase separation technique not for a single document, although for a particular domain. The former builds a specific domain field. The second collects a list of ideal FA terms and compounds FA terms from the specific domain that are considered to be contender keyword phrases. Therefore, we combine two-word node weights and field tree relationships into a new approach to generate keyphrases from a particular domain. Studies using the changed approach to extract key phrases demonstrate that the latest techniques including FA terms are stronger than the others that use normal words and its precise words reach 90%.展开更多
Text Summarization is an essential area in text mining,which has procedures for text extraction.In natural language processing,text summarization maps the documents to a representative set of descriptive words.Therefo...Text Summarization is an essential area in text mining,which has procedures for text extraction.In natural language processing,text summarization maps the documents to a representative set of descriptive words.Therefore,the objective of text extraction is to attain reduced expressive contents from the text documents.Text summarization has two main areas such as abstractive,and extractive summarization.Extractive text summarization has further two approaches,in which the first approach applies the sentence score algorithm,and the second approach follows the word embedding principles.All such text extractions have limitations in providing the basic theme of the underlying documents.In this paper,we have employed text summarization by TF-IDF with PageRank keywords,sentence score algorithm,and Word2Vec word embedding.The study compared these forms of the text summarizations with the actual text,by calculating cosine similarities.Furthermore,TF-IDF based PageRank keywords are extracted from the other two extractive summarizations.An intersection over these three types of TD-IDF keywords to generate the more representative set of keywords for each text document is performed.This technique generates variable-length keywords as per document diversity instead of selecting fixedlength keywords for each document.This form of abstractive summarization improves metadata similarity to the original text compared to all other forms of summarized text.It also solves the issue of deciding the number of representative keywords for a specific text document.To evaluate the technique,the study used a sample of more than eighteen hundred text documents.The abstractive summarization follows the principles of deep learning to create uniform similarity of extracted words with actual text and all other forms of text summarization.The proposed technique provides a stable measure of similarity as compared to existing forms of text summarization.展开更多
In recent years,the growing popularity of social media platforms has led to several interesting natural language processing(NLP)applications.However,these social media-based NLP applications are subject to different t...In recent years,the growing popularity of social media platforms has led to several interesting natural language processing(NLP)applications.However,these social media-based NLP applications are subject to different types of adversarial attacks due to the vulnerabilities of machine learning(ML)and NLP techniques.This work presents a new low-level adversarial attack recipe inspired by textual variations in online social media communication.These variations are generated to convey the message using out-of-vocabulary words based on visual and phonetic similarities of characters and words in the shortest possible form.The intuition of the proposed scheme is to generate adversarial examples influenced by human cognition in text generation on social media platforms while preserving human robustness in text understanding with the fewest possible perturbations.The intentional textual variations introduced by users in online communication motivate us to replicate such trends in attacking text to see the effects of such widely used textual variations on the deep learning classifiers.In this work,the four most commonly used textual variations are chosen to generate adversarial examples.Moreover,this article introduced a word importance ranking-based beam search algorithm as a searching method for the best possible perturbation selection.The effectiveness of the proposed adversarial attacks has been demonstrated on four benchmark datasets in an extensive experimental setup.展开更多
文摘Text Rank is a popular tool for obtaining words or phrases that are important for many Natural Language Processing (NLP) tasks. This paper presents a practical approach for Text Rank domain specific using Field Association (FA) words. We present the keyphrase separation technique not for a single document, although for a particular domain. The former builds a specific domain field. The second collects a list of ideal FA terms and compounds FA terms from the specific domain that are considered to be contender keyword phrases. Therefore, we combine two-word node weights and field tree relationships into a new approach to generate keyphrases from a particular domain. Studies using the changed approach to extract key phrases demonstrate that the latest techniques including FA terms are stronger than the others that use normal words and its precise words reach 90%.
文摘Text Summarization is an essential area in text mining,which has procedures for text extraction.In natural language processing,text summarization maps the documents to a representative set of descriptive words.Therefore,the objective of text extraction is to attain reduced expressive contents from the text documents.Text summarization has two main areas such as abstractive,and extractive summarization.Extractive text summarization has further two approaches,in which the first approach applies the sentence score algorithm,and the second approach follows the word embedding principles.All such text extractions have limitations in providing the basic theme of the underlying documents.In this paper,we have employed text summarization by TF-IDF with PageRank keywords,sentence score algorithm,and Word2Vec word embedding.The study compared these forms of the text summarizations with the actual text,by calculating cosine similarities.Furthermore,TF-IDF based PageRank keywords are extracted from the other two extractive summarizations.An intersection over these three types of TD-IDF keywords to generate the more representative set of keywords for each text document is performed.This technique generates variable-length keywords as per document diversity instead of selecting fixedlength keywords for each document.This form of abstractive summarization improves metadata similarity to the original text compared to all other forms of summarized text.It also solves the issue of deciding the number of representative keywords for a specific text document.To evaluate the technique,the study used a sample of more than eighteen hundred text documents.The abstractive summarization follows the principles of deep learning to create uniform similarity of extracted words with actual text and all other forms of text summarization.The proposed technique provides a stable measure of similarity as compared to existing forms of text summarization.
基金supported by the National Research Foundation of Korea (NRF)grant funded by the Korea government (MSIT) (No.NRF-2022R1A2C1007434)by the BK21 FOUR Program of the NRF of Korea funded by the Ministry of Education (NRF5199991014091).
文摘In recent years,the growing popularity of social media platforms has led to several interesting natural language processing(NLP)applications.However,these social media-based NLP applications are subject to different types of adversarial attacks due to the vulnerabilities of machine learning(ML)and NLP techniques.This work presents a new low-level adversarial attack recipe inspired by textual variations in online social media communication.These variations are generated to convey the message using out-of-vocabulary words based on visual and phonetic similarities of characters and words in the shortest possible form.The intuition of the proposed scheme is to generate adversarial examples influenced by human cognition in text generation on social media platforms while preserving human robustness in text understanding with the fewest possible perturbations.The intentional textual variations introduced by users in online communication motivate us to replicate such trends in attacking text to see the effects of such widely used textual variations on the deep learning classifiers.In this work,the four most commonly used textual variations are chosen to generate adversarial examples.Moreover,this article introduced a word importance ranking-based beam search algorithm as a searching method for the best possible perturbation selection.The effectiveness of the proposed adversarial attacks has been demonstrated on four benchmark datasets in an extensive experimental setup.