Class Title:Radiological imaging method a comprehensive overview purpose.This GPT paper provides an overview of the different forms of radiological imaging and the potential diagnosis capabilities they offer as well a...Class Title:Radiological imaging method a comprehensive overview purpose.This GPT paper provides an overview of the different forms of radiological imaging and the potential diagnosis capabilities they offer as well as recent advances in the field.Materials and Methods:This paper provides an overview of conventional radiography digital radiography panoramic radiography computed tomography and cone-beam computed tomography.Additionally recent advances in radiological imaging are discussed such as imaging diagnosis and modern computer-aided diagnosis systems.Results:This paper details the differences between the imaging techniques the benefits of each and the current advances in the field to aid in the diagnosis of medical conditions.Conclusion:Radiological imaging is an extremely important tool in modern medicine to assist in medical diagnosis.This work provides an overview of the types of imaging techniques used the recent advances made and their potential applications.展开更多
In this paper,a hybrid intelligent text zero-watermarking approach has been proposed by integrating text zero-watermarking and hidden Markov model as natural language processing techniques for the content authenticati...In this paper,a hybrid intelligent text zero-watermarking approach has been proposed by integrating text zero-watermarking and hidden Markov model as natural language processing techniques for the content authentication and tampering detection of Arabic text contents.The proposed approach known as Second order of Alphanumeric Mechanism of Markov model and Zero-Watermarking Approach(SAMMZWA).Second level order of alphanumeric mechanism based on hidden Markov model is integrated with text zero-watermarking techniques to improve the overall performance and tampering detection accuracy of the proposed approach.The SAMMZWA approach embeds and detects the watermark logically without altering the original text document.The extracted features are used as a watermark information and integrated with digital zero-watermarking techniques.To detect eventual tampering,SAMMZWA has been implemented and validated with attacked Arabic text.Experiments were performed on four datasets of varying lengths under multiple random locations of insertion,reorder and deletion attacks.The experimental results show that our method is more sensitive for all kinds of tampering attacks with high level accuracy of tampering detection than compared methods.展开更多
The digital text media is the most common media transferred via the internet for various purposes and is very sensitive to transfer online with the possibility to be tampered illegally by the tampering attacks.Therefo...The digital text media is the most common media transferred via the internet for various purposes and is very sensitive to transfer online with the possibility to be tampered illegally by the tampering attacks.Therefore,improving the security and authenticity of the text when it is transferred via the internet has become one of the most difcult challenges that researchers face today.Arabic text is more sensitive than other languages due to Harakat’s existence in Arabic diacritics such as Kasra,and Damma in which making basic changes such as modifying diacritic arrangements can lead to change the text meaning.In this paper,an intelligent hybrid solution is proposed with highly sensitive detection for any tampering on Arabic text exchanged via the internet.Natural language processing,entropy,and watermarking techniques have been integrated into this method to improve the security and reliability of Arabic text without limitations in text nature or size,and type or volumes of tampering attack.The proposed scheme is implemented,simulated,and validated using four standard Arabic datasets of varying lengths under multiple random locations of insertion,reorder,and deletion attacks.The experimental and simulation results prove the accuracy of tampering detection of the proposed scheme against all kinds of tampering attacks.Comparison results show that the proposed approach outperforms all of the other baseline approaches in terms of tampering detection accuracy.展开更多
Text in natural scene images usually carries abundant semantic information. However, due to variations of text and complexity of background, detecting text in scene images becomes a critical and challenging task. In t...Text in natural scene images usually carries abundant semantic information. However, due to variations of text and complexity of background, detecting text in scene images becomes a critical and challenging task. In this paper, we present a novel method to detect text from scene images. Firstly, we decompose scene images into background and text components using morphological component analysis(MCA), which will reduce the adverse effects of complex backgrounds on the detection results.In order to improve the performance of image decomposition,two discriminative dictionaries of background and text are learned from the training samples. Moreover, Laplacian sparse regularization is introduced into our proposed dictionary learning method which improves discrimination of dictionary. Based on the text dictionary and the sparse-representation coefficients of text, we can construct the text component. After that, the text in the query image can be detected by applying certain heuristic rules. The results of experiments show the effectiveness of the proposed method.展开更多
Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal ...Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal is to enhance inference speed without sacrificing significant detection accuracy,thereby enabling robust performance on resource-constrained devices like drones,closed-circuit television cameras,and other embedded systems.To achieve this,we propose key modifications to the network architecture to lighten the original backbone and improve feature aggregation,including replacing standard convolution with depth-wise convolution,adopting the C2 sequence module in place of C3,employing Spatial Pyramid Pooling Global(SPPG)instead of Spatial Pyramid Pooling Fast(SPPF)and integrating Bi-directional Feature Pyramid Network(BiFPN)into the neck.Experimental results demonstrate a remarkable 26%improvement in inference speed compared to the baseline,with only marginal reductions of 1.6%and 4.2%in mean average precision(mAP)at the intersection over union(IoU)thresholds of 0.5 and 0.5:0.95,respectively.Our work represents a significant advancement in scene text detection,striking a balance between speed and accuracy,making it well-suited for performance-constrained environments.展开更多
Text perception is crucial for understanding the semantics of outdoor scenes,making it a key requirement for building intelligent systems for driver assistance or autonomous driving.Text information in car-mounted vid...Text perception is crucial for understanding the semantics of outdoor scenes,making it a key requirement for building intelligent systems for driver assistance or autonomous driving.Text information in car-mounted videos can assist drivers in making decisions.However,Car-mounted video text images pose challenges such as complex backgrounds,small fonts,and the need for real-time detection.We proposed a robust Car-mounted Video Text Detector(CVTD).It is a lightweight text detection model based on ResNet18 for feature extraction,capable of detecting text in arbitrary shapes.Our model efficiently extracted global text positions through the Coordinate Attention Threshold Activation(CATA)and enhanced the representation capability through stacking two Feature Pyramid Enhancement Fusion Modules(FPEFM),strengthening feature representation,and integrating text local features and global position information,reinforcing the representation capability of the CVTD model.The enhanced feature maps,when acted upon by Text Activation Maps(TAM),effectively distinguished text foreground from non-text regions.Additionally,we collected and annotated a dataset containing 2200 images of Car-mounted Video Text(CVT)under various road conditions for training and evaluating our model’s performance.We further tested our model on four other challenging public natural scene text detection benchmark datasets,demonstrating its strong generalization ability and real-time detection speed.This model holds potential for practical applications in real-world scenarios.展开更多
In recent years,images have played a more and more important role in our daily life and social communication.To some extent,the textual information contained in the pictures is an important factor in understanding the...In recent years,images have played a more and more important role in our daily life and social communication.To some extent,the textual information contained in the pictures is an important factor in understanding the content of the scenes themselves.The more accurate the text detection of the natural scenes is,the more accurate our semantic understanding of the images will be.Thus,scene text detection has also become the hot spot in the domain of computer vision.In this paper,we have presented a modified text detection network which is based on further research and improvement of Connectionist Text Proposal Network(CTPN)proposed by previous researchers.To extract deeper features that are less affected by different images,we use Residual Network(ResNet)to replace Visual Geometry Group Network(VGGNet)which is used in the original network.Meanwhile,to enhance the robustness of the models to multiple languages,we use the datasets for training from multi-lingual scene text detection and script identification datasets(MLT)of 2017 International Conference on Document Analysis and Recognition(ICDAR2017).And apart from that,the attention mechanism is used to get more reasonable weight distribution.We found the proposed models achieve 0.91 F1-score on ICDAR2011 test,better than CTPN trained on the same datasets by about 5%.展开更多
The task of cross-target stance detection faces significant challenges due to the lack of additional background information in emerging knowledge domains and the colloquial nature of language patterns.Traditional stan...The task of cross-target stance detection faces significant challenges due to the lack of additional background information in emerging knowledge domains and the colloquial nature of language patterns.Traditional stance detection methods often struggle with understanding limited context and have insufficient generalization across diverse sentiments and semantic structures.This paper focuses on effectively mining and utilizing sentimentsemantics knowledge for stance knowledge transfer and proposes a sentiment-aware hierarchical attention network(SentiHAN)for cross-target stance detection.SentiHAN introduces an improved hierarchical attention network designed to maximize the use of high-level representations of targets and texts at various fine-grain levels.This model integrates phrase-level combinatorial sentiment knowledge to effectively bridge the knowledge gap between known and unknown targets.By doing so,it enables a comprehensive understanding of stance representations for unknown targets across different sentiments and semantic structures.The model’s ability to leverage sentimentsemantics knowledge enhances its performance in detecting stances that may not be directly observable from the immediate context.Extensive experimental results indicate that SentiHAN significantly outperforms existing benchmark methods in terms of both accuracy and robustness.Moreover,the paper employs ablation studies and visualization techniques to explore the intricate relationship between sentiment and stance.These analyses further confirm the effectiveness of sentence-level combinatorial sentiment knowledge in improving stance detection capabilities.展开更多
Text embedded in images is one of many important cues for indexing and retrieval of images and videos. In the paper, we present a novel method of detecting text aligned either horizontally or vertically, in which a py...Text embedded in images is one of many important cues for indexing and retrieval of images and videos. In the paper, we present a novel method of detecting text aligned either horizontally or vertically, in which a pyramid structure is used to represent an image and the features of the text are extracted using SUSAN edge detector. Text regions at each level of the pyramid are identified according to the autocorrelation analysis. New techniques are introduced to split the text regions into basic ones and merge them into text lines. By evaluating the method on a set of images, we obtain a very good performance of text detection.展开更多
The text of the Quran is principally dependent on the Arabic language.Therefore,improving the security and reliability of the Quran’s text when it is exchanged via internet networks has become one of the most difcult...The text of the Quran is principally dependent on the Arabic language.Therefore,improving the security and reliability of the Quran’s text when it is exchanged via internet networks has become one of the most difcult challenges that researchers face today.Consequently,the diacritical marks in the Holy Quran which represent Arabic vowels(i,j.s)known as the kashida(or“extended letters”)must be protected from changes.The cover text of the Quran and its watermarked text are different due to the low values of the Peak Signal to Noise Ratio(PSNR),and Normalized Cross-Correlation(NCC);thus,the location for tamper detection accuracy is low.The gap addressed in this paper to improve the security of Arabic text in the Holy Quran by using vowels with kashida.To enhance the watermarking scheme of the text of the Quran based on hybrid techniques(XOR and queuing techniques)of the purposed scheme.The methodology propose scheme consists of four phases:The rst phase is pre-processing.This is followed by the second phase where an embedding process takes place to hide the data after the vowel letters wherein if the secret bit is“1”,it inserts the kashida but does not insert the kashida if the bit is“0”.The third phase is an extraction process and the last phase is to evaluate the performance of the proposed scheme by using PSNR(for the imperceptibility),and NCC(for the security of the watermarking).Experiments were performed on three datasets of varying lengths under multiple random locations of insertion,reorder and deletion attacks.The experimental results were revealed the improvement of the NCC by 1.76%,PSNR by 9.6%compared to available current schemes.展开更多
Due to the rapid increase in the exchange of text information via internet networks,the security and the reliability of digital content have become a major research issue.The main challenges faced by researchers are a...Due to the rapid increase in the exchange of text information via internet networks,the security and the reliability of digital content have become a major research issue.The main challenges faced by researchers are authentication,integrity verication,and tampering detection of the digital contents.In this paper,text zero-watermarking and text feature-based approach is proposed to improve the tampering detection accuracy of English text contents.The proposed approach embeds and detects the watermark logically without altering the original English text document.Based on hidden Markov model(HMM),the fourth level order of the word mechanism is used to analyze the contents of the given English text to nd the interrelationship between the contexts.The extracted features are used as watermark information and integrated with digital zero-watermarking techniques.To detect eventual tampering,the proposed approach has been implemented and validated with attacked English text.Experiments were performed using four standard datasets of varying lengths under multiple random locations of insertion,reorder,and deletion attacks.The experimental and simulation results prove the tampering detection accuracy of our method against all kinds of tampering attacks.Comparison results show that our proposed approach outperforms all the other baseline approaches in terms of tampering detection accuracy.展开更多
Detecting and recognizing text from natural scene images presents a challenge because the image quality depends on the conditions in which the image is captured,such as viewing angles,blurring,sensor noise,etc.However...Detecting and recognizing text from natural scene images presents a challenge because the image quality depends on the conditions in which the image is captured,such as viewing angles,blurring,sensor noise,etc.However,in this paper,a prototype for text detection and recognition from natural scene images is proposed.This prototype is based on the Raspberry Pi 4 and the Universal Serial Bus(USB)camera and embedded our text detection and recognition model,which was developed using the Python language.Our model is based on the deep learning text detector model through the Efficient and Accurate Scene Text Detec-tor(EAST)model for text localization and detection and the Tesseract-OCR,which is used as an Optical Character Recognition(OCR)engine for text recog-nition.Our prototype is controlled by the Virtual Network Computing(VNC)tool through a computer via a wireless connection.The experiment results show that the recognition rate for the captured image through the camera by our prototype can reach 99.75%with low computational complexity.Furthermore,our proto-type is more performant than the Tesseract software in terms of the recognition rate.Besides,it provides the same performance in terms of the recognition rate with a huge decrease in the execution time by an average of 89%compared to the EasyOCR software on the Raspberry Pi 4 board.展开更多
Segmentation-based scene text detection has drawn a great deal of attention,as it can describe the text instance with arbitrary shapes based on its pixel-level prediction.However,most segmentation-based methods suffer...Segmentation-based scene text detection has drawn a great deal of attention,as it can describe the text instance with arbitrary shapes based on its pixel-level prediction.However,most segmentation-based methods suffer from complex post-processing to separate the text instances which are close to each other,resulting in considerable time consumption during the inference procedure.A label enhancement method is proposed to construct two kinds of training labels for segmentation-based scene text detection in this paper.The label distribution learning(LDL)method is used to overcome the problem brought by pure shrunk text labels that might result in suboptimal detection perfor⁃mance.The experimental results on three benchmarks demonstrate that the proposed method can consistently improve the performance with⁃out sacrificing inference speed.展开更多
Scene text detection is an important step in the scene text reading system.There are still two problems during the existing text detection methods:(1)The small receptive of the convolutional layer in text detection is...Scene text detection is an important step in the scene text reading system.There are still two problems during the existing text detection methods:(1)The small receptive of the convolutional layer in text detection is not sufficiently sensitive to the target area in the image;(2)The deep receptive of the convolutional layer in text detection lose a lot of spatial feature information.Therefore,detecting scene text remains a challenging issue.In this work,we design an effective text detector named Adaptive Multi-Scale HyperNet(AMSHN)to improve texts detection performance.Specifically,AMSHN enhances the sensitivity of target semantics in shallow features with a new attention mechanism to strengthen the region of interest in the image and weaken the region of no interest.In addition,it reduces the loss of spatial feature by fusing features on multiple paths,which significantly improves the detection performance of text.Experimental results on the Robust Reading Challenge on Reading Chinese Text on Signboard(ReCTS)dataset show that the proposed method has achieved the state-of-the-art results,which proves the ability of our detector on both particularity and universality applications.展开更多
近年来场景文本检测技术飞速发展,提出一种可适用于任意形状文本检测的新颖算法Mask Text Detector.该算法在Mask R-CNN的基础上,用anchor-free的方法替代了原本的RPN层生成建议框,减少了超参、模型参数和计算量.还提出LQCS(Localizatio...近年来场景文本检测技术飞速发展,提出一种可适用于任意形状文本检测的新颖算法Mask Text Detector.该算法在Mask R-CNN的基础上,用anchor-free的方法替代了原本的RPN层生成建议框,减少了超参、模型参数和计算量.还提出LQCS(Localization Quality and Classification Score)joint regression,能够将坐标质量和类别分数关联到一起,消除预测阶段不一致的问题.为了让网络区分复杂样本,结合传统的边缘检测算法提出Socle-Mask分支生成分割掩码.该模块在水平和垂直方向上分区别提取纹理特征,并加入通道自注意力机制,让网络自主选择通道特征.我们在三个具有挑战性的数据集(Total-Text、CTW1500和ICDAR2015)中进行了广泛的实验,验证了该算法具有很好的文本检测性能.展开更多
We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and ...We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and video indexing. First, a Non-Linear Niblack method (NLNiblack) is proposed to decompose the image into candidate CCs. Then, all these CCs are fed into a cascade of classifiers trained by Adaboost algorithm. Each classifier in the cascade responds to one feature of the CC. Proposed here are 12 novel features which are insensitive to noise, scale, text orientation and text language. The classifier cascade allows non-text CCs of the image to be rapidly discarded while more computation is spent on promising text-like CCs. The CCs passing through the cascade are considered as text components and are used to form the segmentation result. A prototype system was built, with experimental results proving the effectiveness and efficiency of the proposed method.展开更多
This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorit...This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorithm. Then all the CCs including text CCs and non-text CCs are verified on their text features by a 2-stage classification module, where most non-text CCs are discarded by an attentional cascade classifier and remaining CCs are further verified by an SVM. All the accepted CCs are output to result in text only binary image. Experiments with many images in different scenes showed satisfactory performance of our proposed method.展开更多
Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will...Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will be affected greatly. Aiming at this problem,hierarchical clustering algorithm based on single-pass is proposed,which is inspired by hierarchical and concurrent ideas to divide clustering process into three stages. News reports are classified into different categories firstly.Then there are twice single-pass clustering processes in the same category,and one agglomerative clustering among different categories. In addition,for semantic similarity in news reports,topic model is improved based on named entities. Experimental results show that the proposed method can effectively accelerate the process as well as improve the performance.展开更多
Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty...Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty topics that experience a sudden increase during a period of time. In this paper, we propose a new topic model named Burst-LDA, which simultaneously discovers topics and reveals their burstiness through explicitly modeling each topic's burst states with a first order Markov chain and using the chain to generate the topic proportion of documents in a Logistic Normal fashion. A Gibbs sampling algorithm is developed for the posterior inference of the proposed model. Experimental results on a news data set show our model can efficiently discover bursty topics, outperforming the state-of-the-art method.展开更多
Due to the widespread usage of social media in our recent daily lifestyles,sentiment analysis becomes an important field in pattern recognition and Natural Language Processing(NLP).In this field,users’feedback data o...Due to the widespread usage of social media in our recent daily lifestyles,sentiment analysis becomes an important field in pattern recognition and Natural Language Processing(NLP).In this field,users’feedback data on a specific issue are evaluated and analyzed.Detecting emotions within the text is therefore considered one of the important challenges of the current NLP research.Emotions have been widely studied in psychology and behavioral science as they are an integral part of the human nature.Emotions describe a state of mind of distinct behaviors,feelings,thoughts and experiences.The main objective of this paper is to propose a new model named BERT-CNN to detect emotions from text.This model is formed by a combination of the Bidirectional Encoder Representations from Transformer(BERT)and the Convolutional Neural networks(CNN)for textual classification.This model embraces the BERT to train the word semantic representation language model.According to the word context,the semantic vector is dynamically generated and then placed into the CNN to predict the output.Results of a comparative study proved that the BERT-CNN model overcomes the state-of-art baseline performance produced by different models in the literature using the semeval 2019 task3 dataset and ISEAR datasets.The BERTCNN model achieves an accuracy of 94.7%and an F1-score of 94%for semeval2019 task3 dataset and an accuracy of 75.8%and an F1-score of 76%for ISEAR dataset.展开更多
文摘Class Title:Radiological imaging method a comprehensive overview purpose.This GPT paper provides an overview of the different forms of radiological imaging and the potential diagnosis capabilities they offer as well as recent advances in the field.Materials and Methods:This paper provides an overview of conventional radiography digital radiography panoramic radiography computed tomography and cone-beam computed tomography.Additionally recent advances in radiological imaging are discussed such as imaging diagnosis and modern computer-aided diagnosis systems.Results:This paper details the differences between the imaging techniques the benefits of each and the current advances in the field to aid in the diagnosis of medical conditions.Conclusion:Radiological imaging is an extremely important tool in modern medicine to assist in medical diagnosis.This work provides an overview of the types of imaging techniques used the recent advances made and their potential applications.
基金the Deanship of Scientific Research at King Khalid University for funding this work under grant number(R.G.P.2/55/40/2019),Received by Fahd N.Al-Wesabi.www.kku.edu.sa。
文摘In this paper,a hybrid intelligent text zero-watermarking approach has been proposed by integrating text zero-watermarking and hidden Markov model as natural language processing techniques for the content authentication and tampering detection of Arabic text contents.The proposed approach known as Second order of Alphanumeric Mechanism of Markov model and Zero-Watermarking Approach(SAMMZWA).Second level order of alphanumeric mechanism based on hidden Markov model is integrated with text zero-watermarking techniques to improve the overall performance and tampering detection accuracy of the proposed approach.The SAMMZWA approach embeds and detects the watermark logically without altering the original text document.The extracted features are used as a watermark information and integrated with digital zero-watermarking techniques.To detect eventual tampering,SAMMZWA has been implemented and validated with attacked Arabic text.Experiments were performed on four datasets of varying lengths under multiple random locations of insertion,reorder and deletion attacks.The experimental results show that our method is more sensitive for all kinds of tampering attacks with high level accuracy of tampering detection than compared methods.
基金The author extends his appreciation to the Deanship of Scientic Research at King Khalid University for funding this work under Grant Number(R.G.P.2/55/40/2019),Received by Fahd N.Al-Wesabi.www.kku.edu.sa。
文摘The digital text media is the most common media transferred via the internet for various purposes and is very sensitive to transfer online with the possibility to be tampered illegally by the tampering attacks.Therefore,improving the security and authenticity of the text when it is transferred via the internet has become one of the most difcult challenges that researchers face today.Arabic text is more sensitive than other languages due to Harakat’s existence in Arabic diacritics such as Kasra,and Damma in which making basic changes such as modifying diacritic arrangements can lead to change the text meaning.In this paper,an intelligent hybrid solution is proposed with highly sensitive detection for any tampering on Arabic text exchanged via the internet.Natural language processing,entropy,and watermarking techniques have been integrated into this method to improve the security and reliability of Arabic text without limitations in text nature or size,and type or volumes of tampering attack.The proposed scheme is implemented,simulated,and validated using four standard Arabic datasets of varying lengths under multiple random locations of insertion,reorder,and deletion attacks.The experimental and simulation results prove the accuracy of tampering detection of the proposed scheme against all kinds of tampering attacks.Comparison results show that the proposed approach outperforms all of the other baseline approaches in terms of tampering detection accuracy.
基金supported in part by the National Natural Science Foundation of China(61302041,61363044,61562053,61540042)the Applied Basic Research Foundation of Yunnan Provincial Science and Technology Department(2013FD011,2016FD039)
文摘Text in natural scene images usually carries abundant semantic information. However, due to variations of text and complexity of background, detecting text in scene images becomes a critical and challenging task. In this paper, we present a novel method to detect text from scene images. Firstly, we decompose scene images into background and text components using morphological component analysis(MCA), which will reduce the adverse effects of complex backgrounds on the detection results.In order to improve the performance of image decomposition,two discriminative dictionaries of background and text are learned from the training samples. Moreover, Laplacian sparse regularization is introduced into our proposed dictionary learning method which improves discrimination of dictionary. Based on the text dictionary and the sparse-representation coefficients of text, we can construct the text component. After that, the text in the query image can be detected by applying certain heuristic rules. The results of experiments show the effectiveness of the proposed method.
基金the National Natural Science Foundation of PRChina(42075130)Nari Technology Co.,Ltd.(4561655965)。
文摘Scene text detection is an important task in computer vision.In this paper,we present YOLOv5 Scene Text(YOLOv5ST),an optimized architecture based on YOLOv5 v6.0 tailored for fast scene text detection.Our primary goal is to enhance inference speed without sacrificing significant detection accuracy,thereby enabling robust performance on resource-constrained devices like drones,closed-circuit television cameras,and other embedded systems.To achieve this,we propose key modifications to the network architecture to lighten the original backbone and improve feature aggregation,including replacing standard convolution with depth-wise convolution,adopting the C2 sequence module in place of C3,employing Spatial Pyramid Pooling Global(SPPG)instead of Spatial Pyramid Pooling Fast(SPPF)and integrating Bi-directional Feature Pyramid Network(BiFPN)into the neck.Experimental results demonstrate a remarkable 26%improvement in inference speed compared to the baseline,with only marginal reductions of 1.6%and 4.2%in mean average precision(mAP)at the intersection over union(IoU)thresholds of 0.5 and 0.5:0.95,respectively.Our work represents a significant advancement in scene text detection,striking a balance between speed and accuracy,making it well-suited for performance-constrained environments.
基金This work is supported in part by the National Natural Science Foundation of China(Grant Number 61971078)which provided domain expertise and computational power that greatly assisted the activity+1 种基金This work was financially supported by Chongqing Municipal Education Commission Grants forMajor Science and Technology Project(KJZD-M202301901)the Science and Technology Research Project of Jiangxi Department of Education(GJJ2201049).
文摘Text perception is crucial for understanding the semantics of outdoor scenes,making it a key requirement for building intelligent systems for driver assistance or autonomous driving.Text information in car-mounted videos can assist drivers in making decisions.However,Car-mounted video text images pose challenges such as complex backgrounds,small fonts,and the need for real-time detection.We proposed a robust Car-mounted Video Text Detector(CVTD).It is a lightweight text detection model based on ResNet18 for feature extraction,capable of detecting text in arbitrary shapes.Our model efficiently extracted global text positions through the Coordinate Attention Threshold Activation(CATA)and enhanced the representation capability through stacking two Feature Pyramid Enhancement Fusion Modules(FPEFM),strengthening feature representation,and integrating text local features and global position information,reinforcing the representation capability of the CVTD model.The enhanced feature maps,when acted upon by Text Activation Maps(TAM),effectively distinguished text foreground from non-text regions.Additionally,we collected and annotated a dataset containing 2200 images of Car-mounted Video Text(CVT)under various road conditions for training and evaluating our model’s performance.We further tested our model on four other challenging public natural scene text detection benchmark datasets,demonstrating its strong generalization ability and real-time detection speed.This model holds potential for practical applications in real-world scenarios.
基金supported by National Natural Science Foundation of China(Nos.U1536121,61370195).
文摘In recent years,images have played a more and more important role in our daily life and social communication.To some extent,the textual information contained in the pictures is an important factor in understanding the content of the scenes themselves.The more accurate the text detection of the natural scenes is,the more accurate our semantic understanding of the images will be.Thus,scene text detection has also become the hot spot in the domain of computer vision.In this paper,we have presented a modified text detection network which is based on further research and improvement of Connectionist Text Proposal Network(CTPN)proposed by previous researchers.To extract deeper features that are less affected by different images,we use Residual Network(ResNet)to replace Visual Geometry Group Network(VGGNet)which is used in the original network.Meanwhile,to enhance the robustness of the models to multiple languages,we use the datasets for training from multi-lingual scene text detection and script identification datasets(MLT)of 2017 International Conference on Document Analysis and Recognition(ICDAR2017).And apart from that,the attention mechanism is used to get more reasonable weight distribution.We found the proposed models achieve 0.91 F1-score on ICDAR2011 test,better than CTPN trained on the same datasets by about 5%.
基金supported by the National Social Science Fund of China(20BXW101)。
文摘The task of cross-target stance detection faces significant challenges due to the lack of additional background information in emerging knowledge domains and the colloquial nature of language patterns.Traditional stance detection methods often struggle with understanding limited context and have insufficient generalization across diverse sentiments and semantic structures.This paper focuses on effectively mining and utilizing sentimentsemantics knowledge for stance knowledge transfer and proposes a sentiment-aware hierarchical attention network(SentiHAN)for cross-target stance detection.SentiHAN introduces an improved hierarchical attention network designed to maximize the use of high-level representations of targets and texts at various fine-grain levels.This model integrates phrase-level combinatorial sentiment knowledge to effectively bridge the knowledge gap between known and unknown targets.By doing so,it enables a comprehensive understanding of stance representations for unknown targets across different sentiments and semantic structures.The model’s ability to leverage sentimentsemantics knowledge enhances its performance in detecting stances that may not be directly observable from the immediate context.Extensive experimental results indicate that SentiHAN significantly outperforms existing benchmark methods in terms of both accuracy and robustness.Moreover,the paper employs ablation studies and visualization techniques to explore the intricate relationship between sentiment and stance.These analyses further confirm the effectiveness of sentence-level combinatorial sentiment knowledge in improving stance detection capabilities.
文摘Text embedded in images is one of many important cues for indexing and retrieval of images and videos. In the paper, we present a novel method of detecting text aligned either horizontally or vertically, in which a pyramid structure is used to represent an image and the features of the text are extracted using SUSAN edge detector. Text regions at each level of the pyramid are identified according to the autocorrelation analysis. New techniques are introduced to split the text regions into basic ones and merge them into text lines. By evaluating the method on a set of images, we obtain a very good performance of text detection.
基金funded by MOHE(FRGS:R.K130000.7856.5F026),Received by Nilam Nur Amir Sjarif.
文摘The text of the Quran is principally dependent on the Arabic language.Therefore,improving the security and reliability of the Quran’s text when it is exchanged via internet networks has become one of the most difcult challenges that researchers face today.Consequently,the diacritical marks in the Holy Quran which represent Arabic vowels(i,j.s)known as the kashida(or“extended letters”)must be protected from changes.The cover text of the Quran and its watermarked text are different due to the low values of the Peak Signal to Noise Ratio(PSNR),and Normalized Cross-Correlation(NCC);thus,the location for tamper detection accuracy is low.The gap addressed in this paper to improve the security of Arabic text in the Holy Quran by using vowels with kashida.To enhance the watermarking scheme of the text of the Quran based on hybrid techniques(XOR and queuing techniques)of the purposed scheme.The methodology propose scheme consists of four phases:The rst phase is pre-processing.This is followed by the second phase where an embedding process takes place to hide the data after the vowel letters wherein if the secret bit is“1”,it inserts the kashida but does not insert the kashida if the bit is“0”.The third phase is an extraction process and the last phase is to evaluate the performance of the proposed scheme by using PSNR(for the imperceptibility),and NCC(for the security of the watermarking).Experiments were performed on three datasets of varying lengths under multiple random locations of insertion,reorder and deletion attacks.The experimental results were revealed the improvement of the NCC by 1.76%,PSNR by 9.6%compared to available current schemes.
基金The author extends his appreciation to the Deanship of Scientic Research at King Khalid University for funding this work under grant number(R.G.P.2/55/40/2019),Received by Fahd N.Al-Wesabi.www.kku.edu.sa.
文摘Due to the rapid increase in the exchange of text information via internet networks,the security and the reliability of digital content have become a major research issue.The main challenges faced by researchers are authentication,integrity verication,and tampering detection of the digital contents.In this paper,text zero-watermarking and text feature-based approach is proposed to improve the tampering detection accuracy of English text contents.The proposed approach embeds and detects the watermark logically without altering the original English text document.Based on hidden Markov model(HMM),the fourth level order of the word mechanism is used to analyze the contents of the given English text to nd the interrelationship between the contexts.The extracted features are used as watermark information and integrated with digital zero-watermarking techniques.To detect eventual tampering,the proposed approach has been implemented and validated with attacked English text.Experiments were performed using four standard datasets of varying lengths under multiple random locations of insertion,reorder,and deletion attacks.The experimental and simulation results prove the tampering detection accuracy of our method against all kinds of tampering attacks.Comparison results show that our proposed approach outperforms all the other baseline approaches in terms of tampering detection accuracy.
基金This work was funded by the Deanship of Scientific Research at Jouf University(Kingdom of Saudi Arabia)under Grant No.DSR-2021-02-0392.
文摘Detecting and recognizing text from natural scene images presents a challenge because the image quality depends on the conditions in which the image is captured,such as viewing angles,blurring,sensor noise,etc.However,in this paper,a prototype for text detection and recognition from natural scene images is proposed.This prototype is based on the Raspberry Pi 4 and the Universal Serial Bus(USB)camera and embedded our text detection and recognition model,which was developed using the Python language.Our model is based on the deep learning text detector model through the Efficient and Accurate Scene Text Detec-tor(EAST)model for text localization and detection and the Tesseract-OCR,which is used as an Optical Character Recognition(OCR)engine for text recog-nition.Our prototype is controlled by the Virtual Network Computing(VNC)tool through a computer via a wireless connection.The experiment results show that the recognition rate for the captured image through the camera by our prototype can reach 99.75%with low computational complexity.Furthermore,our proto-type is more performant than the Tesseract software in terms of the recognition rate.Besides,it provides the same performance in terms of the recognition rate with a huge decrease in the execution time by an average of 89%compared to the EasyOCR software on the Raspberry Pi 4 board.
基金supported by ZTE Industry⁃University⁃Institute Coopera⁃tion Funds under Grant No.HC⁃CN⁃20200717012.
文摘Segmentation-based scene text detection has drawn a great deal of attention,as it can describe the text instance with arbitrary shapes based on its pixel-level prediction.However,most segmentation-based methods suffer from complex post-processing to separate the text instances which are close to each other,resulting in considerable time consumption during the inference procedure.A label enhancement method is proposed to construct two kinds of training labels for segmentation-based scene text detection in this paper.The label distribution learning(LDL)method is used to overcome the problem brought by pure shrunk text labels that might result in suboptimal detection perfor⁃mance.The experimental results on three benchmarks demonstrate that the proposed method can consistently improve the performance with⁃out sacrificing inference speed.
基金This work is supported by the National Natural Science Foundation of China(61872231,61701297).
文摘Scene text detection is an important step in the scene text reading system.There are still two problems during the existing text detection methods:(1)The small receptive of the convolutional layer in text detection is not sufficiently sensitive to the target area in the image;(2)The deep receptive of the convolutional layer in text detection lose a lot of spatial feature information.Therefore,detecting scene text remains a challenging issue.In this work,we design an effective text detector named Adaptive Multi-Scale HyperNet(AMSHN)to improve texts detection performance.Specifically,AMSHN enhances the sensitivity of target semantics in shallow features with a new attention mechanism to strengthen the region of interest in the image and weaken the region of no interest.In addition,it reduces the loss of spatial feature by fusing features on multiple paths,which significantly improves the detection performance of text.Experimental results on the Robust Reading Challenge on Reading Chinese Text on Signboard(ReCTS)dataset show that the proposed method has achieved the state-of-the-art results,which proves the ability of our detector on both particularity and universality applications.
文摘近年来场景文本检测技术飞速发展,提出一种可适用于任意形状文本检测的新颖算法Mask Text Detector.该算法在Mask R-CNN的基础上,用anchor-free的方法替代了原本的RPN层生成建议框,减少了超参、模型参数和计算量.还提出LQCS(Localization Quality and Classification Score)joint regression,能够将坐标质量和类别分数关联到一起,消除预测阶段不一致的问题.为了让网络区分复杂样本,结合传统的边缘检测算法提出Socle-Mask分支生成分割掩码.该模块在水平和垂直方向上分区别提取纹理特征,并加入通道自注意力机制,让网络自主选择通道特征.我们在三个具有挑战性的数据集(Total-Text、CTW1500和ICDAR2015)中进行了广泛的实验,验证了该算法具有很好的文本检测性能.
文摘We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and video indexing. First, a Non-Linear Niblack method (NLNiblack) is proposed to decompose the image into candidate CCs. Then, all these CCs are fed into a cascade of classifiers trained by Adaboost algorithm. Each classifier in the cascade responds to one feature of the CC. Proposed here are 12 novel features which are insensitive to noise, scale, text orientation and text language. The classifier cascade allows non-text CCs of the image to be rapidly discarded while more computation is spent on promising text-like CCs. The CCs passing through the cascade are considered as text components and are used to form the segmentation result. A prototype system was built, with experimental results proving the effectiveness and efficiency of the proposed method.
基金Project supported by the OMRON and SJTU Collaborative Founda-tion under PVS project (2005.03~2005.10)
文摘This paper proposes a learning-based method for text detection and text segmentation in natural scene images. First, the input image is decomposed into multiple connected-components (CCs) by Niblack clustering algorithm. Then all the CCs including text CCs and non-text CCs are verified on their text features by a 2-stage classification module, where most non-text CCs are discarded by an attentional cascade classifier and remaining CCs are further verified by an SVM. All the accepted CCs are output to result in text only binary image. Experiments with many images in different scenes showed satisfactory performance of our proposed method.
基金Supported by the National Natural Science Foundation of China(No.61502312)the Fundamental Research Funds for the Central Universities(No.2017BQ024)+1 种基金the Natural Science Foundation of Guangdong Province(No.2017A030310428)the Science and Technology Programm of Guangzhou(No.201806020075,20180210025)
文摘Single-pass is commonly used in topic detection and tracking( TDT) due to its simplicity,high efficiency and low cost. When dealing with large-scale data,time cost will increase sharply and clustering performance will be affected greatly. Aiming at this problem,hierarchical clustering algorithm based on single-pass is proposed,which is inspired by hierarchical and concurrent ideas to divide clustering process into three stages. News reports are classified into different categories firstly.Then there are twice single-pass clustering processes in the same category,and one agglomerative clustering among different categories. In addition,for semantic similarity in news reports,topic model is improved based on named entities. Experimental results show that the proposed method can effectively accelerate the process as well as improve the performance.
基金Supported by the National High Technology Research and Development Program of China(No.2012AA011005)
文摘Topic models such as Latent Dirichlet Allocation(LDA) have been successfully applied to many text mining tasks for extracting topics embedded in corpora. However, existing topic models generally cannot discover bursty topics that experience a sudden increase during a period of time. In this paper, we propose a new topic model named Burst-LDA, which simultaneously discovers topics and reveals their burstiness through explicitly modeling each topic's burst states with a first order Markov chain and using the chain to generate the topic proportion of documents in a Logistic Normal fashion. A Gibbs sampling algorithm is developed for the posterior inference of the proposed model. Experimental results on a news data set show our model can efficiently discover bursty topics, outperforming the state-of-the-art method.
文摘Due to the widespread usage of social media in our recent daily lifestyles,sentiment analysis becomes an important field in pattern recognition and Natural Language Processing(NLP).In this field,users’feedback data on a specific issue are evaluated and analyzed.Detecting emotions within the text is therefore considered one of the important challenges of the current NLP research.Emotions have been widely studied in psychology and behavioral science as they are an integral part of the human nature.Emotions describe a state of mind of distinct behaviors,feelings,thoughts and experiences.The main objective of this paper is to propose a new model named BERT-CNN to detect emotions from text.This model is formed by a combination of the Bidirectional Encoder Representations from Transformer(BERT)and the Convolutional Neural networks(CNN)for textual classification.This model embraces the BERT to train the word semantic representation language model.According to the word context,the semantic vector is dynamically generated and then placed into the CNN to predict the output.Results of a comparative study proved that the BERT-CNN model overcomes the state-of-art baseline performance produced by different models in the literature using the semeval 2019 task3 dataset and ISEAR datasets.The BERTCNN model achieves an accuracy of 94.7%and an F1-score of 94%for semeval2019 task3 dataset and an accuracy of 75.8%and an F1-score of 76%for ISEAR dataset.