The challenging task of handwriting style synthesis requires capturing the individuality and diversity of human handwriting.The majority of currently available methods use either a generative adversarial network(GAN)o...The challenging task of handwriting style synthesis requires capturing the individuality and diversity of human handwriting.The majority of currently available methods use either a generative adversarial network(GAN)or a recurrent neural network(RNN)to generate new handwriting styles.This is why these techniques frequently fall short of producing diverse and realistic text pictures,particularly for terms that are not commonly used.To resolve that,this research proposes a novel deep learning model that consists of a style encoder and a text generator to synthesize different handwriting styles.This network excels in generating conditional text by extracting style vectors from a series of style images.The model performs admirably on a range of handwriting synthesis tasks,including the production of text that is out-of-vocabulary.It works more effectively than previous approaches by displaying lower values on key Generative Adversarial Network evaluation metrics,such Geometric Score(GS)(3.21×10^(-5))and Fréchet Inception Distance(FID)(8.75),as well as text recognition metrics,like Character Error Rate(CER)and Word Error Rate(WER).A thorough component analysis revealed the steady improvement in image production quality,highlighting the importance of specific handwriting styles.Applicable fields include digital forensics,creative writing,and document security.展开更多
The purpose of this paper is to propose a new multi stage algorithm for the recognition of isolated characters. It was similar work done before using only the center of gravity (This paper is extended version of “A f...The purpose of this paper is to propose a new multi stage algorithm for the recognition of isolated characters. It was similar work done before using only the center of gravity (This paper is extended version of “A fast recognition system for isolated printed characters using center of gravity”, LAP LAMBERT Academic Publishing 2011, ISBN: 978-38465-0002-6), but here we add using principal axis in order to make the algorithm rotation invariant. In my previous work which is published in LAP LAMBERT, I face a big problem that when the character is rotated I can’t recognize the character. So this adds constrain on the document to be well oriented but here I use the principal axis in order to unify the orientation of the character set and the characters in the scanned document. The algorithm can be applied for any isolated character such as Latin, Chinese, Japanese, and Arabic characters but it has been applied in this paper for Arabic characters. The approach uses normalized and isolated characters of the same size and extracts an image signature based on the center of gravity of the character after making the character principal axis vertical, and then the system compares these values to a set of signatures for typical characters of the set. The system then provides the closeness of match to all other characters in the set.展开更多
The main purposes of this thesis are to carry through a further investigation of the lexical features of business English correspondence and to present the lexical application methods which are based on basic rules in...The main purposes of this thesis are to carry through a further investigation of the lexical features of business English correspondence and to present the lexical application methods which are based on basic rules in effective business English letter writing.展开更多
BACKGROUND: The role of the left midfusiform gyrus as a target for visual word processing has been a topic of discussion. Numerous studies have utilized alphabetic writing for subject matter. However, few have addres...BACKGROUND: The role of the left midfusiform gyrus as a target for visual word processing has been a topic of discussion. Numerous studies have utilized alphabetic writing for subject matter. However, few have addressed visual processing of Chinese characters in the left midfusiform gyrus. OBJECTIVE: To verify visual processing of Chinese characters and images in the left midfusiform gyrus using functional magnetic resonance imaging. DESIGN, TIME AND SETTING: A blocked design paradigm study. Experiments were performed at the Room of Magnetic Resonance, Guangdong Provincial Second People's Hospital, China from May to June 2009. PARTICIPANTS: A total of eight undergraduate students were recruited from Guangzhou University of China, comprising two females and six males, aged 20-23 years. The subjects were right-handed which was determined by a Chinese standard questionnaire. None of the subjects had a history of psychoneurosis, familial disease, color blindness, or color weakness. METHODS: A total of eight undergraduates were enrolled as subjects. Picture-naming and verb generation tasks were employed through the use of functional magnetic resonance imaging. Analysis of Functional Neurolmages software was used to process the data. MAIN OUTCOME MEASURES: Visual processing of Chinese characters and images in the left midfusiform gyrus was measured. RESULTS: Picture-naming and verb generation tasks were shown to significantly activate the bilateral midfusiform gyrus. Activation occurred in the visual word form area of the left midfusiform gyrus. CONCLUSION: The left midfusiform gyrus plays a general role in visual processing of Chinese characters and images.展开更多
Artificial neural networks have the abilities to learn by example and are capable of solving problems that are hard to solve using ordinary rule-based programming. They have many design parameters that affect their pe...Artificial neural networks have the abilities to learn by example and are capable of solving problems that are hard to solve using ordinary rule-based programming. They have many design parameters that affect their performance such as the number and sizes of the hidden layers. Large sizes are slow and small sizes are generally not accurate. Tuning the neural network size is a hard task because the design space is often large and training is often a long process. We use design of experiments techniques to tune the recurrent neural network used in an Arabic handwriting recognition system. We show that best results are achieved with three hidden layers and two subsampling layers. To tune the sizes of these five layers, we use fractional factorial experiment design to limit the number of experiments to a feasible number. Moreover, we replicate the experiment configuration multiple times to overcome the randomness in the training process. The accuracy and time measurements are analyzed and modeled. The two models are then used to locate network sizes that are on the Pareto optimal frontier. The approach described in this paper reduces the label error from 26.2% to 19.8%.展开更多
A local and global context representation learning model for Chinese characters is designed and a Chinese word segmentation method based on character representations is proposed in this paper. First, the proposed Chin...A local and global context representation learning model for Chinese characters is designed and a Chinese word segmentation method based on character representations is proposed in this paper. First, the proposed Chinese character learning model uses the semanties of loeal context and global context to learn the representation of Chinese characters. Then, Chinese word segmentation model is built by a neural network, while the segmentation model is trained with the eharaeter representations as its input features. Finally, experimental results show that Chinese charaeter representations can effectively learn the semantic information. Characters with similar semantics cluster together in the visualize space. Moreover, the proposed Chinese word segmentation model also achieves a pretty good improvement on precision, recall and f-measure.展开更多
An unsupervised framework to partially resolve the four issues, namely ambiguity, unknown word, knowledge acquisition and efficient algorithm, in developing a robust Chinese segmentation system is described. It first ...An unsupervised framework to partially resolve the four issues, namely ambiguity, unknown word, knowledge acquisition and efficient algorithm, in developing a robust Chinese segmentation system is described. It first proposes a statistical segmentation model integrating the simplified character juncture model (SCJM) with word formation power. The advantage of this model is that it can employ the affinity of characters inside or outside a word and word formation power simultaneously to process disambiguation and all the parameters can be estimated in an unsupervised way. After investigating the differences between real and theoretical size of segmentation space, we apply A * algorithm to perform segmentation without exhaustively searching all the potential segmentations. Finally, an unsupervised version of Chinese word formation patterns to detect unknown words is presented. Experiments show that the proposed methods are efficient.展开更多
Italicized word in character utterance, which indicates that extra stress is put on the word, is meant to convey implied meanings and therefore should be given adequate attention in the translation of novels. Through ...Italicized word in character utterance, which indicates that extra stress is put on the word, is meant to convey implied meanings and therefore should be given adequate attention in the translation of novels. Through a comparative case study of three Chinese versions of Pride and prejudice, the thesis points out that stress attached to words is often neglected in translation and that a literal rendition of stress runs the risk of changing, deleting the original meaning conveyed or adding meanings unwanted by the character and the novelist. The problem can be solved by employing lexical and syntactic means in translation.展开更多
Character derivation means that if a graph represents several meanings, a new graph based on the original one will be created to bear one or two of the meanings, which is a natural law in Chinese writing system. The o...Character derivation means that if a graph represents several meanings, a new graph based on the original one will be created to bear one or two of the meanings, which is a natural law in Chinese writing system. The old graph is called original character, and the new generated one is called derived character. Two kinds of phenomena-derivation of cognate words (同詞孳乳) and differentiation of unidentical words (異詞別異)-promote the derivation of Chinese characters. In return, derived characters not only bear meaning of the original one, but also serve as a symbol of an independent word and consolidate the graph-meaning relationship. It deserves much attention to the law of the process of character derivation.展开更多
为获得结构化的小麦品种表型和遗传描述,针对非结构化小麦种质数据中存在的实体边界模糊以及关系重叠问题,提出一种基于深度字词融合的小麦种质信息实体关系联合抽取模型WGIE-DCWF(wheat germplasm information extraction model based ...为获得结构化的小麦品种表型和遗传描述,针对非结构化小麦种质数据中存在的实体边界模糊以及关系重叠问题,提出一种基于深度字词融合的小麦种质信息实体关系联合抽取模型WGIE-DCWF(wheat germplasm information extraction model based on deep character and word fusion)。模型编码层通过深度字词融合和上下文语义特征融合,提高密集实体特征识别能力;模型三元组抽取层建立层叠指针网络,提高重叠关系的提取能力。在小麦种质数据集和公开数据集上的一系列对比实验结果表明,WGIE-DCWF模型能够有效提高小麦种质数据实体关系联合抽取效果,同时拥有较好的泛化性,可以为小麦种质信息知识库构建提供技术支撑。展开更多
基金supported by the National Research Foundation of Korea(NRF)Grant funded by the Korean government(MSIT)(NRF-2023R1A2C1005950).
文摘The challenging task of handwriting style synthesis requires capturing the individuality and diversity of human handwriting.The majority of currently available methods use either a generative adversarial network(GAN)or a recurrent neural network(RNN)to generate new handwriting styles.This is why these techniques frequently fall short of producing diverse and realistic text pictures,particularly for terms that are not commonly used.To resolve that,this research proposes a novel deep learning model that consists of a style encoder and a text generator to synthesize different handwriting styles.This network excels in generating conditional text by extracting style vectors from a series of style images.The model performs admirably on a range of handwriting synthesis tasks,including the production of text that is out-of-vocabulary.It works more effectively than previous approaches by displaying lower values on key Generative Adversarial Network evaluation metrics,such Geometric Score(GS)(3.21×10^(-5))and Fréchet Inception Distance(FID)(8.75),as well as text recognition metrics,like Character Error Rate(CER)and Word Error Rate(WER).A thorough component analysis revealed the steady improvement in image production quality,highlighting the importance of specific handwriting styles.Applicable fields include digital forensics,creative writing,and document security.
文摘The purpose of this paper is to propose a new multi stage algorithm for the recognition of isolated characters. It was similar work done before using only the center of gravity (This paper is extended version of “A fast recognition system for isolated printed characters using center of gravity”, LAP LAMBERT Academic Publishing 2011, ISBN: 978-38465-0002-6), but here we add using principal axis in order to make the algorithm rotation invariant. In my previous work which is published in LAP LAMBERT, I face a big problem that when the character is rotated I can’t recognize the character. So this adds constrain on the document to be well oriented but here I use the principal axis in order to unify the orientation of the character set and the characters in the scanned document. The algorithm can be applied for any isolated character such as Latin, Chinese, Japanese, and Arabic characters but it has been applied in this paper for Arabic characters. The approach uses normalized and isolated characters of the same size and extracts an image signature based on the center of gravity of the character after making the character principal axis vertical, and then the system compares these values to a set of signatures for typical characters of the set. The system then provides the closeness of match to all other characters in the set.
文摘The main purposes of this thesis are to carry through a further investigation of the lexical features of business English correspondence and to present the lexical application methods which are based on basic rules in effective business English letter writing.
基金the Key Programming Research Project of Education Science During the 11~(th) Five-Year Plan Period of Guangdong Province, No. 06TJZ014the Programming Project of Education Science During the 11~(th) Five-Year Plan Period of Guangzhou City, No. 07B290
文摘BACKGROUND: The role of the left midfusiform gyrus as a target for visual word processing has been a topic of discussion. Numerous studies have utilized alphabetic writing for subject matter. However, few have addressed visual processing of Chinese characters in the left midfusiform gyrus. OBJECTIVE: To verify visual processing of Chinese characters and images in the left midfusiform gyrus using functional magnetic resonance imaging. DESIGN, TIME AND SETTING: A blocked design paradigm study. Experiments were performed at the Room of Magnetic Resonance, Guangdong Provincial Second People's Hospital, China from May to June 2009. PARTICIPANTS: A total of eight undergraduate students were recruited from Guangzhou University of China, comprising two females and six males, aged 20-23 years. The subjects were right-handed which was determined by a Chinese standard questionnaire. None of the subjects had a history of psychoneurosis, familial disease, color blindness, or color weakness. METHODS: A total of eight undergraduates were enrolled as subjects. Picture-naming and verb generation tasks were employed through the use of functional magnetic resonance imaging. Analysis of Functional Neurolmages software was used to process the data. MAIN OUTCOME MEASURES: Visual processing of Chinese characters and images in the left midfusiform gyrus was measured. RESULTS: Picture-naming and verb generation tasks were shown to significantly activate the bilateral midfusiform gyrus. Activation occurred in the visual word form area of the left midfusiform gyrus. CONCLUSION: The left midfusiform gyrus plays a general role in visual processing of Chinese characters and images.
文摘Artificial neural networks have the abilities to learn by example and are capable of solving problems that are hard to solve using ordinary rule-based programming. They have many design parameters that affect their performance such as the number and sizes of the hidden layers. Large sizes are slow and small sizes are generally not accurate. Tuning the neural network size is a hard task because the design space is often large and training is often a long process. We use design of experiments techniques to tune the recurrent neural network used in an Arabic handwriting recognition system. We show that best results are achieved with three hidden layers and two subsampling layers. To tune the sizes of these five layers, we use fractional factorial experiment design to limit the number of experiments to a feasible number. Moreover, we replicate the experiment configuration multiple times to overcome the randomness in the training process. The accuracy and time measurements are analyzed and modeled. The two models are then used to locate network sizes that are on the Pareto optimal frontier. The approach described in this paper reduces the label error from 26.2% to 19.8%.
基金Supported by the National Natural Science Foundation of China(No.61303179,U1135005,61175020)
文摘A local and global context representation learning model for Chinese characters is designed and a Chinese word segmentation method based on character representations is proposed in this paper. First, the proposed Chinese character learning model uses the semanties of loeal context and global context to learn the representation of Chinese characters. Then, Chinese word segmentation model is built by a neural network, while the segmentation model is trained with the eharaeter representations as its input features. Finally, experimental results show that Chinese charaeter representations can effectively learn the semantic information. Characters with similar semantics cluster together in the visualize space. Moreover, the proposed Chinese word segmentation model also achieves a pretty good improvement on precision, recall and f-measure.
文摘An unsupervised framework to partially resolve the four issues, namely ambiguity, unknown word, knowledge acquisition and efficient algorithm, in developing a robust Chinese segmentation system is described. It first proposes a statistical segmentation model integrating the simplified character juncture model (SCJM) with word formation power. The advantage of this model is that it can employ the affinity of characters inside or outside a word and word formation power simultaneously to process disambiguation and all the parameters can be estimated in an unsupervised way. After investigating the differences between real and theoretical size of segmentation space, we apply A * algorithm to perform segmentation without exhaustively searching all the potential segmentations. Finally, an unsupervised version of Chinese word formation patterns to detect unknown words is presented. Experiments show that the proposed methods are efficient.
文摘Italicized word in character utterance, which indicates that extra stress is put on the word, is meant to convey implied meanings and therefore should be given adequate attention in the translation of novels. Through a comparative case study of three Chinese versions of Pride and prejudice, the thesis points out that stress attached to words is often neglected in translation and that a literal rendition of stress runs the risk of changing, deleting the original meaning conveyed or adding meanings unwanted by the character and the novelist. The problem can be solved by employing lexical and syntactic means in translation.
文摘Character derivation means that if a graph represents several meanings, a new graph based on the original one will be created to bear one or two of the meanings, which is a natural law in Chinese writing system. The old graph is called original character, and the new generated one is called derived character. Two kinds of phenomena-derivation of cognate words (同詞孳乳) and differentiation of unidentical words (異詞別異)-promote the derivation of Chinese characters. In return, derived characters not only bear meaning of the original one, but also serve as a symbol of an independent word and consolidate the graph-meaning relationship. It deserves much attention to the law of the process of character derivation.
文摘为获得结构化的小麦品种表型和遗传描述,针对非结构化小麦种质数据中存在的实体边界模糊以及关系重叠问题,提出一种基于深度字词融合的小麦种质信息实体关系联合抽取模型WGIE-DCWF(wheat germplasm information extraction model based on deep character and word fusion)。模型编码层通过深度字词融合和上下文语义特征融合,提高密集实体特征识别能力;模型三元组抽取层建立层叠指针网络,提高重叠关系的提取能力。在小麦种质数据集和公开数据集上的一系列对比实验结果表明,WGIE-DCWF模型能够有效提高小麦种质数据实体关系联合抽取效果,同时拥有较好的泛化性,可以为小麦种质信息知识库构建提供技术支撑。