BACKGROUND: The role of the left midfusiform gyrus as a target for visual word processing has been a topic of discussion. Numerous studies have utilized alphabetic writing for subject matter. However, few have addres...BACKGROUND: The role of the left midfusiform gyrus as a target for visual word processing has been a topic of discussion. Numerous studies have utilized alphabetic writing for subject matter. However, few have addressed visual processing of Chinese characters in the left midfusiform gyrus. OBJECTIVE: To verify visual processing of Chinese characters and images in the left midfusiform gyrus using functional magnetic resonance imaging. DESIGN, TIME AND SETTING: A blocked design paradigm study. Experiments were performed at the Room of Magnetic Resonance, Guangdong Provincial Second People's Hospital, China from May to June 2009. PARTICIPANTS: A total of eight undergraduate students were recruited from Guangzhou University of China, comprising two females and six males, aged 20-23 years. The subjects were right-handed which was determined by a Chinese standard questionnaire. None of the subjects had a history of psychoneurosis, familial disease, color blindness, or color weakness. METHODS: A total of eight undergraduates were enrolled as subjects. Picture-naming and verb generation tasks were employed through the use of functional magnetic resonance imaging. Analysis of Functional Neurolmages software was used to process the data. MAIN OUTCOME MEASURES: Visual processing of Chinese characters and images in the left midfusiform gyrus was measured. RESULTS: Picture-naming and verb generation tasks were shown to significantly activate the bilateral midfusiform gyrus. Activation occurred in the visual word form area of the left midfusiform gyrus. CONCLUSION: The left midfusiform gyrus plays a general role in visual processing of Chinese characters and images.展开更多
The purpose of this paper is to propose a new multi stage algorithm for the recognition of isolated characters. It was similar work done before using only the center of gravity (This paper is extended version of “A f...The purpose of this paper is to propose a new multi stage algorithm for the recognition of isolated characters. It was similar work done before using only the center of gravity (This paper is extended version of “A fast recognition system for isolated printed characters using center of gravity”, LAP LAMBERT Academic Publishing 2011, ISBN: 978-38465-0002-6), but here we add using principal axis in order to make the algorithm rotation invariant. In my previous work which is published in LAP LAMBERT, I face a big problem that when the character is rotated I can’t recognize the character. So this adds constrain on the document to be well oriented but here I use the principal axis in order to unify the orientation of the character set and the characters in the scanned document. The algorithm can be applied for any isolated character such as Latin, Chinese, Japanese, and Arabic characters but it has been applied in this paper for Arabic characters. The approach uses normalized and isolated characters of the same size and extracts an image signature based on the center of gravity of the character after making the character principal axis vertical, and then the system compares these values to a set of signatures for typical characters of the set. The system then provides the closeness of match to all other characters in the set.展开更多
Italicized word in character utterance, which indicates that extra stress is put on the word, is meant to convey implied meanings and therefore should be given adequate attention in the translation of novels. Through ...Italicized word in character utterance, which indicates that extra stress is put on the word, is meant to convey implied meanings and therefore should be given adequate attention in the translation of novels. Through a comparative case study of three Chinese versions of Pride and prejudice, the thesis points out that stress attached to words is often neglected in translation and that a literal rendition of stress runs the risk of changing, deleting the original meaning conveyed or adding meanings unwanted by the character and the novelist. The problem can be solved by employing lexical and syntactic means in translation.展开更多
The challenging task of handwriting style synthesis requires capturing the individuality and diversity of human handwriting.The majority of currently available methods use either a generative adversarial network(GAN)o...The challenging task of handwriting style synthesis requires capturing the individuality and diversity of human handwriting.The majority of currently available methods use either a generative adversarial network(GAN)or a recurrent neural network(RNN)to generate new handwriting styles.This is why these techniques frequently fall short of producing diverse and realistic text pictures,particularly for terms that are not commonly used.To resolve that,this research proposes a novel deep learning model that consists of a style encoder and a text generator to synthesize different handwriting styles.This network excels in generating conditional text by extracting style vectors from a series of style images.The model performs admirably on a range of handwriting synthesis tasks,including the production of text that is out-of-vocabulary.It works more effectively than previous approaches by displaying lower values on key Generative Adversarial Network evaluation metrics,such Geometric Score(GS)(3.21×10^(-5))and Fréchet Inception Distance(FID)(8.75),as well as text recognition metrics,like Character Error Rate(CER)and Word Error Rate(WER).A thorough component analysis revealed the steady improvement in image production quality,highlighting the importance of specific handwriting styles.Applicable fields include digital forensics,creative writing,and document security.展开更多
A local and global context representation learning model for Chinese characters is designed and a Chinese word segmentation method based on character representations is proposed in this paper. First, the proposed Chin...A local and global context representation learning model for Chinese characters is designed and a Chinese word segmentation method based on character representations is proposed in this paper. First, the proposed Chinese character learning model uses the semanties of loeal context and global context to learn the representation of Chinese characters. Then, Chinese word segmentation model is built by a neural network, while the segmentation model is trained with the eharaeter representations as its input features. Finally, experimental results show that Chinese charaeter representations can effectively learn the semantic information. Characters with similar semantics cluster together in the visualize space. Moreover, the proposed Chinese word segmentation model also achieves a pretty good improvement on precision, recall and f-measure.展开更多
An unsupervised framework to partially resolve the four issues, namely ambiguity, unknown word, knowledge acquisition and efficient algorithm, in developing a robust Chinese segmentation system is described. It first ...An unsupervised framework to partially resolve the four issues, namely ambiguity, unknown word, knowledge acquisition and efficient algorithm, in developing a robust Chinese segmentation system is described. It first proposes a statistical segmentation model integrating the simplified character juncture model (SCJM) with word formation power. The advantage of this model is that it can employ the affinity of characters inside or outside a word and word formation power simultaneously to process disambiguation and all the parameters can be estimated in an unsupervised way. After investigating the differences between real and theoretical size of segmentation space, we apply A * algorithm to perform segmentation without exhaustively searching all the potential segmentations. Finally, an unsupervised version of Chinese word formation patterns to detect unknown words is presented. Experiments show that the proposed methods are efficient.展开更多
ESA is an unsupervised approach to word segmentation previously proposed by Wang, which is an iterative process consisting of three phases: Evaluation, Selection and Adjustment. In this article, we propose Ex ESA, the...ESA is an unsupervised approach to word segmentation previously proposed by Wang, which is an iterative process consisting of three phases: Evaluation, Selection and Adjustment. In this article, we propose Ex ESA, the extension of ESA. In Ex ESA, the original approach is extended to a 2-pass process and the ratio of different word lengths is introduced as the third type of information combined with cohesion and separation. A maximum strategy is adopted to determine the best segmentation of a character sequence in the phrase of Selection. Besides, in Adjustment, Ex ESA re-evaluates separation information and individual information to overcome the overestimation frequencies. Additionally, a smoothing algorithm is applied to alleviate sparseness. The experiment results show that Ex ESA can further improve the performance and is time-saving by properly utilizing more information from un-annotated corpora. Moreover, the parameters of Ex ESA can be predicted by a set of empirical formulae or combined with the minimum description length principle.展开更多
Artificial neural networks have the abilities to learn by example and are capable of solving problems that are hard to solve using ordinary rule-based programming. They have many design parameters that affect their pe...Artificial neural networks have the abilities to learn by example and are capable of solving problems that are hard to solve using ordinary rule-based programming. They have many design parameters that affect their performance such as the number and sizes of the hidden layers. Large sizes are slow and small sizes are generally not accurate. Tuning the neural network size is a hard task because the design space is often large and training is often a long process. We use design of experiments techniques to tune the recurrent neural network used in an Arabic handwriting recognition system. We show that best results are achieved with three hidden layers and two subsampling layers. To tune the sizes of these five layers, we use fractional factorial experiment design to limit the number of experiments to a feasible number. Moreover, we replicate the experiment configuration multiple times to overcome the randomness in the training process. The accuracy and time measurements are analyzed and modeled. The two models are then used to locate network sizes that are on the Pareto optimal frontier. The approach described in this paper reduces the label error from 26.2% to 19.8%.展开更多
基金the Key Programming Research Project of Education Science During the 11~(th) Five-Year Plan Period of Guangdong Province, No. 06TJZ014the Programming Project of Education Science During the 11~(th) Five-Year Plan Period of Guangzhou City, No. 07B290
文摘BACKGROUND: The role of the left midfusiform gyrus as a target for visual word processing has been a topic of discussion. Numerous studies have utilized alphabetic writing for subject matter. However, few have addressed visual processing of Chinese characters in the left midfusiform gyrus. OBJECTIVE: To verify visual processing of Chinese characters and images in the left midfusiform gyrus using functional magnetic resonance imaging. DESIGN, TIME AND SETTING: A blocked design paradigm study. Experiments were performed at the Room of Magnetic Resonance, Guangdong Provincial Second People's Hospital, China from May to June 2009. PARTICIPANTS: A total of eight undergraduate students were recruited from Guangzhou University of China, comprising two females and six males, aged 20-23 years. The subjects were right-handed which was determined by a Chinese standard questionnaire. None of the subjects had a history of psychoneurosis, familial disease, color blindness, or color weakness. METHODS: A total of eight undergraduates were enrolled as subjects. Picture-naming and verb generation tasks were employed through the use of functional magnetic resonance imaging. Analysis of Functional Neurolmages software was used to process the data. MAIN OUTCOME MEASURES: Visual processing of Chinese characters and images in the left midfusiform gyrus was measured. RESULTS: Picture-naming and verb generation tasks were shown to significantly activate the bilateral midfusiform gyrus. Activation occurred in the visual word form area of the left midfusiform gyrus. CONCLUSION: The left midfusiform gyrus plays a general role in visual processing of Chinese characters and images.
文摘The purpose of this paper is to propose a new multi stage algorithm for the recognition of isolated characters. It was similar work done before using only the center of gravity (This paper is extended version of “A fast recognition system for isolated printed characters using center of gravity”, LAP LAMBERT Academic Publishing 2011, ISBN: 978-38465-0002-6), but here we add using principal axis in order to make the algorithm rotation invariant. In my previous work which is published in LAP LAMBERT, I face a big problem that when the character is rotated I can’t recognize the character. So this adds constrain on the document to be well oriented but here I use the principal axis in order to unify the orientation of the character set and the characters in the scanned document. The algorithm can be applied for any isolated character such as Latin, Chinese, Japanese, and Arabic characters but it has been applied in this paper for Arabic characters. The approach uses normalized and isolated characters of the same size and extracts an image signature based on the center of gravity of the character after making the character principal axis vertical, and then the system compares these values to a set of signatures for typical characters of the set. The system then provides the closeness of match to all other characters in the set.
文摘Italicized word in character utterance, which indicates that extra stress is put on the word, is meant to convey implied meanings and therefore should be given adequate attention in the translation of novels. Through a comparative case study of three Chinese versions of Pride and prejudice, the thesis points out that stress attached to words is often neglected in translation and that a literal rendition of stress runs the risk of changing, deleting the original meaning conveyed or adding meanings unwanted by the character and the novelist. The problem can be solved by employing lexical and syntactic means in translation.
基金supported by the National Research Foundation of Korea(NRF)Grant funded by the Korean government(MSIT)(NRF-2023R1A2C1005950).
文摘The challenging task of handwriting style synthesis requires capturing the individuality and diversity of human handwriting.The majority of currently available methods use either a generative adversarial network(GAN)or a recurrent neural network(RNN)to generate new handwriting styles.This is why these techniques frequently fall short of producing diverse and realistic text pictures,particularly for terms that are not commonly used.To resolve that,this research proposes a novel deep learning model that consists of a style encoder and a text generator to synthesize different handwriting styles.This network excels in generating conditional text by extracting style vectors from a series of style images.The model performs admirably on a range of handwriting synthesis tasks,including the production of text that is out-of-vocabulary.It works more effectively than previous approaches by displaying lower values on key Generative Adversarial Network evaluation metrics,such Geometric Score(GS)(3.21×10^(-5))and Fréchet Inception Distance(FID)(8.75),as well as text recognition metrics,like Character Error Rate(CER)and Word Error Rate(WER).A thorough component analysis revealed the steady improvement in image production quality,highlighting the importance of specific handwriting styles.Applicable fields include digital forensics,creative writing,and document security.
基金Supported by the National Natural Science Foundation of China(No.61303179,U1135005,61175020)
文摘A local and global context representation learning model for Chinese characters is designed and a Chinese word segmentation method based on character representations is proposed in this paper. First, the proposed Chinese character learning model uses the semanties of loeal context and global context to learn the representation of Chinese characters. Then, Chinese word segmentation model is built by a neural network, while the segmentation model is trained with the eharaeter representations as its input features. Finally, experimental results show that Chinese charaeter representations can effectively learn the semantic information. Characters with similar semantics cluster together in the visualize space. Moreover, the proposed Chinese word segmentation model also achieves a pretty good improvement on precision, recall and f-measure.
文摘An unsupervised framework to partially resolve the four issues, namely ambiguity, unknown word, knowledge acquisition and efficient algorithm, in developing a robust Chinese segmentation system is described. It first proposes a statistical segmentation model integrating the simplified character juncture model (SCJM) with word formation power. The advantage of this model is that it can employ the affinity of characters inside or outside a word and word formation power simultaneously to process disambiguation and all the parameters can be estimated in an unsupervised way. After investigating the differences between real and theoretical size of segmentation space, we apply A * algorithm to perform segmentation without exhaustively searching all the potential segmentations. Finally, an unsupervised version of Chinese word formation patterns to detect unknown words is presented. Experiments show that the proposed methods are efficient.
基金supported in part by National Science Foundation of China under Grants No. 61303105 and 61402304the Humanity & Social Science general project of Ministry of Education under Grants No.14YJAZH046+2 种基金the Beijing Natural Science Foundation under Grants No. 4154065the Beijing Educational Committee Science and Technology Development Planned under Grants No.KM201410028017Beijing Key Disciplines of Computer Application Technology
文摘ESA is an unsupervised approach to word segmentation previously proposed by Wang, which is an iterative process consisting of three phases: Evaluation, Selection and Adjustment. In this article, we propose Ex ESA, the extension of ESA. In Ex ESA, the original approach is extended to a 2-pass process and the ratio of different word lengths is introduced as the third type of information combined with cohesion and separation. A maximum strategy is adopted to determine the best segmentation of a character sequence in the phrase of Selection. Besides, in Adjustment, Ex ESA re-evaluates separation information and individual information to overcome the overestimation frequencies. Additionally, a smoothing algorithm is applied to alleviate sparseness. The experiment results show that Ex ESA can further improve the performance and is time-saving by properly utilizing more information from un-annotated corpora. Moreover, the parameters of Ex ESA can be predicted by a set of empirical formulae or combined with the minimum description length principle.
文摘Artificial neural networks have the abilities to learn by example and are capable of solving problems that are hard to solve using ordinary rule-based programming. They have many design parameters that affect their performance such as the number and sizes of the hidden layers. Large sizes are slow and small sizes are generally not accurate. Tuning the neural network size is a hard task because the design space is often large and training is often a long process. We use design of experiments techniques to tune the recurrent neural network used in an Arabic handwriting recognition system. We show that best results are achieved with three hidden layers and two subsampling layers. To tune the sizes of these five layers, we use fractional factorial experiment design to limit the number of experiments to a feasible number. Moreover, we replicate the experiment configuration multiple times to overcome the randomness in the training process. The accuracy and time measurements are analyzed and modeled. The two models are then used to locate network sizes that are on the Pareto optimal frontier. The approach described in this paper reduces the label error from 26.2% to 19.8%.