The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotatio...The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotation is difficult and expensive.The incorrect label annotation produces two negative effects:1)the complex decision boundary of diagnosis models lowers the generalization performance on the target domain,and2)the distribution of target domain samples becomes misaligned with the false-labeled samples.To overcome these negative effects,this article proposes a solution called the label recovery and trajectory designable network(LRTDN).LRTDN consists of three parts.First,a residual network with dual classifiers is to learn features from cross-domain samples.Second,an annotation check module is constructed to generate a label anomaly indicator that could modify the abnormal labels of false-labeled samples in the source domain.With the training of relabeled samples,the complexity of diagnosis model is reduced via semi-supervised learning.Third,the adaptation trajectories are designed for sample distributions across domains.This ensures that the target domain samples are only adapted with the pure-labeled samples.The LRTDN is verified by two case studies,in which the diagnosis knowledge of bearings is transferred across different working conditions as well as different yet related machines.The results show that LRTDN offers a high diagnosis accuracy even in the presence of incorrect annotation.展开更多
Brassica oleracea has been developed into many important crops,including cabbage,kale,cauliflower,broccoli and so on.The genome and gene annotation of cabbage(cultivar JZS),a representative morphotype of B.oleracea,ha...Brassica oleracea has been developed into many important crops,including cabbage,kale,cauliflower,broccoli and so on.The genome and gene annotation of cabbage(cultivar JZS),a representative morphotype of B.oleracea,has been widely used as a common reference in biological research.Although its genome assembly has been updated twice,the current gene annotation still lacks information on untranslated regions(UTRs)and alternative splicing(AS).Here,we constructed a high-quality gene annotation(JZSv3)using a full-length transcriptome acquired by nanopore sequencing,yielding a total of 59452 genes and 75684 transcripts.Additionally,we re-analyzed the previously reported transcriptome data related to the development of different tissues and cold response using JZSv3 as a reference,and found that 3843 out of 11908 differentially expressed genes(DEGs)underwent AS during the development of different tissues and 309 out of 903 cold-related genes underwent AS in response to cold stress.Meanwhile,we also identified many AS genes,including BolLHCB5 and BolHSP70,that displayed distinct expression patterns within variant transcripts of the same gene,highlighting the importance of JZSv3 as a pivotal reference for AS analysis.Overall,JZSv3 provides a valuable resource for exploring gene function,especially for obtaining a deeper understanding of AS regulation mechanisms.展开更多
●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS...●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS:Totally 203 infrared meibomian images from 138 patients with dry eye disease,accompanied by corresponding annotations,were gathered for the study.A rectified scribble-supervised gland segmentation(RSSGS)model,incorporating temporal ensemble prediction,uncertainty estimation,and a transformation equivariance constraint,was introduced to address constraints imposed by limited supervision information inherent in scribble annotations.The viability and efficacy of the proposed model were assessed based on accuracy,intersection over union(IoU),and dice coefficient.●RESULTS:Using manual labels as the gold standard,RSSGS demonstrated outcomes with an accuracy of 93.54%,a dice coefficient of 78.02%,and an IoU of 64.18%.Notably,these performance metrics exceed the current weakly supervised state-of-the-art methods by 0.76%,2.06%,and 2.69%,respectively.Furthermore,despite achieving a substantial 80%reduction in annotation costs,it only lags behind fully annotated methods by 0.72%,1.51%,and 2.04%.●CONCLUSION:An innovative automatic segmentation model is developed for MGs in infrared eyelid images,using scribble annotation for training.This model maintains an exceptionally high level of segmentation accuracy while substantially reducing training costs.It holds substantial utility for calculating clinical parameters,thereby greatly enhancing the diagnostic efficiency of ophthalmologists in evaluating meibomian gland dysfunction.展开更多
This paper discusses the placement of Chinese annotation from point of view of graphics. Area Feature is classified as simple polygon, complex polygon and special polygon. For simple ones, annotations are placed along...This paper discusses the placement of Chinese annotation from point of view of graphics. Area Feature is classified as simple polygon, complex polygon and special polygon. For simple ones, annotations are placed along the longest edge. For complex ones, firstly the polygon are simplified according to close points, then the longest diagonal is gotten by comparing length, lastly, annotations are placed along long diagonal. For special ones, the polygon are partitioned into several parts by a certain rule for getting their sub\|diagonals, then their annotation are placed by means of the second.展开更多
In machine learning,sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis,annotated data is a basic requirement.Generally,this data is manually annotated.Manual...In machine learning,sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis,annotated data is a basic requirement.Generally,this data is manually annotated.Manual annotation is time consuming,costly and laborious process.To overcome these resource constraints this research has proposed a fully automated annotation technique for aspect level sentiment analysis.Dataset is created from the reviews of ten most popular songs on YouTube.Reviews of five aspects—voice,video,music,lyrics and song,are extracted.An N-Gram based technique is proposed.Complete dataset consists of 369436 reviews that took 173.53 s to annotate using the proposed technique while this dataset might have taken approximately 2.07 million seconds(575 h)if it was annotated manually.For the validation of the proposed technique,a sub-dataset—Voice,is annotated manually as well as with the proposed technique.Cohen’s Kappa statistics is used to evaluate the degree of agreement between the two annotations.The high Kappa value(i.e.,0.9571%)shows the high level of agreement between the two.This validates that the quality of annotation of the proposed technique is as good as manual annotation even with far less computational cost.This research also contributes in consolidating the guidelines for the manual annotation process.展开更多
Every day,websites and personal archives create more and more photos.The size of these archives is immeasurable.The comfort of use of these huge digital image gatherings donates to their admiration.However,not all of ...Every day,websites and personal archives create more and more photos.The size of these archives is immeasurable.The comfort of use of these huge digital image gatherings donates to their admiration.However,not all of these folders deliver relevant indexing information.From the outcomes,it is dif-ficult to discover data that the user can be absorbed in.Therefore,in order to determine the significance of the data,it is important to identify the contents in an informative manner.Image annotation can be one of the greatest problematic domains in multimedia research and computer vision.Hence,in this paper,Adap-tive Convolutional Deep Learning Model(ACDLM)is developed for automatic image annotation.Initially,the databases are collected from the open-source system which consists of some labelled images(for training phase)and some unlabeled images{Corel 5 K,MSRC v2}.After that,the images are sent to the pre-processing step such as colour space quantization and texture color class map.The pre-processed images are sent to the segmentation approach for efficient labelling technique using J-image segmentation(JSEG).Thefinal step is an auto-matic annotation using ACDLM which is a combination of Convolutional Neural Network(CNN)and Honey Badger Algorithm(HBA).Based on the proposed classifier,the unlabeled images are labelled.The proposed methodology is imple-mented in MATLAB and performance is evaluated by performance metrics such as accuracy,precision,recall and F1_Measure.With the assistance of the pro-posed methodology,the unlabeled images are labelled.展开更多
Objective To investigate the effect of Guangdong Shenqu(GSQ)on intestinal flora structure in mice with food stagnation through 16S rDNA sequencing.Methods Mice were randomly assigned to control,model,GSQ low-dose(GSQL...Objective To investigate the effect of Guangdong Shenqu(GSQ)on intestinal flora structure in mice with food stagnation through 16S rDNA sequencing.Methods Mice were randomly assigned to control,model,GSQ low-dose(GSQL),GSQ medium-dose(GSQM),GSQ high-dose(GSQH),and lacidophilin tablets(LAB)groups,with each group containing 10 mice.A food stagnation and internal heat mouse model was established through intragastric administration of a mixture of beeswax and olive oil(1:15).The control group was administered normal saline,and the model group was administered beeswax and olive oil to maintain a state.The GSQL(2 g/kg),GSQM(4 g/kg),GSQH(8 g/kg),and LAB groups(0.625 g/kg)were administered corresponding drugs for 5 d.After administration,16S rDNA sequencing was performed to assess gut microbiota in mouse fecal samples.Results The model group exhibited significant intestinal flora changes.Following GSQ administration,the abundance and diversity index of the intestinal flora increased significantly,the number of bacterial species was regulated,andαandβdiversity were improved.GSQ administration increased the abundance of probiotics,including Clostridia,Lachnospirales,and Lactobacillus,whereas the abundance of conditional pathogenic bacteria,such as Allobaculum,Erysipelotrichaceae,and Bacteroides decreased.Functional prediction analysis indicated that the pathogenesis of food stagnation and GSQ intervention were primarily associated with carbohydrate,lipid,and amino acid metabolism,among other metabolic pathways.Conclusion The digestive mechanism of GSQ may be attributed to its role in restoring diversity and abundance within the intestinal flora,thereby improving the composition and structure of the intestinal flora in mice and subsequently influencing the regulation of metabolic pathways.展开更多
Translation is an important medium of cultural communication.It is not a mere transfer of two languages,but the interaction of two cultures.Cultural misreading,which results from cultural discrepancy and translator’s...Translation is an important medium of cultural communication.It is not a mere transfer of two languages,but the interaction of two cultures.Cultural misreading,which results from cultural discrepancy and translator’s subjectivity,truly reflects where the blockade and conflict in the cultural communication is.Cultural misreading is an objective phenomenon that exists in the entire process of translation.This paper intends to make a comprehensive analysis and discussion on The History of the Former Han Dynasty:a Critical Translation with Annotations translated by Homer Hasenpflug Dubs.As for the reasons of cultural misreading,this paper divides them into three types—language,thinking habit,traditional culture.It is to be hoped that this paper will draw more attention from the translation circle to the phenomena,and make contribution to the development of literary translation.展开更多
Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summariza...Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summarization is described. Three kernel descriptors such as gradient, shape, and color are extracted, respectively. Feature late-fusion is executed in turn by the multiple kernel learning model to obtain more discriminant image features. Absolute rank and relative rank of the tag-rank model are used to boost the key words' weights. A new word integration algorithm named word sequence blocks building (WSBB) is designed to create N-gram word sequences. Sentences are generated according to the N-gram word sequences and predefined templates. Experimental results show that both the BLEU-1 scores and BLEU-2 scores of the sentences are superior to those of the state-of-art baselines.展开更多
It is very important in the field of bioinformatics to apply computer to perform the function annotation for new sequenced bio-sequences. Based on GO database and BLAST program, a novel method for the function annotat...It is very important in the field of bioinformatics to apply computer to perform the function annotation for new sequenced bio-sequences. Based on GO database and BLAST program, a novel method for the function annotation of new biological sequences is presented by using the variable-precision rough set theory. The proposed method is applied to the real data in GO database to examine its effectiveness. Numerical results show that the proposed method has better precision, recall-rate and harmonic mean value compared with existing methods.展开更多
Representing the relationships between ontologies is the key problem of semantic annotations based on multi-ontologies. Traditional approaches only had the ability of denoting the simple concept subsumption relations ...Representing the relationships between ontologies is the key problem of semantic annotations based on multi-ontologies. Traditional approaches only had the ability of denoting the simple concept subsumption relations between ontologies. Through analyzing and classifying the relationships between ontologies, the idea of bridge ontology was proposed, which had the powerful capability of expressing the complex relationships between concepts and relationships between relations in multi-ontologies. Meanwhile, a new approach employing bridge ontology was proposed to deal with the multi-ontologies-based semantic annotation problem. The bridge ontology is a peculiar ontology, which can be created and maintained conveniently, and is effective in the multi-ontologies-based semantic annotation. The approach using bridge ontology has the advantages of low-cost, scalable, robust in the web circumstance, and avoiding the unnecessary ontology extending and integration. Key words semantic web - bridge ontology - multi-ontologies - semantic annotation CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (60373066, 60303024). National Grand Fundamental Research 973 Program of China (2002CB312000), National Re-search Foundation for the Doctoral Program of Higher Education of China (20020286004)Biography: WANG Peng (1977-), male, Ph.D candidate, research direction: semantic web, ontology, and knowledge representation on the Web.展开更多
The Chinese tree shrew(Tupaia belangeri chinensis)is emerging as an important experimental animal in multiple fields of biomedical research.Comprehensive reference genome annotation for both mRNA and long non-coding R...The Chinese tree shrew(Tupaia belangeri chinensis)is emerging as an important experimental animal in multiple fields of biomedical research.Comprehensive reference genome annotation for both mRNA and long non-coding RNA(lncRNA)is crucial for developing animal models using this species.In the current study,we collected a total of 234 high-quality RNA sequencing(RNA-seq)datasets and two long-read isoform sequencing(ISO-seq)datasets and improved the annotation of our previously assembled high-quality chromosomelevel tree shrew genome.We obtained a total of 3514 newly annotated coding genes and 50576 lncRNA genes.We also characterized the tissuespecific expression patterns and alternative splicing patterns of mRNAs and lncRNAs and mapped the orthologous relationships among 11 mammalian species using the current annotated genome.We identified 144 tree shrew-specific gene families,including interleukin 6(IL6)and STT3 oligosaccharyltransferase complex catalytic subunit B(STT3B),which underwent significant changes in size.Comparison of the overall expression patterns in tissues and pathways across four species(human,rhesus monkey,tree shrew,and mouse)indicated that tree shrews are more similar to primates than to mice at the tissue-transcriptome level.Notably,the newly annotated purine rich element binding protein A(PURA)gene and the STT3B gene family showed dysregulation upon viral infection.The updated version of the tree shrew genome annotation(KIZ version 3:TS_3.0)is available at http://www.treeshrewdb.org and provides an essential reference for basic and biomedical studies using tree shrew animal models.展开更多
Since the publication of this article,the authors have noticed that the GeneIDs from new and original genome annotations don’t match in Table S6,the correct Table S6 is given here.The authors would like to apologize ...Since the publication of this article,the authors have noticed that the GeneIDs from new and original genome annotations don’t match in Table S6,the correct Table S6 is given here.The authors would like to apologize for this error.展开更多
Aimming at the difficulty in getting semantic informarton from each problem in problem set archives, We propose a new method of ontology based semantic annotation for problem set archives, which utilizes programming k...Aimming at the difficulty in getting semantic informarton from each problem in problem set archives, We propose a new method of ontology based semantic annotation for problem set archives, which utilizes programming knowledge domain ontology to add semantic annotations to problems in the Web. The system we developed adds semantic annotation for each problem in the form of Extensible Makeup Language. Our method overcomes the difficulty of extracting semantics from problem set archives and the efficiency of this method is demonstrated through a case study. Having semantic annotations of problems, a student can efficiently locate the problems that logically corre spond to his knowledge.展开更多
Objective: Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of mJcroa...Objective: Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of mJcroarray data was presented, by combined with evidence acquired from comparative genornic hybridization (CGH) data. Methods: Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify ted genes in CRC. Results: A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions- Our results demonstrated that integration analysis is an effective strategy for mining cancer- associated genes.展开更多
In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficie...In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficient and effective searching environment for users to query their images more easily. In this paper,a semi-supervised learning based probabilistic latent semantic analysis( PLSA) model for automatic image annotation is presenred. Since it's often hard to obtain or create labeled images in large quantities while unlabeled ones are easier to collect,a transductive support vector machine( TSVM) is exploited to enhance the quality of the training image data. Then,different image features with different magnitudes will result in different performance for automatic image annotation. To this end,a Gaussian normalization method is utilized to normalize different features extracted from effective image regions segmented by the normalized cuts algorithm so as to reserve the intrinsic content of images as complete as possible. Finally,a PLSA model with asymmetric modalities is constructed based on the expectation maximization( EM) algorithm to predict a candidate set of annotations with confidence scores. Extensive experiments on the general-purpose Corel5k dataset demonstrate that the proposed model can significantly improve performance of traditional PLSA for the task of automatic image annotation.展开更多
A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to esti...A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to estimate the joint probability between images and semantic concepts,then a subgraph is extracted served as the corresponding structure of Markov random fields and inference over it is performed by the iterative conditional modes so as to capture the final annotation for the image.The novelty of our method mainly lies in two aspects:exploiting PLSA to estimate the joint probability between images and semantic concepts as well as multiple MRF to further explore the semantic context among keywords for accurate image annotation.To demonstrate the effectiveness of this approach,an experiment on the Corel5 k dataset is conducted and its results are compared favorably with the current state-of-the-art approaches.展开更多
As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinfor...As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.展开更多
基金the National Key R&D Program of China(2022YFB3402100)the National Science Fund for Distinguished Young Scholars of China(52025056)+4 种基金the National Natural Science Foundation of China(52305129)the China Postdoctoral Science Foundation(2023M732789)the China Postdoctoral Innovative Talents Support Program(BX20230290)the Open Foundation of Hunan Provincial Key Laboratory of Health Maintenance for Mechanical Equipment(2022JXKF JJ01)the Fundamental Research Funds for Central Universities。
文摘The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotation is difficult and expensive.The incorrect label annotation produces two negative effects:1)the complex decision boundary of diagnosis models lowers the generalization performance on the target domain,and2)the distribution of target domain samples becomes misaligned with the false-labeled samples.To overcome these negative effects,this article proposes a solution called the label recovery and trajectory designable network(LRTDN).LRTDN consists of three parts.First,a residual network with dual classifiers is to learn features from cross-domain samples.Second,an annotation check module is constructed to generate a label anomaly indicator that could modify the abnormal labels of false-labeled samples in the source domain.With the training of relabeled samples,the complexity of diagnosis model is reduced via semi-supervised learning.Third,the adaptation trajectories are designed for sample distributions across domains.This ensures that the target domain samples are only adapted with the pure-labeled samples.The LRTDN is verified by two case studies,in which the diagnosis knowledge of bearings is transferred across different working conditions as well as different yet related machines.The results show that LRTDN offers a high diagnosis accuracy even in the presence of incorrect annotation.
基金supported by the National Natural Science Foundation of China (Grant Nos.31972411,31722048,and 31630068)the Central Public-interest Scientific Institution Basal Research Fund (Grant No.Y2022PT23)+1 种基金the Innovation Program of the Chinese Academy of Agricultural Sciences,and the Key Laboratory of Biology and Genetic Improvement of Horticultural Crops,Ministry of Agriculture and Rural Affairs,P.R.Chinasupported by NIFA,the Department of Agriculture,via UC-Berkeley,USA。
文摘Brassica oleracea has been developed into many important crops,including cabbage,kale,cauliflower,broccoli and so on.The genome and gene annotation of cabbage(cultivar JZS),a representative morphotype of B.oleracea,has been widely used as a common reference in biological research.Although its genome assembly has been updated twice,the current gene annotation still lacks information on untranslated regions(UTRs)and alternative splicing(AS).Here,we constructed a high-quality gene annotation(JZSv3)using a full-length transcriptome acquired by nanopore sequencing,yielding a total of 59452 genes and 75684 transcripts.Additionally,we re-analyzed the previously reported transcriptome data related to the development of different tissues and cold response using JZSv3 as a reference,and found that 3843 out of 11908 differentially expressed genes(DEGs)underwent AS during the development of different tissues and 309 out of 903 cold-related genes underwent AS in response to cold stress.Meanwhile,we also identified many AS genes,including BolLHCB5 and BolHSP70,that displayed distinct expression patterns within variant transcripts of the same gene,highlighting the importance of JZSv3 as a pivotal reference for AS analysis.Overall,JZSv3 provides a valuable resource for exploring gene function,especially for obtaining a deeper understanding of AS regulation mechanisms.
基金Supported by Natural Science Foundation of Fujian Province(No.2020J011084)Fujian Province Technology and Economy Integration Service Platform(No.2023XRH001)Fuzhou-Xiamen-Quanzhou National Independent Innovation Demonstration Zone Collaborative Innovation Platform(No.2022FX5)。
文摘●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS:Totally 203 infrared meibomian images from 138 patients with dry eye disease,accompanied by corresponding annotations,were gathered for the study.A rectified scribble-supervised gland segmentation(RSSGS)model,incorporating temporal ensemble prediction,uncertainty estimation,and a transformation equivariance constraint,was introduced to address constraints imposed by limited supervision information inherent in scribble annotations.The viability and efficacy of the proposed model were assessed based on accuracy,intersection over union(IoU),and dice coefficient.●RESULTS:Using manual labels as the gold standard,RSSGS demonstrated outcomes with an accuracy of 93.54%,a dice coefficient of 78.02%,and an IoU of 64.18%.Notably,these performance metrics exceed the current weakly supervised state-of-the-art methods by 0.76%,2.06%,and 2.69%,respectively.Furthermore,despite achieving a substantial 80%reduction in annotation costs,it only lags behind fully annotated methods by 0.72%,1.51%,and 2.04%.●CONCLUSION:An innovative automatic segmentation model is developed for MGs in infrared eyelid images,using scribble annotation for training.This model maintains an exceptionally high level of segmentation accuracy while substantially reducing training costs.It holds substantial utility for calculating clinical parameters,thereby greatly enhancing the diagnostic efficiency of ophthalmologists in evaluating meibomian gland dysfunction.
文摘This paper discusses the placement of Chinese annotation from point of view of graphics. Area Feature is classified as simple polygon, complex polygon and special polygon. For simple ones, annotations are placed along the longest edge. For complex ones, firstly the polygon are simplified according to close points, then the longest diagonal is gotten by comparing length, lastly, annotations are placed along long diagonal. For special ones, the polygon are partitioned into several parts by a certain rule for getting their sub\|diagonals, then their annotation are placed by means of the second.
文摘In machine learning,sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis,annotated data is a basic requirement.Generally,this data is manually annotated.Manual annotation is time consuming,costly and laborious process.To overcome these resource constraints this research has proposed a fully automated annotation technique for aspect level sentiment analysis.Dataset is created from the reviews of ten most popular songs on YouTube.Reviews of five aspects—voice,video,music,lyrics and song,are extracted.An N-Gram based technique is proposed.Complete dataset consists of 369436 reviews that took 173.53 s to annotate using the proposed technique while this dataset might have taken approximately 2.07 million seconds(575 h)if it was annotated manually.For the validation of the proposed technique,a sub-dataset—Voice,is annotated manually as well as with the proposed technique.Cohen’s Kappa statistics is used to evaluate the degree of agreement between the two annotations.The high Kappa value(i.e.,0.9571%)shows the high level of agreement between the two.This validates that the quality of annotation of the proposed technique is as good as manual annotation even with far less computational cost.This research also contributes in consolidating the guidelines for the manual annotation process.
文摘Every day,websites and personal archives create more and more photos.The size of these archives is immeasurable.The comfort of use of these huge digital image gatherings donates to their admiration.However,not all of these folders deliver relevant indexing information.From the outcomes,it is dif-ficult to discover data that the user can be absorbed in.Therefore,in order to determine the significance of the data,it is important to identify the contents in an informative manner.Image annotation can be one of the greatest problematic domains in multimedia research and computer vision.Hence,in this paper,Adap-tive Convolutional Deep Learning Model(ACDLM)is developed for automatic image annotation.Initially,the databases are collected from the open-source system which consists of some labelled images(for training phase)and some unlabeled images{Corel 5 K,MSRC v2}.After that,the images are sent to the pre-processing step such as colour space quantization and texture color class map.The pre-processed images are sent to the segmentation approach for efficient labelling technique using J-image segmentation(JSEG).Thefinal step is an auto-matic annotation using ACDLM which is a combination of Convolutional Neural Network(CNN)and Honey Badger Algorithm(HBA).Based on the proposed classifier,the unlabeled images are labelled.The proposed methodology is imple-mented in MATLAB and performance is evaluated by performance metrics such as accuracy,precision,recall and F1_Measure.With the assistance of the pro-posed methodology,the unlabeled images are labelled.
基金supported by the National Natural Science Foundation of China(81872995).
文摘Objective To investigate the effect of Guangdong Shenqu(GSQ)on intestinal flora structure in mice with food stagnation through 16S rDNA sequencing.Methods Mice were randomly assigned to control,model,GSQ low-dose(GSQL),GSQ medium-dose(GSQM),GSQ high-dose(GSQH),and lacidophilin tablets(LAB)groups,with each group containing 10 mice.A food stagnation and internal heat mouse model was established through intragastric administration of a mixture of beeswax and olive oil(1:15).The control group was administered normal saline,and the model group was administered beeswax and olive oil to maintain a state.The GSQL(2 g/kg),GSQM(4 g/kg),GSQH(8 g/kg),and LAB groups(0.625 g/kg)were administered corresponding drugs for 5 d.After administration,16S rDNA sequencing was performed to assess gut microbiota in mouse fecal samples.Results The model group exhibited significant intestinal flora changes.Following GSQ administration,the abundance and diversity index of the intestinal flora increased significantly,the number of bacterial species was regulated,andαandβdiversity were improved.GSQ administration increased the abundance of probiotics,including Clostridia,Lachnospirales,and Lactobacillus,whereas the abundance of conditional pathogenic bacteria,such as Allobaculum,Erysipelotrichaceae,and Bacteroides decreased.Functional prediction analysis indicated that the pathogenesis of food stagnation and GSQ intervention were primarily associated with carbohydrate,lipid,and amino acid metabolism,among other metabolic pathways.Conclusion The digestive mechanism of GSQ may be attributed to its role in restoring diversity and abundance within the intestinal flora,thereby improving the composition and structure of the intestinal flora in mice and subsequently influencing the regulation of metabolic pathways.
文摘Translation is an important medium of cultural communication.It is not a mere transfer of two languages,but the interaction of two cultures.Cultural misreading,which results from cultural discrepancy and translator’s subjectivity,truly reflects where the blockade and conflict in the cultural communication is.Cultural misreading is an objective phenomenon that exists in the entire process of translation.This paper intends to make a comprehensive analysis and discussion on The History of the Former Han Dynasty:a Critical Translation with Annotations translated by Homer Hasenpflug Dubs.As for the reasons of cultural misreading,this paper divides them into three types—language,thinking habit,traditional culture.It is to be hoped that this paper will draw more attention from the translation circle to the phenomena,and make contribution to the development of literary translation.
基金The National Natural Science Foundation of China(No.61133012)the Humanity and Social Science Foundation of the Ministry of Education(No.12YJCZH274)+1 种基金the Humanity and Social Science Foundation of Jiangxi Province(No.XW1502,TQ1503)the Science and Technology Project of Jiangxi Science and Technology Department(No.20121BBG70050,20142BBG70011)
文摘Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summarization is described. Three kernel descriptors such as gradient, shape, and color are extracted, respectively. Feature late-fusion is executed in turn by the multiple kernel learning model to obtain more discriminant image features. Absolute rank and relative rank of the tag-rank model are used to boost the key words' weights. A new word integration algorithm named word sequence blocks building (WSBB) is designed to create N-gram word sequences. Sentences are generated according to the N-gram word sequences and predefined templates. Experimental results show that both the BLEU-1 scores and BLEU-2 scores of the sentences are superior to those of the state-of-art baselines.
基金the support of the National Natural Science Foundation of China under Grant No.60673023,60433020,10501017,3040016the European Commission for TH/Asia Link/010 under Grant No.111084.
文摘It is very important in the field of bioinformatics to apply computer to perform the function annotation for new sequenced bio-sequences. Based on GO database and BLAST program, a novel method for the function annotation of new biological sequences is presented by using the variable-precision rough set theory. The proposed method is applied to the real data in GO database to examine its effectiveness. Numerical results show that the proposed method has better precision, recall-rate and harmonic mean value compared with existing methods.
文摘Representing the relationships between ontologies is the key problem of semantic annotations based on multi-ontologies. Traditional approaches only had the ability of denoting the simple concept subsumption relations between ontologies. Through analyzing and classifying the relationships between ontologies, the idea of bridge ontology was proposed, which had the powerful capability of expressing the complex relationships between concepts and relationships between relations in multi-ontologies. Meanwhile, a new approach employing bridge ontology was proposed to deal with the multi-ontologies-based semantic annotation problem. The bridge ontology is a peculiar ontology, which can be created and maintained conveniently, and is effective in the multi-ontologies-based semantic annotation. The approach using bridge ontology has the advantages of low-cost, scalable, robust in the web circumstance, and avoiding the unnecessary ontology extending and integration. Key words semantic web - bridge ontology - multi-ontologies - semantic annotation CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (60373066, 60303024). National Grand Fundamental Research 973 Program of China (2002CB312000), National Re-search Foundation for the Doctoral Program of Higher Education of China (20020286004)Biography: WANG Peng (1977-), male, Ph.D candidate, research direction: semantic web, ontology, and knowledge representation on the Web.
基金This study was supported by the National Natural Science Foundation of China(U1902215 to Y.G.Y.and 31970542 to Y.F.)Chinese Academy of Sciences(Light of West China Program xbzg-zdsys-201909 to Y.G.Y.)Yunnan Province(202001AS070023 and 2018FB046 to D.D.Y.and 202002AA100007 to Y.G.Y.)。
文摘The Chinese tree shrew(Tupaia belangeri chinensis)is emerging as an important experimental animal in multiple fields of biomedical research.Comprehensive reference genome annotation for both mRNA and long non-coding RNA(lncRNA)is crucial for developing animal models using this species.In the current study,we collected a total of 234 high-quality RNA sequencing(RNA-seq)datasets and two long-read isoform sequencing(ISO-seq)datasets and improved the annotation of our previously assembled high-quality chromosomelevel tree shrew genome.We obtained a total of 3514 newly annotated coding genes and 50576 lncRNA genes.We also characterized the tissuespecific expression patterns and alternative splicing patterns of mRNAs and lncRNAs and mapped the orthologous relationships among 11 mammalian species using the current annotated genome.We identified 144 tree shrew-specific gene families,including interleukin 6(IL6)and STT3 oligosaccharyltransferase complex catalytic subunit B(STT3B),which underwent significant changes in size.Comparison of the overall expression patterns in tissues and pathways across four species(human,rhesus monkey,tree shrew,and mouse)indicated that tree shrews are more similar to primates than to mice at the tissue-transcriptome level.Notably,the newly annotated purine rich element binding protein A(PURA)gene and the STT3B gene family showed dysregulation upon viral infection.The updated version of the tree shrew genome annotation(KIZ version 3:TS_3.0)is available at http://www.treeshrewdb.org and provides an essential reference for basic and biomedical studies using tree shrew animal models.
文摘Since the publication of this article,the authors have noticed that the GeneIDs from new and original genome annotations don’t match in Table S6,the correct Table S6 is given here.The authors would like to apologize for this error.
基金Supported by the National Natural Science Fundationof China (60273051)
文摘Aimming at the difficulty in getting semantic informarton from each problem in problem set archives, We propose a new method of ontology based semantic annotation for problem set archives, which utilizes programming knowledge domain ontology to add semantic annotations to problems in the Web. The system we developed adds semantic annotation for each problem in the form of Extensible Makeup Language. Our method overcomes the difficulty of extracting semantics from problem set archives and the efficiency of this method is demonstrated through a case study. Having semantic annotations of problems, a student can efficiently locate the problems that logically corre spond to his knowledge.
基金supported by a grant from the National Natural Science Foundation of China(Grant No.61373057)a grant from the Zhejiang Provincial Natural Science Foundation of China(Grant No.Y1110763)
文摘Objective: Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of mJcroarray data was presented, by combined with evidence acquired from comparative genornic hybridization (CGH) data. Methods: Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify ted genes in CRC. Results: A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions- Our results demonstrated that integration analysis is an effective strategy for mining cancer- associated genes.
基金Supported by the National Program on Key Basic Research Project(No.2013CB329502)the National Natural Science Foundation of China(No.61202212)+1 种基金the Special Research Project of the Educational Department of Shaanxi Province of China(No.15JK1038)the Key Research Project of Baoji University of Arts and Sciences(No.ZK16047)
文摘In recent years,multimedia annotation problem has been attracting significant research attention in multimedia and computer vision areas,especially for automatic image annotation,whose purpose is to provide an efficient and effective searching environment for users to query their images more easily. In this paper,a semi-supervised learning based probabilistic latent semantic analysis( PLSA) model for automatic image annotation is presenred. Since it's often hard to obtain or create labeled images in large quantities while unlabeled ones are easier to collect,a transductive support vector machine( TSVM) is exploited to enhance the quality of the training image data. Then,different image features with different magnitudes will result in different performance for automatic image annotation. To this end,a Gaussian normalization method is utilized to normalize different features extracted from effective image regions segmented by the normalized cuts algorithm so as to reserve the intrinsic content of images as complete as possible. Finally,a PLSA model with asymmetric modalities is constructed based on the expectation maximization( EM) algorithm to predict a candidate set of annotations with confidence scores. Extensive experiments on the general-purpose Corel5k dataset demonstrate that the proposed model can significantly improve performance of traditional PLSA for the task of automatic image annotation.
基金Supported by the National Basic Research Priorities Program(No.2013CB329502)the National High-tech R&D Program of China(No.2012AA011003)+1 种基金National Natural Science Foundation of China(No.61035003,61072085,60933004,60903141)the National Scienceand Technology Support Program of China(No.2012BA107B02)
文摘A novel image auto-annotation method is presented based on probabilistic latent semantic analysis(PLSA) model and multiple Markov random fields(MRF).A PLSA model with asymmetric modalities is first constructed to estimate the joint probability between images and semantic concepts,then a subgraph is extracted served as the corresponding structure of Markov random fields and inference over it is performed by the iterative conditional modes so as to capture the final annotation for the image.The novelty of our method mainly lies in two aspects:exploiting PLSA to estimate the joint probability between images and semantic concepts as well as multiple MRF to further explore the semantic context among keywords for accurate image annotation.To demonstrate the effectiveness of this approach,an experiment on the Corel5 k dataset is conducted and its results are compared favorably with the current state-of-the-art approaches.
基金Supported by the National Natural Science Foundation of China( No. 3 9980 0 0 5 )
文摘As the sequencing stage of human genome project is near the end, the work has begun for discovering novel genes from genome sequences and annotating their biological functions. Here are reviewed current major bioinformatics tools and technologies available for large scale gene discovery and annotation from human genome sequences. Some ideas about possible future development are also provided.