The study and comparison of sequences of characters from a finite alphabet is relevant to various areas of science, notably molecular biology. The measurement of sequence similarity involves the consideration of the p...The study and comparison of sequences of characters from a finite alphabet is relevant to various areas of science, notably molecular biology. The measurement of sequence similarity involves the consideration of the possible sequence alignments in order to find an optimal one for which the “distance” between sequences is minimum. In biology informatics area, it is a more important and difficult problem due to the long length (100 at least) of sequence, this cause the compute complexity and large memory require. By associating a path in a lattice to each alignment, a geometric insight can be brought into the problem of finding an optimal alignment, this give an obvious encoding of each path. This problem can be solved by applying genetic algorithm, which is more efficient than dynamic programming and hidden Markov model using commomly now.展开更多
Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summariza...Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summarization is described. Three kernel descriptors such as gradient, shape, and color are extracted, respectively. Feature late-fusion is executed in turn by the multiple kernel learning model to obtain more discriminant image features. Absolute rank and relative rank of the tag-rank model are used to boost the key words' weights. A new word integration algorithm named word sequence blocks building (WSBB) is designed to create N-gram word sequences. Sentences are generated according to the N-gram word sequences and predefined templates. Experimental results show that both the BLEU-1 scores and BLEU-2 scores of the sentences are superior to those of the state-of-art baselines.展开更多
文摘The study and comparison of sequences of characters from a finite alphabet is relevant to various areas of science, notably molecular biology. The measurement of sequence similarity involves the consideration of the possible sequence alignments in order to find an optimal one for which the “distance” between sequences is minimum. In biology informatics area, it is a more important and difficult problem due to the long length (100 at least) of sequence, this cause the compute complexity and large memory require. By associating a path in a lattice to each alignment, a geometric insight can be brought into the problem of finding an optimal alignment, this give an obvious encoding of each path. This problem can be solved by applying genetic algorithm, which is more efficient than dynamic programming and hidden Markov model using commomly now.
基金The National Natural Science Foundation of China(No.61133012)the Humanity and Social Science Foundation of the Ministry of Education(No.12YJCZH274)+1 种基金the Humanity and Social Science Foundation of Jiangxi Province(No.XW1502,TQ1503)the Science and Technology Project of Jiangxi Science and Technology Department(No.20121BBG70050,20142BBG70011)
文摘Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summarization is described. Three kernel descriptors such as gradient, shape, and color are extracted, respectively. Feature late-fusion is executed in turn by the multiple kernel learning model to obtain more discriminant image features. Absolute rank and relative rank of the tag-rank model are used to boost the key words' weights. A new word integration algorithm named word sequence blocks building (WSBB) is designed to create N-gram word sequences. Sentences are generated according to the N-gram word sequences and predefined templates. Experimental results show that both the BLEU-1 scores and BLEU-2 scores of the sentences are superior to those of the state-of-art baselines.