Vector cleaner:一种新的去除测序目的基因载体序列的方法被引量：2

Vector cleaner: a novel method for vector sequence removal

下载PDF

导出

摘要 Sanger测序法测序目的基因常包含有目的基因和载体序列,为了快速去除测序目的基因载体序列,提出了一种新的目的基因载体序列去除方法并开发了程序Vector cleaner。首先利用该程序批量读取引物信息和目的基因测序序列;其次,程序在所读取的引物序列上建立引物半长的滑动窗口来产生种子,通过计数种子与测序序列的匹配次数,定位引物位置和删除引物两侧的载体序列;最后,程序通过比较上游引物序列和其反向互补序列分别与测序序列匹配种子数,判断和转换正义链。使用Vector cleaner对12条GhVIN1基因测序序列进行去载体测试,并与Seqclean和SeqMan软件相比较。结果表明:Vector cleaner能有效去除棉花GhVIN1基因测序载体序列,识别并翻译反义链序列。与Seqclean和SeqMan软件相比较,Vector cleaner正确率高,敏感性强。Vector cleaner、SeqMan和Seqclean所测试序列的总序列数正确率分别为100%、100%和91.6%,总碱基正确率分别为99.90%、99.00%和94.33%。与同类软件比较,Vector cleaner更适合实验人员批量去除测序目的基因载体序列,具有准确率高、敏感性强、自动翻译反义链的特点。 Sequenced target genes produced by automated Sanger sequencing machines frequently contain fragments of the vector sequences.Hence,to remove vector sequence in sequenced target gene and translate the antisense strand sequence,a novel method was proposed and a small software,Vector cleaner,was developed using Perl language.The key feature of Vector cleaner is that it can remove vector sequences in batch processes and translate the antisense strand sequence to sense strand sequence. Vector cleaner,works in three steps.First,Vector cleaner reads primers information and target gene sequencing information from input files.Second,a sliding window of half length of primers at every base was set in primers to generate seeds.The seeds are used to scan the target gene sequence to find the perfect matching.In this phase,Vector cleaner could identify the primer and remove vector sequences flanking the primers.Third,Vector cleaner detects the sense strand sequence by comparing the seeds matching times in slide window of the upstream primer and its reverse complement sequences.In this study,the proposed method was compared to softwares,SeqMan and Seqclean with similar function,using 12 sequencing results of the cotton gene GhVIN1.12 sequences were amplified from Gossypium arboreum cv.JLZM and Gossypium raimondii.The cDNA fragments were cloned into the pMD19-T vector and sequenced.Seqclean is a software based on NCBI＇s UniVec database and run in default parameters to screen vector.SeqMan imported plasmids pMD19-T sequences and run in default parameters.The results of Vector cleaner,SeqMan and Seqclean were analysed using multiple sequence alignment software Clustal X.The results showed that Vector cleaner successfully removed the vector sequences of cotton gene GhVIN1 and exported the detail results including primer information,product size and target gene sequence to an excel file.Sequences of GhVIN1-1,GhVIN1-2,GhVIN1-3,GhVIN1-4,GhVIN1-7,GhVIN1-8,GhVIN1-10,GhVIN1-12 were detected to be antisense strand sequences and automatically be translated into sense strand sequences.GhVIN1-2 with 2 bases mismatch in primers can also be identified and corrected.Compared with Seqclean and SeqMan,Vector cleaner has a higher accuracy and sensitivity.The rate of correct sequences cleaned by Vector cleaner,SeqMan and Seqclean was 100%,100% and 91. 6% respectively and the rate of correct nucleotide bases obtained by Vector cleaner,SeqMan and Seqclean was 99.90%,99.00% and 94.33%,meaning SeqMan and Seqclean has more nucleotide bases bias.Thus,Vector cleaner is a highly optimized software in vector sequence removal for gene cloning.It outperforms other traditional software in terms of accuracy,its function for translating antisense strand sequence and it tackles the weaknesses of traditional Vector cleaner requiring vector sequences.

作者赵汀周宝良

机构地区南京农业大学作物遗传与种质创新国家重点实验室

出处《南京农业大学学报》 CAS CSCD 北大核心 2014年第4期9-14,共6页 Journal of Nanjing Agricultural University

基金国家973计划项目(2011CB109300)

关键词目的基因测序序列载体序列去除 VECTOR CLEANER PERL语言 target gene sequencing vector sequence removing Vector cleaner Perl language

分类号 S562 [农业科学—作物学] Q811.4 [生物学—生物工程]

引文网络
相关文献

参考文献13

1White J R, Roberts M, Yorke J A, et al.Figaro:a novel statistical method for vector sequence removal[J].Bioinformatics, 2008, 24(4):462-467.
2Falgueras J, Lara A J, Fernández-Pozo N, et al.SeqTrim:a high-throughput pipeline for pre-processing any type of sequence read[J].BMC Bioinformatics, 2010, 11(1):38.
3Chou H H, Holmes M H.DNA sequence quality trimming and vector removal[J].Bioinformatics, 2001, 17(12):1093-1104.
4Pertea G, Huang X, Liang F, et al.TIGR gene indices clustering tools(TGICL):a software system for fast clustering of large EST datasets[J].Bioinformatics, 2003, 19(5):651-652.
5Burland T G.DNASTAR’s laser gene sequence analysis software[M]//Misener S, Krawetz S A.Methods in Molecular Biology.Totowa, New Jersey:Humana Press, 2000, 132:71-91.
6Altschul S F, Gish W, Miller W, et al.Basic local alignment search tool[J].Journal of Molecular Biology, 1990, 215(3):403-410.
7向福,陈悟,余龙江.基于Bioperl的基因序列获取的程序设计与实现[J].生物技术,2004,14(6):64-66. 被引量：10
8周猛,童春发,施季森.充分利用Bioperl加速生物信息学的研究[J].生物信息学,2008,6(1):43-45. 被引量：4
9Wang L, Ruan Y L.Unraveling mechanisms of cell expansion linking solute transport, metabolism, plasmodesmtal gating and cell wall dynamics[J]. Plant Signaling and Behavior, 2010, 5(12):1561-1564.
10Wang L, Li X R, Lian H, et al.Evidence that high activity of vacuolar invertase is required for cotton fiber and Arabidopsis root elongation through osmotic dependent and independent pathways, respectively[J].Plant Physiology, 2010, 154(2):744-756.

二级参考文献33

1向福,陈悟,余龙江.基于Bioperl的基因序列获取的程序设计与实现[J].生物技术,2004,14(6):64-66. 被引量：10
2[2]Jason E.Stajich,David Block,Kris Boulez,et al.The Biopert Toolkit:Perl Modules for the Life Sciences[J].Genome Res.,2002,12:1611-1618.
3Ryan L,Brian D G,Joseph R E.Next is now:new technologies for sequencing of genomes,transcriptomes,and beyond[J].Curr Opin Plant Biol,2009,12(2):107-18.
4Altschul S F,Gish W,Miller W,et al.Basic local alignment search tool[J].J Mol Biol,1990,215:403-412.
5Altschul S F,Madden T L,Schaffer A A,et al.Gapped BLAST andPSI-BLAST:a new generation of protein database search programs[J].NucleicAcids Research,1997,25(17):3389-3402.
6Ye J,McGinnis S,Madden T L.BLAST:improvements for better sequence analysis[J].Nucleic Acids Res.,2006,34:W6-9.
7Edgar RC.MUSCLE:multiple sequence alignment with high accuracy and high throughput[J].Nucleic Acids Res.,2004,32:1792-1797.
8Felsenstein J.PHYLIP-Phylogeny Inference Package (Version 3.2)[J].Cladistics,1989,5:164-166.
9Eddy S R.Profile hidden Markov models[J].Bioinformatics,1998,14 (9):755-763.
10于澄宇金平安.高通量植物蛋白组学研究方法.生物信息学,2003,1(1):1-5.

共引文献17

1李喻菲,林龙,宣铭润,冯慧,叶文武,王源超.卵菌与真菌Argonaute家族基因的生物信息学分析[J].植物病理学报,2020,50(1):60-67. 被引量：2
2周猛,童春发,施季森.基于Perl语言的序列同源性分析过程自动化的实现[J].生物技术,2007,17(1):60-63. 被引量：1
3周猛,童春发,施季森.充分利用Bioperl加速生物信息学的研究[J].生物信息学,2008,6(1):43-45. 被引量：4
4陈悟,崔永明,潘飞,陈思奇,祝荻,曾庆福.硫酸盐还原菌多相分类系统的研究进展[J].武汉科技学院学报,2008,21(11):7-12.
5马相如,王红梅,顾延生,葛继稳.基于局域网的生物信息学应用与开发平台的建立[J].计算机应用,2009,29(B06):387-389. 被引量：3
6张大勇,易金鑫,胡国民,许玲,袁玲玲,徐照龙,何晓兰,黄益洪,刘晓庆,马鸿翔.一个大豆GmXIP基因的克隆与表达分析[J].华北农学报,2012,27(4):12-17. 被引量：1
7张大勇,胡国民,易金鑫,许玲,Ali ZULFIQAR,刘晓庆,袁玲玲,徐照龙,何晓兰,黄益洪,马鸿翔.大豆GmTIP1；1基因的克隆与表达分析[J].作物学报,2013,39(1):76-83. 被引量：1
8张晓婧,潘伟民,曹兴芹.基于Bioperl实现远程自动获取抗逆基因序列[J].生物信息学,2014,12(3):185-188. 被引量：1
9叶文武,张萌,曹明娜,翟春花,李爱宁,王源超.大豆疫霉PsMPK1沉默突变体的基因表达谱分析[J].南京农业大学学报,2016,39(3):386-393. 被引量：5
10叶文武,李爱宁,王晓莉,王源超.大豆疫霉MAPK基因的鉴定与转录分析[J].植物病理学报,2016,46(3):338-346. 被引量：1

同被引文献37

1何碧梧.关于科技期刊参考文献的几点认识[J].武汉科技学院学报,2006,19(4):123-124. 被引量：1
2GB／T7714-2005文后参考文献著录规则[S]．北京：中国标准出版社，2005
3《四川医学》编辑部.《四川医学》关于论著、综述、基金论文的要求[EB/OL].(2014-12-08)[2015-02-21]. http://scmj.scyx.org.cn/UserRegion/.
4朱大明.参考文献引证在研究型论文中的分布特征[J].编辑学报,2008,20(6):481-482. 被引量：11
5朱大明.科技期刊不应规定论文参考文献著录数量[J].中国科技期刊研究,2009,20(1):159-160. 被引量：9
6李军纪.关于参考文献表中期刊卷号著录的建议[J].编辑学报,2010,22(3):266-266. 被引量：5
7郝远.一条参考文献3处著录差错[J].编辑学报,2010,22(4):342-342. 被引量：1
8诸仁.文后参考文献著录时期刊的卷号可以省略吗?[J].编辑学报,2010,22(6):502-502. 被引量：3
9郝拉娣,刘琳,王从奎.外文科技期刊卷期页的一些特别之处[J].编辑学报,2011,23(1):39-40. 被引量：6
10陈浩元,颜帅,郑进保,李兴昌.关于文后参考文献著录若干问题的释疑[J].编辑学报,2011,23(2):109-113. 被引量：61

引证文献2

1王雨生.作者和读者视域下参考文献责任者著录存在的问题及建议——以农业类高校自然科学学报为例[J].农业图书情报学刊,2015,27(10):145-147. 被引量：3
2王雨生.农业学报参考文献的卷期页码著录主要问题刍议[J].出版与印刷,2015,0(3):19-21. 被引量：1

二级引证文献4

1谢飞凤,季群,赵瑞.新常态下医学期刊的改革及创新研究——以《中华全科医学》为例[J].今传媒,2017,25(3):141-143.
2满鹏.文后参考文献中外国著者姓名的规范化著录[J].传播与版权,2019(3):49-51. 被引量：3
3丁忠华.期刊析出文献出处项的数字著录乱象解析及防范[J].编辑学报,2020,32(6):636-638. 被引量：2
4路晓鸽,王文福,张德福,赵晓明,刘常达.摭谈高校学报参考文献著录规范化的途径[J].黑龙江工程学院学报,2024,38(1):78-82.

1新型选择性除草剂——氰氟草酯[J].农化市场十日讯,2007(35):22-22.
2张慧丽,曲力涛,李景文,杨德光.玉米株型与穗部某些性状相关性的研究[J].玉米科学,2001,9(2):59-60. 被引量：5
3黎星辉.茶树苔藓的去除方法[J].贵州茶叶,2003,31(1):15-15.
4焦万洪.大豆中胰蛋白酶抑制剂的去除方法[J].四川畜牧兽医,2005,32(2):40-40. 被引量：2
5李召华,朱克永,陈祖武,詹庆才.SSR分子标记技术在杂交水稻种子纯度鉴定中的应用[J].杂交水稻,2006,21(4):11-14. 被引量：49
6如何快速去除带鱼鳞[J].吉林农业,2005(12):35-35.
7Maziidah Ab Rahman,Roslan Arshad,Faizah Shaharom,Nur Asma Ariffin.Amino Acid and Fatty Acid Profile in Epidermal Mucus of Bluestreak Cleaner Wrasse （Labroides dimidiatus）： Possible Role as Defense Mechanism against Pathogens[J].Journal of Life Sciences,2012,6(12):1371-1377.
8张新民,沈德诚,苏月来.一种快速去除单核细胞的新方法[J].中华血液学杂志,1996,17(2):105-105.
9郭文久.Perl语言环境下生物信息学的数据库技术[J].安康学院学报,2007,19(5):74-78. 被引量：4
10周猛,童春发,施季森.基于Perl语言的序列同源性分析过程自动化的实现[J].生物技术,2007,17(1):60-63. 被引量：1

南京农业大学学报

2014年第4期

浏览历史

内容加载中请稍等...

Vector cleaner:一种新的去除测序目的基因载体序列的方法被引量：2

参考文献13

二级参考文献33

共引文献17

同被引文献37

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

Vector cleaner:一种新的去除测序目的基因载体序列的方法 被引量：2

参考文献13

二级参考文献33

共引文献17

同被引文献37

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

Vector cleaner:一种新的去除测序目的基因载体序列的方法被引量：2