序列比对与blast
1、Global alignments 全局比对:尽可能保证两条序列的碱基都能配对,因此会有比较多的错配和gap,但是不会对序列两端对gap进行罚分。
2、Local alignments 局部比对:尽可能的找到能最优匹配对子区域。
3、多序列比对:常用的软件为: mafft, muscle, clusta-omega, t-coffee等,是全局比对。
4、BLAST (Basic Local Alignment Search Tool),结果是局部比对。Its purpose is to search a large body of known information for similarities (hits).
5、如何使用blast
1?Prepare a BLAST database with makeblastdb.? This only needs to be done once.
2Pick a blast tool-- blastn\blastp\blastx\tblastn\tblastx , as appropriate (use -h to see and chose the parameters).
3Run the tool and format the output as needed ( -outfmt 6 or 7 to?format the outputs into?tabular or even add custom fields to the output).
4如果只有两条序列,则不需要构建database,blastn可以直接进行pairwise alignment,命令如下:blastn -query query.fa -subject ~/refs/ebola/KM233118.fa
5 blastn -task 选项可以选择多种模式:
blastn?-?finds more divergent sequences;megablast?- finds less divergenent sequences;blastn-short?- short queries?
6blast 会自动过滤掉一些低复杂度的序列(高度重复)使用 -dust no 关闭过滤。
6、blast databases
1现成的databases:ftp://ftp.ncbi.nlm.nih.gov/blast/db/
2
3blastdbcmd :queries blast databases. -info
List the content of the blast database:blastdbcmd -db index/all -entry 'all' -outfmt "%a"
4elink : to see which publication links to this sequence?
? esearch -db nuccore -query NR_118889.1 | elink -target pubmed | efetch
5reformat database
blastdbcmd?-db ~/refs/refseq/16SMicrobial -entry 'all' -outfmt '%a,%l,%T,%L' | tr ','? '\t'?
6extract a specific entry
? Get the first 20 bases of a specific 16S gene:blastdbcmd -db? ~/refs/refseq/16SMicrobial -entry 'NR_118889.1' -range 1-20
7、extractfeat
鹏仔微信 15129739599 鹏仔QQ344225443 鹏仔前端 pjxi.com 共享博客 sharedbk.com
图片声明:本站部分配图来自网络。本站只作为美观性配图使用,无任何非法侵犯第三方意图,一切解释权归图片著作权方,本站不承担任何责任。如有恶意碰瓷者,必当奉陪到底严惩不贷!