彭友松
从NCBI批量下载序列
2016-12-30 23:16
阅读:13507
好久没有玩过大量序列了,发现之前的脚本不管用了,所以从bing上又学习了一遍。下面是一位大神总结的几种最简单的方法。From https://edwards.sdsu.edu/research/ncbi-sequence-or-fasta-batch-download-using-entrez/
Three easy ways to download multiple sequences from NCBI

There are different ways of how to download multiple sequences from the NCBI databases in a single request.

1) Using the batch Entrez website

http://www.>ncbi.nlm.nih.gov/sites/batchentrez

2) Using Perl: (copy into your terminal and press return/enter)

perl -e 'use LWP::Simple;getstore("http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&rettype=fasta&retmode=text&id=".join(",",qw(6701965 6701969 6702094 6702105 6702160)),"seqs.fasta");'

This takes the IDs separated by spaces and the filename of the fasta file with the sequences that will be generated (seqs.fasta). If you don’t try to get the nucleotide data, then you will have to change the database name as well.

3) Using your browser: (paste this to the address field)

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&rettype=fasta&retmode=text&id=6701965,6701969,6702094,6702105,6702160
This time the IDs are separated by commas. Same here, if you need to get data from a different database you just have to change that.


转载本文请联系原作者获取授权,同时请注明本文来自彭友松科学网博客。

链接地址:http://wap.sciencetimes.com.cn/blog-54276-1024373.html?mobile=1

收藏

分享到:

当前推荐数:1
推荐人:
推荐到博客首页
网友评论0 条评论
确定删除指定的回复吗?
确定删除本博文吗?