Map SNP IDs to genome coordinates
Here is a solution using the Bioconductor package biomaRt
. It is a slightly corrected and reformatted version of the previously posted code.
library(biomaRt) # biomaRt_2.30.0
snp_mart = useMart("ENSEMBL_MART_SNP", dataset="hsapiens_snp")
snp_ids = c("rs16828074", "rs17232800")
snp_attributes = c("refsnp_id", "chr_name", "chrom_start")
snp_locations = getBM(attributes=snp_attributes, filters="snp_filter",
values=snp_ids, mart=snp_mart)
snp_locations
# refsnp_id chr_name chrom_start
# 1 rs16828074 2 232318754
# 2 rs17232800 18 66292259
Users are encouraged to read the comprehensive biomaRt
vignette and experiment with the following biomaRt
functions:
listFilters(snp_mart)
listAttributes(snp_mart)
attributePages(snp_mart)
listDatasets(snp_mart)
listMarts()
Using bioconductor's biomaRt R package.
This provides an easy way to send queries to BioMart which fetches information about SNPs given an rsNumber (i.e. rsid).
E.g. to import SNP data for rs16828074 (an rsNumber you listed in the post), use this:
Code:
library(biomaRt)
snp.id <- 'rs16828074' # an SNP rsNumber like you listed in the post
snp.db <- useMart("snp", dataset="hsapiens_snp") # select your SNP database
# The SNP data file imported from the HUMAN database:
nt.biomart <- getBM(c("refsnp_id","allele","chr_name","chrom_start",
"chrom_strand","associated_gene",
"ensembl_gene_stable_id"),
filters="refsnp",
values=snp.id,
mart=snp.db)
Let me know how you get on with this (via comments) since I assume some basic coding and package importing ability in my answer here.
Aknowledgement/s:
goes to Jorge Amigo (for his post in Biostars)
Via Perl you will find it quite easy to build code to query for SNPs.
There is a web browser GUI tool (HERE) for building perl scripts based on which database and dataset you wish to query using Biomart library.
Instructions
- Go to http://www.ensembl.org/biomart/martview/ad23fb5685e6aecb59ab12ce73c89731 (for supported Metazoans), or http://biomart.vectorbase.org/biomart/martview/6e274bc00b3c68a131a6947d02039ade (for up to date Vectors of Malaria, e.g. A. gambiae)
Select the database and dataset:
Click on the "perl" button to generate perl code for the Biomart API querying, and copy-paste the code into your perl editor - run it with the SNP rsNumbers of your choice.
# An example script demonstrating the use of BioMart API. use strict; use BioMart::Initializer; use BioMart::Query; use BioMart::QueryRunner; my $confFile = "PATH TO YOUR REGISTRY FILE UNDER biomart-perl/conf/." my $action='cached'; my $initializer = BioMart::Initializer->new('registryFile'=>$confFile,'action'=>$action); my $registry = $initializer->getRegistry; my $query = BioMart::Query->new('registry'=>$registry,'virtualSchemaName'=>'default'); $query->setDataset("hsapiens_snp"); $query->addAttribute("refsnp_id"); $query->addAttribute("refsnp_source"); $query->addAttribute("chr_name"); $query->addAttribute("chrom_start"); $query->formatter("TSV"); my $query_runner = BioMart::QueryRunner->new(); ############################## GET RESULTS ########################## $query_runner->execute($query); $query_runner->printHeader(); $query_runner->printResults(); $query_runner->printFooter(); #####################################################################