openSNP | FAQ

What does OpenSNP offer me as a genotyping-customer? +

If you were genotyped by 23andMe, deCODEme, or FamilyTreeDNA, you can upload the raw genotype data downloaded from your DTC test provider. The data will then be openly available for the world to see and download. We also parse these SNPs and annotate them.
For annotation, we include the manually curated SNPedia and find open-access primary publications appearing in the journals of The Public Library of Science (PLoS), an open-access publishing group. Additionally, we screen Mendeley, a crowd-sourced repository of scientific publications.
You can also publish some of your phenotypes so, someday, it might be possible to associate some SNPs with phenotypes (because of this we really would like to encourage you to do so; helping science generates this warm, fuzzy feeling inside of you).
And, of course, you can also share your knowledge about SNPs and phenotypes with other users and can socialize.
Who is behind openSNP? +

The site is hosted and was coded by Bastian, Helge and Philipp. We are not working full time on this project, this is more of a hobby. Please give us some time to answer your questions, fix bugs and stuff like this as we are doing this in our free time besides our studies and day jobs. You can find some more details on our About Us page.
Why all this? +

openSNP is a non-profit, open-source project that is about sharing genetic and phenotypic information. The idea for this project came to Bastian after he was genotyped by 23andMe in May 2011 and started playing around with his data. During his research he became frustrated, because it was not that easy to find more data. He started working on openSNP to fix this. To be clear: This project is not about making money, selling data or to quote Google: “We don’t wanna be evil”. We are just interested in making science more open and accessible.
How much does membership in openSNP cost? +

We take the "open" in openSNP seriously, so everything is free of charge.
Where can I download my raw data on the page of my DTC company? +

23andMe customers can download the data here. deCODEme customers can find the data here. If you were genotyped by FamilytreeDNA you can find your files here.
How long does it take until my genotyping data is available on the site after uploading? +

This heavily depends on how many other users are in the queue, waiting for their data to be parsed. Right now openSNP runs on a small scale server so it may take 2-3 days until all your variations can be found on the site. But you can start using openSNP right away and others can download your raw-data right from the start. The parsing only affects the "other users sharing this SNP" and the "my genotype"-views.
Exomes? +

23andMe has a limited exome sequencing offer. If you've have had your exome sequenced through 23andMe you can now upload the VCF (Variant Call Format, see the filetypes-FAQ) version of your exome as well. But there are some limitations: While users will be able to download your complete VCF-file we currently don't have the computing power to parse all of your variation. So currently we will only parse the already known SNPs out of your file. For the same reasons we currently can't offer uploads of the raw sequence data.

The feature is still in an early beta stage, so please notify us if something goes wrong.
I was genotyped by another DTC company, why can't I upload my data? +

We are always interested in adding more data. If your company is missing up to now it is because we did not have an example-file to optimize our parser on it. But if you contact us at parsing@opensnp.org and send us your file we can fix this, so that all customers of the DTC company of your choice can participate.
How open is my data? +

Completely open: Everyone can see everything you enter or upload (except your private messages and your password, of course). We warn every user twice about this: Once during the user-creation and once before the genotyping-file-upload. You can find more information about the risks of publishing the data by reading the disclaimer
The phenotype I'm interested in is not yet in openSNP. What can I do? +

You can easily add this phenotype as long as you are logged in. Just visit the page for phenotype creation and enter a name and a description that enables other users to understand what you are interested in.
I've found a bug, what can I do? +

Just send us an email.

What does openSNP offer me as a researcher working on human SNPs? +

We can offer you data, hopefully someday even lots of data. 23andMe alone genotyped already over 100.000 people and more and more of them are sharing their SNPs, as well as their phenotypes.
Logged-In users can search our database for users with specific phenotypes and mass-download all corresponding SNP-datasets. This allows you to get datasets like "All genotyping files of openSNP-users that have Alzheimer" and the corresponding control group.
If you'd like to be informed about new genotyping results in general, or just about all new data sets for a specific phenotype, you can use the RSS-feeds we offer.
How much does a membership in openSNP cost? +

We take the "open" in openSNP serious, so everything is free of charge.
The phenotype I'm interested in is not yet in openSNP. What can I do? +

You can easily add this phenotype as long as you are logged in. Just visit the page for phenotype creation and enter a name and a description that enables other users to understand what you are interested in.
How can i start a mass download? +

This option is only available for logged-in users. Just visit the phenotype-page of your choice and click on the download-links for the variations you are interested in. Our server will grab all the data for you, make a handy zip-file out of it and send you the download link via email, as this can be a lot of data.
Is there an API for searching/downloading files? +

We are currently working on providing API-access. If you are interested in learning more about how you can use JSON to get access to openSNP-data you might want to read this. There are also some other ways to get hands on the data: If you want to automate the file-downloads for a given phenotype the RSS-feeds could help you. Inside the RSS-XML there are 2 flags you could use to automatically create correct genotype-groups: <variation> gives you the variation of this user at the phenotype you are looking at and <dlink> gives you the download link.
How do I parse the files provided by openSNP? +

There's a Public Domain Python3-script located on github here: Click this link. Feel free to download and change this script.
How do I cite openSNP? +

If you have used openSNP (code or data) in your work we would be really happy if you could cite our paper, which was published in PLOS ONE in 2014

What do I have to consider while entering my variation for already existing phenotypes? +

1.You should read the description on the phenotype you are about to enter. Often the description gives further details on which kind of answers people are interested in. You can also take a look on the answers which are given so far by other users. This should also give you a feeling about what kind of information is wanted for this phenotype.
2. As soon as you start to enter your own variation we will try to auto-complete your writing to avoid duplicate entries into the openSNP-database. Having multiple entries for the same thing, like "blue-green" plus "green-blue" in case of eye-colour makes it harder for (citizen) scientists to use the data for their studies.
3. The thing you are about to enter doesn't get auto-completed? Great. You are about to enter a variation which has not been reported so far, have fun entering your data!
What should I consider while creating a new phenotype? +

1. Use a reasonable name for your characteristic. Don't make it too long, but at the same time make sure that people get the general idea of what you want to know.
2. You should also make sure that the same phenotype is not already available in the openSNP-database. As soon as you start writing your characteristics name openSNP will try to auto-complete what you are writing. This should reduce duplicate phenotypes, so if you are about to enter "eye-colour" you will get a suggestions for the already available "eye-color" (Sorry, we can't solve the AE / BE issue here).
3. Write a description about the characteristic you are interested in. Give users some words why this is interesting. Maybe add what research has been done so far on this phenotype. Links to further web-resources and/or Wikipedia might also be a good idea. If you already have suggestions which different answers might apply to this phenotype: List them in the description.
4. Enter your own variation: The last field, "variation" is meant to carry your own variation for this phenotype. Don't enter all possible answers (You can list those in the description). Other users can easily add their own variation and take your suggestions from the description or - if you forgot to list a possible answer - give completely new answers.

Not sure what kind of file you've got? Take a look at our example-files

23andMe +

                		# More Comments up here
                		# More information on reference human assembly build 36:
                		# http://www.ncbi.nlm.nih.gov/projects/mapview/map_search.cgi?taxid=9606&build=36
                		#
                		# rsid	chromosome	position	genotype
                		rs4477212	  1	        72017	    AA
                		rs3094315	  1	        742429	  AA
                		rs3131972	  1	        742584	  GG
                		rs12124819	1	        766409	  AG
                		rs11240777	1	        788822	  AG
                		rs6681049	  1	        789870	  CC
                		rs4970383	  1	        828418	  CC
                		rs4475691	  1	        836671	  CC
                		rs7537756	  1	        844113	  AA

deCODEme +

                    Name,Variation,Chromosome,Position,Strand,YourCode
                    rs4477212,A/G,1,72017,+,AA
                    rs2185539,C/T,1,556738,+,CC
                    rs6681105,C/T,1,581938,+,TT
                    rs11240767,C/T,1,718814,+,CC
                    rs3094315,C/T,1,742429,-,TT
                    rs3131972,C/T,1,742584,-,CC
                    rs3131969,C/T,1,744045,-,CC
                    rs1048488,C/T,1,750775,+,TT
                    rs2905046,A/G,1,752518,-,GG

FamilyTreeDNA +

                		RSID,CHROMOSOME,POSITION,RESULT
                		"rs3094315","1","742429","AG"
                		"rs3131972","1","742584","AG"
                		"rs12562034","1","758311","GG"
                		"rs12124819","1","766409","AA"
                		"rs11240777","1","788822","GG"
                		"rs6681049","1","789870","CC"
                		"rs4970383","1","828418","CC"
                		"rs4475691","1","836671","CC"
                		"rs7537756","1","844113","AA"

23andMe Exome in VCF format +

                  ##fileformat=VCFv4.1
                  ##CombineVariants=[...]
                  ##FILTER= 200.0">
                  [...]
                  ##FORMAT=
                  [...]
                  ##INFO=
                  [...]
                  ##contig=
                  ##contig=
                  [...]
                  #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  USERID
                  1       14907   rs79585140      A       G       317.34  MQFilter40 AB=0.604;AC=1;AF=0.50; [...]     GT:AD:DP:GQ:PL  0/1:32,21:54:99:347,0,514
                  1       14930   rs75454623      A       G       351.37  MQFilter40  AB=0.61;AC=1;AF=0.50; [...]     GT:AD:DP:GQ:PL  0/1:36,23:59:99:381,0,680

My filetype is still missing, what can I do? +

We are always interested in adding more data. If your company is missing up to now it is, because we did not have an example-file to optimize our parser on it. But if you contact us at parsing@opensnp.org and send us your file we can fix this, so that all customers of the DTC company of your choice can participate.

Using JSON +

We provide some methods to make it easier to get data back from openSNP. Currently we are implementing some JavaScript Object Notation (short: JSON) methods that will make it easier to code some awesome openSNP-utilizing stuff. As you can see on the Wikipedia-page there are many JSON-parsers around for all major programming languages and it is fairly human readable as well. Plus: It is one of the de facto webstandards for APIs. The full overview of the openSNP JSON API is on GitHub at this link.
The method I need is missing and/or I need more help!? +

There are many ways in which this problem can be fixed. First of all: If you are able to code some Rails and have got some basic understanding of JSON feel invited to help us with the development of openSNP. The source code is freely available at GitHub and we are always looking for some motivated people who want to join us. Just start coding the stuff you are interested in and make a pull request. Or what about joining the openSNP development mailinglist/Google group and you can get to know us a bit better?
But don't worry if you don't feel like spending that much time with us (letting alone having that much time left): We are working on implementing more of this stuff, so if you've got a great idea that we should absolutely include just let us know: Write us at info@opensnp.org or drop Philipp or Basti a note. And the same is true if you need some more support with those methods already in place.

DAS? +

We've implemented some basic commands of the Distributed Annotation System (DAS). Those are for example used for the genome browsers on the SNP-pages in openSNP.
Where can I learn more about DAS? +

If you want to learn more about the use at openSNP you can watch this video which explains a bit about it. You can also read the specifications or read about example commands at the DASRegistry.

How can I see all DAS sources? +

To find all DAS-sources you can point your browser to http://opensnp.org/das/sources. You should get back some XML which roughly looks like this:

                    <SOURCES>
                      <SOURCE uri="6" title="6" description="6">
                        <MAINTAINER email="admin@opensnp.org"/>
                        <VERSION created="29.09.2011 15:43" uri="6">
                        <COORDINATES uri="http://www.dasregistry.org/dasregistry/coordsys/CS_DS311" taxid="9606" source="Chromosome" authority="GRCh" test_range="" version="37">GRCh_37,Chromosome,Homo sapiens</COORDINATES>
                        <CAPABILITY query_uri="http://opensnp.org/das/6/features" type="das1:features"/>
                        <CAPABILITY type="das1:unknown-segment"/>
                        </VERSION>
                      </SOURCE>
                        <SOURCE uri="10" title="10" description="10">
                        <MAINTAINER email="admin@opensnp.org"/>
                        <VERSION created="29.09.2011 20:54" uri="10">
                        <COORDINATES uri="http://www.dasregistry.org/dasregistry/coordsys/CS_DS311" taxid="9606" source="Chromosome" authority="GRCh" test_range="" version="37">GRCh_37,Chromosome,Homo sapiens</COORDINATES>
                        <CAPABILITY query_uri="http://opensnp.org/das/10/features" type="das1:features"/>
                        <CAPABILITY type="das1:unknown-segment"/>
                        </VERSION>
                      </SOURCE>
                    </SOURCES>

The title of each source corresponds to a single openSNP-user. Even if users have provided more than one file you can query all the data available for this user through a single DAS-source.

How can I query a DAS source? +

To query a DAS-source you can use the features-command and provide the chromosome and slice you are interested in. For example http://opensnp.org/das/1/features?segment=1:1,1000000 will provide you with all SNPs which are located between base 1 and 1000000 for the user with the ID #1. It will produce a similar XML-output like shown below:

                  <DASGFF>
                    <GFF version="1.0" href="http://opensnp.org/das/1/features?segment=1:1,1000000">
                    <SEGMENT id="1" version="1.0" start="1" stop="1000000">
                    <FEATURE id="rs6681049">
                      <TYPE id="CC"/>
                      <METHOD id=""/>
                      <START>789870</START>
                      <END>789870</END>
                      <LINK href="http://opensnp.org/snps/rs6681049">http://opensnp.org/snps/rs6681049</LINK>
                    </FEATURE>
                    <FEATURE id="rs6657048">
                      <TYPE id="CC"/>
                      <METHOD id=""/>
                      <START>947503</START>
                      <END>947503</END>
                      <LINK href="http://opensnp.org/snps/rs6657048">http://opensnp.org/snps/rs6657048</LINK>
                    </FEATURE>
                    <FEATURE id="rs28415373">
                      <TYPE id="CC"/>
                      <METHOD id=""/>
                      <START>883844</START>
                      <END>883844</END>
                      <LINK href="http://opensnp.org/snps/rs28415373">http://opensnp.org/snps/rs28415373</LINK>
                    </FEATURE>
                  </DASGFF>

The general URL-schema for DAS-feature-requests is http://opensnp.org/das/$USER_ID/features?segment=$CHROMOSOME:$START_BASE,$STOP_BASE.

FAQ

Not sure what kind of file you've got? Take a look at our example-files

Contacts