Contact Us

  T3/Oat

GBS import instructions

Genotype by Sequencing (GBS) Data, over 100K markers

Genotype Experiment Markers aligned to reference
  • The marker_name and sequence should be unique.
  • The marker sequence should be checked for synonyms to existing entries in the database (there is a BLAST tool to check sequence synonyms on the import page).
  • We will BLAST newly submitted marker flanking sequences to flanking sequences previously submitted. For sequence pairs that are identical along
    the full length of the shorter of the two sequences, we will assume identity of the SNP. Thus, the newly submitted marker will become a synonym of
    the previously submitted marker.
  • The sequence for each marker should be long enough to uniquely define the marker within the genome. For the wheat genome anchored to the IWGSC assembly we use a marker sequence of 128 bases.
  • file format - comma separated
    WCSS1_marker_name,marker_type,A_allele,B_allele,sequence
    WCSS1_contig3917765_1al-5470,GBS,G,A,GCCGGACTGAGGCGGCAACTTGATGCGGCGGATGCCAACATTGCGCTTGTGAACAAGCGGCTTG[G/A]CGAGGCACAGGGTATGTATTTTCGGGTGGTCAACAAATATTAAGAGGAGCATGATGCTAGTAT
    WCSS1_contig3917765_1al-5481,GBS,G,T,GCGGCAACTTGATGCGGCGGATGCCAACATTGCGCTTGTGAACAAGCGGCTTGGCGAGGCACAG[G/T]GTATGTATTTTCGGGTGGTCAACAAATATTAAGAGGAGCATGATGCTAGTATCTATAATATGC
    WCSS1_contig3917765_1al-5493,GBS,C,T,TGCGGCGGATGCCAACATTGCGCTTGTGAACAAGCGGCTTGGCGAGGCACAGGGTATGTATTTT[C/T]GGGTGGTCAACAAATATTAAGAGGAGCATGATGCTAGTATCTATAATATGCTGTGACTGCAGA
    
  • fields
    marker_name = valid characters are alphanumeric and “_-.“
    marker_type = GBS
    A_allele = reference allele
    B_allele = alternate allele
    sequence = ACTG, the SNP should be embedded in the sequence with the reference allele first and the alternate allele second
Markers not aligned to reference Genotype Results Command line import instructions
  1. Load file for tassel and rrBLUP format with the script load_gbs_bymarker.
  2. Calculate and load allele frequencies with the script load_gbs_frequencies.

In 2009 the Toronto International Data Release Workshop agreed on a policy statement about prepublication data sharing. Accordingly, the data producers are making many of the datasets in T3 available prior to publication of a global analysis. Guidelines for appropriate sharing of these data are given in the excerpt from the Toronto Statement

I agree to the Data Usage Policy as specified in Toronto Statement.