Marker Alleles and Sequences

Genotyping results are submitted to T3, and can be downloaded from T3, in a variety of formats. To allow this flexibility, all types of markers are represented internally in a common format. In this format each marker has two allele states (plus "missing data"), referred to as A-allele and B-allele. Below are the mapping rules used to convert from each submission format to T3 format. Following that are the nucleotide calls corresponding to the A- and B-allele for each marker. In each Sequence, the SNP is given in square brackets as "[A-allele/B-allele]".

Illumina SNPs

Genotyping results from Illumina GoldenGate and Infinium assays are submitted to T3 in Illumina's A/B format, and stored as A-allele = A and B-allele = B. The actual nucleotide calls corresponding to A and B are taken from the Illumina manifest files (example), interpreted according to the documentation in Illumina's Technical Note "TOP/BOT" Strand and "A/B" Allele. In short:
  • For [A/C] and [A/G] SNPs, Allele A is A.
  • For [T/C] and [T/G] SNPs, Allele A is T and the sequence is designated BOT.
  • For [A/T] SNPs, when the Illumina Strand is TOP then Allele A = A and Allele B = T. When the Strand is BOT, then Allele A = T and Allele B = A.
  • For [C/G] SNPs, when the Illumina Strand is TOP then Allele A = C and Allele B = G. When the Strand is BOT then Allele A = G and Allele B = C.
The Sequence shown in T3 is the reverse-complement of the the manifest's TopGenomicSeq when the Illumina strand (manifest field "Ilmn Strand") was BOT, so the the bracketed SNP nucleotides in the Sequence are the same as the A-allele and B-allele nucleotides; however they aren't necessarily given in the same order.


For SNPs assayed by shotgun sequencing downstream from a restriction site, the Sequence is oriented with the common restriction site at the 5-prime, left end. The alternative nucleotides for this strand are shown in square brackets in alphabetical order, A < C < G < T. The first nucleotide alphabetically is stored as A-allele, and the second is B-allele. Genotyping data are provided to T3 with a single-letter score, the nucleotide, when homozygous; heterozygotes are indicated with "H", and missing data with "N".


For DArT markers, Allele A is 1 (present) and Allele B is 0 (absent).

