Journal article
GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data
PLoS genetics, Vol.12(2), pp.e1005631-e1005631
02/2016
Handle:
https://hdl.handle.net/2376/103912
PMCID: PMC4734769
PMID: 26828719
Abstract
Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth.
Metrics
13 Record Views
Details
- Title
- GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data
- Creators
- Thomas F Cooke - Department of Genetics, Stanford University, Stanford, California, United States of AmericaMuh-Ching Yee - Carnegie Institution for Science, Department of Plant Biology, Stanford, California, United States of AmericaMarina Muzzio - Facultad de Ciencias Naturales y Museo, Universidad Nacional de La Plata, La Plata, ArgentinaAlexandra Sockell - Department of Genetics, Stanford University, Stanford, California, United States of AmericaRyan Bell - Department of Genetics, Stanford University, Stanford, California, United States of AmericaOmar E Cornejo - School of Biological Sciences, Washington State University, Pullman, Washington, United States of AmericaJoanna L Kelley - School of Biological Sciences, Washington State University, Pullman, Washington, United States of AmericaGraciela Bailliet - Instituto Multidisciplinario de Biología Celular (CCT La Plata-CONICET, CICPBA), La Plata, ArgentinaClaudio M Bravi - Instituto Multidisciplinario de Biología Celular (CCT La Plata-CONICET, CICPBA), La Plata, ArgentinaCarlos D Bustamante - Department of Genetics, Stanford University, Stanford, California, United States of AmericaEimear E Kenny - Center of Statistical Genetics, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
- Publication Details
- PLoS genetics, Vol.12(2), pp.e1005631-e1005631
- Academic Unit
- Biological Sciences, School of
- Publisher
- United States
- Grant note
- T32 HG000044 / NHGRI NIH HHS T32 GM007276 / NIGMS NIH HHS
- Identifiers
- 99900546931201842
- Language
- English
- Resource Type
- Journal article