Introducing NOISYmputer
Low-coverage next-generation sequencing (LC-NGS) methods can be used to genotype bi-parental populations. This approach allows creating highly saturated genetic maps at reasonable cost, precisely localized recombination breakpoints, and minimize mapping intervals for quantitative-trait locus analysis.
The main issues with these genotyping methods are (1) poor performance at heterozygous loci, (2) a high percentage of missing data, (3) local errors due to erroneous mapping of sequencing reads and reference genome mistakes, and (4) global, technical errors inherent to NGS itself.
Recent methods like Tassel-FSFHap or LB-Impute are excellent at addressing issues 1 and 2, but nonetheless perform poorly when issues 3 and 4 are persistent in a dataset (i.e. “noisy” data).
We present NOISYmputer, program for imputation of LC-NGS data that eliminates the need of complex pre-filtering of noisy data, accurately types heterozygous chromosomic regions, corrects erroneous data, and imputes missing data.
NOISYmputer performs very well for noisy data, compared with Tassel-FSFHap, LB-Impute, or Genotype-Corrector.