Introduction

This R markdown document is intended to serve as a companion in the preliminary analysis of NOISYmputer outputs. It can be modified as necessary to obtain a personalised report.

Prerequisite

You might be required to modify the path to folder in order for the R markdown to run properly. In the following chunk, please change the current path to the path of the directory containing the output files of NOISYmputer.

#workingdirectory = "Path/to/working/directory/containing/NOISYmputer/output/files"
workingdirectory = "~/Documents/PostDoc_2021_2025/Bioinformatique/NOISYmputer/V28.07.22/results/Freebayes_3X_384samples_with_30Xparents_on_AZUwithOrganelles"

Genotype frequencies before imputation

Plots of genotypic frequencies across the population before imputation per SNP site. A: Scatterplot of genotypic frequencies per SNP site; B: Histogram of genotypic frequencies per SNP site; C: Boxplot of genotypic frequencies per SNP site.

Plots of genotypic frequencies across the population before imputation per SNP site. A: Scatterplot of genotypic frequencies per SNP site; B: Histogram of genotypic frequencies per SNP site; C: Boxplot of genotypic frequencies per SNP site.

Genotype frequencies after imputation

Plots of genotypic frequencies after imputation. A: Genotypic frequencies per SNP site; B: Histogram of genotypic frequencies per SNP site; C: Boxplot of genotypic frequencies per SNP site

Plots of genotypic frequencies after imputation. A: Genotypic frequencies per SNP site; B: Histogram of genotypic frequencies per SNP site; C: Boxplot of genotypic frequencies per SNP site

Breakpoints analyses after imputation

Analyses of the estimated breakpoints after imputation across the population. The cyan rectangles represent positions of breakpoints detected throughout the population. The continuous line represent the number of variants present in each estimated window containing a breakpoint.

Analyses of the estimated breakpoints after imputation across the population. The cyan rectangles represent positions of breakpoints detected throughout the population. The continuous line represent the number of variants present in each estimated window containing a breakpoint.

Marey map estimation after imputation

The estimated total map size after imputation is 189.02 cM.

Marey map estimated after imputation. A: Marey map using position (in bp) and cumulated cM; B: Estimation of the chromosome in cM; C: Distribution of SNP sites along the chromosome.

Marey map estimated after imputation. A: Marey map using position (in bp) and cumulated cM; B: Estimation of the chromosome in cM; C: Distribution of SNP sites along the chromosome.

Sample Statistics

Sample statistics after imputation. A: Boxplot of the genotypic frequencies per sample accross the population after imputation; B: Boxplot of the number of transitions experienced by each sample across the population; C: Number of reads used to estimate breakpoint regions per sample across the population.

Sample statistics after imputation. A: Boxplot of the genotypic frequencies per sample accross the population after imputation; B: Boxplot of the number of transitions experienced by each sample across the population; C: Number of reads used to estimate breakpoint regions per sample across the population.

Graphical genotypes of imputed samples

Graphical genotypes of the whole population after imputation, collapsing and removal of alien segments along the chromosome.

Graphical genotypes of the whole population after imputation, collapsing and removal of alien segments along the chromosome.