Statistical Modeling of Coverage in High-Throughput Data
互联网
444
In high-throughput sequencing experiments, the number of reads mapping to a genomic region, also known as the “coverage” or “coverage depth,” is often used as a proxy for the abundance of the underlying genomic region in the sample. The abundance, in turn, can be used for many purposes including calling SNPs, estimating the allele frequency in a pool of individuals, identifying copy number variations, and identifying differentially expressed shRNAs in shRNA-seq experiments.
In this chapter we describe the fundamentals of statistical modeling of coverage depth and discuss the problems of estimation and inference in the relevant experimental scenarios.