Developing a Statistical Model for Primer Design
互联网
521
This chapter describes the statistical method that can be used to predict the success and failure of a designed primer based on properties of genomic sequence surrounding the primer extension, using user's own existing genotyping database. After scores that measure properties of genomic sequence surrounding primer extension are developed as described in previous chapters, this chapter first shows how to use simple statistics to evaluate the correlation between the score and the likelihood of primer success and failure based on user's own empirical data. All scores that show significant correlations with the primer success are kept for further analysis. Next, logistic regression method is described in detail to estimate the contribution of each primer score to the overall primer success/failure rate when all significant scores are weighted simultaneously to produce the logistic regression model. Statistics that evaluate model fit and model discrimination are provided as well. Last, all significant scores are combined into one measure that can predict overall success/failure rate of a given primer design. The estimated logistic regression score allows prioritization of primers, selection of the best possible primer pair, and combining primers into best clusters for multiplex PCR. Software and hardware requirements and sample SAS programs are also included.