Title: | Efficient and Accurate P-Value Computation for Position Weight Matrices |
---|---|
Description: | In putative Transcription Factor Binding Sites (TFBSs) identification from sequence/alignments, we are interested in the significance of certain match score. TFMPvalue provides the accurate calculation of P-value with score threshold for Position Weight Matrices, or the score with given P-value. It is an interface to code originally made available by Helene Touzet and Jean-Stephane Varre, 2007, Algorithms Mol Biol:2, 15. <doi:10.1186/1748-7188-2-15>. |
Authors: | Ge Tan <[email protected]> |
Maintainer: | Ge Tan <[email protected]> |
License: | GPL-2 |
Version: | 0.0.9 |
Built: | 2024-11-21 03:49:25 UTC |
Source: | https://github.com/ge11232002/tfmpvalue |
This package provides a novel algorithm that solves the P-value calculation problem given the score based on a Postion Weight Matrices (PWMs), or the reverse problem: finding the score give the desired P-value. This package is an interface to code originally made available by Helene Touzet and Jean-Stephane Varre, 2007, Algorithms Mol Biol:2, 15.
The original code is taken from http://bioinfo.lifl.fr/TFM/TFMpvalue/TFM-Pvalue.tar.gz, retrived 26/03/2014.
The algorithm is described in Touzet, H., and Varre, J.-S. (2007). Efficient and accurate P-value computation for Position Weight Matrices. Algorithms Mol Biol 2, 15.
Ge Tan
Computes the score threshold associated with P-value p using the algorithm of Beckstette 2006.
TFMLazyScore(mat, pvalue, bg=c(A=0.25, C=0.25, G=0.25, T=0.25), type=c("PFM", "PWM"), granularity=1e-5)
TFMLazyScore(mat, pvalue, bg=c(A=0.25, C=0.25, G=0.25, T=0.25), type=c("PFM", "PWM"), granularity=1e-5)
mat |
The input matrix. It can be a Position Frequency Matrix (PFM) or Position Weight Matrix (PWM) in log ratio. The matrix must have row names with "A", "C", "G", "T". |
pvalue |
The required P-value. |
bg |
The background frequency of the sequences. A numeric vector with names "A", "C", "G", "T". |
type |
The type of input matrix. Can be "PFM" or "PWM". |
granularity |
The granularity used in the computation. |
The score is returned based on the matrix, given P-value and granularity.
Ge Tan
## This example is not tested due to running time > 5s pfm <- matrix(c(3, 5, 4, 2, 7, 0, 3, 4, 9, 1, 1, 3, 3, 6, 4, 1, 11, 0, 3, 0, 11, 0, 2, 1, 11, 0, 2, 1, 3, 3, 2, 6, 4, 1, 8, 1, 3, 4, 6, 1, 8, 5, 1, 0, 8, 1, 4, 1, 9, 0, 2, 3, 9, 5, 0, 0, 11, 0, 3, 0, 2, 7, 0, 5), nrow = 4, dimnames = list(c("A","C","G","T")) ) bg <- c(A=0.25, C=0.25, G=0.25, T=0.25) pvalue <- 1e-5 type <- "PFM" granularity <- 1e-5 TFMLazyScore(pfm, pvalue, bg, type, granularity)
## This example is not tested due to running time > 5s pfm <- matrix(c(3, 5, 4, 2, 7, 0, 3, 4, 9, 1, 1, 3, 3, 6, 4, 1, 11, 0, 3, 0, 11, 0, 2, 1, 11, 0, 2, 1, 3, 3, 2, 6, 4, 1, 8, 1, 3, 4, 6, 1, 8, 5, 1, 0, 8, 1, 4, 1, 9, 0, 2, 3, 9, 5, 0, 0, 11, 0, 3, 0, 2, 7, 0, 5), nrow = 4, dimnames = list(c("A","C","G","T")) ) bg <- c(A=0.25, C=0.25, G=0.25, T=0.25) pvalue <- 1e-5 type <- "PFM" granularity <- 1e-5 TFMLazyScore(pfm, pvalue, bg, type, granularity)
Computes the score threshold associated with a P-value.
TFMpv2sc(mat, pvalue, bg=c(A=0.25, C=0.25, G=0.25, T=0.25), type=c("PFM", "PWM"))
TFMpv2sc(mat, pvalue, bg=c(A=0.25, C=0.25, G=0.25, T=0.25), type=c("PFM", "PWM"))
mat |
The input matrix. It can be a Position Frequency Matrix (PFM) or Position Weight Matrix (PWM) in log ratio. The matrix must have row names with "A", "C", "G", "T". |
pvalue |
The required P-value. |
bg |
The background frequency of the sequences. A numeric vector with names "A", "C", "G", "T". |
type |
The type of input matrix. Can be "PFM" or "PWM". |
The score is returned based on the matrix, given P-value.
Ge Tan
Touzet, H., and Varre, J.-S. (2007). Efficient and accurate P-value computation for Position Weight Matrices. Algorithms Mol Biol 2, 15.
pfm <- matrix(c(3, 5, 4, 2, 7, 0, 3, 4, 9, 1, 1, 3, 3, 6, 4, 1, 11, 0, 3, 0, 11, 0, 2, 1, 11, 0, 2, 1, 3, 3, 2, 6, 4, 1, 8, 1, 3, 4, 6, 1, 8, 5, 1, 0, 8, 1, 4, 1, 9, 0, 2, 3, 9, 5, 0, 0, 11, 0, 3, 0, 2, 7, 0, 5), nrow = 4, dimnames = list(c("A","C","G","T")) ) bg <- c(A=0.25, C=0.25, G=0.25, T=0.25) pvalue <- 1e-5 type <- "PFM" score <- TFMpv2sc(pfm, pvalue, bg, type)
pfm <- matrix(c(3, 5, 4, 2, 7, 0, 3, 4, 9, 1, 1, 3, 3, 6, 4, 1, 11, 0, 3, 0, 11, 0, 2, 1, 11, 0, 2, 1, 3, 3, 2, 6, 4, 1, 8, 1, 3, 4, 6, 1, 8, 5, 1, 0, 8, 1, 4, 1, 9, 0, 2, 3, 9, 5, 0, 0, 11, 0, 3, 0, 2, 7, 0, 5), nrow = 4, dimnames = list(c("A","C","G","T")) ) bg <- c(A=0.25, C=0.25, G=0.25, T=0.25) pvalue <- 1e-5 type <- "PFM" score <- TFMpv2sc(pfm, pvalue, bg, type)
Computes the P-value associated with a score threshold.
TFMsc2pv(mat, score, bg=c(A=0.25, C=0.25, G=0.25, T=0.25), type=c("PFM", "PWM"))
TFMsc2pv(mat, score, bg=c(A=0.25, C=0.25, G=0.25, T=0.25), type=c("PFM", "PWM"))
mat |
The input matrix. It can be a Position Frequency Matrix (PFM) or Position Weight Matrix (PWM) in log ratio. The matrix must have row names with "A", "C", "G", "T". |
score |
The required score. |
bg |
The background frequency of the sequences. A numeric vector with names "A", "C", "G", "T". |
type |
The type of input matrix. Can be "PFM" or "PWM". |
The P-value is returned based on the matrix, given the desired score.
Ge Tan
Touzet, H., and Varre, J.-S. (2007). Efficient and accurate P-value computation for Position Weight Matrices. Algorithms Mol Biol 2, 15.
pfm <- matrix(c(3, 5, 4, 2, 7, 0, 3, 4, 9, 1, 1, 3, 3, 6, 4, 1, 11, 0, 3, 0, 11, 0, 2, 1, 11, 0, 2, 1, 3, 3, 2, 6, 4, 1, 8, 1, 3, 4, 6, 1, 8, 5, 1, 0, 8, 1, 4, 1, 9, 0, 2, 3, 9, 5, 0, 0, 11, 0, 3, 0, 2, 7, 0, 5), nrow = 4, dimnames = list(c("A","C","G","T")) ) bg <- c(A=0.25, C=0.25, G=0.25, T=0.25) score <- 8.77 type <- "PFM" pvalue <- TFMsc2pv(pfm, score, bg, type)
pfm <- matrix(c(3, 5, 4, 2, 7, 0, 3, 4, 9, 1, 1, 3, 3, 6, 4, 1, 11, 0, 3, 0, 11, 0, 2, 1, 11, 0, 2, 1, 3, 3, 2, 6, 4, 1, 8, 1, 3, 4, 6, 1, 8, 5, 1, 0, 8, 1, 4, 1, 9, 0, 2, 3, 9, 5, 0, 0, 11, 0, 3, 0, 2, 7, 0, 5), nrow = 4, dimnames = list(c("A","C","G","T")) ) bg <- c(A=0.25, C=0.25, G=0.25, T=0.25) score <- 8.77 type <- "PFM" pvalue <- TFMsc2pv(pfm, score, bg, type)