Title: | Compute the Label Error of Peak Calls |
---|---|
Description: | Chromatin immunoprecipitation DNA sequencing results in genomic tracks that show enriched regions or peaks where proteins are bound. This package implements fast C code that computes the true and false positives with respect to a database of annotated region labels. |
Authors: | Toby Dylan Hocking |
Maintainer: | Toby Dylan Hocking <[email protected]> |
License: | GPL-3 |
Version: | 2023.9.4 |
Built: | 2024-12-31 04:31:51 UTC |
Source: | https://github.com/tdhock/peakerror |
Check for a valid data.frame with chrom names.
checkChrom(df)
checkChrom(df)
df |
df |
Toby Dylan Hocking
Check for a valid data.frame with chromStart, chromEnd.
checkPositions(df)
checkPositions(df)
df |
df |
Toby Dylan Hocking
Compute true and false positive peak calls, with respect to a
database of annotated regions
.
PeakError(peaks, regions)
PeakError(peaks, regions)
peaks |
data.frame with columns chrom, chromStart, chromEnd. NOTE: chromStart should be 0-based and chromEnd should be 1-based. EXAMPLE: the first 100 base of of a chromosome are chromStart=0, chromEnd=100. The second 100 bases are chromStart=100, chromEnd=200. |
regions |
data.frame with columns chrom, chromStart, chromEnd, annotation. |
data.frame for each region with additional counts of true positives (tp, possible.tp), false positives (fp, possible.fp, fp.status), and false negatives (fn, fn.status).
Toby Dylan Hocking
x <- seq(5, 85, by=5) peaks <- rbind( Peaks("chr2", x, x+3), Peaks("chr3", c(25, 38, 57), c(33, 54, 75)), Peaks("chr4", c(5, 32, 38, 65), c(15, 35, 55, 85)), Peaks("chr5", c(12, 26, 56, 75), c(16, 54, 59, 85))) regions.list <- list() for(chr in 1:5){ regions.list[[chr]] <- data.frame( chrom=paste0("chr", chr), chromStart=c(10, 30, 50, 70), chromEnd=c(20, 40, 60, 80), annotation=c("noPeaks", "peakStart", "peakEnd", "peaks")) } regions <- do.call(rbind, regions.list) err <- PeakError(peaks, regions) ann.colors <- c( noPeaks="#f6f4bf", peakStart="#ffafaf", peakEnd="#ff4c4c", peaks="#a445ee") if(require(ggplot2)){ ggplot()+ geom_rect(aes( xmin=chromStart+1/2, xmax=chromEnd+1/2, ymin=-1, ymax=1, fill=annotation, linetype=fn.status, size=fp.status), data=err, color="black")+ scale_y_continuous("", breaks=NULL)+ scale_linetype_manual( values=c("false negative"="dotted", correct="solid"))+ scale_size_manual( values=c("false positive"=3, correct=1))+ scale_fill_manual( values=ann.colors, breaks=names(ann.colors))+ facet_grid(chrom ~ .)+ theme_bw()+ guides( fill=guide_legend(order=1), linetype=guide_legend(order=2, override.aes=list(fill="white")), size=guide_legend(order=3, override.aes=list(fill="white")))+ theme(panel.margin=grid::unit(0, "cm"))+ geom_segment(aes( chromStart+1/2, 1/2, xend=chromEnd+1/2, yend=1/2), data=peaks, color="deepskyblue", size=2)+ scale_x_continuous( "position on chromosome", breaks=seq(10, 90, by=10))+ geom_text(aes( base, -1/2, label="N"), data.frame(base=10:90), color="deepskyblue") }
x <- seq(5, 85, by=5) peaks <- rbind( Peaks("chr2", x, x+3), Peaks("chr3", c(25, 38, 57), c(33, 54, 75)), Peaks("chr4", c(5, 32, 38, 65), c(15, 35, 55, 85)), Peaks("chr5", c(12, 26, 56, 75), c(16, 54, 59, 85))) regions.list <- list() for(chr in 1:5){ regions.list[[chr]] <- data.frame( chrom=paste0("chr", chr), chromStart=c(10, 30, 50, 70), chromEnd=c(20, 40, 60, 80), annotation=c("noPeaks", "peakStart", "peakEnd", "peaks")) } regions <- do.call(rbind, regions.list) err <- PeakError(peaks, regions) ann.colors <- c( noPeaks="#f6f4bf", peakStart="#ffafaf", peakEnd="#ff4c4c", peaks="#a445ee") if(require(ggplot2)){ ggplot()+ geom_rect(aes( xmin=chromStart+1/2, xmax=chromEnd+1/2, ymin=-1, ymax=1, fill=annotation, linetype=fn.status, size=fp.status), data=err, color="black")+ scale_y_continuous("", breaks=NULL)+ scale_linetype_manual( values=c("false negative"="dotted", correct="solid"))+ scale_size_manual( values=c("false positive"=3, correct=1))+ scale_fill_manual( values=ann.colors, breaks=names(ann.colors))+ facet_grid(chrom ~ .)+ theme_bw()+ guides( fill=guide_legend(order=1), linetype=guide_legend(order=2, override.aes=list(fill="white")), size=guide_legend(order=3, override.aes=list(fill="white")))+ theme(panel.margin=grid::unit(0, "cm"))+ geom_segment(aes( chromStart+1/2, 1/2, xend=chromEnd+1/2, yend=1/2), data=peaks, color="deepskyblue", size=2)+ scale_x_continuous( "position on chromosome", breaks=seq(10, 90, by=10))+ geom_text(aes( base, -1/2, label="N"), data.frame(base=10:90), color="deepskyblue") }
Compute the PeakError
assuming that peaks
and regions
are on the
same chrom.
PeakErrorChrom(peaks, regions)
PeakErrorChrom(peaks, regions)
peaks |
data.frame with columns chromStart, chromEnd. NOTE: chromStart should be 0-based and chromEnd should be 1-based. EXAMPLE: the first 100 base of of a chromosome are chromStart=0, chromEnd=100. The second 100 bases are chromStart=100, chromEnd=200. |
regions |
data.frame with columns chromStart, chromEnd. |
data.frame with 1 row for each region and error columns.
Toby Dylan Hocking
Make a data.frame that represents a list of peaks.
Peaks(chrom = factor(), base.before = integer(), last.base = integer())
Peaks(chrom = factor(), base.before = integer(), last.base = integer())
chrom |
character or factor with |
base.before |
integer, base before peak. |
last.base |
integer, last base of peak. |
data.frame with columns chrom
, chromStart, chromEnd.
Toby Dylan Hocking