Skip to contents

Projection test for query interval overlap.

Usage

bed_projection(x, y, genome, by_chrom = FALSE)

Arguments

x

ivl_df

y

ivl_df

genome

genome_df

by_chrom

compute test per chromosome

Value

ivl_df with the following columns:

  • chrom the name of chromosome tested if by_chrom = TRUE, otherwise has a value of whole_genome

  • p.value p-value from a binomial test. p-values > 0.5 are converted to 1 - p-value and lower_tail is FALSE

  • obs_exp_ratio ratio of observed to expected overlap frequency

  • lower_tail TRUE indicates the observed overlaps are in the lower tail of the distribution (e.g., less overlap than expected). FALSE indicates that the observed overlaps are in the upper tail of the distribution (e.g., more overlap than expected)

Details

Interval statistics can be used in combination with dplyr::group_by() and dplyr::do() to calculate statistics for subsets of data. See vignette('interval-stats') for examples.

Examples

genome <- read_genome(valr_example("hg19.chrom.sizes.gz"))

x <- bed_random(genome, seed = 1010486)
y <- bed_random(genome, seed = 9203911)

bed_projection(x, y, genome)
#> # A tibble: 1 × 4
#>   chrom         p.value obs_exp_ratio lower_tail
#>   <chr>           <dbl>         <dbl> <chr>     
#> 1 whole_genome 0.000850          1.01 FALSE     

bed_projection(x, y, genome, by_chrom = TRUE)
#> # A tibble: 25 × 4
#>    chrom p.value obs_exp_ratio lower_tail
#>    <chr>   <dbl>         <dbl> <chr>     
#>  1 chr1   0.176          1.01  FALSE     
#>  2 chr10  0.0209         1.02  FALSE     
#>  3 chr11  0.118          1.01  FALSE     
#>  4 chr12  0.479          1.00  FALSE     
#>  5 chr13  0.245          1.01  FALSE     
#>  6 chr14  0.120          0.990 TRUE      
#>  7 chr15  0.333          1.00  FALSE     
#>  8 chr16  0.213          1.01  FALSE     
#>  9 chr17  0.181          0.991 TRUE      
#> 10 chr18  0.0306         1.02  FALSE     
#> # ℹ 15 more rows