Projection test for query interval overlap.

bed_projection(x, y, genome, by_chrom = FALSE)

Arguments

x

ivl_df

y

ivl_df

genome

genome_df

by_chrom

compute test per chromosome

Value

ivl_df with the following columns:

  • chrom the name of chromosome tested if by_chrom = TRUE, otherwise has a value of whole_genome

  • p.value p-value from a binomial test. p-values > 0.5 are converted to 1 - p-value and lower_tail is FALSE

  • obs_exp_ratio ratio of observed to expected overlap frequency

  • lower_tail TRUE indicates the observed overlaps are in the lower tail of the distribution (e.g., less overlap than expected). FALSE indicates that the observed overlaps are in the upper tail of the distribution (e.g., more overlap than expected)

Details

Interval statistics can be used in combination with dplyr::group_by() and dplyr::do() to calculate statistics for subsets of data. See vignette('interval-stats') for examples.

See also

Examples

genome <- read_genome(valr_example('hg19.chrom.sizes.gz')) x <- bed_random(genome, seed = 1010486) y <- bed_random(genome, seed = 9203911) bed_projection(x, y, genome)
#> # A tibble: 1 x 4 #> chrom p.value obs_exp_ratio lower_tail #> <chr> <dbl> <dbl> <chr> #> 1 whole_genome 0.0714 1.00 FALSE
bed_projection(x, y, genome, by_chrom = TRUE)
#> # A tibble: 25 x 4 #> chrom p.value obs_exp_ratio lower_tail #> <chr> <dbl> <dbl> <chr> #> 1 chr1 0.302 1.00 FALSE #> 2 chr10 0.305 1.00 FALSE #> 3 chr11 0.306 0.996 TRUE #> 4 chr12 0.0314 1.01 FALSE #> 5 chr13 0.449 1.00 FALSE #> 6 chr14 0.125 1.01 FALSE #> 7 chr15 0.317 1.00 FALSE #> 8 chr16 0.248 1.01 FALSE #> 9 chr17 0.381 1.00 FALSE #> 10 chr18 0.190 0.991 TRUE #> # … with 15 more rows