Computes the absolute distance between the midpoint of each x interval and the midpoints of each closest y interval.

bed_absdist(x, y, genome)

Arguments

x

ivl_df

y

ivl_df

genome

genome_df

Value

ivl_df with .absdist and .absdist_scaled columns.

Details

Absolute distances are scaled by the inter-reference gap for the chromosome as follows. For Q query points and R reference points on a chromosome, scale the distance for each query point i to the closest reference point by the inter-reference gap for each chromosome. If an x interval has no matching y chromosome, .absdist is NA.

$$d_i(x,y) = min_k(|q_i - r_k|)\frac{R}{Length\ of\ chromosome}$$

Both absolute and scaled distances are reported as .absdist and .absdist_scaled.

Interval statistics can be used in combination with dplyr::group_by() and dplyr::do() to calculate statistics for subsets of data. See vignette('interval-stats') for examples.

See also

Examples

genome <- read_genome(valr_example('hg19.chrom.sizes.gz')) x <- bed_random(genome, seed = 1010486) y <- bed_random(genome, seed = 9203911) bed_absdist(x, y, genome)
#> # A tibble: 1,000,000 x 5 #> chrom start end .absdist .absdist_scaled #> <chr> <int> <int> <dbl> <dbl> #> 1 chr1 323 1323 302 0.0977 #> 2 chr1 2032 3032 2011 0.651 #> 3 chr1 2475 3475 2454 0.794 #> 4 chr1 2759 3759 2226 0.720 #> 5 chr1 2766 3766 2219 0.718 #> 6 chr1 3528 4528 1457 0.471 #> 7 chr1 8394 9394 207 0.0670 #> 8 chr1 8819 9819 218 0.0705 #> 9 chr1 12963 13963 788 0.255 #> 10 chr1 24939 25939 270 0.0873 #> # … with 999,990 more rows