Skip to contents

Quantify the usage of different V(D)J segments for each cell cluster. The usage of two V(D)J segments can also be calculated for a single chain. For example, calc_gene_usage() can calculate the frequency that different heavy chain V and J segments appear together.

Usage

calc_gene_usage(
  input,
  data_cols,
  cluster_col = NULL,
  chain = NULL,
  chain_col = global$chain_col,
  prefix = paste0(data_cols[1], "_"),
  return_df = FALSE,
  sep = global$sep
)

Arguments

input

Object containing V(D)J data. If a data.frame is provided, the cell barcodes should be stored as row names.

data_cols

meta.data column(s) containing V(D)J genes identified for each clonotype. If multiple columns are provided, paired usage of genes will be calculated.

cluster_col

meta.data column containing cell clusters to use when calculating gene usage

chain

Chain(s) to use for calculating gene usage. Set to NULL to include all chains.

chain_col

meta.data column containing chains for each cell

prefix

Prefix to add to new columns

return_df

Return results as a data.frame. If FALSE, results will be added to the input object.

sep

Separator used for storing per cell V(D)J data

Value

data.frame containing gene usage summary

Examples

# Calculate V(D)J segment usage for all cells
calc_gene_usage(
  vdj_sce,
  data_cols = "v_gene"
)
#> class: SingleCellExperiment 
#> dim: 200 200 
#> metadata(0):
#> assays(1): counts
#> rownames(200): Gm48486 Trav19 ... Hkdc1 Wdr62
#> rowData names(0):
#> colnames(200): 1_AAGCCGCAGCTTATCG-1 1_AATCCAGCATTACGAC-1 ...
#>   2_TTCTACAAGGCCCTTG-1 2_TTTGTCATCCAAAGTC-1
#> colData names(51): orig.ident nCount_RNA ... v_gene_freq
#>   v_gene_pct
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

# Calculate gene usage separately for cell clusters
calc_gene_usage(
  vdj_sce,
  data_cols = "v_gene",
  cluster_col = "orig.ident"
)
#> class: SingleCellExperiment 
#> dim: 200 200 
#> metadata(0):
#> assays(1): counts
#> rownames(200): Gm48486 Trav19 ... Hkdc1 Wdr62
#> rowData names(0):
#> colnames(200): 1_AAGCCGCAGCTTATCG-1 1_AATCCAGCATTACGAC-1 ...
#>   2_TTCTACAAGGCCCTTG-1 2_TTTGTCATCCAAAGTC-1
#> colData names(52): orig.ident nCount_RNA ... v_gene_pct
#>   v_gene_shared
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

# Calculate gene usage for a specific chain(s)
calc_gene_usage(
  vdj_sce,
  data_cols = "v_gene",
  chain = c("IGK", "IGL")
)
#> class: SingleCellExperiment 
#> dim: 200 200 
#> metadata(0):
#> assays(1): counts
#> rownames(200): Gm48486 Trav19 ... Hkdc1 Wdr62
#> rowData names(0):
#> colnames(200): 1_AAGCCGCAGCTTATCG-1 1_AATCCAGCATTACGAC-1 ...
#>   2_TTCTACAAGGCCCTTG-1 2_TTTGTCATCCAAAGTC-1
#> colData names(51): orig.ident nCount_RNA ... v_gene_freq
#>   v_gene_pct
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

# Calculate paired usage of V(D)J segments
calc_gene_usage(
  vdj_sce,
  data_cols = c("v_gene", "j_gene"),
)
#> # A tibble: 41 × 5
#>    v_gene     j_gene n_cells  freq   pct
#>    <chr>      <chr>    <int> <int> <dbl>
#>  1 IGKV5-43   IGKJ2       67    36 53.7 
#>  2 IGKV6-23   IGKJ5       67     3  4.48
#>  3 IGLV1      IGLJ1       67     3  4.48
#>  4 IGKV8-24   IGKJ5       67     2  2.99
#>  5 IGKV10-96  IGKJ1       67     2  2.99
#>  6 IGKV17-127 IGKJ5       67     2  2.99
#>  7 IGHV5-17   IGHJ4       67     1  1.49
#>  8 IGKV4-72   IGKJ2       67     1  1.49
#>  9 IGKV4-54   IGKJ5       67     1  1.49
#> 10 IGHV5-4    IGHJ4       67     1  1.49
#> # ℹ 31 more rows