Skip to contents

Plot the usage of different V(D)J segments for each cell cluster. The usage of two V(D)J segments can also be plotted for a single chain.

Usage

plot_gene_usage(
  input,
  data_cols,
  cluster_col = NULL,
  group_col = NULL,
  method = NULL,
  units = "percent",
  genes = 20,
  return_list = FALSE,
  plot_colors = NULL,
  plot_lvls = NULL,
  trans = "identity",
  rotate_labels = FALSE,
  panel_nrow = NULL,
  show_points = TRUE,
  show_zeros = TRUE,
  n_label = NULL,
  p_label = c(value = 0.05),
  p_method = NULL,
  p_file = NULL,
  label_params = list(),
  ...,
  chain = NULL,
  chain_col = global$chain_col,
  sep = global$sep
)

Arguments

input

Object containing V(D)J data. If a data.frame is provided, the cell barcodes should be stored as row names.

data_cols

meta.data column containing genes for each clonotype, provide a vector with two column names to plot paired usage of genes

cluster_col

meta.data column containing cell clusters to use for calculating gene usage

group_col

meta.data column to use for grouping cluster IDs present in cluster_col. This is useful when there are multiple replicates or patients for each treatment condition.

method

Method to use for plotting, possible values are:

  • 'bar', create a bargraph, this is the default when a single column is passed to the data_cols argument

  • 'boxplot', create boxplots, this can only be used when group_col is provided

  • 'heatmap', create a heatmap, this is the default when two columns are passed to the data_cols argument

  • 'circos', create a circos plot, this requires two columns to be provided to the data_cols argument

units

Units to plot on the y-axis, either 'frequency' or 'percent'

genes

An integer specifying the number of genes to plot, or a vector giving the names of genes to include.

return_list

Should a list of plots be returned, if FALSE plots will be combined and arranged into panels

plot_colors

Character vector containing colors to use for plot. If a bar graph is created this will specify how to color cell clusters. For a heatmap, these colors will be used to generate the color gradient.

plot_lvls

Levels to use for ordering clusters

trans

Transformation to use when plotting segment usage, e.g. 'log10'. By default values are not transformed, refer to ggplot2::continuous_scale() for more options.

rotate_labels

Should labels on circos plot be rotated to reduce overlapping text

panel_nrow

The number of rows to use for arranging plots when return_list is FALSE

show_points

If TRUE data points will be shown on boxplots, the point size can be adjusted using the point.size parameter

show_zeros

If TRUE cell labels that are missing from a cluster will still be shown on the plot

n_label

Location on plot where n label should be added, this is only applicable when method is 'bar' and can be any combination of the following:

  • 'corner', display the total number of cells plotted in the top right corner, the position of the label can be modified by passing x and y specifications with the label_params argument

  • 'legend', display the number of cells plotted for each group shown in the plot legend

  • 'none', do not display the number of cells plotted

p_label

Specification indicating how p-values should be labeled on plot, this can one of the following:

  • 'none', do not display p-values

  • 'all', show p-values for all groups

  • A named vector providing p-value cutoffs and labels to display, e.g. c('*' = 0.05, '**' = 0.01, '***' = 0.001). The keyword 'value' can be used to display the p-value for those less than a certain cutoff, e.g. c(value = 0.05, ns = Inf) will show significant p-values, all others will be labeled 'ns'.

p_method

Method to use for calculating p-values. By default when comparing two groups a t-test will be performed, when comparing more than two groups the Kruskal-Wallis test will be used. p-values are adjusted for multiple testing using Bonferroni correction. Possible methods include:

p_file

File path to save table containing p-values for each comparison.

label_params

Named list providing additional parameters to modify n label aesthetics, e.g. list(size = 4, color = "red")

...

Additional arguments to pass to plotting function, ggplot2::geom_col() for bargraph, ggplot2::geom_tile() for heatmap, circlize::chordDiagram() for circos plot

chain

Chain to use for calculating gene usage, set to NULL to include all chains

chain_col

meta.data column containing chains for each cell

sep

Separator used for storing per-chain V(D)J data for each cell

Value

ggplot object

Examples

# Plot V(D)J segment usage for all cells
plot_gene_usage(
  vdj_sce,
  data_cols = "v_gene"
)


# Plot gene usage separately for cell clusters
plot_gene_usage(
  vdj_sce,
  data_cols = "v_gene",
  cluster_col = "orig.ident"
)


# Plot gene usage for a specific chain
plot_gene_usage(
  vdj_sce,
  data_cols = "v_gene",
  chain = c("IGH", "IGK")
)


# Plot paired usage of V(D)J segments
plot_gene_usage(
  vdj_sce,
  data_cols = c("v_gene", "j_gene"),
  type = "circos"
)
#> Warning: Ignoring unknown parameters: `type`

# Specify colors to use for each cell cluster
plot_gene_usage(
  vdj_sce,
  data_cols = "v_gene",
  cluster_col = "orig.ident",
  plot_colors = c(avid_2 = "blue", avid_1 = "green")
)


# Specify order to use for plotting cell clusters
plot_gene_usage(
  vdj_sce,
  data_cols = "v_gene",
  cluster_col = "orig.ident",
  plot_lvls = c("avid_2", "avid_1")
)


# Specify certain V(D)J genes to include in plot
plot_gene_usage(
  vdj_sce,
  data_cols = "v_gene",
  vdj_genes = c("IGKV5-43", "IGLV1", "IGHV1-64")
)
#> Warning: Ignoring unknown parameters: `vdj_genes`


# Specify the number of top V(D)J genes to include in plot
plot_gene_usage(
  vdj_sce,
  data_cols = "v_gene",
  genes = 10
)