Main function to compare scRNA-seq data to gene lists.
Usage
clustify_lists(input, ...)
# Default S3 method
clustify_lists(
input,
marker,
marker_inmatrix = TRUE,
metadata = NULL,
cluster_col = NULL,
if_log = TRUE,
per_cell = FALSE,
topn = 800,
cut = 0,
genome_n = 30000,
metric = "hyper",
output_high = TRUE,
lookuptable = NULL,
obj_out = TRUE,
seurat_out = obj_out,
vec_out = FALSE,
rename_prefix = NULL,
threshold = 0,
low_threshold_cell = 0,
verbose = TRUE,
input_markers = FALSE,
details_out = FALSE,
...
)
# S3 method for class 'Seurat'
clustify_lists(
input,
metadata = NULL,
cluster_col = NULL,
if_log = TRUE,
per_cell = FALSE,
topn = 800,
cut = 0,
marker,
marker_inmatrix = TRUE,
genome_n = 30000,
metric = "hyper",
output_high = TRUE,
dr = "umap",
obj_out = TRUE,
seurat_out = obj_out,
vec_out = FALSE,
threshold = 0,
rename_prefix = NULL,
verbose = TRUE,
details_out = FALSE,
...
)
# S3 method for class 'SingleCellExperiment'
clustify_lists(
input,
metadata = NULL,
cluster_col = NULL,
if_log = TRUE,
per_cell = FALSE,
topn = 800,
cut = 0,
marker,
marker_inmatrix = TRUE,
genome_n = 30000,
metric = "hyper",
output_high = TRUE,
dr = "umap",
obj_out = TRUE,
seurat_out = obj_out,
vec_out = FALSE,
threshold = 0,
rename_prefix = NULL,
verbose = TRUE,
details_out = FALSE,
...
)
Arguments
- input
single-cell expression matrix, Seurat object, or SingleCellExperiment
- ...
passed to matrixize_markers
- marker
matrix or dataframe of candidate genes for each cluster
- marker_inmatrix
whether markers genes are already in preprocessed matrix form
- metadata
cell cluster assignments, supplied as a vector or data.frame. If data.frame is supplied then
cluster_col
needs to be set. Not required if running correlation per cell.- cluster_col
column in metadata with cluster number
- if_log
input data is natural log, averaging will be done on unlogged data
- per_cell
compare per cell or per cluster
- topn
number of top expressing genes to keep from input matrix
- cut
expression cut off from input matrix
- genome_n
number of genes in the genome
- metric
adjusted p-value for hypergeometric test, or jaccard index
- output_high
if true (by default to fit with rest of package), -log10 transform p-value
- lookuptable
if not supplied, will look in built-in table for object parsing
- obj_out
whether to output object instead of cor matrix
- seurat_out
output cor matrix or called seurat object (deprecated, use obj_out instead)
- vec_out
only output a result vector in the same order as metadata
- rename_prefix
prefix to add to type and r column names
- threshold
identity calling minimum correlation score threshold, only used when obj_out = T
- low_threshold_cell
option to remove clusters with too few cells
- verbose
whether to report certain variables chosen and steps
- input_markers
whether input is marker data.frame of 0 and 1s (output of pos_neg_marker), and uses alternate enrichment mode
- details_out
whether to also output shared gene list from jaccard
- dr
stored dimension reduction
Value
matrix of numeric values, clusters from input as row names, cell types from marker_mat as column names
Examples
# Annotate a matrix and metadata
# Annotate using a different method
clustify_lists(
input = pbmc_matrix_small,
marker = cbmc_m,
metadata = pbmc_meta,
cluster_col = "classified",
verbose = TRUE,
metric = "jaccard"
)
#> list of markers instead of matrix, only supports jaccard
#> similarity computation completed, matrix of 9 x 13, preparing output
#> CD4 T CD8 T Memory CD4 T CD14+ Mono Naive CD4 T
#> Naive CD4 T 0.001246883 0.001246883 0 0.003750000 0
#> Memory CD4 T 0.001246883 0.000000000 0 0.003750000 0
#> CD14+ Mono 0.000000000 0.000000000 0 0.003750000 0
#> B 0.000000000 0.000000000 0 0.003750000 0
#> CD8 T 0.001246883 0.000000000 0 0.002496879 0
#> FCGR3A+ Mono 0.000000000 0.000000000 0 0.003750000 0
#> NK 0.000000000 0.001246883 0 0.003750000 0
#> DC 0.000000000 0.000000000 0 0.003750000 0
#> Platelet 0.001166861 0.000000000 0 0.003508772 0
#> NK B CD16+ Mono CD34+ Eryth Mk
#> Naive CD4 T 0.003750000 0.000000000 0.000000000 0 0 0.000000000
#> Memory CD4 T 0.003750000 0.000000000 0.000000000 0 0 0.000000000
#> CD14+ Mono 0.002496879 0.000000000 0.000000000 0 0 0.000000000
#> B 0.002496879 0.002496879 0.000000000 0 0 0.000000000
#> CD8 T 0.003750000 0.000000000 0.000000000 0 0 0.000000000
#> FCGR3A+ Mono 0.001246883 0.000000000 0.001246883 0 0 0.000000000
#> NK 0.003750000 0.000000000 0.000000000 0 0 0.000000000
#> DC 0.002496879 0.001246883 0.000000000 0 0 0.000000000
#> Platelet 0.002336449 0.002336449 0.000000000 0 0 0.003508772
#> DC pDCs
#> Naive CD4 T 0.000000000 0.000000000
#> Memory CD4 T 0.000000000 0.000000000
#> CD14+ Mono 0.000000000 0.000000000
#> B 0.000000000 0.000000000
#> CD8 T 0.000000000 0.000000000
#> FCGR3A+ Mono 0.000000000 0.000000000
#> NK 0.000000000 0.000000000
#> DC 0.001246883 0.002496879
#> Platelet 0.000000000 0.000000000