Title: | Unbiased Single-Cell Transcriptomic Data Cell Type Identification |
---|---|
Description: | Assignment of cell type labels to single-cell RNA sequencing (scRNA-seq) clusters is often a time-consuming process that involves manual inspection of the cluster marker genes complemented with a detailed literature search. This is especially challenging when unexpected or poorly described populations are present. The clustermole R package provides methods to query thousands of human and mouse cell identity markers sourced from a variety of databases. |
Authors: | Igor Dolgalev [aut, cre] |
Maintainer: | Igor Dolgalev <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.1.1.9000 |
Built: | 2024-11-03 05:09:18 UTC |
Source: | https://github.com/igordot/clustermole |
Perform enrichment of cell type signatures based on the full gene expression matrix.
clustermole_enrichment(expr_mat, species, method = "gsva")
clustermole_enrichment(expr_mat, species, method = "gsva")
expr_mat |
Expression matrix (logCPMs, logFPKMs, or logTPMs) with genes as rows and clusters/populations/samples as columns. |
species |
Species: |
method |
Enrichment method: |
A data frame of enrichment results.
Barbie, D., Tamayo, P., Boehm, J. et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, 108–112 (2009). doi:10.1038/nature08460
Hänzelmann, S., Castelo, R. & Guinney, J. GSVA: Gene set variation analysis for microarray and RNA-Seq data. BMC Bioinformatics 14, 7 (2013). doi:10.1186/1471-2105-14-7
Foroutan, M., Bhuva, D.D., Lyu, R. et al. Single sample scoring of molecular phenotypes. BMC Bioinformatics 19, 404 (2018). doi:10.1186/s12859-018-2435-4
# my_enrichment <- clustermole_enrichment(expr_mat = my_expr_mat, species = "hs")
# my_enrichment <- clustermole_enrichment(expr_mat = my_expr_mat, species = "hs")
Retrieve the full list of cell type markers in the clustermole
database.
clustermole_markers(species = c("hs", "mm"))
clustermole_markers(species = c("hs", "mm"))
species |
Species: |
A data frame of cell type markers (one gene per row).
markers <- clustermole_markers() head(markers)
markers <- clustermole_markers() head(markers)
Perform overrepresentation analysis for a set of genes compared to all cell type signatures.
clustermole_overlaps(genes, species)
clustermole_overlaps(genes, species)
genes |
A vector of genes. |
species |
Species: |
A data frame of enrichment results with hypergeometric test p-values.
my_genes <- c("CD2", "CD3D", "CD3E", "CD3G", "TRAC", "TRBC2", "LTB") my_overlaps <- clustermole_overlaps(genes = my_genes, species = "hs") head(my_overlaps)
my_genes <- c("CD2", "CD3D", "CD3E", "CD3G", "TRAC", "TRBC2", "LTB") my_overlaps <- clustermole_overlaps(genes = my_genes, species = "hs") head(my_overlaps)
Read a GMT file into a data frame
read_gmt(file, geneset_label = "celltype", gene_label = "gene")
read_gmt(file, geneset_label = "celltype", gene_label = "gene")
file |
A connection object or a character string (can be a URL). |
geneset_label |
Column name for gene sets (first column of the GMT file) in the output data frame. |
gene_label |
Column name for genes (variable columns of the GMT file) in the output data frame. |
A data frame with gene sets as the first column and genes as the second column (one gene per row).
gmt <- "http://software.broadinstitute.org/gsea/msigdb/supplemental/scsig.all.v1.0.symbols.gmt" gmt_tbl <- read_gmt(gmt) head(gmt_tbl)
gmt <- "http://software.broadinstitute.org/gsea/msigdb/supplemental/scsig.all.v1.0.symbols.gmt" gmt_tbl <- read_gmt(gmt) head(gmt_tbl)