Chapter 2 Introduction

scRNASequest is a semi-automated single-cell RNA-seq (scRNA-seq) data analysis workflow which allows the following five different functionalities:

  • Pre-processing from single-cell RNA sequencing UMI count matrix data (better generated with Cell Ranger)

  • Applying multiple harmonization methods for batch correction

  • Reference-dataset-based cell type label transfer and embedding projection

  • Multi-sample multi-condition single-cell level differential gene expression analysis

  • Seamless integration with cellxgene VIP for visualization and with CellDepot for data hosting and sharing by generating a compatible h5ad file. Users have the option to run it on a local laptop computer or interact with sge/slurm schedulers on high performance computing (HPC) clusters.

This pipeline contains the following programs:

  • scAnalyzer

    This is the main script in scRNASequest for single-cell data processing and analysis.

    Also, scAnalyzer is an all-in-one program for downstream data analysis, including pre-processing, batch correction, label transfer, differential gene expression analysis and CellDepot integration. Moreover, scAnalyzer is embedded with a Bookdown report generator, named scReport, which can produce a user-friendly, well-structured quality control report after each run.

  • scDEG

    scDEG is a standalone pipeline for differential expression (DE) analysis using harmonized data. It has been embedded into the scAnalyzer pipeline.

  • scRef

    scRef generates a reference dataset for label transfer.

    For each reference dataset, this program only needs to run once, and the reference h5ad file will be placed in a permanent path.

  • sc2celldepot

    sc2celldepot offers a easy and fast way to convert your own RDS file or h5 results+cell annotation to Cellxgene VIP, without running the full pipeline. If you have run the full pipeline script (scAnalyzer), you should get the h5ad (UMI, embedding, etc.) and db (DE information) files for CellDepot publishing, and you won’t need to run sc2celldepot.

  • scTool

    scTool provides flexibility for modifying the h5ad file, such as modifying annotation and extracting meta information.

  • scRMambient

    scRMambient is a wrapper function to remove ambient RNAs using CellBender.

A detailed comparison of the above scripts can be found below:

Table 2.1: Pipeline scripts and their function
Command Description Input Output
scAnalyzer Main program to perform full scRNA-seq data anal-ysis with QC and data harmonization A path to analysis config file (config.yml) Final analysis results in h5ad and h5seurat format
scDEG Program to perform DEG analysis between two phenotypes within each cluster of an annotation (such as cell types) A path to a DEG config file (config_DEG.yml) One DEG table for each cluster and an SQLite db file of all comparisons
scRef Program to create Seurat ‘Azimuth’ references A path to a reference con-fig file (config_ref.yml) An RDS object with ‘Azi-muth’ reference for scAnalyzer
sc2celldepot Program to transfer ana-lyzed data into h5ad for the cellxgene VIP (CellDepot) loading A path to a data config file (config_convert.yml) An h5ad file
scTool Tool to add, remove or export express/annotation from an h5ad A path to an h5ad file A modified h5ad file or a csv file
scRMambient Remove ambient RNA by the CellBender A path to a sample metadata file containing paths to raw (unfiltered) UMI along with a few Cell-Bender parameters CellBender filtered UMI counts in h5 format