Chapter 2 Getting start with CellDepot

2.1 Sources of annotation and metadata

The original metadata information of each scRNA-seq dataset is retrieved from h5ad file, which is a preferred way of sharing and storing an on-disk representation of anndata object. When importing the dataset to the system, user inputs additional metadata information as shown in (4.6). Both metadata are collected and stored in a MySQL database table that is presented at http://celldepot.bxgenomics.com and Biogen internal instance, http://go.biogen.com/CellDepot.

2.2 Data format, availability, and preparation

CellDepot requires scRNA-seq data in h5ad file where the expression matrix is stored in CSC (compressed sparse column) instead of CSR (compressed sparse row) format to improve the speed of data retrieving. For example, designating genes as columns in the h5ad file creates the interactive plot five times faster than as rows. Just in case, we provide sample scripts to help users generate h5ad files. Having gene expression matrix, metadata, and layout files, users can easily combine and convert their data to h5ad file by following this R script on https://github.com/interactivereport/CellDepot/blob/main/toH5ad.R. In the case of lacking layout file, users can also create h5ad file by following the Jupyter notebook https://github.com/interactivereport/CellDepot/blob/main/raw2h5ad.ipynb with custom python script tailored to their own data. Categorical features extracted from a h5ad file are shown in the ‘annotation groups’ column of the table on CellDepot home page, while the numerical features are shown as the histograms in the rightmost panel on cellxgene VIP. (4.4.2)

2.3 CellDepot platform and installation

The public version of CellDepot web portal is hosted at the web site, http://celldepot.bxgenomics.com and Biogen internal link http://go.biogen.com/CellDepot. It is implemented with MySQL database, an advanced search engine, and powerful interactive visualizing tools that allow users to explore attributes of datasets as well as scRNA-seq analysis results. Also, users can intentionally select single-cell RNA-seq datasets on the web interface by simply browsing the online dataset table or applying advanced search to perform the cross-dataset comparison. Moreover, CellDepot also provides comprehensive data analysis tools via an embedded interactive visualization plugin. To host private datasets, local instance of CellDepot on Unix server can be installed by following the guide here, https://celldepot.bxgenomics.com/celldepot_manual/install_environment.php.

2.4 How to set up cron job?

The following cron job entry is needed to convert h5ad file to CSC format on the background,

@hourly <user-name> cd /var/www/html/celldepot/app/core; php ./api_toCSCh5ad.php

Note: Please make sure that the user has the permission to write in the data directory.

2.5 CellDepot API (Application Programming Interface)

The CellDepot API web service provides a direct way to generate figures for users to share or embed in web page. For example, the following URL will generate a gene expression violin plot across cell clusters for IRAK4 gene for the data set with ID equaling one, https://celldepot.bxgenomics.com/celldepot/app/core/api_gene_plot.php?ID=1&Genes=IRAK4&Plot_Type=violin&Subsampling=0&n=0&g=0&Project_Group=CLUSTER. The complete format of the URL and explanation of parameters are detailed in the online documentation, https://celldepot.bxgenomics.com/celldepot_manual/api_gene_plot.php.

2.6 Code availability

The source code, links to tutorials and other supplementary documents are provided at https://github.com/interactivereport/CellDepot. With broad adoption and contribution in mind, CellDepot is released under the MIT open-source license. The detailed instruction of local installation is available at https://celldepot.bxgenomics.com/celldepot_manual.

2.7 Online tutorials

To better assist biologists to use CellDepot and integrated cellxgene VIP visual analytical tool, we created online easy-to-access HTML tutorials with step-by-step guides available at https://interactivereport.github.io/CellDepot/bookdown/docs/SITutorial.html for CellDepot and https://interactivereport.github.io/cellxgene_VIP/tutorial/docs/how-to-use-cellxgene-vip.html for cellxgene VIP, respectively. In addition, a question mark next to the title of each VIP function module provides a direct way to reach the corresponding section of the HTML document for help.