Chapter 3 Installation

3.1 Install ExpressionAnalysis

First we install the ExpressionAnalysis by downloading the scripts from GitHub:

git clone https://github.com/interactivereport/RNASequest.git
cd RNASequest

#Install RNASequest conda environment
#Please make sure you have conda installed before, and this step may take a while
bash install

#Activate the conda environment
conda activate ExpressionAnalysis

#Also, .env will be created under the src directory:
ls ~/RNASequest/src/.env

#Check the path of current directory and add it to $PATH:
CurrentDir=`pwd`
export PATH="$CurrentDir:$PATH"

#However, the above command only adds the RNASequest directory to $PATH temporarily
#To add it to the environment permanently, edit ~/.bash_profile or ~/.bashrc:
vim ~/.bash_profile
#Add the full path of the RNASequest directory to $PATH, for example, $HOME/RNASequest
PATH=$PATH:$HOME/RNASequest
#Source the file
source ~/.bash_profile

Set the default paths for ExpressionAnalysis:

In sys_example.yml file is used to set the default configurations of the whole pipeline, including paths for Quickomics visualization. Please modify this file before running the pipeline, since the paths will be different from server to server.

Please use this sys_example.yml to create a sys.yml file under the src directory, and fill in the necessary information. Here is an example of the sys.yml file:

The QuickOmics_test_folder and QuickOmics_publish_folder are defined by the config.csv file in the QuickOmics app folder (See 3.2).

## path to the reference genome is on HPC
genome_path: /path/to/genome/references
notCovariates: [Sapio_Submission_ID,Sapio_Sample_ID,Sapio_Request_Name,Sapio_Request_ID,Sapio_URL,Sapio_Plate_ID,Sapio_Profile,
                TST_ID,Status_Time,Well,Organism,ID,Sample_Process,Plate_Name,Volume, #Well_Row,
                Sample_Type,Sample_Name,Concentration, #Well_Column,
                Annotated_By,Ethnicity,Race]

#sample2meta: [Sapio_Plate_Name,RIN,Organ,Tissue,Cell_type,Cell_Line,Gender,Disease,Treatment,Timepoint,Genotype,Age]
qc2meta: [Intragenic Rate,Exonic Rate,Mapping Rate,Genes Detected,Mean Per Base Cov.,Estimated Library Size,Mapped,Intergenic Rate,Total Purity Filtered Reads Sequenced,Intronic Rate,Fragment Length Mean,rRNA rate,No. Covered 5',Duplication Rate of Mapped]
QuickOmics_test_folder: /path/to/QuickOmics_test_folder/
QuickOmics_test_link:  http://ngs.biogen.com:3838/Quickomics/?testfile=
QuickOmics_publish_folder: /path/to/QuickOmics_data/
QuiclOmics_publish_link: http://ngs.biogen.com:3838/Quickomics/?serverfile=
shinyApp: http://ngs.biogen.com/shinyone/app/core/
DA_columns: [SampleID, ProjectName, PlatformName, CellType, Collection, DataSource, Description, DiseaseStage, DiseaseState,
             Ethnicity, Gender, Infection, Organism, Response, SamplePathology, SampleSource, SampleType, 
             SamplingTime, SubjectID, Symptom, Tissue, Title, Transfection, Treatment, Age]
DNAnexus:
  prjID: FASTR_NGS_PROJECT_ID
  species: FASTR_REFERENCE_SPECIES
  ref: FASTR_REFERENCE_VERSION
  uName: FASTR_USER_FULLNAME
  count: [genes.estcount_table,genes.expected_count]
  effL: genes.effective_length
  tpm: [genes.tpm_table,genes.tpm]
  indFlag: genes.results.gz
  seqQC: [combined.metrics,merged_metrics]
FileName:
  projectFile: project.yml
  prj_counts: counts
  prj_effLength: length
  prj_seqQC: seqQC
  prj_TPM: tpm
  sample_meta: meta
  gene_annotation: annotation
  comparison_file: comparison
  projectEntry:
    prjID: project
    prjTitle: project
    species: species

3.2 Install Quickomics

3.2.1 Install R packages

Method 1. Create conda environment from yml file. This is the easiest way to install all packages. Please download the file https://github.com/interactivereport/Quickomics/blob/master/conda_environment/QuickOmics.yml, load your conda, then run:

conda env create -f QuickOmics.yml
#The above command may take a while, then you will have a new conda environment called QuickOmics

Method 2. Install all packages in your current R environment. Depending on your system, you may need to install other required programs. We recommend using the conda environment method whenever possible.

cran_packages=c("shiny", "shinythemes", "shinyjs", "plotly", "reshape2", "tidyverse", "gplots", "ggpubr",
"gridExtra", "ggrepel", "RcolorBrewer", "pheatmap", "rgl", "car", "colourpicker", "VennDiagram", "factoextra",
"openxlsx", "visNetwork", "cowplot", "circlize", "svglite", "shinyjqui", "Hmisc", "ggrastr",
"ggExtra", "networkD3", "vctrs", "ragg", "textshaping", "stringi", "plyr", "png", "psych", "broom")

#Note: Hmisc is not required to run the Shiny app but is needed to prepare network data from expression matrix.
install.packages(cran_packages, repos="http://cran.r-project.org/")  #choose repos based on your location if needed
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(c("Mfuzz", "biomaRt", "ComplexHeatmap", "pathview"))

#Requirements: shiny >= v1.4.0.2

3.2.2 Clone Quickomics GitHub repository

git clone https://github.com/interactivereport/Quickomics.git

see more at https://docs.github.com/en/free-pro-team@latest/github/creating-cloning-and-archiving-repositories/cloning-a-repository

3.2.3 Launch the R Shiny App

Check the following web links on various options to launch the app.

https://shiny.rstudio.com/articles/running.html

https://shiny.rstudio.com/deploy/

3.2.4 Configure QuickOmics to load projects directly from URLs

QuickOmics has a feature to load a project directly from URL in the form of https://quickomics.bxgenomics.com/?serverfile=SRP199678, which requires saving project data files to a user-specified directory on the server hosting QuickOmics. To use this feature, create a file called config.csv in the QuickOmics App folder.

category,value
server_dir,{Path/to/Server/File}
test_dir,{Path/to/Test/Files}

Files stored in server_dir can be loaded by URLs like: https://quickomics.bxgenomics.com/?serverfile=project_ID

Files stored in test_dir can be loaded by URLs like: https://quickomics.bxgenomics.com/?testfile=project_ID

3.2.5 Prepare project data to be loaded from URL

To prepare project files to be loaded via URL, do the following.

For each project, besides the two RData files (projectID.RData and ProjectID_netword.RData), prepare a ProjectID.csv file with six columns, e.g.

"Name","ShortName","ProjectID","Species","ExpressionUnit","Path"
"RNAseq analysis of sorted microglia","SRP199678","SRP199678","mouse","log2(TPM+0.25)","/camhpc/home/ysun4/RNASequest/example/SRP199678/EA20220329_0"

The last two columns in the csv file are optional but recommended.

Now copy the three files (projectID.RData, ProjectID_netword.RData and ProjectID.csv) inside the server_dir folder as specified by the config.csv file, and the project can be loaded as https://quickomics.bxgenomics.com/?serverfile=project_ID (replace quickomics.bxgenomics.com by the URL of your own QuickOmics instance).