Chapter 7 GeoMX
GeoMx™ Digital Spatial Profiler (DSP) is a spatial transcriptomics platform developed by NanoString Technologies (Now Bruker Corporation), allowing researchers to quantify RNA and/or protein expression using FFPE (formalin-fixed, paraffin-embedded) or fresh frozen samples. GeoMx enables high-plex, spatially resolved profiling of tissues based on user-defined Region of interests (ROIs). These ROIs can be defined using multiple staining markers, and each ROI usually has at least 50–100 cells, but ideally 100–300+ cells per ROI for robust RNA signal. Thus, GeoMx cannot provide single-cell resolution spatial transcriptomics readout, but it allows better capture of lowly-expressed genes, due to its mini-bulk
strategy.
GeoMx experiment can be sequenced using NGS high-throughput sequencing, or counted using NanoString nCounter. Here, we discuss the data analysis using NGS-based readout, with DCC as pipeline input format.
GeoMx workflow: This figures shows the general workflow of GeoMx experiment. The whole workflow contains sample preparation, imaging, data collection, and downstream data analysis. (Image from: https://nanostring.com/products/geomx-digital-spatial-profiler/geomx-dsp-overview/)
7.1 FASTQ to DCC conversion
NanoString Technologies (Now Bruker Corporation) provides a toolkit to process GeoMx readout in the FASTQ format. As described here, users can follow the instruction and generate files with .dcc as suffix for downstream analysis.
Another useful user manual is available at: GeoMx DSP NGS Readout User Manual
7.2 Pipeline setup
Demo run directory: ~/SpaceSequest_demo/4_GeoMx
Here, we will demonstrate the pipeline using a public dataset provided by NanoString Technologies. The FASTQ->DCC step has already been run for this data, so we can directly start from DCC file processing.
The demo dataset is a public data provided by NanoString: http://nanostring-public-share.s3-website-us-west-2.amazonaws.com/GeoScriptHub/Kidney_Dataset_for_GeomxTools.zip. After unzipping this file, please add the following file to the annotation
folder inside Kidney_Dataset
, as we modified the spreadsheet to simplify this test run: https://github.com/interactivereport/SpaceSequest/blob/gh-pages/tutorial/02-Introduction.Rmd
This is a kidney dataset, and in our demo run, we will run data processing, quality control, and differential expression analysis by comparing two kidney cell types: Glomerulus v.s. Tubule
First, we initiate the pipeline using the following script. This will generate templates of config.yml and compareInfo.csv:
#First step, generate config and sampleMeta file
geomx ~/SpaceSequest_demo/4_GeoMx
After this step, fill in the config.yml file and the sampleMeta.csv file as below:
#config file for GeoMx. Please avoid using spaces in names or paths. All items are required.
project_ID: GeoMx_demo #name of the project
data_path: ~/SpaceSequest_demo/4_GeoMx/Kidney_Dataset/dccs #path to DCC files
data_annotation: ~/SpaceSequest_demo/4_GeoMx/Kidney_Dataset/annotation/kidney_demo_AOI_Annotations_selected_clean.xlsx #sample meta information
annotation_sheet: Template #Sheet name of annotation Excel file
pkcs_file: ~/SpaceSequest_demo/4_GeoMx/Kidney_Dataset/pkcs/TAP_H_WTA_v1.0.pkc #path to the pkc file
output_dir: ~/SpaceSequest_demo/4_GeoMx/results #path for output files
comparison: ~/SpaceSequest_demo/4_GeoMx/compareInfo.csv #comparison file to define DE groups
quickomics: True
Then prepare the compareInfo.csv file to define differential expression analysis comparisons:
CompareName,Model,Group_name,Group_test,Group_ctrl,Analysis_method
Glome_vs_Tubule,~region,region,glomerulus,tubule,Linear
Currently, users can run a single linear model to call the differentially expressed genes using Q3 normalized data.
Run the pipeline as below:
#Run the data
geomx ~/SpaceSequest_demo/4_GeoMx/config.yml
Finally check the results in the output_dir folder specified in the config.yml file. The output files contain DE analysis results, and if the quickomics
parameter was set to True, it will generate multiple .csv files to create a Quickomics visualization link. An example can be found here (Q3 normalized values were used to create the link): http://compbio.biogen.com:3838/Quickomics/?unlisted=PRJ_GeoMx_demo_Ij4YyG
7.3 Results
This test run only has a simple comparison so should only take a few minutes to run.
The output files are a set of .csv files containing gene expression and differential expression analysis results. These files are fully compatible with Quickomics, an R Shiny application for data exploration. Users can upload these csv files to Quickomics for downstream analysis and figure generation.
Results in the directory:
~/SpaceSequest_demo/4_GeoMx/results
├── Comparison_result_1.txt #Differential expression result
#Quickomics files:
GeoMx_demo_Sample_metadata.csv
GeoMx_demo_Exp_Q3NormData.csv
GeoMx_demo_Comparison_Data.csv
GeoMx_demo_ProteinGeneName_Optional.csv
#Additional normalizations:
GeoMx_demo_Exp_NegNormData.csv
GeoMx_demo_Exp_QuantileNormData.csv
NanoString Technologies (Now Bruker Corporation) suggests using Q3 normalization, which was performed using:
demoData <- normalize(demoData ,
norm_method = "quant",
desiredQuantile = .75,
toElt = "q_norm")
However, additional normalization is also available, and users can upload those normalized values to Quickomics for analysis. For example, the ‘Negative Probe’ normalization was performed by the following command:
target_demoData <- normalize(target_demoData ,
norm_method = "neg",
fromElt = "exprs",
toElt = "neg_norm")
7.4 Quickomics exploration
Here, we upload the following four files to the Quickomics server:
GeoMx_demo_Sample_metadata.csv: Sample meta information include sameple name, disease conditions, etc.
GeoMx_demo_Exp_Q3NormData.csv: Gene expression values normalized by Q3 normalization
GeoMx_demo_Comparison_Data.csv: Differential expression analysis results
GeoMx_demo_ProteinGeneName_Optional.csv: Gene-Protein name conversion
Quickomics link: https://quickomics.bxgenomics.com/?unlisted=PRJ_GeoMx_demo_238wZk
A few example images from Quickomics exploration. The figure below shows GeoMx PCA and volcano plots:
The following figure displays a GeoMx expression heatmap: