R data objects of the GeoMxSet class are provided with corresponding expression data tables pre-loaded for user convenience. The GeoMxSet class is an S4 data structure in current development at NanoString. This vignette gives a short introduction on how to work with the object as well as quick look at analysis on the object. Please refer to the readme and introduction documents and presentations for information about the kidney dataset and how this data was collected.
There are three different data objects representing data at different steps of the data processing pipeline.
featureType(kidney_probe)
#> [1] "Probe"
featureType(kidney_target)
#> [1] "Target"
featureType(kidney_norm)
#> [1] "Target"
dim(kidney_probe)
#> Features Samples
#> 18642 231
dim(kidney_target)
#> Features Samples
#> 18504 231
dim(kidney_norm)
#> Features Samples
#> 16084 231
Within each object there are sample (region of interest) and feature (probe or target) annotations that can be used to subset the object by. This can be thought of as subsetting by column and row, respectively. As an example, we can remove any features labeled as control targets and split objects by normal vs disease.
tail(svarLabels(kidney_norm))
#> [1] "AOINucleiCount" "ROICoordinateX" "ROICoordinateY" "disease_status"
#> [5] "pathology" "region"
fvarLabels(kidney_norm)
#> [1] "TargetName" "Module" "CodeClass" "ProbeID" "Negative"
<- kidney_norm[fData(kidney_norm)[["CodeClass"]] == "Endogenous01", ]
control_removed dim(control_removed[, sData(control_removed)[["disease_status"]] == "normal"])
#> Features Samples
#> 16083 100
dim(control_removed[, sData(control_removed)[["disease_status"]] == "DKD"])
#> Features Samples
#> 16083 131
rm(control_removed)
Functions such as quantile normalization can be performed on GeoMxSet objects and subsequent results can be saved to the current object as an assayDataElement. Expression data are accessible through accessor functions and replacement requires the same dimensions as well as sample and feature names to match.
<- setdiff(featureNames(kidney_target), featureNames(kidney_norm))
targets_qc_removed <- kidney_target[!featureNames(kidney_target) %in% targets_qc_removed, ]
qc_filtered <- esApply(qc_filtered, 2, quantile, probs=0.75)
quantiles <- ngeoMean(quantiles)
avg_quantiles <- quantiles / avg_quantiles
norm_factors <- as.matrix(exprs(qc_filtered)) %*% diag(1 / norm_factors)
normed colnames(normed) <- sampleNames(qc_filtered)
try(assayDataElement(kidney_target, "norm") <- normed)
#> Error in .validate_assayDataElementReplace(obj, value) :
#> object and replacement value have different dimensions
assayDataElement(qc_filtered, "norm") <- normed
assayDataElement(qc_filtered, "norm")[1:2, 1:2]
#> DSP-1001250007851-H-A02.dcc DSP-1001250007851-H-A03.dcc
#> A1BG 7.228423 6.56648
#> A1CF 12.232716 12.31215
exprs(qc_filtered)[1:2, 1:2]
#> DSP-1001250007851-H-A02.dcc DSP-1001250007851-H-A03.dcc
#> A1BG 13 8
#> A1CF 22 15
assayDataElement(qc_filtered, "exprs") <- normed
exprs(qc_filtered)[1:2, 1:2]
#> DSP-1001250007851-H-A02.dcc DSP-1001250007851-H-A03.dcc
#> A1BG 7.228423 6.56648
#> A1CF 12.232716 12.31215
rm(qc_filtered)
Differential expression is easily performed on the kidney data objects with packages such as limma. Designs can be appended to the objects to consistently pass the differential expression design through the analysis process. With the kidney dataset we can use these methods to compare diabetic kidney disease (DKD) glomeruli to normal and find the top 5 features (genes) that separate DKD from normal glomeruli.
<- kidney_norm[, sData(kidney_norm)[["region"]] == "glomerulus"]
glomeruli design(glomeruli) <- ~ `disease_status`
<- lmFit(exprs(glomeruli), model.matrix(design(glomeruli), sData(glomeruli)))
fit <- eBayes(fit)
fit <- topTable(fit, coef=2, adjust="fdr", n=5)
DE_table
DE_table#> logFC AveExpr t P.Value adj.P.Val B
#> DSTN 66.51779 83.72305 20.11222 1.794371e-44 2.886066e-40 86.85895
#> PCOLCE2 41.56335 41.57211 19.72293 1.518958e-43 8.195688e-40 84.92020
#> MAL2 45.76416 52.94508 19.72177 1.528666e-43 8.195688e-40 84.91441
#> VEGFA 222.79256 245.03567 18.39332 2.579344e-40 1.037154e-36 78.13117
#> CDKN1C 172.68002 163.29339 18.34772 3.341495e-40 1.074892e-36 77.89382
<- glomeruli[rownames(DE_table), ]
DE_glomeruli fData(DE_glomeruli)[["CodeClass"]] <- gsub("\\d+$", "", fData(DE_glomeruli)[["CodeClass"]])
<- autoplot(DE_glomeruli, type = "heatmap-genes", elt = "exprs",
heat log2scale = FALSE, heatmapGroup = "disease_status")
heat
sessionInfo()
#> R version 4.0.4 (2021-02-15)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04 LTS
#>
#> Matrix products: default
#> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats4 parallel stats graphics grDevices utils datasets
#> [8] methods base
#>
#> other attached packages:
#> [1] limma_3.46.0 NanoStringHackathon_0.0.1
#> [3] GeomxTools_0.99.3 NanoStringNCTools_0.99.8
#> [5] ggplot2_3.3.3 S4Vectors_0.28.1
#> [7] Biobase_2.50.0 BiocGenerics_0.36.0
#> [9] usethis_2.0.1 devtools_2.0.2
#>
#> loaded via a namespace (and not attached):
#> [1] sass_0.3.1 pkgload_1.2.0 jsonlite_1.7.2 bslib_0.2.4
#> [5] assertthat_0.2.1 highr_0.8 vipor_0.4.5 cellranger_1.1.0
#> [9] outliers_0.14 yaml_2.2.1 remotes_2.0.4 sessioninfo_1.1.1
#> [13] gdtools_0.2.3 pillar_1.5.0 glue_1.4.2 digest_0.6.27
#> [17] RColorBrewer_1.1-2 XVector_0.30.0 colorspace_2.0-0 plyr_1.8.6
#> [21] htmltools_0.5.1.1 pkgconfig_2.0.3 pheatmap_1.0.12 zlibbioc_1.36.0
#> [25] purrr_0.3.4 scales_1.1.1 processx_3.4.5 tibble_3.1.0
#> [29] farver_2.0.3 generics_0.1.0 IRanges_2.24.1 ellipsis_0.3.1
#> [33] withr_2.4.1 cli_2.3.1 magrittr_2.0.1 crayon_1.4.1
#> [37] readxl_1.3.1 memoise_1.1.0 evaluate_0.14 ps_1.5.0
#> [41] fs_1.5.0 fansi_0.4.2 xml2_1.3.2 beeswarm_0.2.3
#> [45] pkgbuild_1.2.0 ggthemes_4.1.1 tools_4.0.4 data.table_1.14.0
#> [49] prettyunits_1.1.1 lifecycle_1.0.0 stringr_1.4.0 munsell_0.5.0
#> [53] EnvStats_2.4.0 callr_3.5.1 Biostrings_2.58.0 compiler_4.0.4
#> [57] jquerylib_0.1.3 systemfonts_1.0.1 rlang_0.4.10 grid_4.0.4
#> [61] rjson_0.2.20 htmlwidgets_1.5.3 rmarkdown_2.7 testthat_3.0.2
#> [65] gtable_0.3.0 curl_4.3 DBI_1.1.1 reshape2_1.4.4
#> [69] R6_2.5.0 knitr_1.31 dplyr_1.0.4 ggiraph_0.6.1
#> [73] utf8_1.1.4 rprojroot_2.0.2 desc_1.2.0 stringi_1.5.3
#> [77] ggbeeswarm_0.6.0 Rcpp_1.0.6 vctrs_0.3.6 tidyselect_1.1.0
#> [81] xfun_0.21