R data objects of the GeoMxSet class are provided with corresponding expression data tables pre-loaded for user convenience. The GeoMxSet class is an S4 data structure in current development at NanoString. This vignette gives a short introduction on how to work with the object as well as quick look at analysis on the object. Please refer to the readme and introduction documents and presentations for information about the kidney dataset and how this data was collected.
There are three different data objects representing data at different steps of the data processing pipeline.
featureType(kidney_probe)
#> [1] "Probe"
featureType(kidney_target)
#> [1] "Target"
featureType(kidney_norm)
#> [1] "Target"
dim(kidney_probe)
#> Features Samples
#> 18642 231
dim(kidney_target)
#> Features Samples
#> 18504 231
dim(kidney_norm)
#> Features Samples
#> 16084 231Within each object there are sample (region of interest) and feature (probe or target) annotations that can be used to subset the object by. This can be thought of as subsetting by column and row, respectively. As an example, we can remove any features labeled as control targets and split objects by normal vs disease.
tail(svarLabels(kidney_norm))
#> [1] "AOINucleiCount" "ROICoordinateX" "ROICoordinateY" "disease_status"
#> [5] "pathology" "region"
fvarLabels(kidney_norm)
#> [1] "TargetName" "Module" "CodeClass" "ProbeID" "Negative"
control_removed <- kidney_norm[fData(kidney_norm)[["CodeClass"]] == "Endogenous01", ]
dim(control_removed[, sData(control_removed)[["disease_status"]] == "normal"])
#> Features Samples
#> 16083 100
dim(control_removed[, sData(control_removed)[["disease_status"]] == "DKD"])
#> Features Samples
#> 16083 131
rm(control_removed)Functions such as quantile normalization can be performed on GeoMxSet objects and subsequent results can be saved to the current object as an assayDataElement. Expression data are accessible through accessor functions and replacement requires the same dimensions as well as sample and feature names to match.
targets_qc_removed <- setdiff(featureNames(kidney_target), featureNames(kidney_norm))
qc_filtered <- kidney_target[!featureNames(kidney_target) %in% targets_qc_removed, ]
quantiles <- esApply(qc_filtered, 2, quantile, probs=0.75)
avg_quantiles <- ngeoMean(quantiles)
norm_factors <- quantiles / avg_quantiles
normed <- as.matrix(exprs(qc_filtered)) %*% diag(1 / norm_factors)
colnames(normed) <- sampleNames(qc_filtered)
try(assayDataElement(kidney_target, "norm") <- normed)
#> Error in .validate_assayDataElementReplace(obj, value) :
#> object and replacement value have different dimensions
assayDataElement(qc_filtered, "norm") <- normed
assayDataElement(qc_filtered, "norm")[1:2, 1:2]
#> DSP-1001250007851-H-A02.dcc DSP-1001250007851-H-A03.dcc
#> A1BG 7.228423 6.56648
#> A1CF 12.232716 12.31215
exprs(qc_filtered)[1:2, 1:2]
#> DSP-1001250007851-H-A02.dcc DSP-1001250007851-H-A03.dcc
#> A1BG 13 8
#> A1CF 22 15
assayDataElement(qc_filtered, "exprs") <- normed
exprs(qc_filtered)[1:2, 1:2]
#> DSP-1001250007851-H-A02.dcc DSP-1001250007851-H-A03.dcc
#> A1BG 7.228423 6.56648
#> A1CF 12.232716 12.31215
rm(qc_filtered)Differential expression is easily performed on the kidney data objects with packages such as limma. Designs can be appended to the objects to consistently pass the differential expression design through the analysis process. With the kidney dataset we can use these methods to compare diabetic kidney disease (DKD) glomeruli to normal and find the top 5 features (genes) that separate DKD from normal glomeruli.
glomeruli <- kidney_norm[, sData(kidney_norm)[["region"]] == "glomerulus"]
design(glomeruli) <- ~ `disease_status`
fit <- lmFit(exprs(glomeruli), model.matrix(design(glomeruli), sData(glomeruli)))
fit <- eBayes(fit)
DE_table <- topTable(fit, coef=2, adjust="fdr", n=5)
DE_table
#> logFC AveExpr t P.Value adj.P.Val B
#> DSTN 66.51779 83.72305 20.11222 1.794371e-44 2.886066e-40 86.85895
#> PCOLCE2 41.56335 41.57211 19.72293 1.518958e-43 8.195688e-40 84.92020
#> MAL2 45.76416 52.94508 19.72177 1.528666e-43 8.195688e-40 84.91441
#> VEGFA 222.79256 245.03567 18.39332 2.579344e-40 1.037154e-36 78.13117
#> CDKN1C 172.68002 163.29339 18.34772 3.341495e-40 1.074892e-36 77.89382
DE_glomeruli <- glomeruli[rownames(DE_table), ]
fData(DE_glomeruli)[["CodeClass"]] <- gsub("\\d+$", "", fData(DE_glomeruli)[["CodeClass"]])
heat <- autoplot(DE_glomeruli, type = "heatmap-genes", elt = "exprs",
log2scale = FALSE, heatmapGroup = "disease_status")
heatsessionInfo()
#> R version 4.0.4 (2021-02-15)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04 LTS
#>
#> Matrix products: default
#> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats4 parallel stats graphics grDevices utils datasets
#> [8] methods base
#>
#> other attached packages:
#> [1] limma_3.46.0 NanoStringHackathon_0.0.1
#> [3] GeomxTools_0.99.3 NanoStringNCTools_0.99.8
#> [5] ggplot2_3.3.3 S4Vectors_0.28.1
#> [7] Biobase_2.50.0 BiocGenerics_0.36.0
#> [9] usethis_2.0.1 devtools_2.0.2
#>
#> loaded via a namespace (and not attached):
#> [1] sass_0.3.1 pkgload_1.2.0 jsonlite_1.7.2 bslib_0.2.4
#> [5] assertthat_0.2.1 highr_0.8 vipor_0.4.5 cellranger_1.1.0
#> [9] outliers_0.14 yaml_2.2.1 remotes_2.0.4 sessioninfo_1.1.1
#> [13] gdtools_0.2.3 pillar_1.5.0 glue_1.4.2 digest_0.6.27
#> [17] RColorBrewer_1.1-2 XVector_0.30.0 colorspace_2.0-0 plyr_1.8.6
#> [21] htmltools_0.5.1.1 pkgconfig_2.0.3 pheatmap_1.0.12 zlibbioc_1.36.0
#> [25] purrr_0.3.4 scales_1.1.1 processx_3.4.5 tibble_3.1.0
#> [29] farver_2.0.3 generics_0.1.0 IRanges_2.24.1 ellipsis_0.3.1
#> [33] withr_2.4.1 cli_2.3.1 magrittr_2.0.1 crayon_1.4.1
#> [37] readxl_1.3.1 memoise_1.1.0 evaluate_0.14 ps_1.5.0
#> [41] fs_1.5.0 fansi_0.4.2 xml2_1.3.2 beeswarm_0.2.3
#> [45] pkgbuild_1.2.0 ggthemes_4.1.1 tools_4.0.4 data.table_1.14.0
#> [49] prettyunits_1.1.1 lifecycle_1.0.0 stringr_1.4.0 munsell_0.5.0
#> [53] EnvStats_2.4.0 callr_3.5.1 Biostrings_2.58.0 compiler_4.0.4
#> [57] jquerylib_0.1.3 systemfonts_1.0.1 rlang_0.4.10 grid_4.0.4
#> [61] rjson_0.2.20 htmlwidgets_1.5.3 rmarkdown_2.7 testthat_3.0.2
#> [65] gtable_0.3.0 curl_4.3 DBI_1.1.1 reshape2_1.4.4
#> [69] R6_2.5.0 knitr_1.31 dplyr_1.0.4 ggiraph_0.6.1
#> [73] utf8_1.1.4 rprojroot_2.0.2 desc_1.2.0 stringi_1.5.3
#> [77] ggbeeswarm_0.6.0 Rcpp_1.0.6 vctrs_0.3.6 tidyselect_1.1.0
#> [81] xfun_0.21