scRUtils 0.1.0
scRUtils
provides various utilities for visualising and functional analysis of RNA-seq data,
particularly single-cell dataset. It evolved from a collection of helper functions that were
used in our in-house scRNA-seq processing workflow.
The documentation of this package is divided into 5 sections:
This vignette (#2) will demonstrate functions for general visualisation purposes.
To use scRUtils
and relevant packages in a R session, we load them using the library()
command.
library(scRUtils)
library(ggforce)
library(ggplot2)
library(scater)
repr.plot.*
behaviourThe fig()
, function uses options()
to change the behaviour of repr.plot.*
. It provides a
quick and easy way to change plot size and other repr.plot.*
behaviours when running R in a
Jupyter Notebook. If use without indicating any argument, the plot behaviour will be reset
to default. The reset.fig()
is an alias of fig()
.
The example below is not evaluated as the function has no effect in R Markdown.
library(ggplot2)
# Change plot area width to 8 inches and height to 5 inches
fig(width = 8, height = 5)
ggplot(mpg, aes(class)) + geom_bar()
# Reset to default settings
fig()
# Change plot wider and taller
fig(width = 14, height = 10)
ggplot(mpg, aes(class)) + geom_bar()
# Alias of fig()
reset.fig()
c30()
and c40()
palettesThe c30()
palette has 30 unique colours and c40()
palette has 40 unique colours.
The c40()
colour palette is taken from plotScoreHeatmap()
of the SingleR
package (which itself is based on and Okabe-Ito colors).
# Show colours as pie charts
pie(rep(1,30), col = c30(), radius = 1.05)
pie(rep(1,40), col = c40(), radius = 1.05)
The choosePalette()
function takes a character vector of features and optionally a vector of
color codes to evaluate if the supplied color codes has sufficient number of colours. It returns
a named vector of color codes based on the input features, with the same length as the unique
features.
By default, it uses the c30()
palette when no more than 30 colours are required, then the
c40()
palette, and lastly the rainbow()
colour palette when requiring more than 40 colours.
The example below shows using a character vector of 10 letters as input and choosePalette()
returns 5 colours.
feat <- rep(LETTERS[1:5], 2)
feat
## [1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E"
choosePalette(feat) # use c30()
## Loading required namespace: gtools
## A B C D E
## "#006400" "#ff0000" "#0000ff" "#ff8c00" "#800080"
Next example shows using a factor of 15 letters and 3 levels as input and choosePalette()
returns 3 of the 10 colours from the rainbow(10)
colour palette.
feat <- factor(rep(LETTERS[1:3], 5))
feat
## [1] A B C A B C A B C A B C A B C
## Levels: A B C
choosePalette(feat, rainbow(10))
## A B C
## "#FF0000" "#FF9900" "#CCFF00"
The geom_parallel_sets_labs()
function in this package is the same function as
geom_parallel_sets_labels()
from the ggforce package but with the ability to
nudge labels at a fixed distance. It is especially useful when the labels are too long to fit
inside the bars depicting the discrete categories. A pull request of the nudge enhancement has
been submitted to its GitHub repository, ggforce, awaiting approval.
library(ggforce)
data <- as.data.frame(Titanic)
data <- gather_set_data(data, 1:4)
# Use nudge_x to offset and hjust = 0 to left-justify label
ggplot(data, aes(x, id = id, split = y, value = Freq)) +
geom_parallel_sets(aes(fill = Sex), alpha = 0.3, axis.width = 0.1) +
geom_parallel_sets_axes(axis.width = 0.1) +
geom_parallel_sets_labs(colour = "red", size = 6, angle = 0,
nudge_x = 0.1, hjust = 0) +
theme_bw(20)
The plotParallel()
function uses the ggforce package to produce a parallel
sets diagram for visualising interaction between 2 variables. The inputs are two character
vectors containing membership information.
The example below uses the Titanic
dataset to show the class and age of the passengers.
data <- as.data.frame(Titanic)
plotParallel(data$Class, data$Age, labels = c("class", "age"))
We can also use plotParallel()
to show cell-specific features of a single-cell dataset, such
as clustering and cell type assignment.
data(sce)
plotParallel(sce$label, sce$CellType, labels = c("Cluster", "Cell Type"),
add_counts = TRUE, text_size = 4)
sessionInfo()
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-conda-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
##
## Matrix products: default
## BLAS/LAPACK: /home/ihsuan/miniconda3/envs/jupyterlab/lib/libopenblasp-r0.3.20.so
##
## locale:
## [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
## [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
## [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] scater_1.22.0 scuttle_1.4.0
## [3] SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0
## [5] Biobase_2.54.0 GenomicRanges_1.46.1
## [7] GenomeInfoDb_1.30.1 IRanges_2.28.0
## [9] S4Vectors_0.32.3 BiocGenerics_0.40.0
## [11] MatrixGenerics_1.6.0 matrixStats_0.61.0
## [13] ggforce_0.3.3 ggplot2_3.3.5
## [15] scRUtils_0.1.0 BiocStyle_2.22.0
##
## loaded via a namespace (and not attached):
## [1] bitops_1.0-7 httr_1.4.2
## [3] tools_4.1.3 bslib_0.3.1
## [5] utf8_1.2.2 R6_2.5.1
## [7] irlba_2.3.5 vipor_0.4.5
## [9] DBI_1.1.2 colorspace_2.0-3
## [11] withr_2.5.0 gridExtra_2.3
## [13] tidyselect_1.1.2 compiler_4.1.3
## [15] cli_3.2.0 BiocNeighbors_1.12.0
## [17] enrichR_3.0 DelayedArray_0.20.0
## [19] labeling_0.4.2 bookdown_0.26
## [21] sass_0.4.1 scales_1.2.0
## [23] stringr_1.4.0 digest_0.6.29
## [25] rmarkdown_2.13 XVector_0.34.0
## [27] pkgconfig_2.0.3 htmltools_0.5.2
## [29] sparseMatrixStats_1.6.0 limma_3.50.1
## [31] highr_0.9 fastmap_1.1.0
## [33] rlang_1.0.2 DelayedMatrixStats_1.16.0
## [35] jquerylib_0.1.4 farver_2.1.0
## [37] generics_0.1.2 jsonlite_1.8.0
## [39] gtools_3.9.2 BiocParallel_1.28.3
## [41] dplyr_1.0.8 RCurl_1.98-1.6
## [43] magrittr_2.0.3 BiocSingular_1.10.0
## [45] GenomeInfoDbData_1.2.7 Matrix_1.4-1
## [47] Rcpp_1.0.8.3 ggbeeswarm_0.6.0
## [49] munsell_0.5.0 fansi_1.0.3
## [51] viridis_0.6.2 ggnewscale_0.4.7
## [53] lifecycle_1.0.1 edgeR_3.36.0
## [55] stringi_1.7.6 yaml_2.3.5
## [57] MASS_7.3-56 zlibbioc_1.40.0
## [59] grid_4.1.3 dqrng_0.3.0
## [61] parallel_4.1.3 ggrepel_0.9.1
## [63] crayon_1.5.1 lattice_0.20-45
## [65] cowplot_1.1.1 beachmat_2.10.0
## [67] locfit_1.5-9.5 magick_2.7.3
## [69] metapod_1.2.0 knitr_1.38
## [71] pillar_1.7.0 igraph_1.3.0
## [73] rjson_0.2.21 ScaledMatrix_1.2.0
## [75] glue_1.6.2 evaluate_0.15
## [77] scran_1.22.1 BiocManager_1.30.16
## [79] vctrs_0.4.1 tweenr_1.0.2
## [81] tidyr_1.2.0 gtable_0.3.0
## [83] purrr_0.3.4 polyclip_1.10-0
## [85] assertthat_0.2.1 xfun_0.30
## [87] rsvd_1.0.5 viridisLite_0.4.0
## [89] tibble_3.1.6 beeswarm_0.4.0
## [91] cluster_2.1.3 statmod_1.4.36
## [93] bluster_1.4.0 ellipsis_0.3.2