This vignette demonstrates how to
read and write SingleCellExperiment
objects using the
{anndataR} package, leveraging the interoperability
between SingleCellExperiment
and the AnnData
format.
Check out ?anndataR
for a full list of the functions
provided by this package.
SingleCellExperiment is a widely used class for storing single-cell
data in R, especially within the Bioconductor ecosystem.
{anndataR} enables conversion between
SingleCellExperiment
objects and AnnData
objects, allowing you to leverage the strengths of both the scverse and
Bioconductor ecosystems.
Before you begin, make sure you have both SingleCellExperiment and {anndataR} installed. You can install them using the following code:
Using an example .h5ad
file included in the package, we
will demonstrate how to read an .h5ad
file and convert it
to a SingleCellExperiment
object.
library(anndataR)
library(SingleCellExperiment)
h5ad_file <- system.file("extdata", "example.h5ad", package = "anndataR")
Read the .h5ad
file:
adata <- read_h5ad(h5ad_file)
adata
#> AnnData object with n_obs × n_vars = 50 × 100
#> obs: 'Float', 'FloatNA', 'Int', 'IntNA', 'Bool', 'BoolNA', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'leiden'
#> var: 'String', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
#> uns: 'Bool', 'BoolNA', 'Category', 'DataFrameEmpty', 'Int', 'IntNA', 'IntScalar', 'Sparse1D', 'String', 'String2D', 'StringScalar', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'umap'
#> obsm: 'X_pca', 'X_umap'
#> varm: 'PCs'
#> layers: 'counts', 'csc_counts', 'dense_X', 'dense_counts'
#> obsp: 'connectivities', 'distances'
Convert to a SingleCellExperiment
object:
sce_obj <- adata$to_SingleCellExperiment()
sce_obj
#> class: SingleCellExperiment
#> dim: 100 50
#> metadata(18): Bool BoolNA ... rank_genes_groups umap
#> assays(5): data counts csc_counts dense_X dense_counts
#> rownames(100): Gene000 Gene001 ... Gene098 Gene099
#> rowData names(11): String n_cells_by_counts ... dispersions
#> dispersions_norm
#> colnames(50): Cell000 Cell001 ... Cell048 Cell049
#> colData names(11): Float FloatNA ... log1p_total_counts leiden
#> reducedDimNames(2): pca umap
#> mainExpName: NULL
#> altExpNames(0):
Note that there is no one-to-one mapping possible between the AnnData and SingleCellExperiment data structures, so some information might be lost during conversion. It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred.
See the in-depth vignette on conversion to and from SingleCellExperiment objects for more details on how to customize the conversion process. For instance:
Here’s an example demonstrating how to create a
SingleCellExperiment
object from scratch, then convert it
to AnnData
and save it as .h5ad
counts <- matrix(rbinom(20000, 1000, .001), nrow = 100)
sce_obj <- SingleCellExperiment(list(counts = counts))
sce_obj
#> class: SingleCellExperiment
#> dim: 100 200
#> metadata(0):
#> assays(1): counts
#> rownames: NULL
#> rowData names(0):
#> colnames: NULL
#> colData names(0):
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
You can convert the SingleCellExperiment
object to an
AnnData
object using the
from_SingleCellExperiment
function:
adata <- from_SingleCellExperiment(sce_obj)
adata
#> AnnData object with n_obs × n_vars = 200 × 100
#> layers: 'counts'
Again note that there is no one-to-one mapping possible between the AnnData and SingleCellExperiment data structures, so some information might be lost during conversion. It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred.
See the in-depth vignette on conversion to and from SingleCellExperiment objects for more details on how to customize the conversion process. Example:
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] Seurat_5.2.1 tidyr_1.3.1
#> [3] dplyr_1.1.4 purrr_1.0.4
#> [5] stringr_1.5.1 rprojroot_2.0.4
#> [7] knitr_1.49 tibble_3.2.1
#> [9] rmarkdown_2.29 anndataR_0.99.0
#> [11] SingleCellExperiment_1.29.1 SummarizedExperiment_1.37.0
#> [13] Biobase_2.67.0 GenomicRanges_1.59.1
#> [15] GenomeInfoDb_1.43.4 IRanges_2.41.2
#> [17] S4Vectors_0.45.2 BiocGenerics_0.53.6
#> [19] generics_0.1.3 MatrixGenerics_1.19.1
#> [21] matrixStats_1.5.0 SeuratObject_5.0.2
#> [23] sp_2.2-0 BiocStyle_2.35.0
#>
#> loaded via a namespace (and not attached):
#> [1] RColorBrewer_1.1-3 sys_3.4.3 jsonlite_1.8.9
#> [4] magrittr_2.0.3 spatstat.utils_3.1-2 farver_2.1.2
#> [7] vctrs_0.6.5 ROCR_1.0-11 spatstat.explore_3.3-4
#> [10] htmltools_0.5.8.1 S4Arrays_1.7.2 SparseArray_1.7.5
#> [13] sass_0.4.9 sctransform_0.4.1 parallelly_1.42.0
#> [16] KernSmooth_2.23-26 bslib_0.9.0 htmlwidgets_1.6.4
#> [19] ica_1.0-3 plyr_1.8.9 plotly_4.10.4
#> [22] zoo_1.8-12 cachem_1.1.0 buildtools_1.0.0
#> [25] igraph_2.1.4 mime_0.12 lifecycle_1.0.4
#> [28] pkgconfig_2.0.3 Matrix_1.7-2 R6_2.5.1
#> [31] fastmap_1.2.0 GenomeInfoDbData_1.2.13 fitdistrplus_1.2-2
#> [34] future_1.34.0 shiny_1.10.0 digest_0.6.37
#> [37] colorspace_2.1-1 patchwork_1.3.0 tensor_1.5
#> [40] RSpectra_0.16-2 irlba_2.3.5.1 progressr_0.15.1
#> [43] spatstat.sparse_3.1-0 polyclip_1.10-7 httr_1.4.7
#> [46] abind_1.4-8 compiler_4.4.2 withr_3.0.2
#> [49] bit64_4.6.0-1 fastDummies_1.7.5 MASS_7.3-64
#> [52] DelayedArray_0.33.5 tools_4.4.2 lmtest_0.9-40
#> [55] httpuv_1.6.15 future.apply_1.11.3 goftest_1.2-3
#> [58] glue_1.8.0 nlme_3.1-167 promises_1.3.2
#> [61] grid_4.4.2 Rtsne_0.17 cluster_2.1.8
#> [64] reshape2_1.4.4 hdf5r_1.3.12 spatstat.data_3.1-4
#> [67] gtable_0.3.6 data.table_1.16.4 XVector_0.47.2
#> [70] spatstat.geom_3.3-5 RcppAnnoy_0.0.22 ggrepel_0.9.6
#> [73] RANN_2.6.2 pillar_1.10.1 spam_2.11-1
#> [76] RcppHNSW_0.6.0 later_1.4.1 splines_4.4.2
#> [79] lattice_0.22-6 deldir_2.0-4 survival_3.8-3
#> [82] bit_4.5.0.1 tidyselect_1.2.1 maketools_1.3.1
#> [85] miniUI_0.1.1.1 pbapply_1.7-2 gridExtra_2.3
#> [88] scattermore_1.2 xfun_0.50 stringi_1.8.4
#> [91] UCSC.utils_1.3.1 lazyeval_0.2.2 yaml_2.3.10
#> [94] evaluate_1.0.3 codetools_0.2-20 BiocManager_1.30.25
#> [97] cli_3.6.3 uwot_0.2.2 xtable_1.8-4
#> [100] reticulate_1.40.0 munsell_0.5.1 jquerylib_0.1.4
#> [103] Rcpp_1.0.14 spatstat.random_3.3-2 globals_0.16.3
#> [106] png_0.1-8 spatstat.univar_3.1-1 parallel_4.4.2
#> [109] ggplot2_3.5.1 dotCall64_1.2 listenv_0.9.1
#> [112] viridisLite_0.4.2 scales_1.3.0 ggridges_0.5.6
#> [115] crayon_1.5.3 rlang_1.1.5 cowplot_1.1.3