This vignette demonstrates how to
read and write Seurat
objects using the
{anndataR} package, leveraging the interoperability
between Seurat
and the AnnData
format.
Check out ?anndataR
for a full list of the functions
provided by this package.
Seurat is a widely used toolkit for single-cell analysis in R.
{anndataR} enables conversion between
Seurat
objects and AnnData
objects, allowing
you to leverage the strengths of both the scverse and Seurat
ecosystems.
Before you begin, make sure you have both Seurat and {anndataR} installed. You can install them using the following code:
Using an example .h5ad
file included in the package, we
will demonstrate how to read an .h5ad
file and convert it
to a Seurat
object.
library(anndataR)
library(Seurat)
#>
#> Attaching package: 'Seurat'
#> The following object is masked from 'package:SummarizedExperiment':
#>
#> Assays
h5ad_file <- system.file("extdata", "example.h5ad", package = "anndataR")
Read the .h5ad
file:
adata <- read_h5ad(h5ad_file)
adata
#> AnnData object with n_obs × n_vars = 50 × 100
#> obs: 'Float', 'FloatNA', 'Int', 'IntNA', 'Bool', 'BoolNA', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'leiden'
#> var: 'String', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
#> uns: 'Bool', 'BoolNA', 'Category', 'DataFrameEmpty', 'Int', 'IntNA', 'IntScalar', 'Sparse1D', 'String', 'String2D', 'StringScalar', 'hvg', 'leiden', 'log1p', 'neighbors', 'pca', 'rank_genes_groups', 'umap'
#> obsm: 'X_pca', 'X_umap'
#> varm: 'PCs'
#> layers: 'counts', 'csc_counts', 'dense_X', 'dense_counts'
#> obsp: 'connectivities', 'distances'
Convert to a Seurat
object:
seurat_obj <- adata$to_Seurat()
seurat_obj
#> An object of class Seurat
#> 100 features across 50 samples within 1 assay
#> Active assay: RNA (100 features, 0 variable features)
#> 5 layers present: counts, data, csc_counts, dense_X, dense_counts
#> 2 dimensional reductions calculated: pca, umap
Note that there is no one-to-one mapping possible between the AnnData and SeuratObject data structures, so some information might be lost during conversion. It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred.
See ?to_Seurat
for more details on how to customize the
conversion process. For instance:
adata$to_Seurat(
assay_name = "ADT",
layers_mapping = c(counts = "dense_counts", data = "dense_X")
)
#> An object of class Seurat
#> 100 features across 50 samples within 1 assay
#> Active assay: ADT (100 features, 0 variable features)
#> 2 layers present: counts, data
#> 2 dimensional reductions calculated: pca, umap
Here’s an example demonstrating how to create a Seurat
object from scratch, then convert it to AnnData
and save it
as .h5ad
counts <- matrix(rbinom(20000, 1000, .001), nrow = 100)
seurat_obj <- CreateSeuratObject(counts = counts) |>
NormalizeData() |>
FindVariableFeatures() |>
ScaleData() |>
RunPCA(npcs = 10) |>
FindNeighbors() |>
RunUMAP(dims = 1:10)
#> Normalizing layer: counts
#> Finding variable features for layer counts
#> Centering and scaling data matrix
#> PC_ 1
#> Positive: Feature72, Feature67, Feature73, Feature52, Feature44, Feature18, Feature15, Feature2, Feature79, Feature91
#> Feature10, Feature29, Feature43, Feature74, Feature49, Feature22, Feature6, Feature36, Feature78, Feature25
#> Feature88, Feature7, Feature48, Feature100, Feature87, Feature27, Feature8, Feature33, Feature35, Feature57
#> Negative: Feature20, Feature1, Feature40, Feature21, Feature86, Feature60, Feature94, Feature81, Feature66, Feature71
#> Feature76, Feature50, Feature11, Feature32, Feature95, Feature14, Feature28, Feature64, Feature46, Feature23
#> Feature63, Feature70, Feature92, Feature38, Feature56, Feature54, Feature37, Feature77, Feature26, Feature97
#> PC_ 2
#> Positive: Feature24, Feature40, Feature15, Feature12, Feature25, Feature96, Feature16, Feature13, Feature67, Feature35
#> Feature32, Feature74, Feature21, Feature27, Feature26, Feature30, Feature82, Feature17, Feature80, Feature76
#> Feature72, Feature100, Feature87, Feature18, Feature31, Feature47, Feature52, Feature11, Feature36, Feature43
#> Negative: Feature68, Feature6, Feature33, Feature58, Feature23, Feature28, Feature91, Feature10, Feature63, Feature48
#> Feature88, Feature70, Feature83, Feature50, Feature29, Feature93, Feature49, Feature22, Feature57, Feature64
#> Feature94, Feature75, Feature41, Feature78, Feature5, Feature20, Feature61, Feature7, Feature92, Feature37
#> PC_ 3
#> Positive: Feature3, Feature70, Feature69, Feature18, Feature48, Feature84, Feature91, Feature19, Feature62, Feature66
#> Feature41, Feature61, Feature78, Feature34, Feature45, Feature100, Feature26, Feature16, Feature22, Feature20
#> Feature95, Feature47, Feature51, Feature72, Feature80, Feature13, Feature35, Feature86, Feature75, Feature10
#> Negative: Feature8, Feature43, Feature42, Feature28, Feature39, Feature68, Feature25, Feature37, Feature49, Feature94
#> Feature23, Feature85, Feature83, Feature74, Feature15, Feature52, Feature50, Feature58, Feature21, Feature82
#> Feature67, Feature59, Feature6, Feature5, Feature64, Feature30, Feature46, Feature12, Feature71, Feature54
#> PC_ 4
#> Positive: Feature4, Feature45, Feature85, Feature8, Feature78, Feature53, Feature62, Feature28, Feature83, Feature27
#> Feature69, Feature54, Feature48, Feature19, Feature67, Feature51, Feature30, Feature96, Feature31, Feature73
#> Feature46, Feature55, Feature92, Feature82, Feature86, Feature17, Feature33, Feature93, Feature91, Feature40
#> Negative: Feature38, Feature75, Feature7, Feature26, Feature5, Feature29, Feature14, Feature15, Feature37, Feature24
#> Feature61, Feature41, Feature58, Feature47, Feature11, Feature87, Feature60, Feature56, Feature99, Feature98
#> Feature34, Feature2, Feature39, Feature43, Feature80, Feature52, Feature95, Feature65, Feature77, Feature100
#> PC_ 5
#> Positive: Feature14, Feature66, Feature79, Feature13, Feature32, Feature90, Feature83, Feature33, Feature38, Feature18
#> Feature74, Feature47, Feature45, Feature69, Feature12, Feature65, Feature3, Feature43, Feature58, Feature96
#> Feature68, Feature21, Feature98, Feature2, Feature99, Feature61, Feature6, Feature35, Feature41, Feature23
#> Negative: Feature84, Feature34, Feature27, Feature72, Feature17, Feature9, Feature80, Feature37, Feature1, Feature42
#> Feature59, Feature22, Feature62, Feature95, Feature29, Feature57, Feature56, Feature15, Feature100, Feature8
#> Feature76, Feature44, Feature92, Feature87, Feature46, Feature88, Feature5, Feature20, Feature85, Feature36
#> Computing nearest neighbor graph
#> Computing SNN
#> 21:38:09 UMAP embedding parameters a = 0.9922 b = 1.112
#> 21:38:09 Read 200 rows and found 10 numeric columns
#> 21:38:09 Using Annoy for neighbor search, n_neighbors = 30
#> 21:38:09 Building Annoy index with metric = cosine, n_trees = 50
#> 0% 10 20 30 40 50 60 70 80 90 100%
#> [----|----|----|----|----|----|----|----|----|----|
#> **************************************************|
#> 21:38:09 Writing NN index file to temp file /tmp/RtmpPbpQkc/file252d3c2ad40
#> 21:38:09 Searching Annoy index using 1 thread, search_k = 3000
#> 21:38:09 Annoy recall = 100%
#> 21:38:10 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
#> 21:38:11 Initializing from normalized Laplacian + noise (using RSpectra)
#> 21:38:11 Commencing optimization for 500 epochs, with 6148 positive edges
#> 21:38:11 Optimization finished
seurat_obj
#> An object of class Seurat
#> 100 features across 200 samples within 1 assay
#> Active assay: RNA (100 features, 100 variable features)
#> 3 layers present: counts, data, scale.data
#> 2 dimensional reductions calculated: pca, umap
You can convert the Seurat
object to an
AnnData
object using the from_Seurat
function:
adata <- from_Seurat(seurat_obj)
adata
#> AnnData object with n_obs × n_vars = 200 × 100
#> obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA'
#> var: 'vf_vst_counts_mean', 'vf_vst_counts_variance', 'vf_vst_counts_variance.expected', 'vf_vst_counts_variance.standardized', 'vf_vst_counts_variable', 'vf_vst_counts_rank', 'var.features', 'var.features.rank'
#> obsm: 'X_pca', 'X_umap'
#> varm: 'PCs'
#> layers: 'counts', 'data', 'scale.data'
#> obsp: 'connectivities', 'snn'
Again note that there is no one-to-one mapping possible between the AnnData and SeuratObject data structures, so some information might be lost during conversion. It is recommended to carefully inspect the converted object to ensure that all necessary information has been transferred.
See ?from_Seurat
for more details on how to customize
the conversion process. Example:
from_Seurat(
seurat_obj,
assay_name = "RNA",
x_mapping = "data",
layers_mapping = c(foo = "counts")
)
#> AnnData object with n_obs × n_vars = 200 × 100
#> obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA'
#> var: 'vf_vst_counts_mean', 'vf_vst_counts_variance', 'vf_vst_counts_variance.expected', 'vf_vst_counts_variance.standardized', 'vf_vst_counts_variable', 'vf_vst_counts_rank', 'var.features', 'var.features.rank'
#> obsm: 'X_pca', 'X_umap'
#> varm: 'PCs'
#> layers: 'foo'
#> obsp: 'connectivities', 'snn'
sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] Seurat_5.2.1 tidyr_1.3.1
#> [3] dplyr_1.1.4 purrr_1.0.4
#> [5] stringr_1.5.1 rprojroot_2.0.4
#> [7] knitr_1.49 tibble_3.2.1
#> [9] rmarkdown_2.29 anndataR_0.99.0
#> [11] SingleCellExperiment_1.29.1 SummarizedExperiment_1.37.0
#> [13] Biobase_2.67.0 GenomicRanges_1.59.1
#> [15] GenomeInfoDb_1.43.4 IRanges_2.41.2
#> [17] S4Vectors_0.45.2 BiocGenerics_0.53.6
#> [19] generics_0.1.3 MatrixGenerics_1.19.1
#> [21] matrixStats_1.5.0 SeuratObject_5.0.2
#> [23] sp_2.2-0 BiocStyle_2.35.0
#>
#> loaded via a namespace (and not attached):
#> [1] RColorBrewer_1.1-3 sys_3.4.3 jsonlite_1.8.9
#> [4] magrittr_2.0.3 spatstat.utils_3.1-2 farver_2.1.2
#> [7] vctrs_0.6.5 ROCR_1.0-11 spatstat.explore_3.3-4
#> [10] htmltools_0.5.8.1 S4Arrays_1.7.2 SparseArray_1.7.5
#> [13] sass_0.4.9 sctransform_0.4.1 parallelly_1.42.0
#> [16] KernSmooth_2.23-26 bslib_0.9.0 htmlwidgets_1.6.4
#> [19] ica_1.0-3 plyr_1.8.9 plotly_4.10.4
#> [22] zoo_1.8-12 cachem_1.1.0 buildtools_1.0.0
#> [25] igraph_2.1.4 mime_0.12 lifecycle_1.0.4
#> [28] pkgconfig_2.0.3 Matrix_1.7-2 R6_2.5.1
#> [31] fastmap_1.2.0 GenomeInfoDbData_1.2.13 fitdistrplus_1.2-2
#> [34] future_1.34.0 shiny_1.10.0 digest_0.6.37
#> [37] colorspace_2.1-1 patchwork_1.3.0 tensor_1.5
#> [40] RSpectra_0.16-2 irlba_2.3.5.1 progressr_0.15.1
#> [43] spatstat.sparse_3.1-0 polyclip_1.10-7 httr_1.4.7
#> [46] abind_1.4-8 compiler_4.4.2 withr_3.0.2
#> [49] bit64_4.6.0-1 fastDummies_1.7.5 MASS_7.3-64
#> [52] DelayedArray_0.33.5 tools_4.4.2 lmtest_0.9-40
#> [55] httpuv_1.6.15 future.apply_1.11.3 goftest_1.2-3
#> [58] glue_1.8.0 nlme_3.1-167 promises_1.3.2
#> [61] grid_4.4.2 Rtsne_0.17 cluster_2.1.8
#> [64] reshape2_1.4.4 hdf5r_1.3.12 spatstat.data_3.1-4
#> [67] gtable_0.3.6 data.table_1.16.4 XVector_0.47.2
#> [70] spatstat.geom_3.3-5 RcppAnnoy_0.0.22 ggrepel_0.9.6
#> [73] RANN_2.6.2 pillar_1.10.1 spam_2.11-1
#> [76] RcppHNSW_0.6.0 later_1.4.1 splines_4.4.2
#> [79] lattice_0.22-6 deldir_2.0-4 survival_3.8-3
#> [82] bit_4.5.0.1 tidyselect_1.2.1 maketools_1.3.1
#> [85] miniUI_0.1.1.1 pbapply_1.7-2 gridExtra_2.3
#> [88] scattermore_1.2 xfun_0.50 stringi_1.8.4
#> [91] UCSC.utils_1.3.1 lazyeval_0.2.2 yaml_2.3.10
#> [94] evaluate_1.0.3 codetools_0.2-20 BiocManager_1.30.25
#> [97] cli_3.6.3 uwot_0.2.2 xtable_1.8-4
#> [100] reticulate_1.40.0 munsell_0.5.1 jquerylib_0.1.4
#> [103] Rcpp_1.0.14 spatstat.random_3.3-2 globals_0.16.3
#> [106] png_0.1-8 spatstat.univar_3.1-1 parallel_4.4.2
#> [109] ggplot2_3.5.1 dotCall64_1.2 listenv_0.9.1
#> [112] viridisLite_0.4.2 scales_1.3.0 ggridges_0.5.6
#> [115] crayon_1.5.3 rlang_1.1.5 cowplot_1.1.3