Using anndataR to read and convert

Introduction

This package allows users to work with .h5ad files, access various slots in the datasets and convert these files to SingleCellExperiment objects and SeuratObjects, and vice versa.

Check out ?anndataR for a full list of the functions provided by this package.

Installation

Install using:

if (!require("BiocManager", quietly = TRUE)) {
  install.packages("BiocManager")
}
BiocManager::install("anndataR")

Usage

This package provides an abstract interface for AnnData objects. This abstract interface models its Python counterpart closely, and stores a data matrix X and annotations corresponding to observations (obs, obsm, obsp) and variables (var, varm, varp) and unstructured metadata uns.

This abstract interface is implemented by different backends. Currently, the following backends are implemented:

  1. InMemoryAnnData

  2. HDF5AnnData

The InMemoryAnnData backend allows you to construct an AnnData object in memory. The HDF5AnnData backend allows you to read in an AnnData object from an .h5ad file.

HDF5AnnData backend

Here is an example of how to read in an .h5ad file.

library(anndataR)
file <- system.file("extdata", "example.h5ad", package = "anndataR")
adata <- read_h5ad(file, to = "InMemoryAnnData")

The contents can be accessed as well:

X <- adata$X
layers <- adata$layers
obs <- adata$obs
var <- adata$var

InMemoryAnnData backend

The following example details how to construct an InMemoryAnnData and access its contents.

adata <- AnnData(
  X = matrix(1:15, 3L, 5L),
  layers = list(
    A = matrix(5:1, 3L, 5L),
    B = matrix(letters[1:5], 3L, 5L)
  ),
  obs = data.frame(row.names = LETTERS[1:3], cell = 1:3),
  var = data.frame(row.names = letters[1:5], gene = 1:5)
)
adata
#> AnnData object with n_obs × n_vars = 3 × 5
#>     obs: 'cell'
#>     var: 'gene'
#>     layers: 'A', 'B'

The contents can be accessed as well:

X <- adata$X
layers <- adata$layers
obs <- adata$obs
var <- adata$var

You can convert the AnnData object to a SingleCellExperiment object or to a SeuratObject in the following way:

sce <- adata$to_SingleCellExperiment()
seurat <- adata$to_Seurat()
#> Warning in matrix(data = as.numeric(x = x), ncol = nc): NAs introduced by
#> coercion

Session info

sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] anndataR_0.99.0             SingleCellExperiment_1.27.2
#>  [3] SummarizedExperiment_1.35.1 Biobase_2.65.1             
#>  [5] GenomicRanges_1.57.1        GenomeInfoDb_1.41.1        
#>  [7] IRanges_2.39.2              S4Vectors_0.43.2           
#>  [9] BiocGenerics_0.51.1         MatrixGenerics_1.17.0      
#> [11] matrixStats_1.4.1           SeuratObject_5.0.2         
#> [13] sp_2.1-4                    BiocStyle_2.33.1           
#> 
#> loaded via a namespace (and not attached):
#>  [1] sass_0.4.9              future_1.34.0           generics_0.1.3         
#>  [4] SparseArray_1.5.35      lattice_0.22-6          listenv_0.9.1          
#>  [7] digest_0.6.37           evaluate_0.24.0         grid_4.4.1             
#> [10] fastmap_1.2.0           jsonlite_1.8.8          Matrix_1.7-0           
#> [13] BiocManager_1.30.25     httr_1.4.7              spam_2.10-0            
#> [16] UCSC.utils_1.1.0        codetools_0.2-20        jquerylib_0.1.4        
#> [19] abind_1.4-8             cli_3.6.3               crayon_1.5.3           
#> [22] rlang_1.1.4             XVector_0.45.0          parallelly_1.38.0      
#> [25] future.apply_1.11.2     bit64_4.0.5             DelayedArray_0.31.11   
#> [28] cachem_1.1.0            yaml_2.3.10             S4Arrays_1.5.7         
#> [31] tools_4.4.1             parallel_4.4.1          GenomeInfoDbData_1.2.12
#> [34] globals_0.16.3          buildtools_1.0.0        R6_2.5.1               
#> [37] lifecycle_1.0.4         zlibbioc_1.51.1         bit_4.0.5              
#> [40] hdf5r_1.3.11            progressr_0.14.0        bslib_0.8.0            
#> [43] Rcpp_1.0.13             xfun_0.47               sys_3.4.2              
#> [46] knitr_1.48              htmltools_0.5.8.1       rmarkdown_2.28         
#> [49] maketools_1.3.0         dotCall64_1.1-1         compiler_4.4.1