My Profile Photo My Profile Photo

Alevin-Tutorial


A support website for Alevin-tool (part of Salmon). How Tos and FAQs


How to use alevin with iSEE

Preparing alevin output for exploration with iSEE

iSEE (Rue-Albrecht, Marini, Soneson and Lun, F1000Research 2018) is an R package for interactive exploratory analysis and visualization of data provided in SummarizedExperiment objects (or any class extending SummarizedExperiment). Here, we show how to prepare the output generated by alevin in a form suitable for exploration with iSEE. We illustrate the procedure using the example data provided in the alevinQC R package, which consists of a mix of mouse and human cells, and was quantified using a combined mouse and human reference transcriptome.

suppressPackageStartupMessages({
    library(tximeta)
    library(iSEE)
    library(shiny)
    library(scater)
    library(scran)
    library(SC3)
    library(cowplot)
})

Read alevin output with tximeta

Alevin output files can be imported into R using the tximport or tximeta packages. While tximport returns a count matrix, tximeta generates a SummarizedExperiment object, and attempts to automatically retrieve information about the transcriptome that was used for the alevin quantification. Note that you need tximport >= 1.11.5 or tximeta >= 1.1.10 to read alevin output. Alternatively, you can load the alevin count matrix with the code provided in the alevin-monocle tutorial and create the SummarizedExperiment object manually.

## Path to alevin quant matrix
alevinquant <- system.file("extdata/alevin_example_pre0.14/alevin/quants_mat.gz", 
                           package = "alevinQC")

## Read alevin output with tximeta and generate a SummarizedExperiment object
alevin <- tximeta::tximeta(coldata = data.frame(files = alevinquant,
                                                names = "sample1"), 
                           type = "alevin")
## importing quantifications
## using importer for pre-v0.14.0 Alevin files
## reading in alevin gene-level counts across cells
## couldn't find matching transcriptome, returning non-ranged SummarizedExperiment

The code above generates a SummarizedExperiment object, with the UMI counts from alevin contained in the counts assay.

alevin
## class: SummarizedExperiment 
## dim: 107450 298 
## metadata(3): tximetaInfo quantInfo countsFromAbundance
## assays(1): counts
## rownames(107450): ENSG00000223972.5 ENSG00000227232.5 ...
##   ENSMUSG00000094855.1 ENSMUSG00000095041.7
## rowData names(0):
## colnames(298): CGTAGCGCACACATGT CTGGTCTTCGGTTAAC ... TGGGCGTAGCGTCAAG
##   CGAGCACTCCCAGGTG
## colData names(0):
dim(assay(alevin, "counts"))
## [1] 107450    298

The SummarizedExperiment object can be directly imported and explored with iSEE.

app <- iSEE(alevin)
shiny::runApp(app)

Add metadata and analysis results

At this stage, any additional row (gene) and/or column (cell) annotations, as well as reduced dimension representations, can be added to the object for visualization with iSEE. As an example, we will first flag the cells that are included in the final whitelist from alevin, and subset the SummarizedExperiment object to only those cells. Then, we will use the scater package to calculate QC metrics and perform dimension reduction using PCA and t-SNE (after normalization and log-transformation of the counts using the scran package), and finally cluster the cells using the SC3 package. All the output of these operations are stored in the object created above (which is converted to a SingleCellExperiment to allow the inclusion of the reduced dimension representations). Note that the code below is just an example, and there are many other ways of performing these analyses. For more detailed information about each step, we refer to the documentation and publications corresponding to the respective packages. Many of the concepts are also discussed in detail in the online book Orchestrating Single-Cell Analysis with Bioconductor.

## Add information about whitelist membership
whitelist <- read.delim(system.file("extdata/alevin_example_pre0.14/alevin/whitelist.txt", 
                                    package = "alevinQC"),
                        header = FALSE, as.is = TRUE)$V1
alevin$whitelist <- colnames(alevin) %in% whitelist

## Subset to only cells in whitelist
alevin <- alevin[, alevin$whitelist]
## Convert to SingleCellExperiment
alevin <- as(alevin, "SingleCellExperiment")

## Normalize using scran/scater
alevin <- scran::computeSumFactors(alevin, min.mean = 0.25)
alevin <- scater::logNormCounts(alevin)

## Calculate QC metrics using scater
alevin <- scater::addPerCellQC(alevin)
alevin <- scater::addPerFeatureQC(alevin)

## Dimension reduction
set.seed(42)
alevin <- scater::runPCA(alevin)
alevin <- scater::runTSNE(alevin)

## t-SNE plots, coloured by the expression of a human gene and a mouse gene, 
## respectively
cowplot::plot_grid(
    scater::plotTSNE(alevin, colour_by = "ENSG00000140988.15"),
    scater::plotTSNE(alevin, colour_by = "ENSMUSG00000091957.3"),
    nrow = 1
)

## Clustering with SC3
rowData(alevin)$feature_symbol <- rownames(alevin)
alevin <- sc3(alevin, n_cores = 1, ks = 2:5)
## Setting SC3 parameters...
## Calculating distances between the cells...
## Performing transformations and calculating eigenvectors...
## Performing k-means clustering...
## Calculating consensus matrix...
## t-SNE plot, coloured by the assigned cluster labels
scater::plotTSNE(alevin, colour_by = "sc3_2_clusters")

app <- iSEE(alevin)
shiny::runApp(app)

Session info

sessionInfo()
## R Under development (unstable) (2020-03-13 r77937)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] cowplot_1.0.0               SC3_1.15.1                 
##  [3] scran_1.15.15               scater_1.15.28             
##  [5] ggplot2_3.3.0               shiny_1.4.0.2              
##  [7] iSEE_1.99.5                 SingleCellExperiment_1.9.2 
##  [9] SummarizedExperiment_1.17.3 DelayedArray_0.13.7        
## [11] BiocParallel_1.21.2         matrixStats_0.56.0         
## [13] Biobase_2.47.3              GenomicRanges_1.39.2       
## [15] GenomeInfoDb_1.23.13        IRanges_2.21.5             
## [17] S4Vectors_0.25.13           BiocGenerics_0.33.3        
## [19] tximeta_1.5.30             
## 
## loaded via a namespace (and not attached):
##   [1] circlize_0.4.8                AnnotationHub_2.19.7         
##   [3] BiocFileCache_1.11.4          igraph_1.2.4.2               
##   [5] lazyeval_0.2.2                shinydashboard_0.7.1         
##   [7] splines_4.0.0                 digest_0.6.25                
##   [9] foreach_1.4.8                 ensembldb_2.11.3             
##  [11] htmltools_0.4.0               viridis_0.5.1                
##  [13] gdata_2.18.0                  magrittr_1.5                 
##  [15] memoise_1.1.0                 doParallel_1.0.15            
##  [17] cluster_2.1.0                 ROCR_1.0-7                   
##  [19] limma_3.43.5                  ComplexHeatmap_2.3.2         
##  [21] Biostrings_2.55.6             askpass_1.1                  
##  [23] prettyunits_1.1.1             colorspace_1.4-1             
##  [25] rrcov_1.5-2                   blob_1.2.1                   
##  [27] rappdirs_0.3.1                WriteXLS_5.0.0               
##  [29] xfun_0.12                     dplyr_0.8.5                  
##  [31] crayon_1.3.4                  RCurl_1.98-1.1               
##  [33] jsonlite_1.6.1                tximport_1.15.11             
##  [35] iterators_1.0.12              glue_1.3.2                   
##  [37] gtable_0.3.0                  zlibbioc_1.33.1              
##  [39] XVector_0.27.1                GetoptLong_0.1.8             
##  [41] BiocSingular_1.3.2            shape_1.4.4                  
##  [43] DEoptimR_1.0-8                scales_1.1.0                 
##  [45] mvtnorm_1.1-0                 pheatmap_1.0.12              
##  [47] rngtools_1.5                  DBI_1.1.0                    
##  [49] edgeR_3.29.1                  miniUI_0.1.1.1               
##  [51] Rcpp_1.0.4                    viridisLite_0.3.0            
##  [53] xtable_1.8-4                  progress_1.2.2               
##  [55] clue_0.3-57                   dqrng_0.2.1                  
##  [57] bit_1.1-15.2                  rsvd_1.0.3                   
##  [59] DT_0.13                       htmlwidgets_1.5.1            
##  [61] httr_1.4.1                    gplots_3.0.3                 
##  [63] RColorBrewer_1.1-2            shinyAce_0.4.1               
##  [65] farver_2.0.3                  pkgconfig_2.0.3              
##  [67] XML_3.99-0.3                  dbplyr_1.4.2                 
##  [69] locfit_1.5-9.2                labeling_0.3                 
##  [71] tidyselect_1.0.0              rlang_0.4.5                  
##  [73] later_1.0.0                   AnnotationDbi_1.49.1         
##  [75] munsell_0.5.0                 BiocVersion_3.11.1           
##  [77] tools_4.0.0                   RSQLite_2.2.0                
##  [79] rintrojs_0.2.2                evaluate_0.14                
##  [81] stringr_1.4.0                 fastmap_1.0.1                
##  [83] yaml_2.2.1                    knitr_1.28                   
##  [85] bit64_0.9-7                   robustbase_0.93-6            
##  [87] caTools_1.18.0                purrr_0.3.3                  
##  [89] AnnotationFilter_1.11.0       doRNG_1.8.2                  
##  [91] nlme_3.1-145                  mime_0.9                     
##  [93] biomaRt_2.43.3                compiler_4.0.0               
##  [95] beeswarm_0.2.3                curl_4.3                     
##  [97] png_0.1-7                     interactiveDisplayBase_1.25.0
##  [99] e1071_1.7-3                   tibble_2.1.3                 
## [101] statmod_1.4.34                pcaPP_1.9-73                 
## [103] stringi_1.4.6                 GenomicFeatures_1.39.7       
## [105] lattice_0.20-40               ProtGenerics_1.19.3          
## [107] Matrix_1.2-18                 shinyjs_1.1                  
## [109] vctrs_0.2.4                   pillar_1.4.3                 
## [111] lifecycle_0.2.0               BiocManager_1.30.10          
## [113] GlobalOptions_0.1.1           BiocNeighbors_1.5.2          
## [115] bitops_1.0-6                  irlba_2.3.3                  
## [117] httpuv_1.5.2                  rtracklayer_1.47.0           
## [119] R6_2.4.1                      promises_1.1.0               
## [121] KernSmooth_2.23-16            gridExtra_2.3                
## [123] vipor_0.4.5                   codetools_0.2-16             
## [125] gtools_3.8.1                  colourpicker_1.0             
## [127] assertthat_0.2.1              openssl_1.4.1                
## [129] rjson_0.2.20                  shinyWidgets_0.5.1           
## [131] withr_2.1.2                   GenomicAlignments_1.23.1     
## [133] Rsamtools_2.3.7               GenomeInfoDbData_1.2.2       
## [135] mgcv_1.8-31                   hms_0.5.3                    
## [137] grid_4.0.0                    class_7.3-15                 
## [139] rmarkdown_2.1.1               DelayedMatrixStats_1.9.0     
## [141] Rtsne_0.15                    ggbeeswarm_0.6.0