SnowyOwl

SnowyOwl.Preprocess.quality_control_metricsFunction
quality_control_metrics(p; kwargs...)
quality_control_metrics(X, obs, var; obsindex, varindex, kwargs...)

Calculate quality control metrics.

Arguments

  • p::Profile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on.
  • obs::DataFrame: The feature information matrix with cell information.
  • var::DataFrame: The feature information matrix with gene information.

Common Keyword arguments

  • obsindex::Symbol=:barcode: The index of dataframe obs.
  • varindex::Symbol=:gene_symbols: The index of dataframe var.
  • use_log1p::Bool: Computing log1p-transformed quality control metric.
  • qc_vars::AbstractVector{String}=["mt"]: The boolean mask for identifying variables you could calculate on.
  • percent_top=nothing: The proportions of top genes to calculate on.

Example

See also quality_control_metrics! for non-inplace operation.

source
SnowyOwl.Preprocess.quality_control_metrics!Function
quality_control_metrics!(p; kwargs...)
quality_control_metrics!(X, obs, var; kwargs...)

Calculate quality control metrics.

Arguments

  • p::AnnotatedProfile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on.
  • obs::DataFrame: The feature information matrix with cell information.
  • var::DataFrame: The feature information matrix with gene information.

Common Keyword arguments

  • use_log1p::Bool: Computing log1p-transformed quality control metric.
  • qc_vars::AbstractVector{String}=["mt"]: The boolean mask for identifying variables you could calculate on.
  • percent_top=nothing: The proportions of top genes to calculate on.

Example

See also quality_control_metrics for non-inplace operation.

source
SnowyOwl.Preprocess.normalizeFunction
normalize(prof, method; omicsname=:RNA, layer=:count, kwargs...)
normalize(X, method; kwargs...)

Normalize counts per cell.

Arguments

  • prof::AnnotatedProfile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on.
  • method: Method to calculate highly variable genes, available for LogNormalization, RelativeNormalization, CenteredLogRatioNormalization and CustomNormalization.

Specific keyword arguments

  • omicsname::Symbol: The OmicsProfile specified to calculate on.
  • layer::Symbol: The layer specified to calculate on.

Common keyword arguments

  • scaled_size::Real=1e4: Scaled total counts for each cell over all genes.

Examples

See also normalize! for inplace operation.

source
SnowyOwl.Preprocess.normalize!Function
normalize!(prof, method; omicsname=:RNA, layer=:count, kwargs...)
normalize!(X, method; kwargs...)

Normalize counts per cell.

Arguments

  • prof::AnnotatedProfile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on.
  • method: Method to calculate highly variable genes, available for LogNormalization, RelativeNormalization, CenteredLogRatioNormalization and CustomNormalization.

Specific keyword arguments

  • omicsname::Symbol: The OmicsProfile specified to calculate on.
  • layer::Symbol: The layer specified to calculate on.

Common keyword arguments

  • scaled_size::Real=1e4: Scaled total counts for each cell over all genes.

Examples

See also normalize for non-inplace operation.

source
SnowyOwl.Preprocess.logarithmizeFunction
logarithmize(prof; omicsname=:RNA, layer=:count, kwargs...)
logarithmize(X; kwargs...)

Logarithmize the data matrix.

Arguments

  • prof::AnnotatedProfile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on.

Specific keyword arguments

  • omicsname::Symbol: The OmicsProfile specified to calculate on.
  • layer::Symbol: The layer specified to calculate on.

Common keyword arguments

  • base::Real=ℯ: Base of the logarithm.

Examples

See also logarithmize! for inplace operation.

source
SnowyOwl.Preprocess.logarithmize!Function
logarithmize!(prof; omicsname=:RNA, layer=:count, kwargs...)
logarithmize!(X; kwargs...)

Logarithmize the data matrix.

Arguments

  • prof::AnnotatedProfile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on.

Specific keyword arguments

  • omicsname::Symbol: The OmicsProfile specified to calculate on.
  • layer::Symbol: The layer specified to calculate on.

Common keyword arguments

  • base::Real=ℯ: Base of the logarithm.

Examples

See also logarithmize for non-inplace operation.

source
SnowyOwl.Preprocess.highly_variable_genesFunction
highly_variable_genes(prof, method; omicsname=:RNA, layer=:count, kwargs...)
highly_variable_genes(X, var, method; varname=:gene_symbols, kwargs...)

Calculate highly variable genes and return a new DataFrame with column :highlyvariable, which selects highly variable genes from given gene set. Additional information including :means, :dispersions and `:dispersionsnormwill be added to returnedDataFrame`.

Arguments

  • prof::AnnotatedProfile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on.
  • var::DataFrame: The feature information matrix with gene information.
  • method::HighlyVariableMethod: Method to calculate highly variable genes, available for CellRangerHVG, SeuratHVG and Seuratv3HVG.

Specific keyword arguments

  • omicsname::Symbol: The OmicsProfile specified to calculate on.
  • layer::Symbol: The layer specified to calculate on.
  • varname::Symbol: The variable name to be specified as identifier for genes.

Common keyword arguments

  • ntop_genes::Int=-1: Number of top variable genes to be selected. Specify -1 to switch to selection by mean and dispersion. Available for CellRangerHVG and SeuratHVG methods.
  • min_disp::Real=0.5: Minimum dispersion for selecting highly variable genes. Available for CellRangerHVG and SeuratHVG methods.
  • max_disp::Real=Inf: Maximum dispersion for selecting highly variable genes. Available for CellRangerHVG and SeuratHVG methods.
  • min_mean::Real=0.0125: Minimum mean for selecting highly variable genes. Available for CellRangerHVG and SeuratHVG methods.
  • max_mean::Real=3.: Maximum mean for selecting highly variable genes. Available for CellRangerHVG and SeuratHVG methods.

Examples

See also highly_variable_genes! for inplace operation.

source
SnowyOwl.Preprocess.highly_variable_genes!Function
highly_variable_genes!(prof, method; omicsname=:RNA, layer=:count, kwargs...)
highly_variable_genes!(X, var, method; kwargs...)

Calculate highly variable genes and modify var directly by adding column :highlyvariable, which selects highly variable genes from given gene set. Additional information including :means, :dispersions and `:dispersionsnormwill be added tovar`.

Arguments

  • prof::AnnotatedProfile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on.
  • var::DataFrame: The feature information matrix with gene information.
  • method: Method to calculate highly variable genes, available for CellRangerHVG, SeuratHVG and Seuratv3HVG.

Specific keyword arguments

  • omicsname::Symbol: The OmicsProfile specified to calculate on.
  • layer::Symbol: The layer specified to calculate on.

Common keyword arguments

  • ntop_genes::Int=-1: Number of top variable genes to be selected. Specify -1 to switch to selection by mean and dispersion. Available for CellRangerHVG and SeuratHVG methods.
  • min_disp::Real=0.5: Minimum dispersion for selecting highly variable genes. Available for CellRangerHVG and SeuratHVG methods.
  • max_disp::Real=Inf: Maximum dispersion for selecting highly variable genes. Available for CellRangerHVG and SeuratHVG methods.
  • min_mean::Real=0.0125: Minimum mean for selecting highly variable genes. Available for CellRangerHVG and SeuratHVG methods.
  • max_mean::Real=3.: Maximum mean for selecting highly variable genes. Available for CellRangerHVG and SeuratHVG methods.

Examples

See also highly_variable_genes for non-inplace operation.

source
SnowyOwl.Analysis.projectFunction
project(prof, method; omicsname=:RNA, layer=:count, kwargs...)
project(X, method; kwargs...)

Project the count matrix from original space to low-dimensional space.

Arguments

  • prof::AnnotatedProfile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on. The dimension of X should be (nfeature, nsample) where nfeature is number of features and nsample is number of samples.
  • method::AnalysisMethod: Method to project count matrix, available for PCAMethod() and UMAPMethod().

Specific keyword arguments

  • omicsname::Symbol: The OmicsProfile specified to calculate on.
  • layer::Symbol: The layer specified to calculate on.

Common keyword arguments

  • dims::Int=50: The dimension of target space. Should lower or equal to the dimension of feature from count matrix which is dims <= nfeature.
  • n_neighbors::Int=20: The number of local neighborhood points used in approximations of manifold structure. It is only effective when method = UMAPMethod().
  • min_dist::Real=0.5: This controls how tightly the embedding is allowed compress points together. Larger values prevent points from packing together and will prone to the preservation of global structure instead. Smaller values result in a more clustered/clumped embedding and prone to the preservation of local structure. It is only effective when method = UMAPMethod().

Examples

See also project! for inplace operation.

source
SnowyOwl.Analysis.project!Function
project!(prof, method; omicsname=:RNA, layer=:count, kwargs...)
project!(X, method; kwargs...)

Project the data matrix from original space to (usually) low-dimensional space.

Arguments

  • prof::AnnotatedProfile: The profile object to calculate on.
  • X::AbstractMatrix: The count matrix to calculate on. The dimension of X should be (nfeature, nsample) where nfeature is number of features and nsample is number of samples.
  • method::AnalysisMethod: Method to project count matrix, available for PCAMethod() and UMAPMethod().

Specific keyword arguments

  • omicsname::Symbol: The OmicsProfile specified to calculate on.
  • layer::Symbol: The layer specified to calculate on.

Common keyword arguments

  • dims::Int=50: The dimension of target space. Should lower or equal to the dimension of feature from count matrix which is dims <= nfeature.
  • n_neighbors::Int=20: The number of local neighborhood points used in approximations of manifold structure. It is only effective when method = UMAPMethod().
  • min_dist::Real=0.5: This controls how tightly the embedding is allowed compress points together. Larger values prevent points from packing together and will prone to the preservation of global structure instead. Smaller values result in a more clustered/clumped embedding and prone to the preservation of local structure. It is only effective when method = UMAPMethod().

Examples

See also project for non-inplace operation.

source