label isotypes based on productive/sterile transcript levels

getIsotype considers, for each cell in count_matrix, the transcript count for given isotypes, and label the cells with one of the isotype. The labelling is controlled by a user-defined function (summary_function) which specifies how to nominate one isotype given a cell's transcript counts. For example, isotype with maximum sterile transcript count. Optionally, if a k-nearest neighbour graph is given, it assumes this graph encodes cell similarity and the function will impute isotypes for a cell without any isotype information, based on majority voting of its immediate neighbours in the graph.

Usage

getIsotype(
  count_matrix,
  summary_function = function(z) max(which(z > 0)) - 1,
  knn_graph = FALSE,
  impute_positive_counts = TRUE
)

Arguments

count_matrix: a matrix of transcript counts of different isotypes (row) across cells (columns). You may subset productive/sterile transcripts prior to running this function
summary_function: a function which takes in an integer vector of transcript counts and return an index (from 0 to n - 1 where n is the total number of isotypes) denoting the chosen isotype. This function is applied only if knn_graph is TRUE; it is applied on each direct neighbour surrounding the given cell as defined in the kNN graph. Default is a function which return the isotype with maximum transcript counts in the input vector
knn_graph: a k-nearest neighbour graph (igraph object) of cell similarity. If supplied, it will be used to impute the annotation of isotype for cells where no such transcripts are found across all isotypes, using majority voting on the direct neighbours of the cell in the kNN graph will be used to impute. If no information/imputation is possible, the cell will be assume to express IgM. (Default: FALSE, i.e. no imputation will be performed)
impute_positive_counts: should cells with observed transcript counts be imputed using the neighbour-voting strategy detailed above? (Default: TRUE)

Value

a factor denoting the chosen isotype for each cell represented in count_matrix.

Details

If a k-nearest neighbour graph is given, it assumes this graph encodes cell similarity and the function will impute isotypes for a cell without any isotype information, based on majority voting of its immediate neighbours in the graph.